Geocomputing curriculum - Machine Learning

MODULE 5: Machine learning

In this module, we explore the Pandas and Scikit-learn packages for machine learning tasks using geoscience data examples. Students will gain a good overview of how to look at large datasets and solve problems with state-of-the-art data science tools.

Machine learning concepts

  • What is it that you’re trying to solve? How can machine learning help?
  • What's the difference between supervised and unsupervised methods?
  • What's the difference between classification and regression?

Data management for machine learning

  • DataFrames: A new way to look at well logs.
  • Exercise: loading a pandas DataFrame from a CSV.
  • Exercise: building a pandas DataFrame from a LAS file.
  • DataFrames vs arrays (vs Hadoop, Dask, etc).

The machine learning iterative loop

  • Data — Getting the data. Loading and storing in an array and/or DataFrame
  • Processing — data exploration, inspection, cleaning, and feature engineering.
  • Model — What is a model? Training a Scikit-Learn model (for now).
  • Results — assessing quality and performance metrics (accuracy, recall, F1,
  • confusion matrices)
  • Repeat — What can we do to improve performance?
  • Exercise: predicting a missing well log.
  • Exercise: improving the pay flag prediction.
  • Exercise: Hugoton lithology prediction contest.

Deep learning in geoscience

  • Logistic regression, gradient descent, backprop, neural networks, deep networks.
  • Exercise: Classifying images with a neural network.
  • Exercise: Classifying images with a deep neural network.

MODULE 6: Geoscience recipes

Now that we have established a large range of skills and code patterns, we share some of the recipes we use often, or which are baked into some of the tools we use. The idea is to expose the algorithms inside the tools we use every day, like map interpolation, AVO, and seismic migration. Each module is a self-directed exercise.

  • Exercise: AVO — from the 2-term Shuey equation to Zoeppritz. And bruges.
  • Exercise: Vibroseis data — loading the SEG-Y, correlating the data.
  • Exercise: Prestack seismic — computing prestack attributes, stacking.
  • Exercise: Wavelet estimation from seismic and logs.
  • Exercise: Linear inversion theory, regularization.
  • Exercise: The normal moveout equation. (Thanks to Leo Uieda!)
  • Exercise: Reverse time migration. (Thanks to Anton Grinevsky)
  • Exercise: Kirchhoff migration. (Based on Brian Russell's tutorial)
  • Exercise: Synthetic wedge models.
  • Exercise: Ternary diagrams and rose diagrams in matplotlib.
  • Exercise: What's inside bruges?
  • Exercise: What's inside welly?
  • Exercise: Seismic acquisition — geometries, key metrics.
  • Exercise: Map interpolation — splines, triangulation, kriging, cokriging.
  • Exercise: Deviated wells — trajectories, seismic data extraction.
  • Exercise: Shape files — working with GIS systems.
  • Exercise: Seismographic data — see x lines post.
  • Exercise: (A very quick) Introduction to Julia.
  • Exercise: Introduction to Octave.
  • Exercise: Introduction to Lua/torch.
  • Exercise: Introduction to R.