November 09, 2017

A new blog, and a new course

November 09, 2017/ Matt Hall

There's a great new geoscience blog on the Internet — I urge you to add it to your blog-reading app or news reader or list of links or whatever it is you use to keep track of these things. It's called Geology and Python, and it contains exactly what you'd expect it to contain!

The author, Bruno Ruas de Pinho, has nine posts up so far, all excellent. The range of topics is quite broad:

In each post, Bruno takes some geoscience challenge — nothing too huge, but the problems aren't trivial either — and then methodically steps through solving the problem in Python. He's clearly got a good quantitative brain, having recently graduated in geological engineering from the Federal University of Pelotas, aka UFPel, Brazil, and he is now available for hire. (He seems to be pretty sharp, so if you're doing anything with computers and geoscience, you should snag him.)

A new course for Calgary

We've run lots of Introduction to Python courses before, usually with the name Creative Geocomputing. Now we're adding a new dimension, combining a crash introduction to Python with a crash introduction to machine learning. It's ambitious, for sure, but the idea is not to turn you into a programmer. We aim to:

Help you set up your computer to run Python, virtual environments, and Jupyter Notebooks.
Get you started with downloading and running other people's packages and notebooks.
Verse you in the basics of Python and machine learning so you can start to explore.
Set you off with ideas and things to figure out for that pet project you've always wanted to code up.
Introduce you to other Calgarians who love playing with code and rocks.

We do all this wielding geoscientific data — it's all well logs and maps and seismic data. There are no silly examples, and we don't shy away from so-called advanced things — what's the point in computers if you can't do some things that are really, really hard to do in your head?

Tickets are on sale now at Eventbrite, it's $750 for 2 days — including all the lunch and code you can eat.

September 29, 2017

Hacking in Houston

September 29, 2017/ Matt Hall

Houston 2013
Houston 2014
Denver 2014
Calgary 2015
New Orleans 2015
Vienna 2016
Paris 2017
Houston 2017... The eighth geoscience hackathon landed last weekend!

We spent last weekend in hot, humid Houston, hacking away with a crowd of geoscience and technology enthusiasts. Thirty-eight hackers joined us on the top-floor coworking space, Station Houston, for fun and games and code. And tacos.

Matt kicking things off

Discussing projects

The hacking starts

Coding all day :)

Team coding

Figuring out street-level flooding in Houston

Alexa, how do we make an Alexa skill?

Evening at 8th Wonder Brewing

Back at it with help from NVIDIA

A Poweredge C4130 posing with David Holmes from Dell EMC

Getting ready to podcast

The organizer sofa

Making secret plans :)

NVIDIA Jetson TX1 all ready to go

Swag!

Listening to demos

David MC-ing the demos

Eric Jones from Enthought

Some of the prize-winners with their prizes

Here's a rundown of the teams and what they worked on.

Seismic Imagers

Jingbo Liu (CGG), Zohreh Souri (University of Houston).

Tech — DCGAN in Tensorflow, Amazon AWS EC2 compute.

The team looked for patterns that make seismic data different from other images, using a deep convolutional generative adversarial network (DCGAN). Using a seismic volume and a set of 2D lines, they made 121,000 sub-images (tiles) for their training set.

The Young And The RasLAS

William Sanger (Schlumberger), Chance Sanger (Museum of Fine Arts, Houston), Diego Castañeda (Agile), Suman Gautam (Schlumberger), Lanre Aboaba (University of Arkansas).

State of the art text detection by Google Cloud Vision API

Tech — Google Cloud Vision API, Python flask web app, Scatteract (sort of). Repo on GitHub.

Digitizing well logs is a common industry task, and current methods require a lot of manual intervention. The team's automated pipeline: convert PDF files to images, perform OCR with Google Cloud Vision API to extract headers and log track labels, pick curves using a CNN in TensorFlow. The team implemented the workflow in a Python flask front-end. Check out their slides.

Hutton Rocks

Kamal Hami-Eddine (Paradigm), Didi Ooi (University of Bristol), James Lowell (GeoTeric), Vikram Sen (Anadarko), Dawn Jobe (Aramco).

Tech — Amazon Echo Dot, Amazon AWS (RDS, Lambda).

The team built Hutton, a cloud-based cognitive assistant for gaining more efficient, better insights from geologic data. Project includes integrated cloud-hosted database, interactive web application for uploading new data, and a cognitive assistant for voice queries. Hutton builds upon existing Amazon Alexa skills. Check out their GitHub repo, and slides.

Big data > Big Lore

Licheng Zhang (CGG), Zhenzhen Zhong (CGG), Justin Gosses (Valador/NASA), Jonathan Parker (Marathon)

The team used machine learning to predict formation tops on wireline logs, which would allow for rapid generation of structure maps for exploration play evaluation, save man hours and assist in difficuly formation-top correlations. The team used the AER Athabasca open dataset of 2193 wells (yay, open data!).

Tech — Jupyter Notebooks, SciPy, scikit-learn. Repo on GitHub.

Free near surface

Tien-Huei Wang, Jing Wu, Clement Zhang (Schlumberger).

Multiples are a kind of undesired seismic signal and take expensive modeling to remove. The project used machine learning to identify multiples in seismic images. They attempted to use GAN frameworks, but found it difficult to formulate their problem, turning instead to the simpler problem of binary classification. Check out their slides.

Tech — CNN... I don't know the framework.

The Cowboyz

Mingliang Liu, Mohit Ayani, Xiaozheng Lang, Wei Wang (University of Wyoming), Vidal Gonzalez (Universidad Simón Bolívar, Venezuela).

A tight group of researchers joined us from the University of Wyoming at Laramie, and snagged one of the most enthusiastic hackers at the event, a student from Venezuela called Vidal. The team attempted acceleration of geostatistical seismic inversion using TensorFlow, a central theme in Mingliang's research.

Tech — TensorFlow.

Augur.ai

Altay Sensal (Geokinetics), Yan Zaretskiy (Aramco), Ben Lasscock (Geokinetics), Colin Sturm (Apache), Brendon Hall (Enthought).

Electrical submersible pumps (ESPs) are critical components for oil production. When they fail, they can cause significant down time. Augur.ai provides tools to analyze pump sensor data to predict when pumps when pump are behaving irregularly. Check out their presentation!

Tech — Amazon AWS EC2 and EFS, Plotly Dash, SigOpt, scikit-learn. Repo on GitHub.

The Disaster Masters

Joe Kington (Planet), Brendan Sullivan (Chevron), Matthew Bauer (CSM), Michael Harty (Oxy), Johnathan Fry (Chevron)

Hydrologic models predict floodplain flooding, but not local street flooding. Can we predict street flooding from LiDAR elevation data, conditioned with citizen-reported street and house flooding from U-Flood? Maybe! Check out their slides.

Tech — Python geospatial and machine learning stacks: rasterio, shapely, scipy.ndimage, scikit-learn. Repo on GitHub.

The structure does WHAT?!

Chris Ennen (White Oak), Nanne Hemstra (dGB Earth Sciences), Nate Suurmeyer (Shell), Jacob Foshee (Durwella).

Inspired by the concept of an iPhone 'face ageing' app, Nate recruited a team to poke at applying the concept to maps of the subsurface. Think of a simple map of a structural field early in its life, compared to how it looks after years of interpretation and drilling. Maybe we can preview the 'aged' appearance to help plan where best to drill next to reduce uncertainty!

Tech — OpendTect, Azure ML Studio, C#, self-boosting forest cluster. Repo on GitHub.

Thank you!

Massive thanks to our sponsors — including Pioneer Natural Resources — for their part in bringing the event to life!

More thank-yous

Apart from the participants themselves, Evan and I benefitted from a team of technical support, mentors, and judges — huge thanks to all these folks:

The indefatigable David Holmes from Dell EMC. The man is a legend.
Andrea Cortis from Pioneer Natural Resources.
Francois Courteille and Issam Said of NVIDIA.
Carlos Castro, Sunny Sunkara, Dennis Cherian, Mike Lapidakis, Jit Biswas, and Rohan Mathews of Amazon AWS.
Maneesh Bhide and Steven Tartakovsky of SigOpt.
Dave Nichols and Aria Abubakar of Schlumberger.
Eric Jones from Enthought.
Emmanuel Gringarten from Paradigm.
Frances Buhay and Brendon Hall for help with catering and logistics.
The team at Station for accommodating us.
Frank's Pizza, Tacos-a-Go-Go, Cali Sandwich (banh mi), Abby's Cafe (bagels), and Freebird (burritos) for feeding us.

Finally, megathanks to Gram Ganssle, my Undersampled Radio co-host. Stalwart hack supporter and uber-fixer, Gram came over all the way from New Orleans to help teams make sense of deep learning architectures and generally smooth things over. We recorded an episode of UR at the hackathon, talking to Dawn Jobe, Joe Kington, and Colin Sturm about their respective projects. Check it out!

[Update, 29 Sep & 3 Nov] Some statistics from the event:

39 participants, including 7 women (way too few, but better than 4 out of 63 in Paris)
9 students (and 0 professors!).
12 people from petroleum companies.
18 people from service and technology companies, including 5 from Schlumberger!
13 no-shows, not including folk who cancelled ahead of time; a bit frustrating because we had a long wait list.
Furthest travelled: James Lowell from Newcastle, UK — 7560 km!
98 tacos, 67 burritos, 96 slices of pizza, 55 kolaches, and an untold number of banh mi.

September 20, 2017

Looking ahead to SEG

September 20, 2017/ Matt Hall

The SEG Annual Meeting is coming up. Next week sees the festival of geophysics return to the global energy capital, shaken and damp but undefeated after its recent battle with Hurricane Harvey. Even though Agile will not be at the meeting this year, I wanted to point out some highlights of the week.

The Annual Meeting

The meeting will be big, as usual: 108 talk sessions, and 50 poster and e-presentation sessions. I have no idea how many presentations we're talking about but suffice to say that there's a lot. Naturally, there's a machine learning session, with the following talks:

A growing machine learning approach to optimize use of prestack and poststack seismic data (Kamal Hami-Eddine)
A machine learning approach to facies classification using well logs (Vincenzo Lipari)
A weakly supervised approach to seismic structure labeling (Yazeed Alaudah)
Automated input attribute weighting for unsupervised seismic facies analysis (Tao Zhao)
Different training sample selection strategies in unsupervised seismic facies analysis (Tao Zhao)
Geobody interpretation through multiattribute surveys, natural clusters, and machine learning (Thomas Andrew Smith)
Patterns classification in assisting seismic-facies analysis (Rongchang Liu)
Towards real-time geologic feature detection from seismic measurements using a randomized machine-learning algorithm (Dr. Youzuo Lin)

The Geophysics Hackathon

Even though we're not at the conference, we are in Houston this weekend — for the latest edition of the Geophysics Hackathon! The focus was set to be firmly on 'machine learning', but after the hurricane, we added the theme of 'disaster recovery and mitigation'. People are completely free to choose whatever project they'd like to work on; we'll be ready to help and advise on both topics. We also have some cool gear to play with: a Dell C4130 with 4 x NVIDIA P100s, NVIDIA Jetson TX1s, Amazon Echo Dots, and a Raspberry Shake. Many, many thanks to Dell EMC and Pioneer Natural Resources and all our other sponsors:

If you're one of the 70 or so people coming to this event, I'm looking forward to seeing you there... if you're not, then I'm looking forward to telling you all about it next week.

Petrel User Group

Jacob Foshee and Durwella are hosting a Petrel User Group meetup at The Dogwood, which is in midtown (not far from downtown). If you're a user of Petrel — power user or beginner, it doesn't matter — and you're interested in making the most of technology, it'd be good to see you there. Apart from anything else, you'll get to meet Jacob, who is one of those people with technology superpowers that you never know when you might need.

Rock Physics Reception

Tuesday If you've never been to the famous Rock Physics Reception, then you're missing out. It's your best shot at bumping into the luminaries of rock physics — Colin Sayers, Stefan Gelinsky, Per Avseth, Marco Perez, Bill Goodway, Tad Smith — you know the sort of thing. If the first thing you think about when you wake up in the morning is Lamé's second parameter, RSVP right now. Hurry: there are only a handful of spots left.

There's more! Don't miss:

The Women's Network Breakfast on Wednesday.
The Wiki Committee meeting on Wednesday, 8:00 am, Hilton Room 344B.
If you're an SEG member, you can go to any committee meeting you like! Find one that matches your interests.

If you know of any other events, please drop them in the comments!

July 24, 2017

Newsflash: the Geophysics Hackathon is back!

July 24, 2017/ Matt Hall

Mark your calendar: 22–24 September (right before SEG), at a downtown Houston location to be confirmed.

We're filling the room with 50 geoscientists of all stripes. Interpreters, programmers, students, professionals... everyone is welcome. The plan: to imagine, design, and prototype some new tools in geophysics — all around the theme of machine learning. It's going to be awesome.

The schedule: we'll get started at 6 pm on Friday 22 September, and go till 10 pm. Then we pick it up again on Saturday morning, and go till 6 pm, and the same again on Sunday. Teams will present a demo to everyone on Sunday after 3 pm. There will be a few prizes, a few drinks, lots of food, and a lot of new geophysical tools and widgets.

If you want to know more about what a hackathon is, read my summary from the last one: Le grand hack! Or check out the project round-up posts, part 1 and part 2.

If you're not sure you belong, I promise that you do. One of the prize-winning teams in Paris had no coding experience! And every team needs help with brainstorming, design, testing, and presentation. Absolutely anyone can contribute, and absolutely everyone will learn something.

If you have some like-minded friends, bring them along! We need teams of 5 people, so if there are already 5 of you, you can start coding as soon as you walk in the door!

If you can't be there yourself, please share this post with someone you know.

When you're ready, click here to buy a ticket.

Thank you as always to our sponsors so far: Dell EMC and Amazon AWS. If you'd like to sponsor the Houston event, please check this page out, or just get in touch.

June 29, 2017

Subsurface Hackathon project round-up, part 2

June 29, 2017/ Evan Bianco

Following on from Part 1 yesterday, here are the other seven team projects from the hackathon:

Interactive visualization of Water Table heights over many years.

Water, water everywhere

Water Underground: Martin Bentley (NMMU), Joseph Barraud (Rolls Royce), Rabah Cheknoun (UPPA)

The team built readers for the groundwater data available from dinoloket .nl, both the groundwater levels and the hydrochemistry. They clustered the data by aggregating by month and then looking for similarities in levels in the boreholes and built an open Jupyter notebook.

Seismic from noise

OBSNoise: Fernando Villanueva-Robles (IPGP), Yann Huet (Setec-Lerm), Ngoc Huyen Luu (Ecole Polytechnique), Dorian Bagur (Telecom ParisTech), Jonathan Grandjean (Independent)

The OBSNoise project investigated the application of machine learning to coherently stack ambient noise records collected from ocean bottom seismic (OBS) arrays in order to extract reservoir information. The team's results from synthetic data showed promise. If fully developed, this technology could be a virtually real-time monitoring system of dynamic reservoir properties.

Global geochemical data analytics

The Killers: Alexandre Sache, Violaine Delahaye, Karl Sache (all from Institute Polytechnique UniLaSalle), Côme Arvis, Guillaume Ligner (Ecole Polytechnique)

Two geoscience undergrads and one automotive design student (I know right?) from UniLaSalle hooked up with two data science students from Ecole Polytechnique to interogate the massive GeoRoc database using some clever data analytics tricks and did some novel many-dimensional geochemical classifications.

Fixing broken well data

LogFix: Guillaume Coffin (Telecom Evolution), Florian Napierala (EISTI), Camille Gimenez (Université Paris-Saclay), Tristan Siméon (Université de Montpellier), Robert Leckenby (Independent)

A truly pristine, calibrated, and corrected petrophysical data is so rare it has a sort of mythical status. Team LogFix used machine learning to identify bad-data zones, repair, QC, and fill-in missing sections. They got an impressive way with the problem, using a dataset from the Athabasca of Canada.

Between the hand-drawn lines

Automagical: Louis Poirier (Independent), Maggie Baber (Independent), Georg Semmler (GiGa infosystems), Björn Wieczoreck (GiGa infosystems), Jonas Kopcsek (GiGa infosystems)

You don't need to believe in magic. Team Automagical used machine learning to create 3D geological models from 2D cross-sections sections. They trained a predictive model using a collection of standardized hand-drawn cross-sections from human geoscientists. The model learns how to propagate rocks throughout a 3D scene. Their goal is to be able to generate cross-sections along any direction through the model. The AI learned how to do geologically realistic interpolation on simple structures. What kind of geologic complexity is possible with more input from more cross-sections?

The document on the left contains a log display with a lithology column. It's a 'hit'. The one on the right has no lithlogies and is a 'miss'.

There's rocks in them hills! Hills of paper, that is

Logs on the Rocks: Daniel Stanton (Leeds University), Jack Woolam (Leeds University), Adam Goddard (Leeds University), Henri Blondelle (AgileDD)

If the oil and gas industry is to get more efficient, we better get really good at finding lithology and fluid information in the mountains of paper we've collectively built. Team Logs on the Rocks used CNNs to identify graphical depictions of rock types in a sea of unstructured PDFs and TIFFs. They introduced themselves as a team of non-coders, but these guys were were doing cloud computing on AWS and using NVIDIA's GPUs before the end of the weekend.

Robot vision for seismic interpretation

It's not our FAULT! Claire Birnie (Leeds University), Carlos Alberto da Costa Filho (Edinburgh University), Matteo Ravasi (Statoil), Filippo Broggini (ETHZ), Gijs Straathof (SGS)

Geologic feature recognition using machine learning. The goal was to assist seismic interpreters in detecting geologic features – faults, folds, traps, etc. – in seismic data . They used Haar cascade classifiers, which are routinely used for identifying faces or kittens or beer bottles in photographs and video streams, specially trained to work on seismic data. They used the awesome OpenCV library to build this technology. At the time of writing, their website appears to be maxed out for the month, so if you're dying to see it, leave them a comment on LinkedIn asking them increase their capacity. And in the meantime, you can check out their project's repo on GitHub.

Kudos for the open source repo, team!

It was thrilling to see such a large range of data and applications. Digital thin-sections, ground water maps, seismic data, well logs, cross-sections, information in unstructured documents, and so on. Thanks to each and every individual that showed up with their expertise and enthusiasm. We're all better off because of it.

A quick reminder that our sponsors are awesome! Please high-five them next time you meet them...

June 28, 2017

Subsurface Hackathon project round-up, part 1

June 28, 2017/ Evan Bianco

The dust has settled from the Hackathon in Paris two weeks ago. Been there, done that, came home with the T-shirt.

In the same random order they presented their 4-minute demos to our panel of esteemed judges, I present a (very) abbreviated round-up of what the teams made together over the course of the weekend. With the exception of a few teams who managed to spontaneously nucleate before the hackathon, most of these teams were comprised of people who had never met each other before the event.

Just let that sink in for a second: teams of mostly mutual strangers built 13 legit machine-learning-based geoscience applications in one weekend.

An automated well log management system

Team Un-well Loggers: James Wanstall (Glencore), Niket Doshi (Teradata), Joseph Taylor (Teradata), Duncan Irving (Teradata), Jane McConnell (Teradata).

Tech: Kylo (NiFi, HDFS, Hive, Spark)

If you're working with well logs, and if you've got lots of them, you've almost certainly got gaps or inaccuracies from curve to curve and from well to well. The team's scalable, automated well-log file management system Log Healer computes missing logs and heals broken ones. Amazing.

An early result from Team Janus. The image on the left is ground truth, that on the right is predicted. Many of the features are present. Not bad for v0.1!

Meaningful cross sections from well logs

Team Janus: Daniel Buse, Johannes Camin, Paul Gabriel, Powei Huang, Fabian Kampe (all from GiGa Infosystems)

The team built an elegant machine learning workflow to attack the very hard problem of creating geologically realistic cross-section from well logs. The validation algorithm compares pixels to score the result.

Think Section's mindblowing photomicrograph labeling tool can also make novel camouflage patterns.

Paint-by-numbers on digital thin sections

Team Think Section: Diego Castaneda (Agile*), Brendon Hall (Enthought), Roeland Nieboer (Fugro), Jan Niederau (RWTH Aachen), Simon Virgo (RWTH Aachen)

Tech: Python (Scikit Learn, Scikit Image, Flask, NumPy, SciPy, Pandas), AWS for hosting app & Jupyter server.

Description: Mineral classification and point-counting on thin sections can be an incredibly tedious and time consuming task. Team Think Section trained a model to segregate, classify, and label mineral grains in 200GB of high-resolution multi-polarization-angle photomicrographs.

Team Classy's super-impressive shot gather seismic event Detection technology. Left: synthetic gather. Middle: predicted labels. Right: truth.

Event detection on seismic shot gathers

Team Classy: Princy Ikotoko Ndong (EOST), Anna Lim (NTNU), Yuriy Ivanov (NTNU), Song Hou (CGG), Justin Gosses (Valador).

Tech: Python (NumPy, Matplotlib), Jupyter notebooks.

The team created an AI which identifies and labels different events on a shot gather image. It can find direct waves, reflections, multiples or coherent noise. It uses a support vector machine for classification, and is simple and fast.

model2seismic: An entirely new way to do modeling and inversion. Take note: the neural network that made this image knows no physics. — model2seismic: An entirely new way to do modeling and inversion. Take note: the neural network that made this image knows no physics.

Forward and inverse modeling without the physics

Team GANsters - Lukas Mosser (Imperial), Wouter Kimman (Meridian), Jesper Dramsch (Copenhagen), Alfredo de la Fuente (Wolfram), Steve Purves (Euclidity)

Tech: PyNoddy, homegrown Python ML tools.

The GANsters created a deep-learning image-translation-based seismic inversion and forward modelling system. I urge you to go and look at their project on model2seismic. If it doesn't give you goosebumps, you are geophysically inert.

Machine learning for for stratigraphic interpretation

Team Pick Pick LOG - Antoine Vanbesien (EOST), Fidèle Degni (Mines St-Étienne), Massinissa Mesbahi (Pau), Natsuki Gunji (Mines St-Étienne), Cédric Menut (EOST).

This team of data science and geoscience undergrads attacked an automated stratigraphic interpretation task. They used supervised learning to determine lithology from well logs in Alberta's Athabasca play, then attempted to teach their AI to pick stratigraphic tops. Impressive!

Pretty amazing, huh? The power of the hackathon to bring a project from barely-even-an-idea to actual-working-code is remarkable! And we're not even halfway through the teams: tomorrow I'll describe the other seven projects.

June 13, 2017

Le grand hack!

June 13, 2017/ Matt Hall

It happened! The Subsurface Hackathon drew to a magnificent close on Sunday, in an intoxicating cloud of code, creativity, coffee, and collaboration. It will take some beating.

Nine months in gestation, the hackathon was on a scale we have not attempted before. Total E&P joined us as co-organizers and made this new reach possible. They also let us use their amazing Booster — a sort of intrapreneurship centre — which was perfect for the event. Their team (thanks especially to Marine and Caroline!) did an amazing job of hosting, as well as providing several professionals from their subsurface software (thanks Jonathan and Yannick!) and data science teams (thanks Victor and David!). Arnaud Rodde and Frédéric Broust, who had to do some organization hacking of their own to make something as weird as a hackathon happen, should be proud of their teams.

Instead of trying to describe the indescribable, here are some photos:

BY THE NUMBERS

16 hours of code
13 teams
62 hackers
44 students
4 robots
568 croissants
0 lost-time incidents

I won't say much about the projects for now. The diversity was high — there were projects in thin section photography, 3D geological modeling, document processing, well log prediction, seismic modeling and inversion, and fault detection. All of the projects included some kind of machine learning, and again there was diversity there, including several deep learning applications. Neural networks are back!

Feel the buzz!

If you are curious, Gram and I recorded a quick podcast and interviewed a few of the teams:

It's going to take a few days to decompress and come down from the high. In a couple of weeks I'll tell you more about the projects themselves, and we'll edit the photos and post the best ones to Flickr (and in the meantime there are a few more pics there already).

Thank you to the sponsors!

Last thing: we couldn't have done any of this without the support of Dell EMC. David Holmes has been a rock for the hackathon project over the last couple of years, and we appreciate his love of community and code! Thank you too to Duncan and Jane at Teradata, Francois at NVIDIA, Peter and Jon at Amazon AWS, and Gram at Sandstone for all your support. Dear reader: please support these organizations!

UPDATE: 2 follow-up posts

June 08, 2017

Looking forward to EAGE

June 08, 2017/ Matt Hall

Evan, Diego and I are flying to Paris today for the EAGE Conference and Exhibition. It's exciting. We're excited.

But the excitement starts before the conference. The Subsurface Hackathon is this weekend!

My diary

Even the hackathon excitement starts before the weekend, because tomorrow, Friday, we're running the hacker's bootcamp — a sort of short course appetizer for the hackathon. We have about 25 geoscientists coming to the Booster TOTAL (an event space at TOTAL's La Défense offices) to get some hands-on practice with Python and the latest in machine learning tools. It's especially exciting because we'll also have engineers from NVIDIA on hand to help with the coaching. The idea is to help people hit the ground running when the hackathon starts on Saturday.

After that, on Saturday and Sunday, it's the hackathon itself. We have no fewer than 60 geoscientists and engineers registered for this breakout event. They're coming to the Booster to work on a wide array of machine learning ideas for the subsurface. It's going to be epic. You can read all about what happens next week, I promise.

Then on Monday it's the Data Science for Geoscience workshop, at which I'm giving a keynote. Since I'm far from possessing expertise, I'm using it as a chance to get people jazzed about helping make the coming AI revolution in geoscience a positive experience. I'm really looking forward to it.

The conference itself starts on Tuesday. In the afternoon I'm co-chairing a session on machine learning (have you spotted the theme yet?) in seismic interpretation, along with Victor Aare of Schlumberger. It will be awesome to see what kind of progress our community is making in this field — it's fun to imagine what seismic interpretation might be like in a few years. There are so many fascinating problems to work on! Here are the talks in that session:

H Di, M Shafiq and G AlRegib (Georgia Institute of Technology), Multi-attribute K-means Cluster Analysis for Salt Boundary Detection
H Hashemi et al. (Institute of Geophysics, University of Tehran), Clustering Seismic Datasets for Optimized Facies Analysis Using a SSCSOM Technique
S Hadiloo et al. (Research Institute of Applied Sciences, ACECR), Seismic Facies Analysis by ANFIS and Fuzzy Clustering Methods to Extract Channel Patterns
A Waldeland and A Solberg (University of Oslo), Salt Classification Using Deep Learning
H Di & G AlRegib (Georgia Institute of Technology), Seismic Multi-attribute Classification for Salt Boundary Detection - A Comparison
P Xu, W Lu and B Wang (Tsinghua University), Multi-Attribute Classification Based On Sparse Autoencoder - A Gas Chimney Detection Example
Y Alaudah and G AlRegib (Georgia Institute of Technology), Weakly Supervised Seismic Structure Labeling via Orthogonal Non-Negative Matrix Factorization
J Amtmann et al. (University of Leoben/ Geo5 GmbH), Testing of Clustering Algorithms on Different 3D Seismic Models
R Noemani Rad and G Gharabeigli (NIOC Research & Technology Directorate), Half Graben Evolution in the Kopet Dagh Fold-and-thrust Belt - Sedimentation and Pale Current History

On Wednesday we'll be taking in some more talks and posters, then in the afternoon I'm reprising my keynote talk at IFPEN, a subsurface research institute in the Bois de Boulogne. I've never been there before, although I have met a few IFP scientists before. I'm looking forward to it very much.

It all ends for us on Thursday. Evan and Diego fly home and I'm off to Cambridge (the old one in the fens, not the one in Massachusetts) for a few days with family (and bookshops). Until then, expect much blogging!

Going to EAGE?

If you're reading this and would like to meet up with us at Agile or some of the Software Underground crowd — the friendliest bunch of coding geoscientists you could hope for — let's plan to meet at the end of the workshop, at the workshop location. Look for the Software Underground shirts.

June 06, 2017

What should national data repositories do?

June 06, 2017/ Matt Hall

Right now there's a conference happening in Stavanger, Norway: National Data Repository 2017. My friend David Holmes of Dell EMC, a long time supporter of Agile's recent hackathons and general geocomputing infrastructure superhero, is there. He's giving a talk, I think, and chairing at least one session. He asked a question today on Software Underground:

“If anyone has any thoughts or ideas as to what the regulators should be doing differently now is a good time to speak up :)”

My response

For me it's about raising their aspirations. Collectively, they are sitting on one of the most valuable — or invaluable — datasets in the world, comparable to Hubble, or the LHC. Better yet, the data are (in most cases) already open and they actually want to share it. And the community (us) is better tooled than ever, and perhaps also more motivated, to get cracking. So the possibility is there to see a revolution in subsurface science and exploration (in the broadest sense of the word) and my challenge to them is:

Can they now create the conditions for this revolution in earth science?

Some things I think they can do right now:

Properly fund the development of an open data platform. I'll expand on this topic below.
Don't get too twisted off on formats (go primitive), platforms (pick one), licenses (go generic), and other busy work that committees love to fret over. Articulate some principles (e.g. public first, open source, small footprint, no lock-in, componentize, no single provider, let-users-choose, or what have you), and stay agile.
Lobby NOCs and IOCs hard to embrace integrated and high-quality open data as an advantage that society, as well as industry, can share in. It's an important piece in the challenge we face to modernize the industry. Not so that it can survive for survival's sake, but so that it can serve society for as long as it's needed.
Get involved in the community: open up their processes and collaborate a lot more with the technical societies — like show up and talk about their programs. (How did I not hear about the CDA's unstructured data challenge — a subject I'm very much into — till it was over? How many other potential participants just didn't know about it?)

An open data platform

The key piece here is the open data platform. Here are the features I'd like to see of such a platform:

Optimized for users, not the data provider, hosting provider, or system administrator.
Clear rights: well-known, documented, obvious, clearly expressed open licenses for re-use.
Meaningful levels of access that are free of charge for most users and most use cases.
Access for humans (a nice mappy web interface) with no awkward or slow registration processes.
Access for machines (a nice API, perhaps even a couple of libraries expressing it).
Tools for query, discovery, and retrieval; ideally with user feedback paths ('more like this, less like that').
Ways to report, or even fix, problems in the data. This relieves you of "the data's not ready" procrastination.
Good documentation of all of this, ideally in a wiki or something that people can improve.
Support for a community of users and developers that want to do things with the data.

Building this platform is not trivial. There is massive file storage, database back end, web front end, licensing, and so on. Then there's the community of developers and users to engage and support. It will take years, and never be finished. It sounds hard... but people are doing it. Prototypes for seismic data exist already, and there are countless models in other verticals (just check out the Registry of Research Data Repositories, or look at the list on PLOS).

The contract to build data infrostructure is often awarded to the likes of Schlumberger, Halliburton or CGG. In theory, these companies have the engineering depth to pull them off (though this too is debatable, especially in today's web-first, native-never world). But they completely lack the culture required: there's no corporate understanding of what 'open' means. So the model is broken in subtle but fatal ways and the whole experiment fails.

I'm excited to hear what comes out of this conference. If you're there, please tell!

June 02, 2017

Conversation not discussion

June 02, 2017/ Matt Hall

It's a while since we had a 'conferences are broken' rant on the Agile blog!

Five or six of the sessions at this year's conference were... different. I already mentioned the Value In Geophysics session, which was a cross between a regular series of talks and a panel discussion. I went to another, The modern geoscientist, which was structured the same way. A third one, Fundamentals of Professional Career Branding, was a mini workshop with Jackie Rafter of Higher Landing. There were at least a couple of other such sessions.

It's awesome to see the societies experimenting with something outside the usual plethora of talks and posters. I hope they were well received, because we need more of this in our discipline, now more than ever. If you went to one and enjoyed it, please let the organizers know.

But... the sessions — especially the panel discussion sessions — lacked something. One thing really:

The sessions we saw were nowhere near participatory enough. Not even close.

The 'expert-panel-enlightens-audience' pattern is slowing us down, perpetuating broken models of leadership and hierarchy. There isn't an expert in Calgary or the universe that knows how or when this downturn is going to end, or what we need to do to improve our chances of continuing to contribute to society and make a living in our profession. So please, stop throwing people up on a stage, making them give 5 minute presentations, and occasionally asking for questions from the audience. That is nothing like a discussion. Tune in to a political debate show to see what those look like: rapid-fire, punchy, controversial. In short: interesting. And, from an organizer's point of view, really hard, which is why we should stop.

Real conversation

What I think is really needed right now, more than half-baked expert discussion, is conversation. Conversations happen between small groups of people, all sitting on the same plane, around a table, with napkins to draw on and time to draw on them. They connect people and spread awesome ideas like viruses. What's more, great conversations have outcomes.

I don't want to claim that Agile has all this figured out, but we have demonstrated various ways of connecting scientists in meaningful ways and with lasting outcomes. We've also written extensively on the subject (e.g. here and here and here and here). Other verticals have conducted many more experiments, and documented the results. Humans know how to do this.

So there's no excuse — it's not too dramatic to call the current 'situation' a crisis in our profession in Canada — so we need to get beyond tinkering at the edges and half-hearted attempts at change. Our societies need to pay attention to what's needed, and get on with making it happen.

Still more ranting...

We talked about this topic at some length on the Undersampled Radio podcast yesterday. Here's the uncut video version:

Blog