I’m dreaming of a blueschist Christmas

The festive season is speeding towards us at the terrifying rate of 3600 seconds per hour. Have you thought about what kind of geoscientific wonders to make or buy for the most awesome kids and/or grownups in your life yet? I hope not, because otherwise this post is pretty redundant… If you have, I’m sure you can think of <AHEM> at least one more earth scientist in your life you’d like to bring a smile to this winter.

I mean, here’s a bargain to start you off: a hammer and chisel for under USD 15 — an amazing deal. The fact that they are, unbelievably, made of chocolate only adds to the uses you could put them to.

If your geoscientist is on a diet or does their fieldwork in a warm country, then obviously these chocolate tools won’t work. You could always get some metal ones instead (UK supplier, US supplier).

Image © The Chocolate Workshop

Image © The Chocolate Workshop

Before you start smashing things to bits with a hammer, especially one that melts at 34°C, it’s sometimes nice to know how hard they are. Tapping them with a chocolate bar or scratching them with your fingernail are time-tested methods, but the true geologist whips out a hardness pick.

I have never actually seen one of these (I’m not a true geologist) so the chances of your geoscientist having one, especially one as nice as this, are minuscule. USD 90 at geology.com.

Image © Geology.com

Image © Geology.com

Hammers can be used around the house too, of course, for knocking in nails or sampling interesting countertops. If your geoscientist is houseproud, how about some of Jane Hunter’s beautiful textile artworks, many of which explore geological and geomorphological themes, especially Scottish ones. The excerpt shown here is from Faults and Folds (ca. USD 1000); there are lots of others.

If textiles aren’t your thing, these hydrology maps from Muir Way are pretty cool too. From USD 80 each.

Image © Jane Hunter

Image © Jane Hunter

Topographic maps are somehow more satisfying when they are three-dimensional. So these beautiful little wooden maps from ElevatedWoodworking on Etsy, which seem too cheap to be true, look perfect.

There’s plenty more for geoscientists on Etsy, if you can look past the crass puns slapped clumsily onto mugs and T-shirts. For example, if geostatistics get you going, start at NausicaaDistribution and keep clicking. My favourites: the Chisquareatops shirt and the MCMC Hammer cross-stitch pattern.

Image © ElevatedWoodworking on Etsy

Image © ElevatedWoodworking on Etsy

I like statistics. Sometimes, not very often, people ask my where my online handle kwinkunks comes from. It’s a phonetic spelling of one of my favourite words, quincunx, which has a couple of meanings, but the most interesting one is a synonym for a Galton board or bean machine. Galton boards are awesome! Demonstrate the central limit theorem right on your desktop! From USD 10: a cheap one, and an expensive one.

Oh, and there’s a really lovely/expensive one from Lightning Calculator if your geoscientist is the sort of person who likes to have the best of everything. It costs USD 1190 and it looks fantastic.

Image © Random Walker

Image © Random Walker

Let’s get back to rocks. You can actually just give a rock to a geologist, and they’ll be happy. You just might not see much of them over the holiday, as they disappear off to look at it.

If your geologist has worked in the North Sea in their career, they will definitely, 100% enjoy these amazing things. Henk Kombrink and Kirstie Wright are distributing chunks of actual North Sea core. The best part is that you can choose the well and formation the rock comes from! We gave some resinated core slabs away as prizes at the hackathons this month, and the winners loved them.

Image © Henk Kombrink

Image © Henk Kombrink

Traditionally, I mention some books. Not that I read books anymore (reasons). If I did read books, these are the ones I’d read:


That’s it for this year! I hope there’s something here to brighten your geoscientist’s day. Have fun shopping!

PS In case there’s not enough here to choose from, you can trawl through the posts from previous years too:

Unlike most images on agilescientific.com, the ones in this post are not my property and are not open access. They are the copyright of their respective owners, and I’m using them here in accordance with typical Fair Use terms. If owners object, please let me know.

The Scottish hackathon

On 16−18 November the UK Oil & Gas Authority (OGA) hosted its first hackathon, with Agile providing the format and technical support. This followed a week of training the OGA provided — again, through Agile — back in September. The theme for the hackathon was ‘machine learning’, and I’m pretty sure it was the first ever geoscience hackathon in the UK.

Thirty-seven digital geoscientists participated in the event at Robert Gordon University; most of them appear below. Many of them had not coded at all before the bootcamp on Friday, so a lot of people were well outside their comfort zones when we sat down on Saturday. Kudos to everyone!

The projects included the usual mix of seismic-based tasks, automated well log picking, a bit of natural language processing, some geospatial processing, and seals (of the mammalian variety). Here’s a rundown of what people got up to:

Counting seals on Scottish islands

Seal Team 6: Julien Moreau, James Mullins, Alex Schaaf, Balazs Kertesz, Hassan Tolba, Tom Buckley.

Project: Julien arrived with a cool dataset: over 6000 seals located on two large TIFFs images of Linga Holm, an island off Stronsay in the Orkneys. The challenge: locate the seals automatically. The team came up with a pipeline to generate HOG descriptors, train a support vector machine on about 20,000 labelled image tiles, then scan the large TIFFs to try to identify seals. Shown here is the output of one such scan, with a few false positive and false negatives. GitHub repo.

This project won the Most Impact award.


Automatic classification of seismic sections

Team Seis Class: Jo Bagguley, Laura Bardsley, Chio Martinez, Peter Rowbotham, Mike Atkins, Niall Rowantree, James Beckwith.

Project: Can you tell if a section has been spectrally whitened? Or AGC’d? This team set out to attempt to teach a neural network the difference. As a first step, they reduced it to a binary classification problem, and showed 110 ‘final’ and 110 ‘raw’ lines from the OGA ESP 2D 2016 dataset to a convolutional neural net. The AI achieved an accuracy of 98% on this task. GitHub repro.

This project won recognition for a Job Well Done.

Why do get blocks relinquished?

Team Relinquishment Surprise: Tanya Knowles, Obiamaka Agbaneje, Kachalla Aliyuda, Daniel Camacho, David Wilkinson (not pictured).

Project: Recognizing the vast trove of latent information locked up in the several thousand reports submitted to the OGA. Despite focusing on relinquishment, they quickly discovered that most of the task is to cope with the heterogeneity of the dataset, but they did manage to extract term frequencies from the various Conclusions sections, and made an ArcGIS web app to map them.


Recognizing reflection styles on seismic

Team What’s My Seismic? Quentin Corlay, Tony Hallam, Ramy Abdallah, Zhihua Cui, Elia Gubbala, Amechi Halim.

Project: The team wanted to detect the presence of various seismic facies in a small segment of seismic data (with a view to later interpreting entire datasets). They quickly generated a training dataset, then explored three classifiers: XGBoost, Google’s AutoML, and a CNN. All of the methods gave reasonable results and were promising enough that the team vowed to continue investigating the problem. Project website. GitHub repo.

This project won the Best Execution award.


Stretchy-squeezey well log correlation

Team Dynamic Depth Warping: Jacqueline Booth, Sarah Weihmann, Khaled Muhammad, Sadiq Sani, Rahman Mukras, Trent Piaralall, Julio Rodriguez.

Project: Making picks and correlations in wireline data is hard, partly because the stratigraphic signal changes spatially — thinning and thickening, and with missing or extra sections. To try to cope with this, the team applied a dynamic time (well, depth) warping algorithm to the logs, then looking for similar sections in adjacent wells. The image shows a target GR log (left) with the 5 most similar sections. Two, maybe four, of them seem reasonable. Next the team planned to incorporate more logs, and attach probabilities to the correlations. Early results looked promising. GitHub repo.

Making lithostrat picks

Team Marker Maker: Nick Hayward, Frédéric Ramon, Can Yang, Peter Crafts, Malcolm Gall

Project: The team took on the task of sorting out lithostratigraphic well tops in a mature basin. But there are speedbumps on the road to glory, e.g. recognizing which picks are lithological (as opposed to chronological), and which pick names are equivalent. The team spent time on various subproblems, but there’s a long road ahead.

This project won recognition for a Job Well Done.


Alongside these projects, Rob and I floated around trying to help, and James Beckwith hacked on a cool project of his own for a while — Paint By Seismic, a look at unsupervised classification on seismic sections. In between generating attributes and clustering, he somehow managed to help and mentor most of the other teams — thanks James!

Thank you!

Thank you to The OGA for these events, and in particular to Jo Bagguley, whose organizational skills I much appreciated over the last few weeks (as my own skills gradually fell apart). The OGA’s own Nick Richardson, the OGTC’s Gillian White, and Robert Gordon Universty’s Eyad Elyan acted as judges.

These organizations contributed to the success of these events — please say Thank You to them when you can!


I’ll leave you with some more photos from the event. Enjoy!



Yesterday I announced that we’re hatching a new plan. The next thing. Today I want to tell you about it.

The project has the codename TRANSFORM. I like the notion of transforms: functions that move you from one domain to another. Fourier transforms. Wavelet transforms. Digital subsurface transforms. Examples:

  • The transformative effect of open source software on subsurface science. Open source accelerates our work!

  • The transformative effect of collaborative, participatory events on the community. We can make new things!

  • The transformative effect of training on ourselves and our peers. Lots of us have new superpowers!

Together, we’ve built the foundation for a new, open software platform.

A domain shift

We think it’s time to refocus the hackathons as sprints — purposefully producing a sustainable, long-lasting, high quality, open source software stack that we can all use and combine into new tools, whether open or proprietary, free or commercial.

We think it’s time to bring a full-featured unconference into the mix. The half-day ‘unsessions’ open too many paths, and leave too few explored. We need more time — to share research, plan software projects, and write code.

Together, we can launch a new era in scientific computing for the subsurface.

At the core of this new era core is a new open-source software stack, created, maintained, and implemented by a community of scientists and organizations passionate about its potential.

Sign up!

Here’s the plan. We’re hosting an unconference from 5 to 11 May 2019, with full days from Monday to Friday. The event will take place at the Château de Rosay, near Rouen, France. It will be fully residential and fully catered. We have room for about 45 participants.

The goal is to lay down a road map for designing, funding, and building an open source software stack for subsurface. In the coming days and weeks, we will formulate the plan for the week, with input from the Software Underground. We want to hear from you. Propose a session! Host a sprint! Offer a bounty! There are lots of ways to get involved.

Map data: GeoBasis-DE / BKG / Google, photo: Chateauform. Click to enlarge.

If you want to be part of this effort, as a developer, an end-user, or a sponsor, then we invite you to join us.

The unconference fee will be EUR 1000, and accommodation and food will be EUR 1500. The student fees will be EUR 240 and EUR 360. There will be at least 5 bursaries of EUR 1000 available.

For the time being, we will be accepting early commitments, with a deposit of EUR 400 to secure a place (students wishing to register now should get in touch). Soon, you will be able to sign up online… we are working on a smooth process. In the meantime, click here to register your interest, share ideas for content, or sign up by paying a deposit.

Thanks for reading. We look forward to figuring this out together.

I’m delighted to be able to announce that we already have support from Dell EMC. Thanks as ever to David Holmes for his willingness to fund experiments!

In the US or Canada? Don’t despair! There will be a North American edition in Quebec in late September.

The next thing

Over the last several years, Agile has been testing some of the new ways of collaborating, centered on digital connections:

  • It all started with this blog, which started in 2010 with my move from Calgary to Nova Scotia. It’s become a central part of my professional life, but we’re all about collaboration and blogs are almost entirely one-way, so…

  • In 2011 we launched SubSurfWiki. It didn’t really catch on, although it was a good basis for some other experiments and I still use it sometimes. Still, we realized we had to do more to connect the community, so…

  • In 2012 we launched our 52 Things collaborative, open access book series. There are well over 5000 of these out in the wild now, but it made us crave a real-life, face-to-face collaboration, so…

  • In 2013 we held the first ‘unsession’, a mini-unconference, at the Canada GeoConvention. Over 50 people came to chat about unsolved problems. We realized we needed a way to actually work on problems, so…

  • Later that year, we followed up with the first geoscience hackathon. Around 15 or so of us gathered in Houston for a weekend of coding and tacos. We realized that the community needed more coding skills, so…

  • In 2014 we started teaching a one-day Python course aimed squarely at geoscientists. We only teach with subsurface data and algorithms, and the course is now 5 days long. We now needed a way to connect all these new hackers and coders, so…

  • In 2014, together with Duncan Child, we also launched Software Underground, a chat room for discussing topics related to the earth and computers. Initially it was a Google Group but in 2015 we relaunched it as an open Slack team. We wanted to double down on scientific computing, so…

  • In 2015 and 2016 we launched a new web app, Pick This (returning soon!), and grew our bruges and welly open source Python projects. We also started building more machine learning projects, and getting really good at it.

Growing and honing

We have spent the recent years growing and honing these projects. The blog gets about 10,000 readers a month. The sixth 52 Things book is on its way. We held two public unsessions this year. The hackathons have now grown to 60 or so hackers, and have had about 400 participants in total, and five of them this year already (plus three to come!). We have also taught Python to 400 geoscientists, including 250 this year alone. And the Software Underground has over 1000 members.

In short, geoscience has gone digital, and we at Agile are grateful and excited to be part of it. At no point in my career have I been more optimistic and energized than I am right now.

So it’s time for the next thing.

The next thing is starting with a new kind of event. The first one is 5 to 11 May 2019, and it’s happening in France. I’ll tell you all about it tomorrow.

Reproducibility Zoo


The Repro Zoo was a new kind of event at the SEG Annual Meeting this year. The goal: to reproduce the results from well-known or important papers in GEOPHYSICS or The Leading Edge. By reproduce, we meant that the code and data should be open and accessible. By results, we meant equations, figures, and other scientific outcomes.

And some of the results are scary enough for Hallowe’en :)

What we did

All the work went straight into GitHub, mostly as Jupyter Notebooks. I had a vague goal of hitting 10 papers at the event, and we achieved this (just!). I’ve since added a couple of other papers, since the inspiration for the work came from the Zoo… and I haven’t been able to resist continuing.

The scene at the Repro Zoo. An air of quiet productivity hung over the booth. Yes, that is Sergey Fomel and Jon Claerbout. Thank you to David Holmes of Dell EMC for the picture.

The scene at the Repro Zoo. An air of quiet productivity hung over the booth. Yes, that is Sergey Fomel and Jon Claerbout. Thank you to David Holmes of Dell EMC for the picture.

Here’s what the Repro Zoo team got up to, in alphabetical order:

  • Aldridge (1990). The Berlage wavelet. GEOPHYSICS 55 (11). The wavelet itself, which has also been added to bruges.

  • Batzle & Wang (1992). Seismic properties of pore fluids. GEOPHYSICS 57 (11). The water properties, now added to bruges.

  • Claerbout et al. (2018). Data fitting with nonstationary statistics, Stanford. Translating code from FORTRAN to Python.

  • Claerbout (1975). Kolmogoroff spectral factorization. Thanks to Stewart Levin for this one.

  • Connolly (1999). Elastic impedance. The Leading Edge 18 (4). Using equations from bruges to reproduce figures.

  • Liner (2014). Long-wave elastic attentuation produced by horizontal layering. The Leading Edge 33 (6). This is the stuff about Backus averaging and negative Q.

  • Luo et al. (2002). Edge preserving smoothing and applications. The Leading Edge 21 (2).

  • Yilmaz (1987). Seismic data analysis, SEG. Okay, not the whole thing, but Sergey Fomel coded up a figure in Madagascar.

  • Partyka et al. (1999). Interpretational aspects of spectral decomposition in reservoir characterization.

  • Röth & Tarantola (1994). Neural networks and inversion of seismic data. Kudos to Brendon Hall for this implementation of a shallow neural net.

  • Taner et al. (1979). Complex trace analysis. GEOPHYSICS 44. Sarah Greer worked on this one.

  • Thomsen (1986). Weak elastic anisotropy. GEOPHYSICS 51 (10). Reproducing figures, again using equations from bruges.

As an example of what we got up to, here’s Figure 14 from Batzle & Wang’s landmark 1992 paper on the seismic properties of pore fluids. My version (middle, and in red on the right) is slightly different from that of Batzle and Wang. They don’t give a numerical example in their paper, so it’s hard to know where the error is. Of course, my first assumption is that it’s my error, but this is the problem with research that does not include code or reference numerical examples.

Figure 14 from Batzle & Wang (1992). Left: the original figure. Middle: My attempt to reproduce it. Right: My attempt in red, overlain on the original.

This was certainly not the only discrepancy. Most papers don’t provide the code or data to reproduce their figures, and this is a well-known problem that the SEG is starting to address. But most also don’t provide worked examples, so the reader is left to guess the parameters that were used, or to eyeball results from a figure. Are we really OK with assuming the results from all the thousands of papers in GEOPHYSICS and The Leading Edge are correct? There’s a long conversation to have here.

What next?

One thing we struggled with was capturing all the ideas. Some are on our events portal. The GitHub repo also points to some other sources of ideas. And there was the Big Giant Whiteboard (below). Either way, there’s plenty to do (there are thousands of papers!) and I hope the zoo continues in spirit. I will take pull requests until the end of the year, and I don’t see why we can’t add more papers until then. At that point, we can start a 2019 repo, or move the project to the SEG Wiki, or consider our other options. Ideas welcome!


Thank you!

The following people and organizations deserve accolades for their dedication to the idea and hard work making it a reality. Please give them a hug or a high five when you see them.

  • David Holmes (Dell EMC) and Chance Sanger worked their tails off on the booth over the weekend, as well as having the neighbouring Dell EMC booth to worry about. David also sourced the amazing Dell tech we had at the booth, just in case anyone needed 128GB of RAM and an NVIDIA P5200 graphics card for their Jupyter Notebook. (The lights in the convention centre actually dimmed when we powered up our booths in the morning.)

  • Luke Decker (UT Austin) organized a corps of volunteer Zookeepers to help manage the booth, and provided enthusiasm and coding skills. Karl Schleicher (UT Austin), Sarah Greer (MIT), and several others were part of this effort.

  • Andrew Geary (SEG) for keeping things moving along when I became delinquent over the summer. Lots of others at SEG also helped, mainly with the booth: Trisha DeLozier, Rebecca Hayes, and Beth Donica all contributed.

  • Diego Castañeda got the events site in shape to support the Repro Zoo, with a dashboard showing the latest commits and contributors.

Café con leche

At the weekend, 28 digital geoscientists gathered at MAZ Café in Santa Ana, California, to sprint on some open geophysics software projects. Teams and individuals pushed pull requests — code contributions to open source projects — left, right, and centre. Meanwhile, Senah and her team at MAZ kept us plied with coffee and horchata, with fantastic food on the side.

Because people were helping each other and contributing where they could, I found it a bit hard to stay on top of what everyone was working on. But here are some of the things I heard at the project breakdown on Sunday afternoon:

Gerard Gorman, Navjot Kukreja, Fabio Luporini, Mathias Louboutin, and Philipp Witte, all from the devito project, continued their work to bring Kubernetes cluster management to devito. Trying to balance ease of use and unlimited compute turns out to be A Hard Problem! They also supported the other teams hacking on devito.

Thibaut Astic (UBC) worked on implementing DC resistivity models in devito. He said he enjoyed the expressiveness of devito’s symbolic equation definitions, but that there were some challenges with implementing the grad, div, and curl operator matrices for EM.

Vitor Mickus and Lucas Cavalcante (Campinas) continued their work implementing a CUDA framework for devito. Again, all part of the devito project trying to give scientists easy ways to scale to production-scale datasets.

That wasn’t all for devito. Alongside all these projects, Stephen Alwon worked on adapting segyio to read shot records, Robert Walker worked on poro-elastic models for devito, and Mohammed Yadecuri and Justin Clark (California Resources) contributed too. On the second day, the devito team was joined by Felix Hermann (now Georgia Tech), with Mengmeng Yang, and Ali Siakoohi (both UBC). Clearly there’s something to this technology!

Brendon Hall and Ben Lasscock (Enthought) hacked on an open data portal concept, based on the UCI Machine Learning Repository, coincidentally based just down the road from our location. The team successfully got some examples of open data and code snippets working.

Jesper Dramsch (Heriot-Watt), Matteo Niccoli (MyCarta), Yuriy Ivanov (NTNU) and Adriana Gordon and Volodymyr Vragov (U Calgary), hacked on bruges for the weekend, mostly on its documentation and the example notebooks in the in-bruges project. Yuriy got started on a ray-tracing code for us.

Nathan Jones (California Resources) and Vegard Hagen (NTNU) did some great hacking on an interactive plotting framework for geoscience data, based on Altair. What they did looked really polished and will definitely come in useful at future hackathons.

All in all, an amazing array of projects!

This event was low-key compared to recent hackathons, and I enjoyed the slightly more relaxed atmosphere. The venue was also incredibly supportive, making my life very easy.

A big thank you as always to our sponsors, Dell EMC and Enthought. The presence of the irrepressible David Holmes and Chris Lenzsch (both Dell EMC), and Enthought’s new VP of Energy, Charlie Cosad, was greatly appreciated.


We will definitely be revisiting the sprint concept in the future einmal ist keinmal, as they say. Devito and bruges both got a boost from the weekend, and I think all the developers did too. So stay tuned for the next edition!

Volve: not open after all

Back in June, Equinor made the bold and exciting decision to release all its data from the decommissioned Volve oil field in the North Sea. Although the intent of the release seemed clear, the dataset did not carry a license of any kind. Since you cannot use unlicensed content without permission, this was a problem. I wrote about this at the time.

To its credit, Equinor listened to the concerns from me and others, and considered its options. Sensibly, it chose an off-the-shelf license. It announced its decision a few days ago, and the dataset now carries a Creative Commons Attribution-NonCommercial-ShareAlike license.

Unfortunately, this license is not ‘open’ by any reasonable definition. The non-commercial stipulation means that a lot of people, perhaps most people, will not be able to legally use the data (which is why non-commercial licenses are not open licenses). And the ShareAlike part means that we’re in for some interesting discussion about what derived products are, because any work based on Volve will have to carry the CC BY-NC-SA license too.

Non-commercial licenses are not open

Here are some of the problems with the non-commercial clause:

NC licenses come at a high societal cost: they provide a broad protection for the copyright owner, but strongly limit the potential for re-use, collaboration and sharing in ways unexpected by many users

  • NC licenses are incompatible with CC-BY-SA. This means that the data cannot be used on Wikipedia, SEG Wiki, or AAPG Wiki, or in any openly licensed work carrying that license.

  • NC-licensed data cannot be used commercially. This is obvious, but far-reaching. It means, for example, that nobody can use the data in a course or event for which they charge a fee. It means nobody can use the data as a demo or training data in commercial software. It means nobody can use the data in a book that they sell.

  • The boundaries of the license are unclear. It's arguable whether any business can use the data for any purpose at all, because many of the boundaries of the scope have not been tested legally. What about a course run by AAPG or SEG? What about a private university? What about a government, if it stands to realize monetary gain from, say, a land sale? All of these uses would be illiegal, because it’s the use that matters, not the commercial status of the user.

Now, it seems likely, given the language around the release, that Equinor will not sue people for most of these use cases. They may even say this. Goodness knows, we have enough nudge-nudge-wink-wink agreements like that already in the world of subsurface data. But these arrangements just shift the onus onto the end use and, as we’ve seen with GSI, things can change and one day you wake up with lawsuits.

ShareAlike means you must share too

Creative Commons licenses are, as the name suggests, intended for works of creativity. Indeed, the whole concept of copyright, depends on creativity: copyright protects works of creative expression. If there’s no creativity, there’s no basis for copyright. So for example, a gamma-ray log is unlikely to be copyrightable, but seismic data is (follow the GSI link above to find out why). Non-copyrightable works are not covered by Creative Commons licenses.

All of which is just to help explain some of the language in the CC BY-NC-SA license agreement, which you should read. But the key part is in paragraph 4(b):

You may distribute, publicly display, publicly perform, or publicly digitally perform a Derivative Work only under the terms of this License

What’s a ‘derivative work’? It’s anything ‘based upon’ the licensed material, which is pretty vague and therefore all-encompassing. In short, if you use or show Volve data in your work, no matter how non-commercial it is, then you must attach a CC BY-NC-SA license to your work. This is why SA licenses are sometimes called ‘viral’.

By the way, the much-loved F3 and Penobscot datasets also carry the ShareAlike clause, so any work (e.g. a scientific paper) that uses them is open-access and carries the CC BY-SA license, whether the author of that work likes it or not. I’m pretty sure no-one in academic publishing knows this.

By the way again, everything in Wikipedia is CC BY-SA too. Maybe go and check your papers and presentations now :)


What should Equinor do?

My impression is that Equinor is trying to satisfy some business partner or legal edge case, but they are forgetting that they have orders of magnitude more capacity to deal with edge cases than the potential users of the dataset do. The principle at work here should be “Don’t solve problems you don’t have”.

Encumbering this amazing dataset with such tight restrictions effectively kills it. It more or less guarantees it cannot have the impact I assume they were looking for. I hope they reconsider their options. The best choice for any open data is CC-BY.

Reproduce this!


There’s a saying in programming: untested code is broken code. Is unreproducible science broken science?

I hope not, because geophysical research is — in general — not reproducible. In other words, we have no way of checking the results. Some of it, hopefully not a lot of it, could be broken. We have no way of knowing.

Next week, at the SEG Annual Meeting, we plan to change that. Well, start changing it… it’s going to take a while to get to all of it. For now we’ll be content with starting.

We’re going to make geophysical research reproducible again!

Welcome to the Repro Zoo!

If you’re coming to SEG in Anaheim next week, you are hereby invited to join us in Exposition Hall A, Booth #749.

We’ll be finding papers and figures to reproduce, equations to implement, and data tables to digitize. We’ll be hunting down datasets, recreating plots, and dissecting derivations. All of it will be done in the open, and all the results will be public and free for the community to use.

You can help

There are thousands of unreproducible papers in the geophysical literature, so we are going to need your help. If you’ll be in Anaheim, and even if you’re not, here some things you can do:

That’s all there is to it! Whether you’re a coder or an interpreter, whether you have half an hour or half a day, come along to the Repro Zoo and we’ll get you started.

Figure 1 from Connolly’s classic paper on elastic impedance. This is the kind of thing we’ll be reproducing.

Figure 1 from Connolly’s classic paper on elastic impedance. This is the kind of thing we’ll be reproducing.

FORCE ML Hackathon: project round-up

The FORCE Machine Learning Hackathon last week generated hundreds of new relationships and nine new projects, including seven new open source tools. Here’s the full run-down, in no particular order…

Predicting well rates in real time

Team Virtual Flow Metering: Nils Barlaug, Trygve Karper, Stian Laagstad, Erlend Vollset (all from Cognite) and Emil Hansen (AkerBP).

Tech: Cognite Data Platform, scikit-learn. GitHub repo.

Project: An engineer from AkerBP brought a problem: testing the rate from a well reduces the pressure and therefore reduces the production rate for a short time, costing about $10k per day. His team investigated whether they could instead predict the rate from other known variables, thereby reducing the number of expensive tests.

This project won the Most Commercial Potential award.

The predicted flow rate (blue) compared to the true flow rate (orange). The team used various models, from multilinear regression to boosted trees.

Reinforcement learning tackles interpretation

Team Gully Attack: Steve Purves, Eirik Larsen, JB Bonas (all Earth Analytics), Aina Bugge (Kalkulo), Thormod Myrvang (NTNU), Peder Aursand (AkerBP).

Tech: keras-rl. GitHub repo.

Project: Deep reinforcement learning has proven adept at learning, and winning, games, and at other tasks including image segmentation. The team tried training an agent to pick these channels in the Parihaka 3D, as well as some other automatic interpretation approaches.

The agent learned something, but in the end it did not prevail. The team learned lots, and did prevail!

This project won the Most Creative Idea award.

Early in training, the learning agent wanders around the image (top left). After an hour of training, the agent tends to stick to the gullies (right).

A new kind of AVO crossplot?

Team ASAP: Per Avseth (Dig), Lucy MacGregor (Rock Solid Images), Lukas Mosser (Imperial), Sandeep Shelke (Emerson), Anders Draege (Equinor), Jostein Heredsvela (DEA), Alessandro Amato del Monte (ENI).

Tech: t-SNE, UMAP, VAE. GitHub repo.

Project: If you were trying to come up with a new approach to AVO analysis, these are the scientists you’d look for. The idea was to reduce the dimensionality of the input traces — using first t-SNE and UMAP then a VAE. This resulted in a new 2-space in which interesting clusters could be probed, chiefly by processing synthetics with known variations (e.g. in thickness or porosity).

This project won the Best In Show award. Look out for the developments that come from this work!

Top: Illustration of the variational autoencoder, which reduces the input data (top left) into some abstract representation — a crossplot, essentially (top middle) — and can also reconstruct the data, but without the features that did not discriminate between the datasets, effectively reducing noise (top right).

The lower image shows the interpreted crossplot (left) and the implied distribution of rock properties (right).

Acquiring seismic with crayons

Team: Jesper Dramsch (Technical University of Denmark), Thilo Wrona (University of Bergen), Victor Aare (Schlumberger), Arno Lettman (DEA), Alf Veland (NPD).

Tech: pix2pix GAN (TensorFlow). GitHub repo.

Project: Not everything tht looks like a toy is a toy. The team spent a few hours drawing cartoons of small seismic sections, then re-trained the pix2pix GAN on them. The result — an app (try it!) that turns sketches into seismic!

This project won the People’s Choice award.

A sketch of a salt diapir penetrating geological layers (left) and the inferred seismic expression, generated by the neural network. In principal, the model could also be trained to work in the other direction.

A sketch of a salt diapir penetrating geological layers (left) and the inferred seismic expression, generated by the neural network. In principal, the model could also be trained to work in the other direction.

Extracting show depths and confidence from PDFs

Team: Florian Basier (Emerson), Jesse Lord (Kadme), Chris Olsen (ConocoPhillips), Anne Estoppey (student), Kaouther Hadji (Accenture).

Project: A couple of decades ago, the last great digital revolution gave us PDFs. Lots of PDFs. But these pseudodigital documents still need to be wrangled into Proper Data. This team took on that project, trying in particular to extract both the depth of a show, and the confidence in its identification, from well reports.

This project won the Best Presentation award.

Kaouther Hadji (left), Florian Basier, Jesse Lord, and Anne Estoppey (right).

Kaouther Hadji (left), Florian Basier, Jesse Lord, and Anne Estoppey (right).

Grain size and structure from core images

Team: Eirik Time, Xiaopeng Liao, Fahad Dilib (all Equinor), Nathan Jones (California Resource Corp), Steve Braun (ExxonMobil), Silje Moeller (Cegal).

Tech: sklearn, skimage, fast.ai. GitHub repo.

Project: One of the many teams composed of professionals from all over the industry — it’s amazing to see this kind of collaboration. The team did a great job of breaking the problem down, going after what they could and getting some decent results. An epic task, but so many interesting avenues — we need more teams on these problems!

The pipeline was as ambitious as it looks. But this is a hard problem that will take some time to get good at. Kudos to this team for starting to dig into it and for making amazing progress in just 2 days.

Learning geological age from bugs

Team: David Wade (Equinor), Per Olav Svendsen (Equinor), Bjoern Harald Fotland (Schlumberger), Tore Aadland (University of Bergen), Christopher Rege (Cegal).

Tech: scikit-learn (random forest). GitHub repo.

Project: The team used DEX files from five wells from the recently released Volve dataset from Equinor. The goal was to learn to predict geological age from biostratigraphic species counts. They made substantial progress — and highlighted what a great resource Volve will be as the community explores it and publishes results like these.

David Wade and Per Olav Svendsen of Equinor (top), and some results (bottom)

Lost in 4D space!

Team: Andres Hatloey, Doug Hakkarinen, Mike Brhlik (all ConocoPhillips), Espen Knudsen, Raul Kist, Robin Chalmers (all Cegal), Einar Kjos (AkerBP).

Tech: scikit-learn (random forest regressor). GitHub repo.

Project: Another cross-industry collaboration. In their own words, the team set out to “identify trends between 4D seismic and well measurements in order to calculate reservoir pressures and/or thickness between well control”. They were motivated by real data from Valhall, and did a great job making sense of a lot of real-world data. One nice innovation: using the seismic quality as a weighting factor to try to understand the role of uncertainty. See the team’s presentation.


Clustering reveals patterns in 4D maps

Team: Tetyana Kholodna, Simon Stavland, Nithya Mohan, Saktipada Maity, Jone Kristoffersen Bakkevig (all CapGemini), Reidar Devold Midtun (ConocoPhillips).

Project: The team worked on real 4D data from an operating field. Reidar provided a lot of maps computed with multiple seismic attributes. Groups of maps represent different reservoir layers, and thirteen different time-lapse acquisitions. So… a lot of maps. The team attempted to correlate 4D effects across all of these dimensions — attributes, layers, and production time. Reidar, the only geoscientist on a team of data scientists, also provided one of the quotes of the hackathon: “I’m the geophysicist, and I represent the problem”.


That’s it for the FORCE Hackathon for 2018. I daresay there may be more in the coming months and years. If they can build on what we started last week, I think more remarkable things are on the way!


One more thing…

I mentioned the UK hackathons last time, but I went and forgot to include the links to the events. So here they are again, in case you couldn’t find them online…

What are you waiting for? Get signed up and tell your friends!

Machine learning goes mainstream

At our first machine-learning-themed hackathon, in New Orleans in 2015, we had fifteen hackers. TImes were hard in the industry. Few were willing or able to compe out and play. Well, it’s now clear that times have changed! After two epic ML hacks last year (in Paris and Houston), at which we hosted about 115 scientists, it’s clear this year is continuing the trend. Indeed, by the end of 2018 we expect to have welcomed at least 240 more digital scientists to hackathons in the US and Europe.

Conclusion: something remarkable is happening in our field.

The FORCE hackathon

Last Tuesday and Wednesday, Agile co-organized the FORCE Machine Learning Hackathon in Stavanger, Norway. FORCE is a cross-industry geoscience organization, coordinating meetings and research in subsurface. The event preceeded a 1-day symposium on the same theme: machine learning in geoscience. And it was spectacular.

Get a flavour of the spectacularness in Alessandro Amato’s beautiful photographs:

Fifty geoscientists and engineers spent two days at the Norwegian Petroleum Directorate (NPD) in Stavanger. Our hosts were welcoming, accommodating, and generous with the waffles. As usual, we gently nudged the participants into teams, and encouraged them to define projects and find data to work on. It always amazes me how smoothly this potentially daunting task goes; I think this says something about the purposefulness and resourcefulness of our community.

Here’s a quick run-down of the projects:

  • Biostrat! Geological ages from species counts.

  • Lost in 4D Space. Pressure drawdown prediction.

  • Virtual Metering. Predicting wellhead pressure in real time.

  • 300 Wells. Extracting shows and uncertainty from well reports.

  • AVO ML. Unsupervised machine learning for more geological AVO.

  • Core Images. Grain size and lithology from core photos.

  • 4D Layers. Classification engine for 4D seismic data.

  • Gully Attack. Strat trap picking with deep reinforcement learning.

  • sketch2seis. Turning geological cartoons into seismic with pix2pix.

I will do a complete review of the projects in the coming few days, but notice the diversity here. Five of the projects straddle geological topics, and five are geophysical. Two or three involve petroleum engineering issues, while two or three move into sed/strat. We saw natural language processing. We saw random forests. We saw GANs, VAEs, and deep reinforcement learning. In terms of input data, we saw core photos, PDF reports, synthetic seismograms, real-time production data, and hastily assembled label sets. In short — we saw everything.

Takk skal du ha

Many thanks to everyone that helped the event come together:

  • Peter Bormann, the mastermind behind the symposium, was instrumental in making the hackathon happen.

  • Grete Block Vargle (AkerBP) and Pernille Hammernes (Equinor) kept everyone organized and inspired.

  • Tone Helene Mydland (NPD) and Soelvi Amundrud (NPD) made sure everything was logistically honed.

  • Eva Halland (NPD) supported the event throughout and helped with the judging.

  • Alessandro Amato del Monte (Eni) took some fantastic photos — as seen in this post.

  • Diego Castaneda and Rob Leckenby helped me on the Agile side of things, and helped several teams.

And a huge thank you to the sponsors of the event — too many to name, but here they all are:


There’s more to come!

If you’re reading this thinking, “I’d love to go to a geoscience hackathon”, and you happen to live in or near the UK, you’re in luck! There are two machine learning geoscience hackathons coming up this fall:

Don’t miss out! Get signed up and we’ll see you there.