Results from the AAPG Machine Learning Unsession

Click here to visit the Google Doc write-up

Click here to visit the Google Doc write-up

Back in May, I co-hosted a different kind of conference session — an 'unsession' — at the AAPG Annual Conference and Exhibition in Salt Lake City, Utah. It was successful in achieving its main goal, which was to show the geoscience community and AAPG organizers a new way of collaborating, networking, and producing tangible outcomes from conference sessions.

It also succeeded in drawing out hundreds of ideas and questions around machine learning in geoscience. We have now combed over what the 120 people (roughly) produced on that afternoon, written it up in a Google Doc (right), and present some highlights right here in this post.

Click here to visit the Flickr photo album.

Click here to visit the Flickr photo album.

The unsession had three phases:

  1. Exploring current and future skills for geoscientists.

  2. Asking about the big questions in machine learning in geoscience.

  3. Digging into some of those questions.

Let's look at each one in turn.


skills_blog.jpg

Current and future skills

As an icebreaker, we asked everyone to list three skills they have that set them apart from others in their teams or organizations — their superpowers, if you will. They wrote these on green Post-It notes. We also asked for three more skills they didn't have today, but wanted to acquire in the next decade or so. These went on orange Post-Its. We were especially interested in those skills that felt intimidating or urgent. The 8 or 10 people at each table then shared these with each other, by way of introducing themselves.

The skills are listed in this Google Sheets document.

Unsurprisingly, the most common 'skills I have' were around geoscience: seismic interpretation, seismic analysis, stratigraphy, engineering, modeling, sedimentology, petrophysics, and programming. And computational methods dominated the 'skills I want' category: machine learning, Python, coding or programming, deep learning, statistics, and mathematics.

We followed this up with a more general question — How would you rate the industry's preparedness for this picture of the future, as implied by the skill gap we've identified?. People could substitute 'industry' for whatever similar scale institution felt meaningful to them. As shown (right), this resulted in a bimodal distribution: apparently there are two ways to think about the future of applied geoscience — this may merit more investigation with a more thorough survey.

Get the histogram data.

preparedness_histogram.png

Big questions in ML

After the icebreaker, we asked the tables to respond to a big question:

What are the most pressing questions in applied geoscience that can probably be tackled with machine learning?

We realized that this sounds a bit 'hammer looking for a nail', but justified asking the question this way by drawing an anology with other important new tools of the past — well logging, or 3D seismic, or sequence stratigrapghy. The point is that we have this powerful new (to us) set of tools; what are we going to look at first? At this point, we wanted people to brainstorm, without applying constraints like time or money.

This yielded approximately 280 ideas, all documented in the Google Sheet. Once the problems had been captured, the tables rotated so that each team walked to a neighboring table, leaving all their problems behind... and adopting new ones. We then asked them to score the new problems on two axes: scope (local vs global problems) and tractability (easy vs hard problems). This provided the basis for each table to choose one problem to take to the room for voting (each person had 9 votes to cast). This filtering process resulted in the following list:

  1. How do we communicate error and uncertainty when using machine learning models and solutions? 85 votes.

  2. How do we account for data integration, integrity, and provenance in our models? 78 votes.

  3. How do we revamp the geoscience curriculum for future geoscientists? 71 votes.

  4. What does guided, searchable, legacy data integration look like? 68 votes.

  5. How can machine learning improve seismic data quality, or provide assistive technology on poor data? 65 votes.

  6. How does the interpretability of machine learning model predictions affect their acceptance? 54 votes.

  7. How do we train a model to assign value to prospects? 51 votes.

  8. How do we teach artificial intelligences foundational geology? 45 votes.

  9. How can we implement automatic core description? 42 votes.

  10. How can we contain bad uses of AI? 40 votes.

  11. Is self-steering well drilling possible? 21 votes.

I am paraphrasing most of those, but you can read the originals in the Google Sheet data harvest.


Exploring the questions

In the final stage of the afternoon, we took the top 6 questions from the list above, and dug into them a little deeper. Tables picked their way through our Solution Sketchpads — especially updated for machine learning problems — to help them navigate the problems. Clearly, these questions were too enormous to make much progress in the hour or so left in the day, but the point here was to sound out some ideas, identify some possible actions, and connect with others interested in working on the problem.

One of the solution sketches is shown here (right), for the Revamp the geoscience curriculum problem. They discussed the problem animatedly for an hour.

This team included — among others — an academic geostatistician, an industry geostatistician, a PhD student, a DOE geophysicist, an SEC geologist, and a young machine learning brainbox. Amazingly, this kind of diversity was typical of the tables.

See the rest of the solution sketches in Flickr.


That's it! Many thanks to Evan Bianco for the labour of capturing and digitizing the data from the event. Thanks also to AAPG for the great photos, and for granting them an open license. And thank you to my co-chairs Brendon Hall and Yan Zaretskiy of Enthought, and all the other folks who helped make the event happen — see the Productive chaos post for details.

To dig deeper, look for the complete write up in Google Docs, and the photos in Flickr


calendar.png

Just a reminder... if it's Python and machine learning skills you want, we're running a Summer School in downtown Houston the week of 13 August. Come along and get your hands on the latest in geocomputing methods. Suitable for beginners or intermediate programmers.

Don't miss out! Find out more or register now.

Productive chaos

Wednesday was a good day.

Over 150 participants came to Room 251 for all or part of the first 'unsession' at the AAPG Annual Conference and Exhibition in Salt Lake City. I was one of the hosts of the event, and emceed the afternoon.

In a nutshell, it was awesome. I have facilitated unsessions before, but this event was on a new scale. Twelve tables of 8–10 seats — covered in sticky notes, stickers, coloured pens, and large sheets of paper — quickly filled up. Together, we burned about 10 person-weeks of human productivity, raising the temperature in the room by several degrees in the process.

Diversity means good conversation

On the way in, people self-identified as mostly software (blue name tags) or mostly soft rocks (red), as a non-serious way to get a handle on how many data scientists we had vs how many people are focused on the rocks themselves — without, I hope, any kind of value judgment. The ratio was about 1:2.

As people continued to drift in, we counted people identifying with various categories, to get a very rough idea of who was in the room. The results are shown here. In addition, I counted 24 women present at the start. Part of the point here is to introduce participants to each other, but there's another purpose too. AAPG, like many scientific organizations, is grappling with diversity today. Like others, it needs to do much better. A small part of the solution is, I think, to name it and measure how we're doing at every opportunity. It's one way to pay more attention.

Harder to capture is the profound level of job diversity. People responsible for billion-dollar budgets sat with graduate students, AAPG medal winners with SEC executives. We even had a venture capitalist and a physician.

Look at all these lovely people:

Tangible and intangible output

At the start of the session, I told the room I wanted to fill the walls with things we made — with data. We easily achieved this, producing a survey of the skills geoscientists will need in the future, hundreds of high-value machine learning tasks in geoscience, a ranked list of the most interesting of these, and even some problem analysis of some of them. None of this was definitive, but I hope it will provide grist for the mill of future conversations about machine learning in geoscience.

As well as these tangible products, each person in the room walked away with new connections and new ideas — about machine learning, about collaboration, and about what scientific meetings can be like.

Acknowledgments

A lot of people contributed to making this event happen.

My unsession co-chairs, Brendon Hall and Yan Zaretskiy of Enthought — spent several hours on the phone with me over the last few weeks, shaping the content and flow of an event that was a bit, er, fuzzy.

We seeded the tables with some of the Software Underground crowd who were in town for the hackathon and AAPG. This ensures that there's no failure case: twelve people are definitely coming. And in the unlikely event that 100 people come, there are twelve allies to manage some of the chaos. Heartfelt thanks to the table hosts:

  • Didi Ooi of the University of Bristol
  • Graham Ganssle of Expero
  • Lisa Stright of Colorado State University
  • Thomas Martin of Colorado School of Mines
  • Tom Creech of ExxonMobil
  • David Holmes of Dell EMC
  • Steve Purves of Euclidity
  • Diego Castaneda of Agile
  • Evan Bianco of Agile

Jenny Cole of SEG came along to observe the session and I appreciated her enthusiastic help as it became clear we were in for more than the usual amount of entropy in the room. Theresa Curry of AAPG did an amazing job getting the venue set up, providing refreshments, and ensuring the photographers were there to capture some of the action. The ACE 2018 organizing committee, especially Zane Jobe and Lauren Birgenheier, did their part by agreeing to supprt including such a weird-sounding thing in the program.

Finally, thank you to the 100+ scientists that came to the event, not knowing at all what to expect. It was a privilege to receive your enthusiastic participation and thoughtful contributions. Let's do it again some time!


We will digitize the ideas and products of the unsession over the coming weeks. They will be released under an open license. Watch this space for updates.

If you're interested in the methodology we use for these events, check out Proceedings of an unsession in CSEG Recorder, November 2013. If you'd like help running an event like this, get in touch.

An invitation to start something

Most sessions at your average conference are about results — the conclusions and insights from completed research projects. But at AAPG this year, there's another kind of session, about beginnings. The 'Unsession' is back!

   Machine Learning Unsession
   Room 251 B/C, 1:15 pm, Wednesday 23 May

The topic is machine learning in geoscience. My hope is that there's a lot of emphasis on geological problems, especially in stratigraphy. But we don't know exactly where it will go, because the participants themselves will determine the topic and direction of the session.

Importantly, most of the session will not involve technical discussion. It's not a computational geology session. It's a session for everyone — we absolutely need input from anyone who's interested in how computers can help us do geoscience.

What to expect

Echoing our previous unconference-style sessions, here's the vibe my co-hosts (Brendon Hall and Yan Zaretskiy of Enthought) and I are going for:

  • Conferences are too one-way, too passive. We want more action, more tangible outcomes.
  • We want open, honest, inclusive conversations about our science, and our technical challenges. Bring your most courageous, opinionated, candid self. The stuff you’re scared to mention, or you’d normally only talk about over a beer? Bring that.
  • Listen with an open mind. The minute you think you’re right, you’ve checked out of the conversation.
  • Whoever shows up — they are the right people. (This is a rule of Open Space Technology.)
  • What happens is the only thing that could have happened. (This is a rule of Open Space Technology.)
  • There is no finish line; when it's over, it's over.
  • What we are doing is not definitive. It's just a thing that we're doing.

The session is an experiment. Failure is most definitely an option, just the least desirable one. Conversely, perfection is the least likely outcome.

If you're going to AAPG this year, I hope you'll come along to this conversation. Bring a friend!


Here's a reminder of the very first Unsession that Evan and I facilitated, way back in 2013. Argh, that's 5 years ago...

Unsolved problems in applied geoscience

I like unsolved problems. I first wrote about them way back in late 2010 — Unsolved problems was the eleventh post on this blog. I touched on the theme again in 2013, before and after the first 'unsession' at the GeoConvention, which itself was dedicated to finding the most pressing questions in exploration geoscience. As we turn towards the unsession at AAPG in Salt Lake City in May, I find myself thinking again about unsolved problems. Specifically, what are they? How can we find them? And what can we do to make them easier to solve?

It turns out lots of people have asked these questions before.

unsolved_problems.png

I've compiled a list of various attempts by geoscientists to list he big questions in the field. The only one I was previous aware of was Milo Backus's challenges in applied seismic geophysics, laid out in his president's column in GEOPHYSICS in 1980 and highlighted later by Larry Lines as part of the SEG's 75th anniversary. Here are some notable attempts:

  • John William Dawson, 1883 — Nova Scotia's most famous geologist listed unsolved problems in geology in his presidential address to the American Association for the Advancement of Science. They included the Cambrian Explosion, and the origin of the Antarctic icecap. 
  • Leason Heberling Adams, 1947 — One of the first experimental rock physicists, Adams made the first list I can find in geophysics, which was less than 30 years old at the time. He included the origin of the geomagnetic field, and the temperature of the earth's interior.
  • Milo Backus, 1980 — The list included direct hydrocarbon detection, seismic imaging, attenuation, and anisotropy.  
  • Mary Lou Zoback, 2000 — As her presidential address to the GSA, Zoback kept things quite high-level, asking questions about finding signal indynamic systems, defining mass flux and energy balance, identifying feedback loops, and communicating uncertainty and risk. This last one pops up in almost every list since.
  • Calgary's geoscience community, 2013 — The 2013 unsession unearthed a list of questions from about 50 geoscientists. They included: open data, improving seismic resolution, dealing with error and uncertainty, and global water management.
  • Daniel Garcia-Castellanos, 2014 — The Retos Terrícolas blog listed 49 problems in 7 categories, ranging from the early solar system to the earth's interior, plate tectonics, oceans, and climate. The list is still maintained by Daniel and pops up occasionally on other blogs and on Wikipedia.

The list continues — you can see them all in this presentation I made for a talk (online) at the Bureau of Economic Geology last week (thank you to Sergey Fomel for hosting me!). During the talk, I took the opportunity to ask those present what their unsolved problems are, especially the ones in their own fields. Here are a few of what we got (the rest are in the preso):

1-what-are-the-biggest-unsolved-problems-in-your-field-1.jpg

What are your unsolved problems in applied geoscience? Share them in the comments!


If you have about 50 minutes to spare, you can watch the talk here, courtesy of BEG's streaming service.

Click here to watch the talk >>>

Free the (seismic) data!

Yesterday afternoon Evan and I hosted the second unsession at the GeoConvention in Calgary. After last year exposing 'Free the data' as one of the unsolved problems in subsurface geoscience, we elected to explore this idea further. And we're addicted to this kind of guided, recorded conversation.

Attendance was a little thin, but those who came spent the afternoon deep in conversation about open data, open software, and greater industry transparency. And we unearthed an exciting and potentially epic conclusion that I hope leads to a small revolution.

What happened?

Rather than leaving the floor completely open, we again brought some structure to the proceedings. I'll post the full version to the wiki page, but here's the overview:

  1. Group seismic interpretation: 5 interpreters in 5 minutes.
  2. Stories about openness: which of 26 short stories resonate with you most?
  3. Open/closed, accessible/inaccessible: a scorecard for petroleum geoscience.
  4. Where are the opportunities? What should we move from closed to open?

As you might expect, the last part was the real point. We wanted to find some high-value areas to poke, or at least gather evidence around. And one area—one data type—was identified as being (a) closed and inaccessible in Canada and (b) much more impactful if it were open and accessible. I gave the punchline away in the title, but that data type is seismic data.

Open, public seismic data is much too juicy a topic to do justice to in this post, so stay tuned for a review of some the specifics of how that conversation went. Meanwhile, imagine a world with free, public seismic data...

Reflections on the 2nd edition

The afternoon went well, and the outcome was intriguing, but we were definitely disappointed by the turnout. We have multiple working hypotheses about it...

  • There may not be a strong appetite for this sort of session, especially on a 'soft' topic. Next time: seismic resolution?
  • The first day might not be the best time for it, because people are still in the mood for talks. Next time: Wednesday morning?
  • The programme maybe didn't reflect what the unsession was about, and the time was unclear. Next time: More visibility.
  • Three hours may be too much to ask from people, though you could say the same about any other session here.

We'd love to hear your thoughts too... Are we barking up completely the wrong tree? Does our community even want to have these conversations? Should we try again in 2015?