TRANSFORM happened!

transform_sticker.jpg

How do you describe the indescribable?

Last week, Agile hosted the TRANSFORM unconference in Normandy, France. We were there to talk about the open suburface stack — the collection of open-source Python tools for earth scientists. We also spent time on the state of the Software Underground, a global community of practice for digital subsurface scientists and engineers. In effect, this was the first annual Software Underground conference. This was SwungCon 1.

The space

I knew the Château de Rosay was going to be nice. I hoped it was going to be very nice. But it wasn’t either of those things. It exceeded expectations by such a large margin, it seemed a little… indulgent, Excessive even. And yet it was cheaper than a Hilton, and you couldn’t imagine a more perfect place to think and talk about the future of open source geoscience, or a more productive environment in which to write code with new friends and colleagues.

It turns out that a 400-year-old château set in 8 acres of parkland in the heart of Normandy is a great place to create new things. I expect Gustave Flaubert and Guy de Maupassant thought the same when they stayed there 150 years ago. The forty-two bedrooms house exactly the right number of people for a purposeful scientific meeting.

This is frustrating, I’m not doing the place justice at all.

The work

This was most people’s first experience of an unconference. It was undeniably weird walking into a week-long meeting with no schedule of events. But, despite being inexpertly facilitated by me, the 26 participants enthusiastically collaborated to create the agenda on the first morning. With time, we appreciated the possibilities of the open space — it lets the group talk about exactly what it needs to talk about, exactly when it needs to talk about it.

The topics ranged from the governance and future of the Software Underground, to the possibility of a new open access journal, interesting new events in the Software Underground calendar, new libraries for geoscience, a new ‘core’ library for wells and seismic, and — of course — machine learning. I’ll be writing more about all of these topics in the coming weeks, and there’s already lots of chatter about them on the Software Underground Slack (which hit 1500 members yesterday!).

The food

I can’t help it. I have to talk about the food.

…but I’m not sure where to start. The full potential of food — to satisfy, to delight, to start conversations, to impress, to inspire — was realized. The food was central to the experience, but somehow not even the most wonderful thing about the experience of eating at the chateau. Meals were prefaced by a presentation by the professionals in the kitchen. No dish was repeated… indeed, no seating arrangement was repeated. The cheese was — if you are into cheese — off the charts.

There was a professionalism and thoughtfulness to the dining that can perhaps only be found in France.

Sorry everyone. This was one of those occasions when you had to be there. If you weren’t there, you missed out. I wish you’d been there. You would have loved it.

The good news is that it will happen again. Stay tuned.

The digital subsurface water-cooler

swung_round_orange.png

Back in August 2016 I told you about the Software Underground, an informal, grass-roots community of people who are into rocks and computers. At its heart is a public Slack group (Slack is a bit like Yammer or Skype but much more awesome). At the time, the Underground had 130 members. This morning, we hit ten times that number: there are now 1300 enthusiasts in the Underground!

If you’re one of them, you already know that it’s easily the best place there is to find and chat to people who are involved in researching and applying machine learning in the subsurface — in geoscience, reservoir engineering, and enything else to do with the hard parts of the earth. And it’s not just about AI… it’s about data management, visualization, Python, and web applications. Here are some things that have been shared in the last 7 days:

  • News about the upcoming Software Underground hackathon in London.

  • A new Udacity course on TensorFlow.

  • Questions to ask when reviewing machine learning projects.

  • A Dockerfile to make installing Seismic Unix a snap.

  • Mark Zoback’s new geomechanics course.

It gets better. One of the most interesting conversations recently has been about starting a new online-only, open-access journal for the geeky side of geo. Look for the #journal channel.

Another emerging feature is the ‘real life’ meetup. Several social+science gatherings have happened recently in Aberdeen, Houston, and Calgary… and more are planned, check #meetups for details. If you’d like to organize a meetup where you live, Software Underground will support it financially.

softwareunderground_merch.png

We’ve also gained a website, softwareunderground.org, where you’ll find a link to sign-up in the Slack group, some recommended reading, and fantastic Software Underground T-shirts and mugs! There are also other ways to support the community with a subscription or sponsorship.

If you’ve been looking for the geeks, data-heads, coders and makers in geoscience and engineering, you’ve found them. It’s free to sign up — I hope we see you in there soon!


Slack has nice desktop, web and mobile clients. Check out all the channels — they are listed on the left:

swung_convo.png

Subsurface Hackathon project round-up, part 1

The dust has settled from the Hackathon in Paris two weeks ago. Been there, done that, came home with the T-shirt.

In the same random order they presented their 4-minute demos to our panel of esteemed judges, I present a (very) abbreviated round-up of what the teams made together over the course of the weekend. With the exception of a few teams who managed to spontaneously nucleate before the hackathon, most of these teams were comprised of people who had never met each other before the event.

Just let that sink in for a second: teams of mostly mutual strangers built 13 legit machine-learning-based geoscience applications in one weekend. 


Log Healer      

Log Healer

 

 

An automated well log management system

Team Un-well Loggers: James Wanstall (Glencore), Niket Doshi (Teradata), Joseph Taylor (Teradata), Duncan Irving (Teradata), Jane McConnell (Teradata).

Tech: Kylo (NiFi, HDFS, Hive, Spark)

If you're working with well logs, and if you've got lots of them, you've almost certainly got gaps or inaccuracies from curve to curve and from well to well. The team's scalable, automated well-log file management system Log Healer computes missing logs and heals broken ones. Amazing.


An early result from Team Janus. The image on the left is ground truth, that on the right is predicted. Many of the features are present. Not bad for v0.1!

An early result from Team Janus. The image on the left is ground truth, that on the right is predicted. Many of the features are present. Not bad for v0.1!

Meaningful cross sections from well logs

Team Janus: Daniel Buse, Johannes Camin, Paul Gabriel, Powei Huang, Fabian Kampe (all from GiGa Infosystems)

The team built an elegant machine learning workflow to attack the very hard problem of creating geologically realistic cross-section from well logs. The validation algorithm compares pixels to score the result. 


Think Section's mindblowing photomicrograph labeling tool can also make novel camouflage patterns.

Think Section's mindblowing photomicrograph labeling tool can also make novel camouflage patterns.

Paint-by-numbers on digital thin sections

Team Think Section: Diego Castaneda (Agile*), Brendon Hall (Enthought), Roeland Nieboer (Fugro), Jan Niederau (RWTH Aachen), Simon Virgo (RWTH Aachen)

Tech: Python (Scikit Learn, Scikit Image, Flask, NumPy, SciPy, Pandas), AWS for hosting app & Jupyter server.

Description: Mineral classification and point-counting on thin sections can be an incredibly tedious and time consuming task. Team Think Section trained a model to segregate, classify, and label mineral grains in 200GB of high-resolution multi-polarization-angle photomicrographs.


Team Classy's super-impressive shot gather seismic event Detection technology. Left: synthetic gather. Middle: predicted labels. Right: truth.

Team Classy's super-impressive shot gather seismic event Detection technology. Left: synthetic gather. Middle: predicted labels. Right: truth.

Event detection on seismic shot gathers

Team Classy: Princy Ikotoko Ndong (EOST), Anna Lim (NTNU), Yuriy Ivanov (NTNU), Song Hou (CGG), Justin Gosses (Valador).

Tech: Python (NumPy, Matplotlib), Jupyter notebooks.

The team created an AI which identifies and labels different events on a shot gather image. It can find direct waves, reflections, multiples or coherent noise. It uses a support vector machine for classification, and is simple and fast. 


model2seismic: An entirely new way to do modeling and inversion. Take note: the neural network that made this image knows no physics.

model2seismic: An entirely new way to do modeling and inversion. Take note: the neural network that made this image knows no physics.

Forward and inverse modeling without the physics

Team GANsters - Lukas Mosser (Imperial), Wouter Kimman (Meridian), Jesper Dramsch (Copenhagen), Alfredo de la Fuente (Wolfram), Steve Purves (Euclidity)

Tech: PyNoddy, homegrown Python ML tools.

The GANsters created a deep-learning image-translation-based seismic inversion and forward modelling system. I urge you to go and look at their project on model2seismic. If it doesn't give you goosebumps, you are geophysically inert.


Team Pick Pick Log

Team Pick Pick Log

Machine learning for for stratigraphic interpretation

Team Pick Pick LOG - Antoine Vanbesien (EOST), Fidèle Degni (Mines St-Étienne), Massinissa Mesbahi (Pau), Natsuki Gunji (Mines St-Étienne), Cédric Menut (EOST).

This team of data science and geoscience undergrads attacked an automated stratigraphic interpretation task. They used supervised learning to determine lithology from well logs in Alberta's Athabasca play, then attempted to teach their AI to pick stratigraphic tops. Impressive!


Pretty amazing, huh? The power of the hackathon to bring a project from barely-even-an-idea to actual-working-code is remarkable! And we're not even halfway through the teams: tomorrow I'll describe the other seven projects. 

Running away from easy

Matt and I are in Calgary at the 2017 GeoConvention. Instead of writing about highlights from Day 1, I wanted to pick on one awesome thing I saw. Throughout the convention, there is a air of sadness, of nostalgia, of struggle. But I detect a divide among us. There are people who are waiting for things to return to how they were, when life was easy. Others are exploring how to be a part of the change, instead of a victim of it. Things are no longer easy, but easy is boring. 


Want to start an oil and gas company? What resources are you going to need? Computers, pricey software applications, data. Purchase all of this stuff as a one-time capital expense, build a team, get an office lease, buy desks and a Keurig. Then if all goes well, 18 months later you'll have a slide deck outlining a play that you could pitch to investors. 

Imagine getting started without laying down a huge amount of capital for all those things. What if you could rent a desk at a co-working space, access the suite of software tools that you're used to, and use their Keurig. The computer infrastructure and software is managed and maintained by an IT service company so you don't have to worry about it. 

Yesterday at the Calgary Geoconvention I heard all about ReSourceYYC, a co-working space catering to oil and gas professionals, introduced ResourceNET, a subscription-based cloud workstation environment for freelancers, consultants, startups, and the newly and not-so-newly underemployed community of subsurface professionals.

In making this offering, ReSourceYYC has partnered up with a number of software companies: Entero, Seisware, Surfer, ValNav, geoLOGIC, and Divestco, to name a few. The limitations and restrictions around this environment, if any, weren't totally clear. I wondered: Could I append or swap my own tools with this stack? Can I access this environment from anywhere?

It could be awesome. I think it could serve just as many freelancers and consultants as "oil and gas startups". It seems a bit too early to say, but I reckon there are literally thousands of geoscientists and engineers in Calgary that'd be all over this.

I think it's interesting and important and I hope they get it right.

Strategies for a revolution

This must be a record. It has taken me several months to get around to recording the talk I gave last year at EAGE in Vienna — Strategies for a revolution. Rather a gradiose title, sorry about that, especially over-the-top given that I was preaching to the converted: the workshop on open source. I did, at least, blog aobut the goings on in the workshop itself at the time. I even followed it up with a slightly cheeky analysis of the discussion at the event. But I never posted my own talk, so here it is:

Too long didn't watch? No worries, my main points were:

  1. It's not just about open source code. We must write open access content, put our data online, and push the whole culture towards openness and reproducibility. 
  2. We, as researchers, professionals, and authors, need to take responsibility for being more open in our practices. It has to come from within the community.
  3. Our conferences need more tutorials, bootcamps, , hackathons and sprints. These events build skills and networks much faster than (just) lectures and courses.
  4. We need something like an Open Geoscience Foundation to help streamline funding channels for open source projects and community events.

If you depend on open source software, or care about seeing more of it in our field, I'd love to hear your thoughts about how we might achieve the goal of having greater (scientific, professional, societal) impact with technology. Please leave a comment.

 

No secret codes: announcing the winners

The SEG / Agile / Enthought Machine Learning Contest ended on Tuesday at midnight UTC. We set readers of The Leading Edge the challenge of beating the lithology prediction in October's tutorial by Brendon Hall. Forty teams, mostly of 1 or 2 people, entered the contest, submitting several hundred entries between them. Deadlines are so interesting: it took a month to get the first entry, and I received 4 in the second month. Then I got 83 in the last twenty-four hours of the contest.

How it ended

Team F1 Algorithm Language Solution
1 LA_Team (Mosser, de la Fuente) 0.6388 Boosted trees Python Notebook
2 PA Team (PetroAnalytix) 0.6250 Boosted trees Python Notebook
3 ispl (Bestagini, Tuparo, Lipari) 0.6231 Boosted trees Python Notebook
4 esaTeam (Earth Analytics) 0.6225 Boosted trees Python Notebook
ml_contest_lukas_alfo.png

The winners are a pair of graduate petroelum engineers, Lukas Mosser (Imperial College, London) and Alfredo de la Fuente (Wolfram Research, Peru). Not coincidentally, they were also one of the more, er, energetic teams — it's say to say that they explored a good deal of the solution space. They were also very much part of the discussion about the contest on GitHub.com and on the Software Underground Slack chat group, aka Swung (you're in there, right?).

I will be sending Raspberry Shakes to the winners, along with some other swag from Enthought and Agile. The second-place team will receive books from SEG (thank you SEG Book Mart!), and the third-place team will have to content themselves with swag. That team, led by Paolo Bestagini of the Politecnico di Milano, deserves special mention — their feature engineering approach was very influential, being used by most of the top-ranking teams.

Coincidentally Gram and I talked to Lukas on Undersampled Radio this week:

Back up a sec, what the heck is a machine learning contest?

To enter, a team had to predict the lithologies in two wells, given wireline logs and other data. They had complete data, including lithologies, in nine other wells — the 'training' data. Teams trained a wide variety of models — from simple nearest neighbour models and support vector machines, to sophisticated deep neural networks and random forests. These met with varying success, with accuracies ranging between about 0.4 and 0.65 (i.e., error rates from 60% to 35%). Here's one of the best realizations from the winning model:

One twist that made the contest especially interesting was that teams could not just submit their predictions — they had to submit the code that made the prediction, in the open, for all their fellow competitors to see. As a result, others were quickly able to adopt successful strategies, and I'm certain the final result was better than it would have been with secret code.

I spent most of yesterday scoring the top entries by generating 100 realizations of the models. This was suggested by the competitors themselves as a way to deal with model variance. This was made a little easier by the fact that all of the top-ranked teams used the same language — Python — and the same type of model: extreme gradient boosted trees. (It's possible that the homogeneity of the top entries was a negative consequence of the open format of the contest... or maybe it just worked better than anything else.)

What now?

There will be more like this. It will have something to do with seismic data. I hope I have something to announce soon.

I (or, preferably, someone else) could write an entire thesis on learnings from this contest. I am busy writing a short article for next month's Leading Edge, so if you're interested in reading more, stay tuned for that. And I'm sure there wil be others.

If you took part in the contest, please leave a comment telling about your experience of it or, better yet, write a blog post somewhere and point us to it.

Le meilleur hackathon du monde

hackathon_2017_calendar.png

Hackathons are short bursts of creative energy, making things that may or may not turn out to be useful. In general, people work in small teams on new projects with no prior planning. The goal is to find a great idea, then manifest that idea as something that (barely) works, but might not do very much, then show it to other people.

Hackathons are intellectually and professionally invigorating. In my opinion, there's no better team-building, networking, or learning event.

The next event will be 10 & 11 June 2017, right before the EAGE Conference & Exhibition in Paris. I hope you can come.

The theme for this event will be machine learning. We had the same theme in New Orleans in 2015, but suffered a bit from a lack of data. This time we will have a collection of open datasets for participants to build off, and we'll prime hackers with a data-and-skills bootcamp on Friday 9 June. We did this once before in Calgary – it was a lot of fun. 

Can you help?

It's my goal to get 52 participants to this edition of the event. But I'll need your help to get there. Please share this post with any friends or colleagues you think might be up for a weekend of messing about with geoscience data and ideas. 

Other than participants, the other thing we always need is sponsors. So far we have three organizations sponsoring the event — Dell EMC is stepping up once again, thanks to the unstoppable David Holmes and his team. And we welcome Sandstone — thank you to Graham Ganssle, my Undersampled Radio co-host, who I did not coerce in any way.

sponsors_so_far.png

If your organization might be awesome enough to help make amazing things happen in our community, I'd love to hear from you. There's info for sponsors here.

If you're still unsure what a hackathon is, or what's so great about them, check out my November article in the Recorder (Hall 2015, CSEG Recorder, vol 40, no 9).

Nowhere near Nyquist

This is a guest post by my Undersampled Radio co-host, Graham Ganssle.

You can find Gram on the webLinkedInTwitterGitHub

This post is a follow up to Tuesday's post about the podcast — you might want to read that first.


Undersampled Radio was born out of a dual interest in podcasting. Matt and I both wanted to give it a shot, but we didn’t know what to talk about. We still don’t. My philosophy on UR is that it’s forumesque; we have a channel on the Software Underground where we solicit ideas, draft guests, and brainstorm about what should be on the show. We take semi-formed thoughts and give them a good think with a guest who knows more than us. Live and uncensored.

Since with words I... have not.. a way... the live nature of the show gives it a silly, laid back attitude. We attempt to bring our guests out of interview mode by asking about their intellectual curiosities in addition to their professional interests. Though the podcast releases are lightly edited, the YouTube live-stream recordings are completely raw. For a good laugh at our expense you should certainly watch one or two.

Techie deets

Have a look at the command center. It’s where all the UR magic (okay, digital trickery) happens in pre- and post-production.

It's a mess but it works!

It's a mess but it works!

We’ve migrated away from the traditional hardware combination used by most podcasters. Rather than use the optimum mic/mixer/spaghetti-of-cables preferred by podcasting operations which actually generate revenue, we’ve opted to use less hardware and do a bit of digital conditioning on the back end. We conduct our interviews via YouTube live (aka Google Hangouts on Air) then on my Ubuntu machine I record the audio through stereo mix using PulseAudio and do the filtering and editing in Audacity.

Though we usually interview guests via Google Hangouts, we have had one interviewee in my office for an in-person chat. It was an incredible episode that was filled with the type of nonlinear thinking which can only be accomplished face to face. I mention this because I’m currently soliciting another New Orleans recording session (message me if you’re interested). You buy the plane ticket to come record in the studio. I buy the beer we’ll drink while recording.

as Matt  guessed  there actually are paddle boats rolling by while I record. Here’s the view from my recording studio; note the paddle boat on the left.

as Matt guessed there actually are paddle boats rolling by while I record. Here’s the view from my recording studio; note the paddle boat on the left.

Forward projections

We have several ideas about what to do next. One is a live competition of some sort, where Matt and I compete while a guest(s) judge our performance. We’re also keen to do a group chat session, in which all the members of the Software Underground will be invited to a raucous, unscripted chat about whatever’s on their minds. Unfortunately we dropped the ball on a live interview session at the SEG conference this year, but we’d still like to get together in some sciencey venue and grab randos walking by for lightning interviews.

In accord with the remainder of our professional lives, Matt and I both conduct the show in a manner which keeps us off balance. I have more fun, and learn more information more quickly, by operating in a space outside of my realm of knowledge. Ergo, we are open to your suggestions and your participation in Undersampled Radio. Come join us!

 

What will people pay for?

Many organizations in the industry are asking this question right now. Software and service companies would like to sell product, technical societies would like to survive diminished ad sales and conference revenue, entrepreneurs would like to find customers. We all need to make a living.

I was recently asked this very question by a technical society. However, it's utterly the wrong question. Even asking this question reveals a deep-seated misunderstanding of what technical societies are for.

The question is not "What will people pay for?", it's "What do people need?". 

The leaders of our profession

Geoscientists and engineers are professionals. Our professional contributions are defined by our work and its purpose, not by our jobs and its tasks. This is essentially what makes a professional different from other workers: we are purpose-oriented, not task-oriented. We're interested in the outcome, not the means.

But even professionals benefit from leadership. Professional regulators notwithstanding, our technical societies are the de facto leaders of the profession. The professional regulator is the 'line manager' of the profession, not the 'chief geoscientist'.

Leadership is about setting an example, inspiring great work, and providing the means to grow and make the best contributions people can make. Societies need to be asking themselves how they can create the conditions for a transformed profession, a more relevant and resilient one. In short, how can they be useful? How can they serve?

OK, so what do people need?

I don't claim to have all the answers, or even many of them, but here are some things I think people need:

  • Representation. Get serious about gender and race balance on your boards and committees. There is recent progress, but it's nowhere near representative. Related: get out of North America and improve global reach.
  • Better ways to contribute and connect. Experiment more — a lot more, and urgently — with meetings and conferences. Help people participate, not just attend. Help people connect, not just exchange business cards. 
  • New ways to contribute and connect. Get serious about social media. Get scientists involved — social media is not a marketing exercise. Think hard about how you can engage your members through blogs and other content.
  • Reproducible science. Go further with open access, open data, and open source code. Make your content work harder. Make it reach further. Demand more of your authors to make their work reproducible.
  • A bit less self-interest. Stop regarding things you didn't organize or produce as a threat. Other people's events and publications may be of interest to your members, and your mission is to serve them.

Don't listen to my blathering. The AGU and the EGU are real leaders in geoscience — be inspired by them, follow their lead. Pay more attention to what's happening in publishing and conferences in other technical verticals, especially technology.

Pie in the sky is still pie

People will say, "That's all great Matt, but right now it's about survival." I get this a lot, but I'm not buying it. When times are good, you don't need to do the right thing; when times are hard, you can't afford to. True, all this would be easier if you'd started doing the right thing when times were good, but you didn't, so here we are.

Sure it's tough now, but are you sure you can afford to wait till tomorrow?


I've written lots before on these topics. Suggested reading:

The sound of the Software Underground

If you are a geoscientist or subsurface engineer, and you like computery things — in other words, if you read this blog — I have a treat for you. In fact, I have two! Don't eat them all at once.

Software Underground

Sometimes (usually) we need more diversity in our lives. Other times we just want a soul mate. Or at least someone friendly to ask about that weird new seismic attribute, where to find a Python library for seismic imaging, or how to spell Kirchhoff. Chat rooms are great for those occasions, Slack is where all the cool kids go to chat, and the Software Underground is the Slack chat room for you. 

It's free to join, and everyone is welcome. There are over 130 of us in there right now — you probably know some of us already (apart from me, obvsly). Just go to http://swung.rocks/ to sign up, and we will welcome you at the door with your choice of beverage.

To give you a flavour of what goes on in there, here's a listing of the active channels:

  • #python — for people developing in Python
  • #sharp-rocks — for people developing in C# or .NET
  • #open-geoscience — for chat about open access content, open data, and open source software
  • #machinelearning — for those who are into artificial intelligence
  • #busdev — collaboration, subcontracting, and other business opportunities 
  • #general — chat about anything to do with geoscience and/or computers
  • #random — everything else

Undersampled Radio

If you have a long commute, or occasionally enjoy being trapped in an aeroplane while it flies around, you might have discovered the joy of audiobooks and podcasts. You've probably wished many times for a geosciencey sort of podcast, the kind where two ill-qualified buffoons interview hyper-intelligent mega-geoscientists about their exploits. I know I have.

Well, wish no more because Undersampled Radio is here! Well, here:

The show is hosted by New Orleans-based geophysicist Graham Ganssle and me. Don't worry, it's usually not just us — we talk to awesome guests like geophysicists Mika McKinnon and Maitri Erwin, geologist Chris Jackson, and geopressure guy Mark Tingay. The podcast is recorded live every week or three in Google Hangouts on Air — the link to that, and to show notes and everything else — is posted by Gram in the #undersampled Software Underground channel. You see? All these things are connected, albeit in a nonlinear, organic, highly improbable way. Pseudoconnection: the best kind of connection.

Indeed, there is another podcast pseudoconnected to Software Underground: the wonderful Don't Panic Geocast — hosted by John Leeman and Shannon Dulin — also has a channel: #dontpanic. Give their show a listen too! In fact, here's a show we recorded together!

Don't have an hour right now? OK, you asked for it, here's a clip from that show to get you started. It starts with John Leeman explaining what Fun Paper Friday is, and moves on to one of my regular rants about conferences...

In case you're wondering, neither of these projects is explicitly connected to Agile — I am just involved in both of them. I just wanted to clear up any confusion. Agile is not a podcast company, for the time being anyway.