Why Python beats MATLAB for geophysics

MATLAB — the scientific computing environment which includes a programming language — is amazing. It has probably done as much for the development of new geophysical methods, and for the teaching and learning of geophysics, as any other tool or language. A purely anecdotal assertion, but it's rare to meet a geophysicist who has not at least dabbled in MATLAB, and it is used daily in geophysics labs and classrooms. Geophysics <3 MATLAB.

It's easy to see why — MATLAB definitely has some advantages.

Advantages of MATLAB

  • Matrices. MATLAB implicitly treats arrays as matrices (the name means 'matrix laboratory'). As a result, notation is quite intuitive for mathematicians. For example, a*b means standard matrix multiplication, the dot product. (Slightly confusingly, to get Python-style element-wise multiplication, add a dot: a.*b).
  • Lots of functions. MATLAB has been around for over 30 years, so there are many, many useful functions. Find them either in the core product, in one of the toolboxes, or in MATLAB Central.
  • Simulink. This block-based system design and simulation engine is much-loved by engineers. It allows users to model physical systems in an intuitive, graphical environment.
  • Easy to install. The MATLAB environment is a desktop application, so it is instantly familiar and can be managed under the same processes other software in your machine or organization is managed.
  • MATLAB is widespread in academia. Thanks to one of those generous schemes where software corporations give free software to universities, just because they're awesome and definitely not for any other reason, students and profs have easy and free access to MATLAB. Outside academia, however, you're looking at tens of thousands of dollars.

So far so good, but it's time for geophysics to switch to Python. On the face of it, the language has a lot in common with MATLAB: they're both easy to learn, and both have broad ecosystems that make things like image processing, statistics, and signal processing easy. But Python has some special features that make it a fantastic platform for scientific computing...

Advantages of Python

  • Free and open. Thanks to one of those generous schemes where people make software and let anyone use it for any purpose for free, Python is free! Not only is it free of charge, you are free to inspect and modify the code. Open is awesome. (There are other free alternatives to MATLAB, notably GNU Octave and SciLab.)
  • General purpose. One of the things I love about Python is its flexibility. You can use it in the shell on microtasks, or interactively, or in scripts, or to write server software, or to build enterprise software with GUIs.
  • Namespaces. Everything in MATLAB lives in the main namespace, whereas Python keeps things inherently modular. To access NumPy, say, you have to import it and then use its namespace to get at its contents: numpy.ndarray([1, 2, 3]). This has various advantages, including flexibility, readability, learnability, and portability.
  • Introspection. A powerful idea in Python, introspection means that you (or your code) can see inside every module, class, and function. You can use access private variables, or write code that 'knows' about other objects' interfaces.
  • Portable. You can run your Python code on any architecture, whereas to run MATLAB code you either need all the MATLAB licenses the software uses, or another pricey toolbox to make executables.
  • Popular. Python is the 7th most popular tag in Stack Overflow, whereas MATLAB is the 58th. While programming is not a popularity contest, think of your career, or the careers of your students. Once they graduate, Python will serve them better than MATLAB. There are over 300 jobs for Pythonistas on Stack Overflow Jobs right now. MATLAB jobs? Nine.

So there you have it. It's time to switch to Python. If you're new to programming, there's no contest. I suppose if you're productive in MATLAB, and have access to all the toolboxes, then admittedly it's hard to say you should switch.

But I'll still say it.


I was inspired to write this post after talking to a geophysicist about using programming languages in the classroom, and by the lists in this nice post on pyzo.org. It would be interesting to hear what you use in the classroom — as an instructor or as a student. I know geophysics is being taught with the help of MATLAB (in many places), Java (e.g. at Colorado School of Mines), Mathematica (e.g. by Chris Liner). I wonder if there's anyone using JavaScript, which wouldn't be a terrible choice. Or C++? Or Fortran?? Let us know in the comments!

Toolbox wishlist

Earlier this week, the conversation on Software Underground* turned to well-tie software.

Someone was complaining that, despite having several well-tie tools at their disposal, none of them was quite right. I've written about this phenomenon before. We, as a discipline, do not know how to tie wells. I don't mean that you don't know, I know you know, but I bet if you compared the workflows of ten geoscientists, they would all be different. That's why every legacy well in every project has thirty time-depth tables, including at least three endearingly hopeful ones called final, and the one everyone uses, called test.

As a result of all this, the topic of "what tools do people need?" came up. Leo Uieda, a researcher in Brazil, asked:

I just about remembered that I had put up this very question on Tricider some time ago. Tricider is not a website about apple-based beverages, but a site for sharing and voting on ideas. You can start with a few ideas, get votes and comments on them, and even get new ideas. Here's the top idea as of right now: an open-source petrophysics tool.

Do check out the list, and vote or comment if you like. It might help someone find a project to work on, or spark an idea for a new app or even a new company.

Another result of the well-tie software conversation was, "What are the features of the one well-tie app to rule them all?" I'll leave you to stew on that one for a while. Meanwhile, please share your thoughts in the comments.


* Software Underground is an open Slack team. In essence, it's a chat room for geocomputing geeks: software, underground, geddit? It's completely free and open to anyone — pop along to http://swung.rocks/ to sign up.

It even has its own radio station!

Tools for drawing geoscientific figures

This is a response to Boyan Vakarelov's useful post on LinkedIn about tools for creating geological figures. I especially liked his SketchUp tip.

It's a while since we wrote about our toolset, so I thought I'd document what we're currently using for making figures. You won't be surprised to hear that they're mostly open source. 

Our figure creation toolbox

  • QGIS — if it's a map, you should make it in a GIS, it's as simple as that.
  • Inkscape — for most drawing and figure creation tasks. It's just as good as Illustrator.
  • GIMP — for raster editing tasks. Rasters are no good for editable figures or line art though.
  • TimeScale Creator — a little-known tool for making editable chronostratigraphic columns. Here's an example from way back on this very blog. The best thing: you can export SVG files, then edit them in Inkscape.
  • Python, R, etc. — the best way to make reproducible scientific figures is not to draw them at all. Instead, create data visualizations programmatically.

To really appreciate how fantastic the programmatic approach is, check out Sergey Fomel's treasure trove of reproducible documents, in which every figure is really just the output of a little program that anyone can run. Here's one of my own, adapted from a previous post and a sneak peek of an upcoming Leading Edge tutorial:

Different sample interpolation styles give different amplitudes for inter-sample positions, as shown at the red 'horizon' time pick. From upcoming tutorial in the April edition of The Leading Edge

Everything you wanted to know about images

Screenshots often form part of a figure, because they're so much easier than trying to figure out how to export an image, or trying to wrangle the data from scratch. If you find yourself grabbing a screenshot, and any time you're providing an image for someone else — especially if it's destined for print — you need to know all about image resolution. Read my post Save the samples for my advice. 

If you still save your images as JPEG, you also need to read my post about How to choose an image format. One day you might need the fidelity you are throwing away! Here's the short version: save everything as a PNG.

Last thing: know the difference between vector and raster graphics. Make vectors when you can.

Stop using PowerPoint!

The only bit of Boyan's post I didn't like was the bit about PowerPoint. I admit, fifteen years ago I was a bit of a slave to PowerPoint. I'd have preferred to use Illustrator at the time, but it was well beyond corporate IT's ken, and I hadn't yet discovered Inkscape. But I'm over it now — and just as well because it's a horrible drawing tool. The main limitation is not having layers, which is a show-stopper for me, but there's also the generic typography, simplistic spline editing, the inability to handle standard formats like SVG, and no scripting or plug-ins.

Getting good

If you want to learn about making effective scientific figures, I strongly recommend reading anything you can by Edward Tufte, Robert Kosara, Alberto Cairo, and Mike Bostock. For some quick inspiration check out the #dataviz hashtag on Twitter, or feast your eyes on this amazing collection of graphics, or Mike Bostock's interactive examples, or... there are too many resources to choose from.

How about you? Share your favourite tools in the comments or on Boyan's post.

A European geo-gaming hackathon

I'm convinced that hackathons are the best way to get geoscientists and engineers inventing and collaborating in new ways. They are better for learning than courses. They are better for networking than parties. And they nearly always have tacos! 

If you are unsure what a hackathon is, or why I'm so enthusiastic about them, you can read my November article in the Recorder (Hall 2015, CSEG Recorder, vol 40, no 9).

The next hackathon will be 28 and 29 May in Vienna, Austria — right before the EAGE Conference and Exhibition. You can sign up right now! Please get it in your calendar and pass it along.

Throwing down the gauntlet

Colorado School of Mines has dominated the student showing at the last 2 autumn hackathons. I know there are plenty more creative research groups out there. Come out and show the world your awesomeness — in teams of up to 4 people — and spend a weekend learning and coding. Also: there will be beer.

To everyone else: this is not a student event, it's for everyone. Most of the participants in the past have been professionals, but the more diverse it is, the more we all get out of it. So don't ask yourself if you'll fit in — you will. 

A word about the fee

Our previous hackathons have been free, but this one has a small fee. It's an experiment. Like most free events, no-shows are a challenge; I'm hoping the fee reduces the problem. If the fee makes it difficult for you to join us, please get in touch — I do not want it to be a barrier.

Just to be clear: these events do not make money. Previous events have been generously sponsored — and that's the only way they can happen. We need support for this one too: if you're a champion of creativity in science and want to support this event, you can find me at matt@agilegeoscience.com, or you can read more about sponsorship here.

Details

The dates are 28 and 29 May. The event will run 8 till 6 (or so) on both the Saturday and the Sunday. We don't have a venue finalized yet. Ideas and contributions of any kind are welcome — this is a community event.

The theme this year will be Games. If you have ideas, share them in the comments! Here are some random project ideas to get you going...

  • Acquisition optimizer: lay out the best geometry to image the geology.
  • Human inversion: add geological layers to match a seismic trace.
  • Drill wells on a budget to make the optimal map of an unseen surface.
  • Which geological section matches the (noisy) seismic section?
  • Top Trumps for global 3D seismic surveys, with data scraped from press releases.
  • Set up the best processing flow based for a modeled, noisy shot gather.

It's going to be fun! If you're traveling to EAGE this year, I hope we see you there!


Photo of Vienna by Nic Piégsa, CC-BY. Photo of bridge by Dragan Brankovic, CC-BY.

Is subsurface software too pricey?

Amy Fox of Enlighten Geoscience in Calgary wrote a LinkedIn post about software pricing a couple of weeks ago. I started typing a comment... and it turned into a blog post.


I have no idea if software is 'too' expensive. Some of it probably is. But I know one thing for sure: we subsurface professionals are the only ones who can do anything about the technology culture in our industry.

Certainly most technical software is expensive. As someone who makes software, I can see why it got that way: good software is really hard to make. The market is small, compared to consumer apps, games, etc. Good software takes awesome developers (who can name their price these days), and it takes testers, scientists, managers.

But all is not lost. There are alternatives to the expensive software. We — practitioners in industry — just do not fully explore them. OpendTect is a great seismic interpretation tool, but many people don't take it seriously because it's free. QGIS is an awesome GIS application, arguably better than ArcGIS and definitely easier to use.

Sure, there are open source tools we have embraced, like Linux and MediaWiki. But on balance I think this community is overly skeptical of open source software. As evidence of this, how many oil and gas companies donate money to open source projects they use? There's just no culture for supporting Linux, MediaWiki, Apache, Python, etc. Why is that?

If we want awesome tools, someone, somewhere, has to pay the people who made them, somehow.

price.png

So why is software expensive and what can we do about it?

I used to sell Landmark's GeoProbe software in Calgary. At the time, it was USD140k per seat, plus 18% annual maintenance. A lot, in other words. It was hard to sell. It needed a sales team, dinners, and golf.  A sale of a few seats might take a year. There was a lot of overhead just managing licenses and test installations. Of course it was expensive!

In response, on the customer side, the corporate immune system kicked in, spawning machine lockdowns, software spending freezes, and software selection committees. These were (well, are) secret organizations of non-users that did (do) difficult and/or pointless things like workflow mapping and software feature comparisons. They have to be secret because there's a bazillion dollars and a 5-year contract on the line.

Catch 22. Even if an ordinary professional would like to try some cheaper and/or better software, there is no process for this. Hands have been tied. Decisions have been made. It's not approved. It can't be done.

Well, it can be done. I call it the 'computational geophysics manoeuver', because that lot have known about it for years. There is an easy way to reclaim your professional right to the tools of the trade, to rediscover the creativity and fun of doing new things:

Bring or buy your own technology, install whatever the heck you want on it, and get on with your work.

If you don't think that's a possibility for you right now, then consider it a medium term goal.

Old skool plot tool

It's not very glamorous, but sometimes you just want to plot a SEG-Y file. That's why we crafted seisplot. OK, that's why we cobbled seisplot together out of various scripts and functions we had lying around, after a couple of years of blog posts and Leading Edge tutorials and the like.

Pupils of the old skool — when everyone knew how to write a bash script, pencil crayons and lead-filled beanbags ruled the desktop, and Carpal Tunnel Syndrome was just the opening act to the Beastie Boys — will enjoy seisplot. For a start, it's command line only: 

    python seisplot.py -R -c config.py ~/segy_files -o ~/plots

Isn't that... reassuring? In this age of iOS and Android and Oculus Rift... there's still the command line interface.

Features galore

So what sort of features can you look forward to? Other than all the usual things you've come to expect of subsurface software, like a complete lack of support or documentation. (LOL, I'm kidding.) Only these awesome selling points:

  • Make wiggle traces or variable density plots... or don't choose — do both!
  • If you want, the script will descend into subdirectories and make plots for every SEG-Y file it finds.
  • There are plenty of colourmaps to choose from, or if you're insane you can make your own.
  • You can make PNGs, JPGs, SVGs or PDFs. But not CGM, sorry about that.

Well, I say 'selling points', but the tool is 100% free. We think this is a fair price. It's also open source of course, so please — seriously, please — improve the source code, then share it with the world! The code is in GitHub, natch.

Never go full throwback

There is one more feature: you can go full throwback and add scribbles and coffee stains. Here's one for your wall:


The 2D seismic line in this post is from the USGS NPRA Seismic Data Archive, and are in the public domain. This is line number 31-81-PR (links directly to SEG-Y file).

The big data eye-roll

First, let's agree on one thing: 'big data' is a half-empty buzzword. It's shorthand for 'more data than you can look at', but really it's more than that: it branches off into other hazy territory like 'data science', 'analytics', 'deep learning', and 'machine intelligence'. In other words, it's not just 'large data'. 

Anyway, the buzzword doesn't bother me too much. What bothers me is when I talk to people at geoscience conferences about 'big data', about half of them roll their eyes and proclaim something like this: "Big data? We've been doing big data since before these punks were born. Don't talk to me about big data."

This is pretty demonstrably a load of rubbish.

What the 'big data' movement is trying to do is not acquire loads of data then throw 99% of it away. They are not processing it in a purely serial pipeline, making arbitrary decisions about parameters on the way. They are not losing most of it in farcical enterprise data management train-wrecks. They are not locking most of their data up in such closed systems that even they don't know they have it.

They are doing the opposite of all of these things.

If you think 'big data', 'data' science' and 'machine learning' are old hat in geophysics, then you have some catching up to do. Sure, we've been toying with simple neural networks for years, eg probabilistic neural nets with 1 hidden layer — though this approach is very, very far from being mainstream in subsurface — but today this is child's play. Over and over, and increasingly so in the last 3 years, people are showing how new technology — built specifically to handle the special challenge that terabytes bring — can transform any quantitative endeavour: social media and online shopping, sure, but also astronomy, robotics, weather prediction, and transportation. These technologies will show up in petroleum geoscience and engineering. They will eat geostatistics for breakfast. They will change interpretation.

So when you read that Google has open sourced its TensorFlow deep learning library (9 November), or that Microsoft has too (yesterday), or that Facebook has too (months ago), or that Airbnb has too (in August), or that there are a bazillion other super easy-to-use packages out there for sophisticated statistical learning, you should pay a whole heap of attention! Because machine learning is coming to subsurface.

Moving ahead with social interpretation

After quietly launching Pick This — our social image interpretation tool — in February, we've been busily improving the tool and now we're moving into 2016 with a plan for world domination. I summed up the first year of development in one of the interpretation sessions at SEG 2015. Here's a 13-minute version of my talk:

In 2016 we'll be exploring ways to adapt the tool to in-house corporate use, mainly by adding encryption and private groups. This way, everyone with @awesome.com email addresses, say, would be connected to each other, and their stuff would only be shared among the group, not with the general public.

Some other functionality is on the list of things to do:

  • Other types of interpretation than points, lines and polygons.
  • Ways to find content more easily, for example with tags like 'Seismic' or 'Outcrop'.
  • Ways to follow individuals, or get notifications of new interpretations on an image.
  • More ways to visualize and generally get at the data Pick This produces.

We're always open to suggestions. Please get in touch if you have a neat idea!

Notes from a hackathon

The spirit of invention is alive and well in exploration geophysics! Last weekend, Agile hosted the 3rd annual Geophysics Hackathon at Propeller, a large and very cool co-working space in New Orleans, Louisiana.

A community of creative scientists

Commensurate with the lower-than-usual turnout at the SEG Annual Meeting, which our event preceded, we had 15 hackers. The remaining hackers were not competing, but hanging out and self-teaching or hacking around with code.

As in Denver, we had an amazing showing from Colorado School of Mines, with 6 participants. I don't know what's in the water over there in the Rockies, or what the profs have been feeding these students, but it works. Such smart, creative talent. But it can't stay this one-sided... one day we'll provoke Stanford into competitive geophysics programming.

Other than the Mines crew, we had one other student (Agile's Ben Bougher, who's at UBC), the dynamic wiki duo from SEG, and the rest were professional geoscientists from large and small companies, so it was pretty well balanced between academia and industry.

Thank you

As always, we are indebted to the sponsors and supporters of the hackathon. The event would be impossible without their financial support, and much less fun without their eager participation. This year we teamed up with three companies:

  • OpenGeoSolutions, a fantastic group of geophysicists based in Calgary. You won't find better advice on signal processing problems. Jamie Alison and Greg Partyka also regularly do us the honour of judging our hackathon demos, which is wonderful.
  • EMC, a huge cloud computing company, generously supported us through David Holmes, their representative for our industry, and a fellow Landmark alum. David also kindly joined us for much of the hackathon, including the judging, which was great for the teams.
  • Palladium Consulting, a Houston-based bespoke software house run by Sebastian Good, were a new sponsor this year. Sebastian reached out to a New Orleans friend and business partner of his, Graham Ganssle, to act as a judge, and he was beyond generous with his time and insight all weekend. He also acted as a rich source of local knowledge.

Although he craves no spotlight, I have to recognize the personal generosity of Karl Schleicher of UT Austin, who is one of the most valuable assets our community has. His tireless promotion of open data and open source software is an inspiration.

And finally, Maitri Erwin again visited to judge the demos on Sunday. She brings the perfect blend of a deep and rigorous expertise in exploration geoscience and a broad and futuristic view of technology in the service of humankind. 

I will do a round up of the projects in the next couple of weeks. Look out for that because all of the projects this year were 'different'. In a good way.


If this all sounds like fun, mark your calendars for 2016! I think we're going to try running it after SEG next year, so set aside 22 and 23 October 2016, and we'll see you there. Bring a team!

PS You can already sign up for the hackathon in Europe at EAGE next year!

The hack is back: learn new skills in New Orleans

Looking for a way to broaden your skills for the next phase of your career? Need some networking that isn't just exchanging business cards? Maybe you just need a reminder that subsurface geoscience is the funnest thing ever? I have something for you...

It's the third Geophysics Hackathon! The most creative geoscience event of the year. Completely free, as always, and fun for everyone — not just programmers. So mark your calendar for the weekend of 17 and 18 October, sign up on your own or with a team, and come to New Orleans for the most creative 48 hours of your career so far.

What is a hackathon?

It's a fun, 2-day event full of geophysics and tech. Most people participate in teams of up to 4 people, but you can take part on your own too. There's plenty of time on the first morning to find projects to work on, or maybe you already have something in mind. At the end of the second day, we show each other what we've been working on with a short demo. There are some fun prizes for especially interesting projects.

You don't have to be a programmer to join the fun. If you're more into geological interpretation, or reservoir engineering, or graphic design, or coming up with amazing ideas — there's a place for you at the hackathon. 

FAQ

  • How much does it cost? It's completely free!
  • I don't believe you. Believe it. Coffee and tacos will be provided. Just bring a laptop.
  • When is it? 17 and 18 October, doors open at 8 am each day, and we go till about 5.30.
  • So I won't miss the SEG Icebreaker? No, we'll all go!
  • Where is it? Propeller, 4035 Washington Avenue, New Orleans
  • How do I sign up? Find out more and register for the event at ageo.co/geohack15

Being part of it all

If this all sounds awesome to you, and you'll be in New Orleans this October, sign up! If you don't think it's for you, please drop in for a visit and a coffee — give me a chance to convince you to sign up next time.

If you own or work for an organization that wants to see more innovation in the world, please think about sponsoring this event, or a future one.

Last thing: I'd really appreciate any signal boost you can offer — please consider forwarding this post to the most creative geoscientist you know, especially if they're in the Houston and New Orleans area. I'm hoping that, with your help, this can be our biggest event ever.