March 31, 2017

SEG-Y Rev 2 again: little-endian is legal!

March 31, 2017/ Matt Hall

Big news! Little-endian byte order is finally legal in SEG-Y files.

That's not all. I already spilled the beans on 64-bit floats. You can now have up to 18 quintillion traces (18 exatraces?) in a seismic line. And, finally, the hyphen confusion is cleared up: it's 'SEG-Y', with a hyphen. All this is spelled out in the new SEG-Y specification, Revision 2.0, which was officially released yesterday after at least five years in the making. Congratulations to Jill Lewis, Rune Hagelund, Stewart Levin, and the rest of the SEG Technical Standards Committee.

Back up a sec: what's an endian?

Whenever you have to think about the order of bytes (the 8-bit chunks in a 'word' of 32 bits, for example) — for instance when you send data down a wire, or store bytes in memory, or in a file on disk — you have to decide if you're Roman Catholic or Church of England.

What?

It's not really about religion. It's about eggs.

In one of the more obscure satirical analogies in English literature, Jonathan Swift wrote about the ideological tussle between between two factions of Lilliputians in Gulliver's Travels (1726). The Big-Endians liked to break their eggs at the big end, while the Little-Endians preferred the pointier option. Chaos ensued.

Two hundred and fifty years later, Danny Cohen borrowed the terminology in his 1 April 1980 paper, On Holy Wars and a Plea for Peace — in which he positioned the Big-Endians, preferring to store the big bytes first in memory, against the Little-Endians, who naturally prefer to store the little ones first. Big bytes first is how the Internet shuttles data around, so big-endian is sometimes called network byte order. The drawing (right) shows how the 4 bytes in a 32-bit 'word' (the hexadecimal codes 0A, 0B, 0C and 0D) sit in memory.

Because we write ordinary numbers big-endian style — 2017 has the thousands first, the units last — big-endian might seem intuitive. Then again, lots of people write dates as, say, 31-03-2017, which is analogous to little-endian order. Cohen reviews the computational considerations in his paper, but really these are just conventions. Some architectures pick one, some pick the other. It just happens that the x86 architecture that powers most desktop and laptop computers is little-endian, so people have been illegally (and often accidentally) writing little-endian SEG-Y files for ages. Now it's actually allowed.

Still other byte orders are possible. Some processors, notably ARM and other RISC architectures, are middle-endian (aka mixed endian or bi-endian). You can think of this as analogous to the month-first American date format: 03-31-2017. For example, the two halves of a 32-bit word might be reversed compared to their 'pure' endian order. I guess this is like breaking your boiled egg in the middle. Swift did not tell us which religious denomination these hapless folks subscribe to.

OK, that's enough about byte order

I agree. So I'll end with this handy SEG-Y cheatsheet. Click here for the PDF.

References and acknowledgments

Cohen, Danny (April 1, 1980). On Holy Wars and a Plea for Peace. IETF. IEN 137. "...which bit should travel first, the bit from the little end of the word, or the bit from the big end of the word? The followers of the former approach are called the Little-Endians, and the followers of the latter are called the Big-Endians." Also published at IEEE Computer, October 1981 issue.

Thumbnail image: “Remember, people will judge you by your actions, not your intentions. You may have a heart of gold -- but so does a hard-boiled egg.” by Kate Ter Haar is licensed under CC BY 2.0

January 20, 2017

News and updates and a sandwich

January 20, 2017/ Matt Hall

Plans for the hackathon in Paris in June are well underway. We now have two major sponsors: Dell EMC and now Total E&P too will be supporting the event with generous funding. Bolstered by this, I've set a goal of getting 50 participants in the event. Imagine that!

If you would like to help us reach this goal, please consider printing out some of these posters (right) and putting them up in your place of work or study >> hi-res PDF << It should even be readable in black & white, if that's your only option.

You can find links to everything you need to know about the event at agilescientific.com/paris.

Le grand sandwich délicieux

The hackathon is really just the filling in a delicious Parisian sandwich of geocomputing goodness. The bread at the bottom is the Hacker Bootcamp on 9 June. The filling is the hackathon weekend... and the final piece is the EAGE workshop on machine learning. Convened by geoscientists at Total and IFP, it should be a great day of knowledge sharing and discussion. I can't wait.

11 days to go!

There are only 11 days left to take part in the SEG Machine Learning contest, in which you are challenged to predict lithologies in two wells, given some wireline logs and lithologies in several other nearby wells. Everything you need to get started, even if you've never tried anything like this before, is right here. See Brendon Hall's TLE article for more deets.

The radio show for geo-nerds

Undersampled Radio is still going strong. We just recorded episode 32 today. Last week's chat with Prof Chris Jackson (Imperial College London) — who's embarking on a GSA lecture tour this year — was a real cracker, check it out:

The other thing you need to know about Chris is that he's started writing his blog again. It's awesome, of course, and you should probably just go and read it now...

December 20, 2016

2016 retrospective

December 20, 2016/ Matt Hall

As we see out the year — or rather shove it out, slamming the door firmly behind it, then changing the locks and filing a restraining order — we like to glance back over the blog. We remember the posts that we enjoyed writing, and the ones you seemed to enjoy reading, and record them here for posterity.

The most popular

The great thing about writing on the web, compared to print, is that you quickly find out whether it was any good, or useful, or at least slightly interesting. You can't hide from data. Without adjusting for the age of posts (older ones have had longer to garner readers of course), the most popular posts of the last 12 months — from the 47 we have published — were:

Why Python beats MATLAB for geophysics — a comparison of the two popular languages.
Working without a job — how I think we can leave this downturn as a stronger profession.
x lines of Python: synthetic wedge model — the first in a new series of posts on programming.
Tools for drawing geoscientific figures — the best tools for making awesome illustrations.

None of these posts comes anywhere near the most popular page on the site, k is for wavenumber, which I wrote in 2012 but still gets about 600 pageviews a month, nearly 4% of the traffic on the site. Other perennials include Well tie workflow, What is anisotropy? and What is SEG Y?

If you gauge popularity by real engagement — comments, which are like diamonds to bloggers — then, apart from the pieces I already mentioned, these were the next most commenty posts:

Open source FWI, I mean geoscience (13 comments) — a report from a workshop at EAGE.
Helpful horizons (10 comments) — my own favourite post of the year, definitely the most useful.
Well data woes (10 comments) — my struggles with wrangling well data.
Copyright and seismic data (9 comments) — probably the post I worked hardest on.

Where is everybody?

We don't collect data about our readers beyond what's reported by your browser to Google Analytics, most of which is pretty esoteric. But it is interesting to see the geographic distribution of our readers. The top dozen cities from the roughly two thirds of sessions — out of about 9000 monthly sessions — that report this information:

Houston (3,457 users)
Calgary (2,244)
London (1,500)
Perth (723)
Kuala Lumpur (700)
Stavanger
Delhi
Rio de Janeiro
Leeds
Aberdeen
Jakarta
New York

Last thing

You rock! I mean it. This blog would be pretty pointless without your eyeballs. We appreciate every visit, however short, and when you share a post with someone... it really makes our day. I love hearing from readers, even about typos. Especially aobut typos. Anyway, the point is: thank you for stopping by, and being part of this global community of geoscientists.

Whatever festival you celebrate this week, have a peaceful time*. And all the best for 2017!

* Well, maybe squeeze in a bit of writing: it's good for you.

Previous Retrospective posts... 2011 retrospective • 2012 retrospective • 2013 retrospective • 2014 retrospective

There was no Retrospective in 2015, I was too discombobulated this time last year :(

December 13, 2016

SEG machine learning contest: there's still time

December 13, 2016/ Matt Hall

Have you been looking for an excuse to find out what machine learning is all about? Or maybe learn a bit of Python programming language? If so, you need to check out Brendon Hall's tutorial in the October issue of The Leading Edge. Entitled, "Facies classification using machine learning", it's a walk-through of a basic statistical learning workflow, applied to a small dataset from the Hugoton gas field in Kansas, USA.

But it was also the launch of a strictly fun contest to see who can get the best prediction from the available data. The rules are spelled out in ther contest's README, but in a nutshell, you can use any reproducible workflow you like in Python, R, Julia or Lua, and you must disclose the complete workflow. The idea is that contestants can learn from each other.

Left: crossplots and histograms of wireline log data, coloured by facies — the idea is to highlight possible data issues, such as highly correlated features. Right: true facies (left) and predicted facies (right) in a validation plot. See the rest o… — Left: crossplots and histograms of wireline log data, coloured by facies — the idea is to highlight possible data issues, such as highly correlated features. Right: true facies (left) and predicted facies (right) in a validation plot. See the rest of the paper for details.

What's it all about?

The task at hand is to predict sedimentological facies from well logs. Such log-derived facies are sometimes called e-facies. This is a familiar task to many development geoscientists, and there are many, many ways to go about it. In the article, Brendon trains a support vector machine to discriminate between facies. It does a fair job, but the accuracy of the result is less than 50%. The challenge of the contest is to do better.

Indeed, people have already done better; here are the current standings:

	Team	F1	Algorithm	Language	Solution
1	gccrowther	0.580	Random forest	Python	Notebook
2	LA_Team	0.568	DNN	Python	Notebook
3	gganssle	0.561	DNN	Lua	Notebook
4	MandMs	0.552	SVM	Python	Notebook
5	thanish	0.551	Random forest	R	Notebook
6	geoLEARN	0.530	Random forest	Python	Notebook
7	CannedGeo	0.512	SVM	Python	Notebook
8	BrendonHall	0.412	SVM	Python	Initial score in article

As you can see, DNNs (deep neural networks) are, in keeping with the amazing recent advances in the problem-solving capability of this technology, doing very well on this task. Of the 'shallow' methods, random forests are quite prominent, and indeed are a great first-stop for classification problems as they tend to do quite well with little tuning.

How do I enter?

There is still over 6 weeks to enter: you have until 31 January. There is a little overhead — you need to learn a bit about git and GitHub, there's some programming, and of course machine learning is a massive field to get up to speed on — but don't be discouraged. The very first entry was from Bryan Page, a self-described non-programmer who dusted off some basic skills to improve on Brendon's notebook. But you can run the notebook right here in mybinder.org (if it's up today — it's been a bit flaky lately) and a play around with a few parameters yourself.

The contest aspect is definitely low-key. There's no money on the line — just a goody bag of fun prizes and a shedload of kudos that will surely get the winners into some awesome geophysics parties. My hope is that it will encourage you (yes, you) to have fun playing with data and code, trying to do that magical thing: predict geology from geophysical data.

Reference

Hall, B (2016). Facies classification using machine learning. The Leading Edge 35 (10), 906–909. doi: 10.1190/tle35100906.1. (This paper is open access: you don't have to be an SEG member to read it.)

December 06, 2016

Le meilleur hackathon du monde

December 06, 2016/ Matt Hall

Hackathons are short bursts of creative energy, making things that may or may not turn out to be useful. In general, people work in small teams on new projects with no prior planning. The goal is to find a great idea, then manifest that idea as something that (barely) works, but might not do very much, then show it to other people.

Hackathons are intellectually and professionally invigorating. In my opinion, there's no better team-building, networking, or learning event.

The next event will be 10 & 11 June 2017, right before the EAGE Conference & Exhibition in Paris. I hope you can come.

see this event on Eventbrite

The theme for this event will be machine learning. We had the same theme in New Orleans in 2015, but suffered a bit from a lack of data. This time we will have a collection of open datasets for participants to build off, and we'll prime hackers with a data-and-skills bootcamp on Friday 9 June. We did this once before in Calgary – it was a lot of fun.

Can you help?

It's my goal to get 52 participants to this edition of the event. But I'll need your help to get there. Please share this post with any friends or colleagues you think might be up for a weekend of messing about with geoscience data and ideas.

Other than participants, the other thing we always need is sponsors. So far we have three organizations sponsoring the event — Dell EMC is stepping up once again, thanks to the unstoppable David Holmes and his team. And we welcome Sandstone — thank you to Graham Ganssle, my Undersampled Radio co-host, who I did not coerce in any way.

If your organization might be awesome enough to help make amazing things happen in our community, I'd love to hear from you. There's info for sponsors here.

If you're still unsure what a hackathon is, or what's so great about them, check out my November article in the Recorder (Hall 2015, CSEG Recorder, vol 40, no 9).

November 18, 2016

The disappearing lake trick

November 18, 2016/ Matt Hall

On Sunday 20 November it's the 36th anniversary of the 1980 Lake Peigneur drilling disaster. The shallow lake — almost just a puddle at about 3 m deep — disappeared completely when the Texaco wellbore penetrated the Diamond Crystal Salt Company mine at a depth of about 350 m.

Location, location, location

It's thought that the rig, operated by Wilson Brothers Ltd, was in the wrong place. It seems a calculation error or misunderstanding resulted in the incorrect coordinates being used for the well site. (I'd love to know if anyone knows more about this as the Wikipedia page and the video below offer slightly different versions of this story, one suggesting a CRS error, the other a triangulation error.)

The entire lake sits on top of the Jefferson Island salt dome, but the steep sides of the salt dome, and a bit of bad luck, meant that a few metres were enough to spoil everyone's day. If you have 10 minutes, it's worth watching this video...

Apparently the accident happened at about 0430, and the crew abandoned the subsiding rig before breakfast. The lake was gone by dinner time. Here's how John Warren, a geologist and proprietor of Saltworks, describes the emptying in his book Evaporites (Springer 2006, and repeated on his awesome blog, Salty Matters):

“Eyewitnesses all agreed that the lake drained like a giant unplugged bathtub—taking with it trees, two oil rigs [...], eleven barges, a tugboat and a sizeable part of the Live Oak Botanical Garden. It almost took local fisherman Leonce Viator Jr. as well. He was out fishing with his nephew Timmy on his fourteen-foot aluminium boat when the disaster struck. The water drained from the lake so quickly that the boat got stuck in the mud, and they were able to walk away! The drained lake didn’t stay dry for long, within two days it was refilled to its normal level by Gulf of Mexico waters flowing backwards into the lake depression through a connecting bayou...”

The other source that seems reliable is Oil Rig Disasters, a nice little collection of data about various accidents. It ends with this:

“Federal experts from the Mine Safety and Health Administration were not able to apportion blame due to confusion over whether Texaco was drilling in the wrong place or that the mine’s maps were inaccurate. Of course, all evidence was lost.”

If the bit about the location is true, it may be one of the best stories of the perils of data management errors. If anyone (at Chevron?!) can find out more about it, please share!

October 25, 2016

Tune in to Undersampled Radio

October 25, 2016/ Matt Hall

Back in the summer I mentioned Undersampled Radio, the world's newest podcast about geoscience. Well, geoscience and computers. OK, machine learning and geoscience. And conferences.

We're now 25 shows in, having started with Episode 0 on 28 January. The show is hosted by Graham 'Gram' Ganssle, a consulting and research geophysicist based in New Orleans, and me. Appropriately enough, I met Gram at the machine-learning-themed hackathon we did at SEG in 2015. He was also a big help with the local knowledge.

I broadcast from one of the phone rooms at The HUB South Shore. Gram has the luxury of a substantial book-lined office, which I imagine has ample views of paddle-steamers lolling on the Mississippi (but I actually have no idea where it is).

To get an idea of what we chat about, check out the guests on some recent episodes:

Ep 23, Forest Through the Trees — David Holmes, CTO of Energy at Dell EMC.
Ep 22, Geomechanicists vs Geomechanicians — Amy Fox, a geomechanic in Calgary.
Ep 20, Hygge — Jesper Dramsch, a PhD student in Copenhagen.
Ep 18, The Rock Botherer — Chris Jackson, a geologist at Imperial College London.
Ep 17, Rock Women Rock — Maitri Erwin, a geophysicist at Nexen in Houston.
Ep 16, Today's Technology Sucks Less — Gerard Gorman, a computational physicist at Imperial.

Better than cable

The podcast is really more than just a podcast, it's really a live TV show, broadcasting on YouTube Live. You can catch the action while it's happening on the Undersampled Radio channel. However, it's not easy to catch live because the episodes are not that predictable — they are announced about 24 hours in advance on the Software Underground Slack group (you are in there, right?). We should try to put them out on the @undrsmpldrdio Twitter feed too...

So, go ahead and watch the very latest episode, recorded last Thursday. We spoke to Tim Hopper, a data scientist in Raleigh, NC, who works at Distil Networks, a cybersecurity firm. It turns out that using machine learning to filter web traffic has some features in common with computational geophysics...

You can subscribe to the show in iTunes or Google Play, or anywhere else good podcasts are served. Grab the RSS Feed from the UndersampledRad.io website.

Of course, we take guest requests. Who would you like to hear us talk to?

August 04, 2016

The sound of the Software Underground

August 04, 2016/ Matt Hall

If you are a geoscientist or subsurface engineer, and you like computery things — in other words, if you read this blog — I have a treat for you. In fact, I have two! Don't eat them all at once.

Software Underground

Sometimes (usually) we need more diversity in our lives. Other times we just want a soul mate. Or at least someone friendly to ask about that weird new seismic attribute, where to find a Python library for seismic imaging, or how to spell Kirchhoff. Chat rooms are great for those occasions, Slack is where all the cool kids go to chat, and the Software Underground is the Slack chat room for you.

It's free to join, and everyone is welcome. There are over 130 of us in there right now — you probably know some of us already (apart from me, obvsly). Just go to http://swung.rocks/ to sign up, and we will welcome you at the door with your choice of beverage.

To give you a flavour of what goes on in there, here's a listing of the active channels:

#python — for people developing in Python
#sharp-rocks — for people developing in C# or .NET
#open-geoscience — for chat about open access content, open data, and open source software
#machinelearning — for those who are into artificial intelligence
#busdev — collaboration, subcontracting, and other business opportunities
#general — chat about anything to do with geoscience and/or computers
#random — everything else

Undersampled Radio

If you have a long commute, or occasionally enjoy being trapped in an aeroplane while it flies around, you might have discovered the joy of audiobooks and podcasts. You've probably wished many times for a geosciencey sort of podcast, the kind where two ill-qualified buffoons interview hyper-intelligent mega-geoscientists about their exploits. I know I have.

Well, wish no more because Undersampled Radio is here! Well, here:

iTunes — subscribe in the app on your device. Best for iOS.
Google Play Music — subscribe in the app. Best for Android.
UndersampledRad.io — the website and RSS feed.

The show is hosted by New Orleans-based geophysicist Graham Ganssle and me. Don't worry, it's usually not just us — we talk to awesome guests like geophysicists Mika McKinnon and Maitri Erwin, geologist Chris Jackson, and geopressure guy Mark Tingay. The podcast is recorded live every week or three in Google Hangouts on Air — the link to that, and to show notes and everything else — is posted by Gram in the #undersampled Software Underground channel. You see? All these things are connected, albeit in a nonlinear, organic, highly improbable way. Pseudoconnection: the best kind of connection.

Indeed, there is another podcast pseudoconnected to Software Underground: the wonderful Don't Panic Geocast — hosted by John Leeman and Shannon Dulin — also has a channel: #dontpanic. Give their show a listen too! In fact, here's a show we recorded together!

Don't have an hour right now? OK, you asked for it, here's a clip from that show to get you started. It starts with John Leeman explaining what Fun Paper Friday is, and moves on to one of my regular rants about conferences...

In case you're wondering, neither of these projects is explicitly connected to Agile — I am just involved in both of them. I just wanted to clear up any confusion. Agile is not a podcast company, for the time being anyway.

June 08, 2016

PRESS START

June 08, 2016/ Matt Hall

The dust has settled from the Subsurface Hackathon 2016 in Vienna, which coincided with EAGE's 78th Conference and Exhibition (some highlights). This post builds on last week's quick summary with more detailed descriptions of the teams and what they worked on. If you want to contact any of the teams, you should be able to track them down via the links to Twitter and/or GitHub.

A word before I launch into the projects. None of the participants had built a game before. Many were relatively new to programming — completely new in one or two cases. Most of the teams were made up of people who had never worked together on a project before; indeed, several team mates had never met before. So get ready to be impressed, maybe even amazed, at what members of our professional community can do in 2 days with only mild provocation and a few snacks.

Traptris

An 8-bit-style video game, complete with music, combining Tetris with basin modeling.

Team: Chris Hamer, Emma Blott, Natt Turner (all MSc students at the University of Leeds), Jesper Dramsch (PhD student, Technical University of Denmark, Copenhagen). GitHub repo.

Tech: Python, with PyGame.

Details: The game is just like Tetris, except that the blocks have lithologies: source, reservoir, and seal. As you complete a row, it disappears, as usual. But in this game, the row reappears on a geological cross-section beside the main game. By completing further rows with just-right combinations of lithologies, you build an earth model. When it's deep enough, and if you've placed sources rocks in the model, the kitchen starts to produce hydrocarbons. These migrate if they can, and are eventually trapped — if you've managed to build a trap, that is. The team impressed the judges with their solid gamplay and boisterous team spirit. Just installing PyGame and building some working code was an impressive feat for the least experienced team of the hackathon.

Prize: We rewarded this rambunctious team for their creative idea, which it's hard to imagine any other set of human beings coming up with. They won Samsung Gear VR headsets, so I'm looking forward to the AR version of the game.

Flappy Trace

A ridiculously addictive seismic interpretation game. "So seismic, much geology".

Team: Håvard Bjerke (Roxar, Oslo), Dario Bendeck (MSc student, Leeds), and Lukas Mosser (PhD student, Imperial College London).

Tech: Python, with PyGame. GitHub repo.

Details: You start with a trace on the left of the screen. More traces arrive, slowly at first, from the right. The controls move the approaching trace up and down, and the pick point is set as it moves across the current trace and off the screen. Gradually, an interpretation is built up. It's like trying to fly along a seismic horizon, one trace at a time. The catch is that the better you get, the faster it goes. All the while, encouragements and admonishments flash up, with images of the doge meme. Just watching someone else play is weirdly mesmerizing.

Prize: The judges wanted to recognize this team for creating such a dynamic, addictive game with real personality. They won DIY Gamer kits and an awesome book on programming Minecraft with Python.

Guess What!

Human seismic inversion. The player must guess the geology that produces a given trace.

Team: Henrique Bueno dos Santos, Carlos Andre (both UNICAMP, Sao Paolo), and Steve Purves (Euclidity, Spain)

Tech: Python web application, on Flask. It even used Agile's nascent geo-plotting library, g3.js, which I am pretty excited about. GitHub repo. You can even play the game online!

Details: This project was on a list of ideas we crowdsourced from the Software Underground Slack, and I really hoped someone would give it a try. The team consisted of a postdoc, a PhD student, and a professional developer, so it's no surprise that they managed a nice implementation. The player is presented with a synthetic seismic trace and must place reflection coefficients that will, she hopes, forward model to match the trace. She may see how she's progressing only a limited number of times before submitting her final answer, which receives a score. There are so many ways to control the game play here, I think there's a lot of scope for this one.

Prize: This team impressed everyone with the far-reaching implications of the game — and the rich possibilities for the future. They were rewarded with SparkFun Digital Sandboxes and a copy of The Thrilling Adventures of Lovelace and Babbage.

DiamondChaser

aka DiamonChaser (sic). A time- and budget-constrained drilling simulator aimed at younger players.

Team: Paul Gabriel, Björn Wieczoreck, Daniel Buse, Georg Semmler, and Jan Gietzel (all at GiGa infosystems, Freiberg)

Tech: TypeScript, which compiles to JS. BitBucket repo. You can play the game online too!

Details: This tight-knit group of colleagues — all professional developers, but using unfamiliar technology — produced an incredibly polished app for the demo. The player is presented with a blank cross section, and some money. After choosing what kind of drill bit to start with, the drilling begins and the subsurface is gradually revealed. The game is then a race against the clock and the ever-diminishing funds, as diamonds and other bonuses are picked up along the way. The team used geological models from various German geological surveys for the subsurface, adding a bit of realism.

Prize: Everyone was impressed with the careful design and polish of the app this team created, and the quiet industry they brought to the event. They each won a CellAssist OBD2 device and a copy of Charles Petzold's Code.

Some of the participants waiting for the judges to finish their deliberations. Standing, from left: Håvard Bjerke, Henrique Bueno dos Santos, Steve Purves. Seated: Jesper Dramsch, Lukas Mosser, Natt Turner, Emma Blott, Dario Bendeck, Carlos André, B… — Some of the participants waiting for the judges to finish their deliberations. Standing, from left: Håvard Bjerke, Henrique Bueno dos Santos, Steve Purves. Seated: Jesper Dramsch, Lukas Mosser, Natt Turner, Emma Blott, Dario Bendeck, Carlos André, Björn Wieczoreck, Paul Gabriel.

Credits and acknowledgments

Thank you to all the hackers for stepping into the unknown and coming along to the event. I think it was everyone's first hackathon. It was an honour to meet everyone. Special thanks to Jesper Dramsch for all the help on the organizational side, and to Dragan Brankovic for taking care of the photography.

The Impact HUB Vienna was a terrific venue, providing us with multiple event spaces and plenty of room to spread out. HUB hosts Steliana and Laschandre were a great help. Der Mann produced the breakfasts. Il Mare pizzeria provided lunch on Saturday, and Maschu Maschu on Sunday.

Thank you to Kristofer Tingdahl, CEO of dGB Earth Sciences and a highly technical, as well as thoughtful, geoscientist. He graciously agreed to act as a judge for the demos, and I think he was most impressed with the quality of the teams' projects.

Last but far from least, a huge Thank You to the sponsor of the event, EMC, the cloud computing firm that was acquired by Dell late last year. David Holmes, the company's CTO (Energy) was also a judge, making an amazing opportunity for the hackers to show off their skills, and sense of humour, to a progressive company with big plans for our industry.

June 01, 2016

Open source geoscience is _________________

June 01, 2016/ Matt Hall

As I wrote yesterday, I was at the Open Source Geoscience workshop at EAGE Vienna 2016 on Monday. Happily, the organizers made time for discussion. However, what passes for discussion in the traditional conference setting is, as I've written before, stilted.

What follows is not an objective account of the proceedings. It's more of a poorly organized collection of soundbites and opinions with no real conclusion... so it's a bit like the actual discussion itself.

TL;DR The main take home of the discussion was that our community does not really know what to do with open source software. We find it difficult to see how we can give stuff away and still make loads of money.

I'm not giving away my stuff

Paraphrasing a Schlumberger scientist:

Schlumberger sponsors a lot of consortiums, but the consortiums that will deliver closed source software are our favourites.

I suppose this is a way to grab competitive advantage, but of course there are always the other consortium members so it's hardly an exclusive. A cynic might see this position as a sort of reverse advantage — soak up the brightest academics you can find for 3 years, and make sure their work never sees the light of day. If you patent it, you can even make sure no-one else gets to use the ideas for 20 years. You don't even have to use the work! I really hope this is not what is going on.

I loved the quote Sergey Fomel shared; paraphrasing Matthias Schwab, his former advisor at Stanford:

Never build things you can't take with you.

My feeling is that if a consortium only churns out closed source code, then it's not too far from being a consulting shop. Apart from the cheap labour, cheap resources, and no corporation tax.

Yesterday, in the talks in the main stream, I asked most of the presenters how people in the audience could go and reproduce, or simply use, their work. The only thing that was available was a commerical OpendTect plugin of dGB's, and one free-as-in-beer MATLAB utility. Everything else was unavailble for any kind of inspection, and in one case the individual would not even reveal the technology framework they were using.

Support and maintenance

Paraphrasing a Saudi Aramco scientist:

There are too many bugs in open source, and no support.

The first point is, I think, a fallacy. It's like saying that Wikipedia contains inaccuracies. I'd suggest that open source code has about the same number of bugs as proprietary software. Software has bugs. Some people think open source is less buggy; as Linus Torvalds said: "Given enough eyeballs, all bugs are shallow." Kristofer Tingdahl (dGB) pointed out that the perceived lack of support is a business opportunity for open source community. Another participant mentioned the importance of having really good documentation. That costs money of course, which means finding ways for industry to support open source software development.

The same person also said something like:

[Open source software] changes too quickly, with new versions all the time.

...which says a lot about the state of application management in many corporations and, again, may represent opportunity rather than a threat to open source movement.

Only in this industry (OK, maybe a couple of others) will you hear the impassioned cry, "Less change!"

The fog of torpor

When a community is falling over itself to invent new ways to do things, create new value for people, and find new ways to get paid, few question the sharing and re-use of information. And by 'information' I mean code and data, not a few PowerPoint slides. Certainly not all information, but lots. I don't know which is the cause and which is the effect, but the correlation is there.

In a community where invention is slow, on the other hand, people are forced to be more circumspect, and what follows is a cynical suspicion of the motives of others. Here's my impression of the dynamic in the room during the discussion on Monday, and of course I'm generalizing horribly:

Operators won't say what they think in front of their competitors
Vendors won't say what they think in front of their customers and competitors
Academics won't say what they think in front of their consortium ~~customers~~ sponsors
Students won't say what they think in front of their advisors and potential employers

This all makes discussion a bit stilted. But it's not impossible to have group discussions in spite of these problems. I think we achieved a real, honest conversation in the two Unsessions we're done in Calgary, and I think the model we used would work perfectly in all manner of non-technical and in technical settings. We just have to start doing it. Why our convention organizers feel unable to try new things at conferences is beyond me.

I can't resist finishing on something a person at Chevron said at the workshop:

I'm from Chevron. I was going to say something earlier, but I thought maybe I shouldn't.

This just sums our industry up.

Blog