August 16, 2018

What is a sprint?

August 16, 2018/ Matt Hall

In October we're hosting our first 'code sprint'! What is that?

A code sprint is a type of hackathon, in which efforts are focused around a small number of open source projects. They are related to, but not really the same as, sprints in the Scrum software development framework. They are non-competitive — the only goal is to improve the software in question, whether it's adding functionality, fixing bugs, writing tests, improving documentation, or doing any of the other countless things that good software needs.

On 13 and 14 October, we'll be hacking on 3 projects:

Devito: a high-level finite difference library for Python. Devito featured in three Geophysical Tutorials at the end of 2017 and beginning of 2018 (see Witte et al. for Part 3). The project needs help with code, tests, model examples, and documentation. There will be core devs from the project at the sprint. GitHub repo is here.
Bruges: a simple collection of Python functions representing basic geophysical equations. We built this library back in 2015, and have been chipping away ever since. It needs more equations, better docs, and better tests — and the project is basic enough for anyone to contribute to it, even a total Python newbie. GitHub repo is here.
G3.js: a JavaScript wrapper for D3.js, a popular plotting toolkit for web developers. When we tried to adapt D3.js to geoscience data, we found we wanted to simplify basic tasks like making vertical plots, and plotting raster-like data (e.g. seismic) with line plots on top (e.g. horizons). Experience with JavaScript is a must. GitHub repo is here.

The sprint will be at a small joint called MAZ Café Con Leche, located in Santa Ana about 10 km or 15 minutes from the Anaheim Convention Center where the SEG Annual Meeting is happening the following week.

Thank you, as ever, to our fantastic sponsors: Dell EMC and Enthought. These two companies are powered by amazing people doing amazing things. I'm very grateful to them both for being such enthusiastic champions of the change we're working for in our science and our industry.

If you like the sound of spending the weekend coding, talking geophysics, and enjoying the best coffee in southern California, please join us at the Geophysics Sprint! Register on Eventbrite and we'll see you there.

August 08, 2018

Get out of the way

August 08, 2018/ Matt Hall

This tweet from the Ecological Society of America conference was interesting:

I am currently moderating a session at #ESA2018 and was instructed by @ESA_org to ask the audience not to take photos of the presentations. This is all backwards,”. Presenters should have an opt-out “no-tweet” option instead, like @BritishEcolSoc does, not a mandatory “no-tweet” pic.twitter.com/PE8nNDwm06
— Rob Salguero-Gómez (@DRobcito) August 7, 2018

This kind of thing is not new — many conferences have 'No photos' signs around the posters and the talk sessions. 'No tweeting' seems pretty extreme though. I'm not sure if that's what the ESA was pushing for in this case, but either way the message is: 'No sharing stuff'. They do have a hashtag though, so...

Anyway, I tweeted this in response:

I think this tells you just as much about how broken the conference model is, as about how naïve/afraid our technical societies are.

I think there's a general rule: if you're trying to control the flow of information, you're getting in the way. You're also going to be disappointed because you can't control the flow of information — perhaps because it's not yours to control. I want to say to the organizers: The people you invited into your society are, thankfully, enthusiastic collaborators who can't wait to share the exciting things they heard at your conference. Why on earth would you try to shut that down? Why wouldn't you go out of your way to support them, amplify them, and find more people like them?

But wait, the no-tweeting society asks, what if the author didn't want anyone to share their work? My first question is: why did you give a talk then? My second question is: did the sharer give you proper attribution? If not — you are right to be annoyed and your society should help set this norm in your community. If so — see my first question.

Technical societies need to get over the idea that they own their communities and the knowledge their communities produce. They fret about revenue and membership numbers, but they just need to focus on making their members' technical and professional lives richer and more connected. The rest will take care of itself.

Interested in this topic? Here's a great post about tweeting at conferences, by Jacquelyn Gill. It also links to lots of other opinions, and there are lots of comments.

Image by Rob Salguero-Gómez.

August 03, 2018

Life lessons from a neural network

August 03, 2018/ Matt Hall

The latest Geophysical Tutorial came out this week in The Leading Edge. It's by my friend Gram Ganssle, and it's about neural networks. Although the example in the article is not, strictly speaking, a deep net (it only has one hidden layer), it concisely illustrates many of the features of deep learning.

Whilst editing the article, it struck me that some of the features of deep learning are really features of life. Maybe humans can learn a few life lessons from neural networks!

Seek nonlinearity

Activation functions are one of the most important ingredients in a neural network. They are the reason neural nets are able to learn complex, nonlinear relationships without a gigantic number of parameters.

Life lesson: look for nonlinearities in your life. Go to an event aimed at another profession. Take a new route to work. Buy a random volume at your local bookshop. Pick that ice-cream flavour you've never dared try (durian, anyone?).

Iterate

Neural networks learn by repetition. They start with random guesses about what might work, then they process each data point a hundred, maybe 100,000 times, check the answer, adjust weights, and get a little better each time.

Life lesson: practice makes perfect. You won't get anything right the first time (if you do, celebrate!). The important thing is that you pay attention, figure out what to change, and tweak it. Then try again.

More data

One of the things we know for sure about neural networks is that they work best when they train on a lot of data. They need to see as much of the problem domain as possible, including the edge cases and the worst cases.

Life lesson: seek data. If you're a geologist, get out into the field and see more rocks. Geophysicists: look at more seismic. Whoever you are, read more. Afterwards, share what you find with others, and listen to what they have learned.

Stretch metaphors

Yes, well, I could probably go on. Convolutional networks teach us to create new things by mixing ideas from different parts of our experience. Long training times for neural nets teach us to be patient, and invest in GPUs. Hidden layers with many units teach us to... er, expect a lot of parameters in our lives...?

Anyway, the point is that life is like a neural net. Or maybe, no less interestingly, neural nets are like life. My impression is that most of the innovations in deep learning have come from people looking at their own interpretive and discriminatory powers and asking, "What do I do here? How do I make these decisions?" — and then trying to approximate that heuristic or thought process in code.

What's the lesson here? I have no idea. Enjoy your weekend!

Thumbnail image by Flickr user latteda, licensed CC-BY. The Leading Edge cover is copyright of SEG, fair use terms.

July 31, 2018

Results from the AAPG Machine Learning Unsession

July 31, 2018/ Matt Hall

Back in May, I co-hosted a different kind of conference session — an 'unsession' — at the AAPG Annual Conference and Exhibition in Salt Lake City, Utah. It was successful in achieving its main goal, which was to show the geoscience community and AAPG organizers a new way of collaborating, networking, and producing tangible outcomes from conference sessions.

It also succeeded in drawing out hundreds of ideas and questions around machine learning in geoscience. We have now combed over what the 120 people (roughly) produced on that afternoon, written it up in a Google Doc (right), and present some highlights right here in this post.

The unsession had three phases:

Exploring current and future skills for geoscientists.
Asking about the big questions in machine learning in geoscience.
Digging into some of those questions.

Let's look at each one in turn.

Current and future skills

As an icebreaker, we asked everyone to list three skills they have that set them apart from others in their teams or organizations — their superpowers, if you will. They wrote these on green Post-It notes. We also asked for three more skills they didn't have today, but wanted to acquire in the next decade or so. These went on orange Post-Its. We were especially interested in those skills that felt intimidating or urgent. The 8 or 10 people at each table then shared these with each other, by way of introducing themselves.

The skills are listed in this Google Sheets document.

Unsurprisingly, the most common 'skills I have' were around geoscience: seismic interpretation, seismic analysis, stratigraphy, engineering, modeling, sedimentology, petrophysics, and programming. And computational methods dominated the 'skills I want' category: machine learning, Python, coding or programming, deep learning, statistics, and mathematics.

We followed this up with a more general question — How would you rate the industry's preparedness for this picture of the future, as implied by the skill gap we've identified?. People could substitute 'industry' for whatever similar scale institution felt meaningful to them. As shown (right), this resulted in a bimodal distribution: apparently there are two ways to think about the future of applied geoscience — this may merit more investigation with a more thorough survey.

Get the histogram data.

Big questions in ML

After the icebreaker, we asked the tables to respond to a big question:

What are the most pressing questions in applied geoscience that can probably be tackled with machine learning?

We realized that this sounds a bit 'hammer looking for a nail', but justified asking the question this way by drawing an anology with other important new tools of the past — well logging, or 3D seismic, or sequence stratigrapghy. The point is that we have this powerful new (to us) set of tools; what are we going to look at first? At this point, we wanted people to brainstorm, without applying constraints like time or money.

This yielded approximately 280 ideas, all documented in the Google Sheet. Once the problems had been captured, the tables rotated so that each team walked to a neighboring table, leaving all their problems behind... and adopting new ones. We then asked them to score the new problems on two axes: scope (local vs global problems) and tractability (easy vs hard problems). This provided the basis for each table to choose one problem to take to the room for voting (each person had 9 votes to cast). This filtering process resulted in the following list:

How do we communicate error and uncertainty when using machine learning models and solutions? 85 votes.
How do we account for data integration, integrity, and provenance in our models? 78 votes.
How do we revamp the geoscience curriculum for future geoscientists? 71 votes.
What does guided, searchable, legacy data integration look like? 68 votes.
How can machine learning improve seismic data quality, or provide assistive technology on poor data? 65 votes.
How does the interpretability of machine learning model predictions affect their acceptance? 54 votes.
How do we train a model to assign value to prospects? 51 votes.
How do we teach artificial intelligences foundational geology? 45 votes.
How can we implement automatic core description? 42 votes.
How can we contain bad uses of AI? 40 votes.
Is self-steering well drilling possible? 21 votes.

I am paraphrasing most of those, but you can read the originals in the Google Sheet data harvest.

Exploring the questions

In the final stage of the afternoon, we took the top 6 questions from the list above, and dug into them a little deeper. Tables picked their way through our Solution Sketchpads — especially updated for machine learning problems — to help them navigate the problems. Clearly, these questions were too enormous to make much progress in the hour or so left in the day, but the point here was to sound out some ideas, identify some possible actions, and connect with others interested in working on the problem.

One of the solution sketches is shown here (right), for the Revamp the geoscience curriculum problem. They discussed the problem animatedly for an hour.

This team included — among others — an academic geostatistician, an industry geostatistician, a PhD student, a DOE geophysicist, an SEC geologist, and a young machine learning brainbox. Amazingly, this kind of diversity was typical of the tables.

See the rest of the solution sketches in Flickr.

That's it! Many thanks to Evan Bianco for the labour of capturing and digitizing the data from the event. Thanks also to AAPG for the great photos, and for granting them an open license. And thank you to my co-chairs Brendon Hall and Yan Zaretskiy of Enthought, and all the other folks who helped make the event happen — see the Productive chaos post for details.

To dig deeper, look for the complete write up in Google Docs, and the photos in Flickr.

Just a reminder... if it's Python and machine learning skills you want, we're running a Summer School in downtown Houston the week of 13 August. Come along and get your hands on the latest in geocomputing methods. Suitable for beginners or intermediate programmers.

Don't miss out! Find out more or register now.

July 27, 2018

Visualization in Copenhagen, part 2

July 27, 2018/ Matt Hall

In Part 1, I wrote about six of the projects teams contributed at the Subsurface Hackathon in Copenhagen in June. Today I want to tell you about the rest of them.

A data exploration tool

Team GeoClusterFu...n: Dan Stanton (University of Leeds), Filippo Broggini (ETH Zürich), Francois Bonneau (Nancy), Danny Javier Tapiero Luna (Equinor), Sabyasachi Dash (Cairn India), Nnanna Ijioma (geophysicist).

Tech: Plotly Dash. GitHub repo.

Project: The team set out to build an interactive web app — a totally new thing for all of them — to make interactive plots from data in a CSV. They ended up with the basis of a useful tool for exploring geoscience data. Project page.

Four sixths of the GeoClusterFu...n team cluster around a laptop.

AR outcrop on your phone

Team SmARt_OGs: Brian Burnham (University of Aberdeen), Tala Maria Aabø (Natural History Museum of Denmark), Björn Wieczoreck, Georg Semmler and Johannes Camin (GiGa Infosystems).

Tech: ARKit/ARCore, WebAR, Firebase. GitLab repo.

Project: Bjørn and his colleagues from GiGa Infosystems have been at all the European hackathons. This time, he knew he wanted to get virtual outcrops on mobiles phones. He found a willing team, and they got it done! Project page.

Three views from the SmartOGs's video. See the full version.

Rock clusters in latent space

The Embedders: Lukas Mosser (Imperial College London), Jesper Dramsch (Technical University of Denmark), Ben Fischer (PricewaterhouseCoopers), Harry McHugh (DUG), Shubhodip Konar (Cairn India), Song Hou (CGG), Peter Bormann (ConocoPhillips).

Tech: Bokeh, scikit-learn, Multicore-TSNE. GitHub repo.

Project: There has been a lot of recent interest in the t-SNE algorithm as a way to reduce the dimensionality of complex data. The team explored its application to subsurface data, and found promising applications. Web page. Project page.

The Embeders built a web app to cluster the data in an LAS file. The clusters (top left) are generated by the t-SNE algorithm.

Fully mixed reality

Team Hands On GeoLabs: Will Sanger (Western Geco), Chance Sanger (Houston Museum of Fine Arts), Pierre Goutorbe (Total), Fernando Villanueva (Institut de Physique du Globe de Paris).

Project: Starting with the ambitious goal of combining the mixed reality of the Meta AR gear with the mixed reality of the Gempy sandbox, the team managed to display and interact with some seismic data in the AR headset, which allows interaction with simple hand gestures. Project page.

The team demonstrate the Meta AR headset.

Huge grids over the web

Team Grid Vizards: Fabian Kampe, Daniel Buse, Jonas Kopcsek, Paul Gabriel (all from GiGa Infosystems)

Tech: three.js. GitHub repo.

Project: Paul and his team wanted to visualize hundreds of millions or billions of grid cells — all in the browser. They ended up with about 20 million points working very smoothly, and impressed everyone. Project page.

Interpreting RGB displays for spec decomp

Team: Florian Smit (Technical University of Denmark), Gijs Straathof (SGS), Thomas Gazzola (Total), Julien Capgras (Total), Steve Purves (Euclidity), Tom Sandison (Shell)

Tech: Python, react.js. GitHub repos: Client. Backend.

Project: Spectral decomposition is still a mostly quantitative tool, especially the interpretation of RGB-blended displays. This team set out to make intuitive, attractive forward models of the spectral response of wells. This should help interpret seismic data, and perhaps make more useful RGB displays too. Intriguing and promising work. Project page.

That's it for another year! Twelve new geoscience visualization projects — ten of them open source. And another fun, creative weekend for 63 geoscientists — all of whom left with new connections and new skills. All this compressed into one weekend. If you haven't experienced a hackathon yet, I urge you to seek one out.

I will leave you with two videos — and an apology. We are so focused on creating a memorable experience for everyone in the room, that we tend to neglect the importance of capturing what's happening. Early hackathons only had the resulting blog post as the document of record, but lately we've been trying to livestream the demos at the end. Our success has been, er, mixed... but they were especially wonky this time because we didn't have livestream maestro Gram Ganssle there. So, these videos exist, and are part of the documentation of the event, but they barely begin to convey the awesomeness of the individuals, the teams, or their projects. Enjoy them, but next time — you should be there!

July 25, 2018

Visualization in Copenhagen, part 1

July 25, 2018/ Matt Hall

It's finally here! The round-up of projects from the Subsurface Hacakthon in Copenhagen last month. This is the first of two posts presenting the teams and their efforts, in the same random order the teams presented them at the end of the event.

Subsurface data meets Pokemon Go

Team Geo Go: Karine Schmidt, Max Gribner, Hans Sturm (all from Wintershall), Stine Lærke Andersen (University of Copenhagen), Ole Johan Hornenes (University of Bergen), Per Fjellheim (Emerson), Arne Kjetil Andersen (Emerson), Keith Armstrong (Dell EMC).

Project: With Pokemon Go as inspiration, the team set out to prototype a geoscience visualization app that placed interactive subsurface data elements into a realistic 3D environment.

Visualizing blind spots in data

Team Blind Spots: Jo Bagguley (UK Oil & Gas Authority), Duncan Irving (Teradata), Laura Froelich (Teradata), Christian Hirsch (Aalborg University), Sean Walker (Campbell & Walker Geophysics).

Tech: Flask, Bokeh, AWS for hosting app. GitHub repo.

Project: Data management always comes up as an issue in conversations about geocomputing, but few are bold enough to tackle it head on. This team built components for checking the integrity of large amounts of raw data, before passing it to data science projects. Project page.

Sean, Laura, and Christian. Jo and Duncan were out doing research. Note the kanban board in the background — agile all the way!

Volume uncertainties visualization

Team Fortuna: Natalia Shchukina (Total), Behrooz Bashokooh (Shell), Tobias Staal (University of Tasmania), Robert Leckenby (now Agile!), Graham Brew (Dynamic Graphics), Marco van Veen (RWTH Aachen).

Tech: Flask, Bokeh, Altair, Holoviews. GitHub repo.

Project: Natalia brought some data with her: lots of surface grids. The team built a web app to compute uncertainty sections and maps, then display them dynamically and interactively — eliciting audible gasps from the room. Project page.

The Fortuna app: Probability of being the the zone (left) and entropy (right). Cross-sections are shown at the top, maps on the bottom.

Differences and similarities with RGB blends

Team RGBlend: Melanie Plainchault and Jonathan Gallon (Total), Per Olav Svendsen, Jørgen Kvalsvik and Max Schuberth (Equinor).

Tech: Python, Bokeh. GitHub repo.

Project: One of the more intriguing ideas of the hackathon was not just so much a fancy visualization technique, as a novel way of producing a visualization — differencing 3 images and visualizing the differences in RGB space. It reminded me of an old blog post about the spot the difference game. Project page.

The differences (lower right) between three time-lapse seismic amplitude maps. — The differences (lower right) between three time-lapse seismic amplitude maps.

Augmented reality geological maps

Team AR Sandbox: Simon Virgo (RWTH Aachen), Miguel de la Varga (RWTH Aachen), Fabian Antonio Stamm (RWTH Aachen), Alexander Schaaf (University of Aberdeen).

Tech: Gempy. GitHub repo.

Project: I don't have favourite projects, but if I did, this would be it. The GemPy group had already built their sandbox when they arrived, but they extended it during the hackathon. Wonderful stuff. Project page.

magic box of sand: Sculpting a landscape (left), and the projected map (right). You can't even imagine how much fun it was to play with. — magic box of sand: Sculpting a landscape (left), and the projected map (right). You can't even imagine how much fun it was to play with.

Augmented reality seismic wavefields

Team Sandbox Seismics: Yuriy Ivanov (NTNU Trondheim), Ana Lim (NTNU Trondheim), Anton Kühl (University of Copenhagen), Jean Philippe Montel (Total).

Tech: GemPy, Devito. GitHub repo.

Project: This team worked closely with Team AR Sandbox, but took it in a different direction. They instead read the velocity from the surface of the sand, then used devito to simulate a seismic wavefield propagating across the model, and projected that wavefield onto the sand. See it in action in my recent Code Show post. Project page.

Yuriy Ivanov demoing the seismic wavefield moving across the sandbox.

Pretty cool, right? As usual, all of these projects were built during the hackathon weekend, almost exclusively by teams that formed spontaneously at the event itself (I think one team was self-contained from the start). If you didn't notice the affiliations of the participants — go back and check them out; I think this might have been an unprecedented level of collaboration!

Next time we'll look at the other six projects. [UPDATE: Next post is here.]

Before you go, check out this awesome video Wintershall made about the event. A massive thank you to them for supporting the event and for recording this beautiful footage — and for agreeing to share it under a CC-BY license. Amazing stuff!

July 19, 2018

Lots of news!

July 19, 2018/ Matt Hall

I can't believe it's been a month since my last post! But I've now recovered from the craziness of the spring — with its two hackathons, two conferences, two new experiments, as well as the usual courses and client projects — and am ready to start getting back to normal. My goal with this post is to tell you all the exciting stuff that's happened in the last few weeks.

Meet our newest team member

There's a new Agilist! Robert Leckenby is a British–Swiss geologist with technology tendencies. Rob has a PhD in Dynamic characterisation and fluid flow modelling of fractured reservoirs, and has worked in various geoscience roles in large and small oil & gas companies. We're stoked to have him in the team!

Rob lives near Geneva, Switzerland, and speaks French and several other human languages, as well as Python and JavaScript. He'll be helping us develop and teach our famous Geocomputing course, among other things. Reach him at robert@agilescientific.com.

Geocomputing Summer School

We have trained over 120 geoscientists in Python so far this year, but most of our training is in private classes. We wanted to fix that, and offer the Geocomputing class back for anyone to take. Well, anyone in the Houston area :) It's called Summer School, it's happening the week of 13 August, and it's a 5-day crash course in scientific Python and the rudiments of machine learning. It's designed to get you a long way up the learning curve. Read more and enroll.

A new kind of event

We have several more events happening this year, including hackathons in Norway and in the UK. But the event in Anaheim, right before the SEG Annual Meeting, is going to be a bit different. Instead of the usual Geophysics Hackathon, we're going to try a sprint around open source projects in geophysics. The event is called the Open Geophysics Sprint, and you can find out more here on events.agilescientific.com.

That site — events.agilescientific.com — is our new events portal, and our attempt to stay on top of the community events we are running. Soon, you'll be able to sign up for events on there too (right now, most of them are still handled through Eventbrite), but for now it's at least a place to see everything that's going on. Thanks to Diego for putting it together!

June 20, 2018

Code Show version 1.0

June 20, 2018/ Matt Hall

Last week we released Code Show version 1.0. In a new experiment, we teamed up with Total and the European Association of Geoscientists and Engineers at the EAGE Annual Conference and Exhibition in Copenhagen. Our goal was to bring a little of the hackathon to as many conference delegates as possible. We succeeded in reaching a few hundred people over the three days, making a lot of new friends in the process. See the action in this Twitter Moment.

What was on the menu?

The augmented reality sandbox that Simon Virgo and his colleagues brought from the University of Aachen. The sandbox displayed both a geological map generated by the GemPy 3D implicit geological modeling tool, as well as a seismic wavefield animation generated by the Devito modeling and inversion project. Thanks to Yuriy Ivanov (NTNU) and others in his hackathon team for contributing the seismic modeling component.

SandboxSeismics in action at the #EAGEAnnual2018 with @endo_simon @EvanBianco @agilegeo #GemPy pic.twitter.com/f1Lr8fNfaP
— Yuriy Ivanov (@veryYuri) June 14, 2018

Demos from the Subsurface Hackathon. We were fortunate to have lots of hackathon participants make time for the Code Show. Graham Brew presented the uncertainty visualizer his team built; Jesper Dramsch and Lukas Mosser showed off their t-SNE experiments; Florian Smit and Steve Purves demoed their RGB explorations; and Paul Gabriel shared the GiGa Infosystems projects in AR and 3D web visualization. Many thanks to those folks and their teams.

AR and VR demos by the Total team. Dell EMC provided HTC Vive and Meta 2 kits, with Dell Precision workstations, for people to try. They were a lot of fun, provoking several cries of disbelief and causing at least one person to collapse in a heap on the floor.

Python demos by the Agile team. Dell EMC also kindly provided lots more Dell Precision workstations for general use. We hooked up some BBC micro :bit microcontrollers, Microsoft Azure IoT DevKits, and other bits and bobs, and showed anyone who would listen what you can do with a few lines of Python. Thank you to Carlos da Costa (University of Edinburgh) for helping out!

Tech demos by engineers from Intel and INT. Both companies are very active in visualization research and generously spent time showing visitors their technology.

v 2.0 next year... maybe?

The booth experience was new to us. Quite a few people came to find us, so it was nice to have a base, rather than cruising around as we usually do. I'd been hoping to get more people set up with Python on their own machines, but this may be too in-depth for most people in a trade show setting. Most were happy to see some new things and maybe tap out some Python on a keyboard.

Overall, I'd call it a successful experiment. If we do it next year in London, we have a very good idea of how to shape an even more engaging experience. I think most visitors enjoyed themselves this year though; If you were one of them, we'd love to hear from you!

June 17, 2018

Big open data... or is it?

June 17, 2018/ Matt Hall

Huge news for data scientists and educators. Equinor, the company formerly known as Statoil, has taken a bold step into the open data arena. On Thursday last week, it 'disclosed' all of its subsurface and production data for the Volve oil field, located in the North Sea.

What's in the data package?

A lot! The 40,000-file package contains 5TB of data, that's 5,000GB!

This collection is substantially larger, both deeper and broader, than any other open subsurface dataset I know of. Most excitingly, Equinor has released a broad range of data types, from reports to reservoir models: 3D and 4D seismic, well logs and real-time drilling records, and everything in between. The only slight problem is that the seismic data are bundled in very large files at the moment; we've asked for them to be split up.

Questions about usage rights

Regular readers of this blog will know that I like open data. One of the cornerstones of open data is access, and there's no doubt that Equinor have done something incredible here. It would be preferable not to have to register at all, but free access to this dataset — which I'm guessing cost more than USD500 million to acquire — is an absolutely amazing gift to the subsurface community.

Another cornerstone is the right to use the data for any purpose. This involves the owner granting certain privileges, such as the right to redistribute the data (say, for a class exercise) or to share derived products (say, in a paper). I'm almost certain that Equinor intends the data to be used this way, but I can't find anything actually granting those rights. Unfortunately, if they aren't explicitly granted, the only safe assumption is that you cannot share or adapt the data.

For reference, here's the language in the CC-BY 4.0 licence:

Subject to the terms and conditions of this Public License, the Licensor hereby grants You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to exercise the Licensed Rights in the Licensed Material to:

reproduce and Share the Licensed Material, in whole or in part; and
produce, reproduce, and Share Adapted Material.

You can dig further into the requirements for open data in the Open Data Handbook.

The last thing we need is yet another industry dataset with unclear terms, so I hope Equinor attaches a clear licence to this dataset soon. Or, better still, just uses a well-known licence such as CC-BY (this is what I'd recommend). This will clear up the matter and we can get on with making the most of this amazing resource.

More about Volve

The Volve field was discovered in 1993, but not developed until 15 years later. It produced oil and gas for 8.5 years, starting on 12 February 2008 and ending on 17 September 2016, though about half of that came in the first 2 years (see below). The facility was the Maersk Inspirer jack-up rig, standing in 80 m of water, with an oil storage vessel in attendance. Gas was piped to Sleipner A. In all, the field produced 10 million Sm³ (63 million barrels) of oil, so is small by most standards, with a peak rate of 56,000 barrels per day.

Volve production over time in standard m³ (i.e. at 20°C). Multiply by 6.29 for barrels.

The production was from the Jurassic Hugin Formation, a shallow-marine sandstone with good reservoir properties, at a depth of about 3000 m. The top reservoir depth map from the discovery report in the data package is shown here. (I joined Statoil in 1997, not long after this report was written, and the sight of this page brings back a lot of memories.)

The top reservoir depth map from the discovery report. The Volve field (my label) is the small closure directly north of Sleipner East, with 15/9-19 well on it.

Get the data

To explore the dataset, you must register in the 'data village', which Equinor has committed to maintaining for 2 years. It only takes a moment. You can get to it through this link.

Let us know in the comments what you think of this move, and do share what you get up to with the data!

June 13, 2018

Visualize this!

June 13, 2018/ Matt Hall

The Copenhagen edition of the Subsurface Hackathon is over! For three days during the warmest June in Denmark for over 100 years, 63 geoscientists and programmers cooked up hot code in the Rainmaking Loft, one of the coolest, and warmest, coworking spaces you've ever seen. As always, every one of the participants brought their A game, and the weekend flew by in a blur of creativity, coffee, and collaboration. And croissants.

Pierre enjoying the Meta AR headset that DEll EMC provided.

Our sponsors have always been unusually helpful and inspiring, pushing us to get more audacious, but this year they were exceptionally engaged and proactive. Dell EMC, in the form of David and Keith, provided some fantastic tech for the teams to explore; Total supported Agile throughout the organization phase, and Wintershall kindly arranged for the event to be captured on film — something I hope to be able to share soon. See below for the full credit roll!

During th event, twelve teams dug into the theme of visualization and interaction. As in Houston last September, we started the event on Friday evening, after the Bootcamp (a full day of informal training). We have a bit of process to form the teams, and it usually takes a couple of hours. But with plenty of pizza and beer for fuel, the evening flew by. After that, it was two whole days of coding, followed by demos from all of the teams and a few prizes. Check out some of the pictures:

Thank you very much to everyone that helped make this event happen! Truly a cast of thousands:

David Holmes of Dell EMC for unparallelled awesomeness.
The whole Total team, but especially Frederic Broust, Sophie Segura, Yannick Pion, and Laurent Baduel...
...and also Arnaud Rodde for helping with the judging.
The Wintershall team, especially Andreas Beha, who also acted as a judge.
Brendon Hall of Enthought for sponsoring the event.
Carlos Castro and Kim Saabye Pedersen of Amazon AWS.
Mathias Hummel and Mahendra Roopa of NVIDIA.
Eirik Larsen of Earth Science Analytics for sponsoring the event and helping with the judging.
Duncan Irving of Teradata for sponsoring, and sorting out the T-shirts.
Monica Beech of Ikon Science for participating in the judging.
Matthias Hartung of Target for acting as a judge again.
Oliver Ranneries, plus Nina and Eva of Rainmaking Loft.
Christopher Backholm for taking such great photographs.

Finally, some statistics from the event:

63 participants, including 8 women (still way too few, but 100% better than 4 out of 63 in Paris)
15 students plus a handful of post-docs.
19 people from petroleum companies.
20 people from service and technology companies, including 7 from GiGa-infosystems!
1 no-show, which I think is a new record.

I will write a summary of all the projects in a couple of weeks when I've caught my breath. In the meantime, you can read a bit about them on our new events portal. We'll be steadily improving this new tool over the coming weeks and months.

That's it for another year... except we'll be back in Europe before the end of the year. There's the FORCE Hackathon in Stavanger in September, then in November we'll be in Aberdeen and London running some events with the Oil and Gas Authority. If you want some machine learning fun, or are looking for a new challenge, please come along!

Simon Virgo (centre) and his colleagues in Aachen built an augmented reality sandbox, powered by their research group's software, Gempy. He brought it along and three teams attempted projects based on the technology. Above, some of the participants … — Simon Virgo (centre) and his colleagues in Aachen built an augmented reality sandbox, powered by their research group's software, Gempy. He brought it along and three teams attempted projects based on the technology. Above, some of the participants are having a scrum meeting to keep their project on track.

UPDATED on 27 July

Check out the projects:

Blog