Closing the gap: what next?

I wrote recently about closing the gap between data science and the subsurface domain, naming some strategies that I think will speed up this process of digitalization.

But even after the gap has closed in your organization, you’re really just getting started. It’s not enough to have contact between the two worlds, you need most of your actvity to be there. This means moving it from wherever it is now. This means time, and effort, and mistakes.

Strategies for 2020+

Hardly any organizations have got to this point yet. And I certainly don’t know what it looks like when we get there as a discipline. But nonetheless I think I’m starting to see what’s going to be required to continue to build into the gap. Soon, we’re going to need to think about these things.

  • We’re bad at hiring; we need to be awesome at it*. We need to stop listening to the pop psychology peddled by HR departments (Myers-Briggs, lol) and get serious about hiring brilliant scientific and technical talent. It’s going to take some work because a lot of brilliant scientists and technical talent aren’t that interested in subsurface.

  • You need to get used to the idea that digital scientists can do amazing things quickly. These are not your corporate timelines. There are no weekly meetings. Protoyping a digital technology, or proving a concept, takes days. Give me a team of 3 people and I can give you a prototype this time next week.

  • You don’t have to do everything yourself. In fact, you don’t want to, because that would be horribly slow. For example, if you have a hard problem at hand, Kaggle can get 3000 teams of fantastically bright people to look at it for you. Next week.

  • We need benchmark datasets. If anyone is going to be able to test anything, or believe any claims about machine learning results, then we need benchmark data. Otherwise, what are we to make of claims like “98% accuracy”? (Nothing, it’s nonsense.)

  • We need faster research. We need to stop asking people for static, finished work — that they can only present with PowerPoint — months ahead of a conference… then present it as if it’s bleeding edge. And do you know how long it takes to get a paper into GEOPHYSICS?

  • You need Slack and Stack Overflow in your business. These two tools have revolutionized how technical people communicate and help each other. If you have a large organization, then your digital scientists need to talk to each other. A lot. Skype and Yammer won’t do. Check out the Software Underground if you want to see how great Slack is.

Even if your organization is not quite ready for these ideas yet, you can start laying the groundwork. Maybe your team is ready. You only need a couple of allies to get started; there’s always something you can do right now to bring change a little sooner. For example, I bet you can:

  • List 3 new places you could look for amazing, hireable scientists to start conversations with.

  • Find out who’s responsible for technical communities of practice and ask them about Slack and SO.

  • Find 3 people who’d like to help organize a hackathon for your department before the summer holidays.

  • Do some research about what it takes to organize a Kaggle-style contest.

  • Get with a colleague and list 3 datasets you could potentially de-locate and release publically.

  • Challenge the committe to let you present in a new way at your next technical conference.

I guarantee that if you pick up one of these ideas and run with it for a bit, it’ll lead somewhere fun and interesting. And if you need help at some point, or just want to talk about it, you know where to find us!

* I’m not being flippant here. Next time you’re at a conference, go and talk to the grad students, all sweaty in their suits, getting fake interviews from recruiters. Look at their CVs and resumes. Visit the recruitment room. Go and look at LinkedIn. The whole thing is totally depressing. We’ve trained them to present the wrong versions of themselves.

Closing the analytics–domain gap

I recently figured out where Agile lives. Or at least where we strive to live. We live on the isthmus — the thin sliver of land — between the world of data science and the domain of the subsurface.

We’re not alone. A growing number of others live there with us. There’s an encampment; I wrote about it earlier this week.

Backman’s Island, one of my favourite kayaking destinations, is a passable metaphor for the relationship between machine learning and our scientific domain.

Backman’s Island, one of my favourite kayaking destinations, is a passable metaphor for the relationship between machine learning and our scientific domain.

Closing the gap in your organization

In some organizations, there is barely a connection. Maybe a few rocks at low tide, so you can hop from one to the other. But when we look more closely we find that the mysterious and/or glamorous data science team, and the stories that come out of it, seem distinctly at odds with the daily reality of the subsurface professionals. The VP talks about a data-driven business, deep learning, and 98% accuracy (whatever that means), while the geoscientists and engineers battle with raster logs, giant spreadsheets, and trying to get their data from Petrel into ArcGIS (or, help us all, PowerPoint) so they can just get on with their day.

We’re not going to learn anything from those organizations, except maybe marketing skills.

We can learn, however, from the handful of organizations, or parts of them, that are serious about not only closing the gap, but building new paths, and infrastructure, and new communities out there in the middle. If you’re in a big company, they almost certainly exist somewhere in the building — probably keeping their heads down because they are so productive and don’t want anyone messing with what they’ve achieved.

Here are some of the things they are doing:

  • Blending data science teams into asset teams, sitting machine learning specialists with subsurface scientists and engineers. Don’t make the same mistake with machine learning that our industry made with innovation — giving it to a VP and trying to bottle it. Instead, treat it like Marmite: spread it very thinly on everything.*

  • Treating software like knowledge sharing. Code is, hands down, the best way to share knowledge: it’s unambiguous, tested (we hope anyway), and — above all — you can actually use it. Best practice documents are I’m afraid, not worth the paper they would be printed on if anyone even knew how to find them.

  • Learning to code. OK, I’m biased because we train people… but it seriously works. When you have 300 geoscientists in your organization that embrace computational thinking, that can write a function in Python, that know what a support vector machine is for — that changes things. It changes every conversation.

  • Providing infrastructure for digital science. Once you have people with skills, you need people with powers. The power to install software, instantiate a virtual machine, or recruit a coder. You need people with tools, like version control, continuous integration, and communities of practice.

  • Realizing that they need to look in new places. Those much-hyped conversations everyone is having with Google or Amazon are, admittedly, pretty cool to see in the extractive industries (though if you really want to live on the cutting edge of geospatial analytics, you should probably be talking to Uber). You will find more hope and joy in Kaggle, Stack Overflow, and any given hackathon than you will in any of the places you’ve been looking for ‘innovation’ for the last 20 years.

This machine learning bandwagon we’re on is not about being cool, or giving keynotes, or saying ‘deep learning’ and ‘we’re working with Google’ all the time. It’s about equipping subsurface professionals to make better and safer scientific, industrial, and business decisions with more evidence and more certainty.

And that means getting serious about closing that gap.

I thought about this gap, and Agile’s place in it — along with the several hundred other digital subsurface scientists in the world — after drawing an attempt at drawing the ‘big picture’ of data science on one of our courses recently. Here’s a rendering of that drawing, without further comment. It didn’t quite fit with my ‘sliver of land’ analogy somehow…

On the left, the world of ‘advanced analytics’, on the right, how the disciplines of data science and earth science overlap on and intersect the computational world. We live in the green belt. (yes, we could argue for hours about these terms, but let’s not.)

On the left, the world of ‘advanced analytics’, on the right, how the disciplines of data science and earth science overlap on and intersect the computational world. We live in the green belt. (yes, we could argue for hours about these terms, but let’s not.)

* If you don’t know what Marmite is, it’s not too late to catch up.

Digitalization... of what?

I've been hearing a lot about 'digitalization', or 'digital transformation', recently. What is this buzzword?

The general idea seems to be: exploit lots and lots of data (which we already have), with analytics and machine learning probably, to do a better job finding and producing fuel and energy safely and responsibly.

At the centre of it all is usually data. Lots of data, usually in a lake. And this is where it all goes wrong. Digitalization is not about data. And it's not about technology either. Or cloud. Or IoT.

Interest in the terms "digital transformation" and "digitalization" since 2004, according to  Google Trends . The data reveal a slight preference for the term "digitalization" in central and northern Europe.  Google Ngram Viewer  indicates that the term "digitalization" has been around for over 100 years, but it is also a medical term connected with the therapeutic use of  digitalis . Just to be clear, that's not what we're talking about.

Interest in the terms "digital transformation" and "digitalization" since 2004, according to Google Trends. The data reveal a slight preference for the term "digitalization" in central and northern Europe. Google Ngram Viewer indicates that the term "digitalization" has been around for over 100 years, but it is also a medical term connected with the therapeutic use of digitalis. Just to be clear, that's not what we're talking about.

It's about people

Oh no, here I go with the hand-wavy, apple-pie "people not process" nonsense... well, yes. I'm convinced that it's humans we're transforming, not data or technology. Or clouds. Or Things.

I think it's worth spelling out, because I think most corporations have not grasped the human aspect yet. And I don't think it's unreasonable to say that petroleum has a track record of not putting people at the centre of its activities, so I worry that this will happen again. This would be bad, because it might mean that digitalization not only fails to get traction — which would be bad enough because this revolution is long overdue — but also that it causes unintended problems.

Without people, digital transformation is just another top-down 'push' effort, with too much emphasis on supply. I think it's smarter to create demand, or 'pull', so that professionals are asking for support, and tools, and databases, and are engaged in how those things are created.

Put technical professionals at the heart of the revolution, and the rest will follow. The inverse is not true.


This is far from an exhaustive list, but here are some ideas for ways to get ahead in digital transformation:

  • Make it easy for digitally curious people to dip a toe in. Build a beginner-friendly computing environment, and encourage people to use it. Challenge your IT people to support a culture of experimentation and creativity. 
  • Give those curious professionals access to professional development channels, whether it's our courses, other courses, online channels like or Coursera, or whatever. 
  • Build a community of practice for 'scientific computing'. Whether it's a Yammer group or something more formal, be sure to encourage frequent face-to-face meetups, and perhaps an intranet portal.
  • Start to connect subsurface professionals with software engineers, especially web programmers and data scientists, elsewhere in the organization. I think the best way is to embed programmers into technical teams. 
  • Encourage participation in external channels like conferences and publications, data science contests, hackathons, open source projects, and so on. I guarantee you'll see a step change in skills and enthusiasm.

The bottom line is that we're looking for a profound culutral change. It will be slow. More than that, it needs to be slow. It might only take a year or two to get traction for an idea like "digital first". But deeper concepts, like "machine readable microservices" or "data-driven decisions" or "reproducible workflows", must take longer because you can't build that high without a solid foundation. Successfully applying specific technologies like deep learning, augmented reality, or blockchain, will certainly require a high level of technology literacy, and will take years to get right.

What's going on with scientific computing in your organization? Are you 'digitally curious'? Do you feel well-supported? Do you know others in your organization like you?

The circuit board image in the thumbnail for this post is by Carl Drougge, licensed CC-BY-SA.

Silos are a feature, not a bug

If you’ve had the same problem for a long time, maybe it’s not a problem. Maybe it’s a fact.
— Yitzhak Rabin

"Break down the silos" is such a hackneyed phrase that it's probably not even useful any more. But I still hear it all the time — especially in our work in governments and large enterprises. It's the wrong idea — silos are awesome.

The thing is: people like silos. That's why they are there. Whether they are project teams, organizational units, technical communities, or management layers, silos are comfortable. People can speak freely. Everyone shares the same values. There is trust. There is purpose.

The problem is that much of the time — maybe most of the time, for most people — you want silos not to matter. Don't confuse this with not wanting them to exist. They do exist: get used to it. So now make them not matter. Cope don't fix. 

Permeable seals

In the context of groups of humans who want to work together, what do permeable silos look like? I mean really leaky ones.

The answer is: it depends. Here are the features they will have:

  • They serve their organization. The silo must serve the organization or community it's part of. I think a service-oriented mindset gets the best out of people: get them asking "How can I help?". If it is not serving anyone, the silo needs to die.
  • They are internally effective. This is the whole point of the silo, so this had better be true. Make sure people can do a better job because of the efficiencies of the silo. Resources are provided. Responsibilities are understood. The shared purpose must result in great things... if not, the silo needs to die.
  • They are open. This is the leakiness criterion. If someone needs something from the silo, it must be obvious how to get it, and the cost of getting it must be very low. If someone wants to join the silo, it's obvious how to do this, and they are welcomed. If something about the silo needs to change, there is a clear path to making this known.
  • They are transparent. People need to know what the silo is for. If people look in, they can see how things work. Don't build secret clubs, black boxes, or other dark places. Conversely, if people in the silo want to look outside, they can. Importantly: if the silo's level of transparency doesn't make you uncomfortable, you're not doing enough of it.

The openness is key. Ideally, the mechanism for getting things from the silo is the same one that the silo's own inhabitants use. This is by far the simplest, cheapest way to nail it. Think of it as an interface; if you're a programmer, think of it as an API. Indeed, in many cases, it will involve an actual API. If this does not exist, other people will come up with their own solutions, and if this happens often enough, the silo will cease to be useful to the organization. Competition between silos is unhelpful.

Build more silos!

A government agency can be a silo, as long as it has a rich, free interface for other agencies and the general public to access its services. Geophysics can be a silo, as long as it's easy for a wave-curious engineer to join in, and the silo is promoting excellence and professional development amongst its members. An HR department can be a silo, as long as its practices and procedures are transparent and people can openly ask why the heck they still use Myers–Briggs tests.

Go and build a silo. Then make it not matter most of the time.

Image: Silos by Flickr user Guerretto, licensed CC-BY.

What will people pay for?

Many organizations in the industry are asking this question right now. Software and service companies would like to sell product, technical societies would like to survive diminished ad sales and conference revenue, entrepreneurs would like to find customers. We all need to make a living.

I was recently asked this very question by a technical society. However, it's utterly the wrong question. Even asking this question reveals a deep-seated misunderstanding of what technical societies are for.

The question is not "What will people pay for?", it's "What do people need?". 

The leaders of our profession

Geoscientists and engineers are professionals. Our professional contributions are defined by our work and its purpose, not by our jobs and its tasks. This is essentially what makes a professional different from other workers: we are purpose-oriented, not task-oriented. We're interested in the outcome, not the means.

But even professionals benefit from leadership. Professional regulators notwithstanding, our technical societies are the de facto leaders of the profession. The professional regulator is the 'line manager' of the profession, not the 'chief geoscientist'.

Leadership is about setting an example, inspiring great work, and providing the means to grow and make the best contributions people can make. Societies need to be asking themselves how they can create the conditions for a transformed profession, a more relevant and resilient one. In short, how can they be useful? How can they serve?

OK, so what do people need?

I don't claim to have all the answers, or even many of them, but here are some things I think people need:

  • Representation. Get serious about gender and race balance on your boards and committees. There is recent progress, but it's nowhere near representative. Related: get out of North America and improve global reach.
  • Better ways to contribute and connect. Experiment more — a lot more, and urgently — with meetings and conferences. Help people participate, not just attend. Help people connect, not just exchange business cards. 
  • New ways to contribute and connect. Get serious about social media. Get scientists involved — social media is not a marketing exercise. Think hard about how you can engage your members through blogs and other content.
  • Reproducible science. Go further with open access, open data, and open source code. Make your content work harder. Make it reach further. Demand more of your authors to make their work reproducible.
  • A bit less self-interest. Stop regarding things you didn't organize or produce as a threat. Other people's events and publications may be of interest to your members, and your mission is to serve them.

Don't listen to my blathering. The AGU and the EGU are real leaders in geoscience — be inspired by them, follow their lead. Pay more attention to what's happening in publishing and conferences in other technical verticals, especially technology.

Pie in the sky is still pie

People will say, "That's all great Matt, but right now it's about survival." I get this a lot, but I'm not buying it. When times are good, you don't need to do the right thing; when times are hard, you can't afford to. True, all this would be easier if you'd started doing the right thing when times were good, but you didn't, so here we are.

Sure it's tough now, but are you sure you can afford to wait till tomorrow?

I've written lots before on these topics. Suggested reading: