Closing the gap: what next?

I wrote recently about closing the gap between data science and the subsurface domain, naming some strategies that I think will speed up this process of digitalization.

But even after the gap has closed in your organization, you’re really just getting started. It’s not enough to have contact between the two worlds, you need most of your actvity to be there. This means moving it from wherever it is now. This means time, and effort, and mistakes.

Strategies for 2020+

Hardly any organizations have got to this point yet. And I certainly don’t know what it looks like when we get there as a discipline. But nonetheless I think I’m starting to see what’s going to be required to continue to build into the gap. Soon, we’re going to need to think about these things.

  • We’re bad at hiring; we need to be awesome at it*. We need to stop listening to the pop psychology peddled by HR departments (Myers-Briggs, lol) and get serious about hiring brilliant scientific and technical talent. It’s going to take some work because a lot of brilliant scientists and technical talent aren’t that interested in subsurface.

  • You need to get used to the idea that digital scientists can do amazing things quickly. These are not your corporate timelines. There are no weekly meetings. Protoyping a digital technology, or proving a concept, takes days. Give me a team of 3 people and I can give you a prototype this time next week.

  • You don’t have to do everything yourself. In fact, you don’t want to, because that would be horribly slow. For example, if you have a hard problem at hand, Kaggle can get 3000 teams of fantastically bright people to look at it for you. Next week.

  • We need benchmark datasets. If anyone is going to be able to test anything, or believe any claims about machine learning results, then we need benchmark data. Otherwise, what are we to make of claims like “98% accuracy”? (Nothing, it’s nonsense.)

  • We need faster research. We need to stop asking people for static, finished work — that they can only present with PowerPoint — months ahead of a conference… then present it as if it’s bleeding edge. And do you know how long it takes to get a paper into GEOPHYSICS?

  • You need Slack and Stack Overflow in your business. These two tools have revolutionized how technical people communicate and help each other. If you have a large organization, then your digital scientists need to talk to each other. A lot. Skype and Yammer won’t do. Check out the Software Underground if you want to see how great Slack is.

Even if your organization is not quite ready for these ideas yet, you can start laying the groundwork. Maybe your team is ready. You only need a couple of allies to get started; there’s always something you can do right now to bring change a little sooner. For example, I bet you can:

  • List 3 new places you could look for amazing, hireable scientists to start conversations with.

  • Find out who’s responsible for technical communities of practice and ask them about Slack and SO.

  • Find 3 people who’d like to help organize a hackathon for your department before the summer holidays.

  • Do some research about what it takes to organize a Kaggle-style contest.

  • Get with a colleague and list 3 datasets you could potentially de-locate and release publically.

  • Challenge the committe to let you present in a new way at your next technical conference.

I guarantee that if you pick up one of these ideas and run with it for a bit, it’ll lead somewhere fun and interesting. And if you need help at some point, or just want to talk about it, you know where to find us!


* I’m not being flippant here. Next time you’re at a conference, go and talk to the grad students, all sweaty in their suits, getting fake interviews from recruiters. Look at their CVs and resumes. Visit the recruitment room. Go and look at LinkedIn. The whole thing is totally depressing. We’ve trained them to present the wrong versions of themselves.

Closing the analytics–domain gap

I recently figured out where Agile lives. Or at least where we strive to live. We live on the isthmus — the thin sliver of land — between the world of data science and the domain of the subsurface.

We’re not alone. A growing number of others live there with us. There’s an encampment; I wrote about it earlier this week.

Backman’s Island, one of my favourite kayaking destinations, is a passable metaphor for the relationship between machine learning and our scientific domain.

Backman’s Island, one of my favourite kayaking destinations, is a passable metaphor for the relationship between machine learning and our scientific domain.

Closing the gap in your organization

In some organizations, there is barely a connection. Maybe a few rocks at low tide, so you can hop from one to the other. But when we look more closely we find that the mysterious and/or glamorous data science team, and the stories that come out of it, seem distinctly at odds with the daily reality of the subsurface professionals. The VP talks about a data-driven business, deep learning, and 98% accuracy (whatever that means), while the geoscientists and engineers battle with raster logs, giant spreadsheets, and trying to get their data from Petrel into ArcGIS (or, help us all, PowerPoint) so they can just get on with their day.

We’re not going to learn anything from those organizations, except maybe marketing skills.

We can learn, however, from the handful of organizations, or parts of them, that are serious about not only closing the gap, but building new paths, and infrastructure, and new communities out there in the middle. If you’re in a big company, they almost certainly exist somewhere in the building — probably keeping their heads down because they are so productive and don’t want anyone messing with what they’ve achieved.

Here are some of the things they are doing:

  • Blending data science teams into asset teams, sitting machine learning specialists with subsurface scientists and engineers. Don’t make the same mistake with machine learning that our industry made with innovation — giving it to a VP and trying to bottle it. Instead, treat it like Marmite: spread it very thinly on everything.*

  • Treating software like knowledge sharing. Code is, hands down, the best way to share knowledge: it’s unambiguous, tested (we hope anyway), and — above all — you can actually use it. Best practice documents are I’m afraid, not worth the paper they would be printed on if anyone even knew how to find them.

  • Learning to code. OK, I’m biased because we train people… but it seriously works. When you have 300 geoscientists in your organization that embrace computational thinking, that can write a function in Python, that know what a support vector machine is for — that changes things. It changes every conversation.

  • Providing infrastructure for digital science. Once you have people with skills, you need people with powers. The power to install software, instantiate a virtual machine, or recruit a coder. You need people with tools, like version control, continuous integration, and communities of practice.

  • Realizing that they need to look in new places. Those much-hyped conversations everyone is having with Google or Amazon are, admittedly, pretty cool to see in the extractive industries (though if you really want to live on the cutting edge of geospatial analytics, you should probably be talking to Uber). You will find more hope and joy in Kaggle, Stack Overflow, and any given hackathon than you will in any of the places you’ve been looking for ‘innovation’ for the last 20 years.

This machine learning bandwagon we’re on is not about being cool, or giving keynotes, or saying ‘deep learning’ and ‘we’re working with Google’ all the time. It’s about equipping subsurface professionals to make better and safer scientific, industrial, and business decisions with more evidence and more certainty.

And that means getting serious about closing that gap.


I thought about this gap, and Agile’s place in it — along with the several hundred other digital subsurface scientists in the world — after drawing an attempt at drawing the ‘big picture’ of data science on one of our courses recently. Here’s a rendering of that drawing, without further comment. It didn’t quite fit with my ‘sliver of land’ analogy somehow…

On the left, the world of ‘advanced analytics’, on the right, how the disciplines of data science and earth science overlap on and intersect the computational world. We live in the green belt. (yes, we could argue for hours about these terms, but let’s not.)

On the left, the world of ‘advanced analytics’, on the right, how the disciplines of data science and earth science overlap on and intersect the computational world. We live in the green belt. (yes, we could argue for hours about these terms, but let’s not.)


* If you don’t know what Marmite is, it’s not too late to catch up.

Digitalization... of what?

I've been hearing a lot about 'digitalization', or 'digital transformation', recently. What is this buzzword?

The general idea seems to be: exploit lots and lots of data (which we already have), with analytics and machine learning probably, to do a better job finding and producing fuel and energy safely and responsibly.

At the centre of it all is usually data. Lots of data, usually in a lake. And this is where it all goes wrong. Digitalization is not about data. And it's not about technology either. Or cloud. Or IoT.

Interest in the terms "digital transformation" and "digitalization" since 2004, according to  Google Trends . The data reveal a slight preference for the term "digitalization" in central and northern Europe.  Google Ngram Viewer  indicates that the term "digitalization" has been around for over 100 years, but it is also a medical term connected with the therapeutic use of  digitalis . Just to be clear, that's not what we're talking about.

Interest in the terms "digital transformation" and "digitalization" since 2004, according to Google Trends. The data reveal a slight preference for the term "digitalization" in central and northern Europe. Google Ngram Viewer indicates that the term "digitalization" has been around for over 100 years, but it is also a medical term connected with the therapeutic use of digitalis. Just to be clear, that's not what we're talking about.

It's about people

Oh no, here I go with the hand-wavy, apple-pie "people not process" nonsense... well, yes. I'm convinced that it's humans we're transforming, not data or technology. Or clouds. Or Things.

I think it's worth spelling out, because I think most corporations have not grasped the human aspect yet. And I don't think it's unreasonable to say that petroleum has a track record of not putting people at the centre of its activities, so I worry that this will happen again. This would be bad, because it might mean that digitalization not only fails to get traction — which would be bad enough because this revolution is long overdue — but also that it causes unintended problems.

Without people, digital transformation is just another top-down 'push' effort, with too much emphasis on supply. I think it's smarter to create demand, or 'pull', so that professionals are asking for support, and tools, and databases, and are engaged in how those things are created.

Put technical professionals at the heart of the revolution, and the rest will follow. The inverse is not true.

Strategies

This is far from an exhaustive list, but here are some ideas for ways to get ahead in digital transformation:

  • Make it easy for digitally curious people to dip a toe in. Build a beginner-friendly computing environment, and encourage people to use it. Challenge your IT people to support a culture of experimentation and creativity. 
  • Give those curious professionals access to professional development channels, whether it's our courses, other courses, online channels like Lynda.com or Coursera, or whatever. 
  • Build a community of practice for 'scientific computing'. Whether it's a Yammer group or something more formal, be sure to encourage frequent face-to-face meetups, and perhaps an intranet portal.
  • Start to connect subsurface professionals with software engineers, especially web programmers and data scientists, elsewhere in the organization. I think the best way is to embed programmers into technical teams. 
  • Encourage participation in external channels like conferences and publications, data science contests, hackathons, open source projects, and so on. I guarantee you'll see a step change in skills and enthusiasm.

The bottom line is that we're looking for a profound culutral change. It will be slow. More than that, it needs to be slow. It might only take a year or two to get traction for an idea like "digital first". But deeper concepts, like "machine readable microservices" or "data-driven decisions" or "reproducible workflows", must take longer because you can't build that high without a solid foundation. Successfully applying specific technologies like deep learning, augmented reality, or blockchain, will certainly require a high level of technology literacy, and will take years to get right.

What's going on with scientific computing in your organization? Are you 'digitally curious'? Do you feel well-supported? Do you know others in your organization like you?


The circuit board image in the thumbnail for this post is by Carl Drougge, licensed CC-BY-SA.

Your competitive advantage is...

Two weeks ago I explained why I think competitive advantage is often attached to the wrong things. So corporations hide things they would get more benefit from sharing. This week, I offer one opinion on what competitive advantage is, and how to get more of it.

Data are often not even treated as an asset, never mind an advantageI recently happened on a blog by Rob Karel about business process. That sounds a bit dull perhaps, but I liked this:

The "data is an asset" rhetoric doesn't translate to [monetary value], because data in and of itself has no value! The only value data/information has to offer... is in the context of the business processes, decisions, customer experiences, and competitive differentiators it can enable.

So if data are not a competitive advantage, and neither are most of the software and industrial technology we use, and nor yet are industry training courses or consoritum-based research programs, then what is? My list follows after the jump.

Read More