Big data in geoscience

Big data is what we got when the decision cost of deleting data became greater than the cost of storing it.
George Dyson, at Strata London

I was looking for something to do in London this week. Tempted by the Deep-water contintental margins meeting in Piccadilly, I instead took the opportunity to attend a different kind of conference. The media group O'Reilly, led by the inspired Tim O'Reilly, organizes conferences. They're known for being energetic, quirky, and small-company-friendly. I wanted to see one, so I came to Strata.

Strata is the conference for big data, one of the woolliest buzzwords in computer science today. Some people are skeptical that it's anything other than a new way to provoke fear and uncertainty in IT executives, the only known way to make them spend money. Indeed, Google "big data" and the top 5 hits are: Wikipedia (obvsly), IBM, McKinsey, Oracle, and EMC. It might be hype, but all this attention might lead somewhere good. 

We're all big data scientists

Geoscientists, especially geophysicists, are unphased by the concept of big data. The acquisition data from a 3D survey can easily require 10TB (10,240GB) or even 100TB of storage. The data must be written, read, processed, and re-written dozens of times during processing, then delivered, loaded, and interpreted. In geoscience, big data is normal data. 

So it's great that big data problems are being hacked on by thousands of developers, researchers, and companies that, until about a year ago, were only interested in games and the web. About 99% of them are not working on problems in geophysics or petroleum, but there will be insight and technology that will benefit our industry.

It's not just about data management. Some of the most creative data scientists in the world are at this conference. People are showing dense, and sometimes beautiful, visualizations of giant datasets, like the transport displays by James Cheshire's research group at UCL (right). I can't wait to show some of these people a SEG-Y or LAS file and, unencumbered by our curmudgeonly tradition of analog display metaphors, see how they would display it.

Would the wiggle display pass muster?