What's inside? 52 things!

On Tuesday we announced our forthcoming community collaboration book. So what's in there? So much magic, it's hard to know where to start. Here's a list of the first dozen chapters:

  1. Anisotropy is not going away, Vladimir Grechka, Shell
  2. Beware the interpretation-to-data trap, Evan Bianco, Agile
  3. Calibrate your intuition, Taras Gerya, ETH Zürich
  4. Don’t ignore seismic attenuation, Carl Reine, Nexen
  5. Don’t neglect your math, Brian Russell, Hampson-Russell
  6. Don’t rely on preconceived notions, Eric Andersen, Talisman
  7. Evolutionary understanding in seismic interpretation, Clare Bond, University of Aberdeen
  8. Explore the azimuths, David Gray, Nexen
  9. Five things I wish I’d known, Matt Hall, Agile
  10. Geology comes first, Chris Jackson, Imperial College London
  11. Geophysics is all around, José M Carcione, OGS Trieste, Italy
  12. How to assess a colourmap, Matteo Niccoli, MyCarta blog
  13. ...

When I read that list, I cannot wait to read the book — and I've read it three times already! This is not even one quarter of the book. You can guess from the list that some are technical, others are personal, a few may be controversial.

One thing we had fun with was organizing the contents. The chapters are, as you see, in alphabetical order. But each piece has thematic tags. Some were a little hard to classify, I admit, and some people will no doubt wonder why, say, Bill Goodway's The magic of Lamé is labeled 'basics', but there you go.

One thing I did with the tags was try to group the chapters according to the tags they had. Each chapter has three tags. If we connect the three tags belonging to an individual chapter, and do the same for every chapter, then we can count the connections and draw a graph (right). I made this one in Gephi

The layout is automatic: relative positions are calculated by modeling the connections as springs whose stiffness depends on the number of links. Node size is a function of connectedness. Isn't it great that geology is in the middle?

Now, without worrying too much about the details, I used the graph to help group the chapters non-exclusively into the following themes:

  • Fundamentals  basics, mapping (16 chapters)
  • Concepts  geology, analogs (12 chapters)
  • Interpretation  needed a theme of its own (21 chapters)
  • Power tools  attributes, ninja skills (9 chapters)
  • Pre-stack  rock physics, pre-stack, processing (11 chapters)
  • Quantitative  mathematics, analysis (20 chapters)
  • Integration  teamwork, workflow (15 chapters)
  • Innovation  history, innovation, technology (9 chapters)
  • Skills  learning, career, managing (15 chapters)

I think this accurately reflects the variety in the book. Next post we'll have a look at the variety among the authors — perhaps it explains the breadth of themes. 

Today's the day!

We're super-excited. We said a week ago we'd tell you why today. 

At the CSEG-CSPG conference last year, we hatched a plan. The idea was simple: ask as many amazing geoscientists as we could to write something fun and/or interesting and/or awesome and/or important about geophysics. Collect the writings. Put them in a book and/or ebook and/or audiobook,... and sell it at a low price. And also let the content out into the wild under a creative commons license, so that others can share it, copy it, and spread it.

So the idea was conceived as Things You Should Know About Geophysics. And today the book is born... almost. It will be available on 1 June, but you can see it right now at Amazon.com, pre-order or wish-list it. It will be USD19, or about 36 cents per chapter. For realz.

The brief was deliberately vague: write up to 600 words on something that excites or inspires or puzzles you about exploration geophysics. We had no idea what to expect. We knew we'd get some gold. We hoped for some rants.

Incredibly, within 24 hours of sending the first batch of invites, we had a contribution. We were thrilled, beyond thrilled, and this was the moment we knew it would work out. Like any collaborative project, it was up and down. We'd get two or three some days, then nothing for a fortnight. We extended deadlines and crossed fingers, and eventually called 'time' at year's end, with 52 contributions from 38 authors.

Like most of what we do, this is a big experiment. We think we can have it ready for 1 June but we're new to this print-on-demand lark. We think the book will be in every Amazon store (.ca, .co, .uk, etc), but it might take a few weeks to roll through all the sites. We think it'll be out as an ebook around the same time. Got an idea? Tell us how we can make this book more relevant to you!

News of the month

Welcome to our more-or-less regular new post. Seen something awesome? Get in touch!

Convention time!

Next week is Canada's annual petroleum geoscience party, the CSPGCSEGCWLS GeoConvention. Thousands of applied geoscientists will descend on Calgary's downtown Telus Convention Centre to hear about the latest science and technology in the oilfield, and catch up with old friends. We're sad to be missing out this year — we hope someone out there will be blogging!

GeoConvention highlights

There are more than fifty technical sessions at the conference this year. For what it's worth, these are the presentations we'd be sitting in the front row for if we were going:

Now run to the train and get to the ERCB Core Research Centre for...

Guided fault interpretation

We've seen automated fault interpretation before, and now Transform have an offering too. A strongly tech-focused company, they have a decent shot at making it work in ordinary seismic data — the demo shows a textbook example:

GPU processing on the desktop

On Monday Paradigm announced their adoption of NVIDIA's Maximus technology into their desktop applications. Getting all gooey over graphics cards seems very 2002, but this time it's not about graphics — it's about speed. Reserving a Quadro processor for graphics, Paradigm is computing seismic attributes on a dedicated Tesla graphics processing unit, or GPU, rather than on the central processing unit (CPU). This is cool because GPUs are massively parallel and are much, much faster at certain kinds of computation because they don't have the process management, I/O, and other overheads that CPUs have. This is why seismic processing companies like CGGVeritas are adopting them for imaging. Cutting edge stuff!

In other news...

This regular news feature is for information only. We aren't connected with any of these organizations, and don't necessarily endorse their products or services. 

One week countdown

We're super-excited, dear reader. Even more than usual.

At the Calgary GeoConvention last year, we hatched a plan. The idea was simple: ask as many amazing geophysicists as we could to help us create something unique and fun. Now, as the conference creeps up on us again, it's almost ready. A new product from Agile that we think will make you smile.

Normally we like to talk about what we're up to, but this project has been a little different. We weren't at all sure it was going to work out until about Christmas time. And it had a lot of moving parts, so the timeline has been, er, flexible. But the project fits nicely into our unbusiness model: it has no apparent purpose other than being interesting and fun. Perfect!

In an attempt to make it look like we have a marketing department, or perhaps to confirm that we definitely do not, let's count down to next Tuesday morning, in milliseconds of course. Come back then — we hope to knock your socks at least partly off...

K is for Wavenumber

Wavenumber, sometimes called the propagation number, is in broad terms a measure of spatial scale. It can be thought of as a spatial analog to the temporal frequency, and is often called spatial frequency. It is often defined as the number of wavelengths per unit distance, or in terms of wavelength, λ:

$$k = \frac{1}{\lambda}$$

The units are \(\mathrm{m}^{–1}\), which are nameless in the International System, though \(\mathrm{cm}^{–1}\) are called kaysers in the cgs system. The concept is analogous to frequency \(f\), measured in \(\mathrm{s}^{–1}\) or Hertz, which is the reciprocal of period \(T\); that is, \(f = 1/T\). In a sense, period can be thought of as a temporal 'wavelength' — the length of an oscillation in time.

If you've explored the applications of frequency in geophysics, you'll have noticed that we sometimes don't use ordinary frequency f, in Hertz. Because geophysics deals with oscillating waveforms, ones that vary around a central value (think of a wiggle trace of seismic data), we often use the angular frequency. This way we can also express the close relationship between frequency and phase, which is an angle. So in many geophysical applications, we want the angular wavenumber. It is expressed in radians per metre:

$$k = \frac{2\pi}{\lambda}$$

The relationship between angular wavenumber and angular frequency is analogous to that between wavelength and ordinary frequency — they are related by the velocity V:

$$k = \frac{\omega}{V}$$

It's unfortunate that there are two definitions of wavenumber. Some people reserve the term spatial frequency for the ordinary wavenumber, or use ν (that's a Greek nu, not a vee — another potential source of confusion!), or even σ for it. But just as many call it the wavenumber and use k, so the only sure way through the jargon is to specify what you mean by the terms you use. As usual!

Just as for temporal frequency, the portal to wavenumber is the Fourier transform, computed along each spatial axis. Here are two images and their 2D spectra — a photo of some ripples, a binary image of some particles, and their fast Fourier transforms. Notice how the more organized image has a more organized spectrum (as well as some artifacts from post-processing on the image), while the noisy image's spectrum is nearly 'white':

Explore our other posts about scale.

The particle image is from the sample images in FIJI. The FFTs were produced in FIJI.

Update

on 2012-05-03 16:41 by Matt Hall

Following up on Brian's suggesstion in the comments, I added a brief workflow to the SubSurfWiki page on wavenumber. Please feel free to add to it or correct it if I messed anything up. 

Opening data in Nova Scotia

When it comes to data, open doesn't mean part of the public relations campaign. Open must be put to work. And making open data work can take a lot of work, by a number of contributors across organizations.

Also, open data should be accesible by more than the privileged few in the right location at the right time, or with the right connections. The better way to connect is by digital data stewardship.

I will be speaking about the state of the onshore Nova Scotia petroleum database Nova Scotia Energy R&D Forum in Halifax on 16 & 17 May, and the direction this might head for the collective benefit of regulators, researchers, explorationists, and the general public. Here's the abstract for the talk:

Read More

The Agile toolbox

Some new businesses go out and raise millions in capital before they do anything else. Not us — we only do what we can afford. Money makes you lazy. It's technical consulting on a shoestring!

If you're on a budget, open source is your best friend. More than this, it's important an open toolbox is less dependent on hardware and less tied to workflows. Better yet, avoiding large technology investments helps us avoid vendor lock-in, and the resulting data lock-in, keeping us more agile. And there are two more important things about open source: 

  • You know exactly what the software does, because you can read the source code
  • You can change what the software does, becase you can change the source code

Anyone who has waited 18 months for a software vendor to fix a bug or add a feature, then 18 more months for their organization to upgrade the software, knows why these are good things.

So what do we use?

In the light of all this, people often ask us what software we use to get our work done.

Hardware  Matt is usually on a dual-screen Apple iMac running OS X 10.6, while Evan is on a Samsung Q laptop (with a sweet solid-state drive) running Windows. Our plan, insofar as we have a plan, is to move to Mac Pro as soon as the new ones come out in the next month or two. Pure Linux is tempting, but Macs are just so... nice.

Geoscience interpretation  dGB OpendTect, GeoCraftQuantum GIS (above). The main thing we lack is a log visualization and interpretation tool. Beyond this, we don't use them much yet but Madagascar and GMT are plugged right into OpendTect. For getting started on stratigraphic charts, we use TimeScale Creator

A quick aside, for context: when I sold Landmark's GeoProbe seismic interpretation tool, back in 2003 or so, the list price was USD140 000 per user, choke, plus USD25k per year in maintenance. GeoProbe is very powerful now (and I have no idea what it costs), but OpendTect is a much better tool that that early edition was. And it's free (as in speech, and as in beer).

Geekery, data mining, analysis  Our core tools for data mining are Excel, Spotfire Silver (an amazing but proprietary tool), MATLAB and/or GNU Octave, random Python. We use Gephi for network analysis, FIJI for image analysis, and we have recently discovered VISAT for remote sensing images. All our mobile app development has been in MIT AppInventor so far, but we're playing with the PhoneGap framework in Eclipse too. 

Writing and drawing  Google Docs for words, Inkscape for vector art and composites, GIMP for rasters, iMovie for video, Adobe InDesign for page layout. And yeah, we use Microsoft Office and OpenOffice.org too — sometimes it's just easier that way. For managing references, Mendeley is another recent discovery — it is 100% awesome. If you only look at one tool in this post, look at this.

Collaboration  We collaborate with each other and with clients via SkypeDropbox, Google+ Hangouts, and various other Google tools (for calendars, etc). We also use wikis (especially SubSurfWiki) for asynchronous collaboration and documentation. As for social media, we try to maintain some presence in Google+, Facebook, and LinkedIn, but our main channel is Twitter.

Web  This website is hosted by Squarespace for reliability and reduced maintenance. The MediaWiki instances we maintain (both public and private) are on MediaWiki's open source platform, running on Amazon's Elastic Compute servers for flexibility. An EC2 instance is basically an online Linux box, running Ubuntu and Bitnami's software stack, plus some custom bits and pieces. We are launching another website soon, running WordPress on Amazon EC2. Hover provides our domain names — an awesome Canadian company.

Administrative tools  Every business has some business tools. We use Tick to track our time — it's very useful when working on multiple projects, subscontractors, etc. For accounting we recently found Wave, and it is the best thing ever. If you have a small business, please check it out — after headaches with several other products, it's the best bean-counting tool I've ever used.

If you have a geeky geo-toolbox of your own, we'd love to hear about it. What tools, open or proprietary, couldn't you live without?

Checklists for everyone

Avoidable failures are common and persistent, not to mention demoralizing and frustrating, across many fields — from medicine to finance, business to government. And the reason is increasingly evident: the volume and complexity of what we know has exceeded our individual ability to deliver its benefits correctly, safely, or reliably. Knowledge has both saved and burdened us.

I first learned about Atul Gawande from Bill Murphy's talk at the 1IWRP conference last August, where he offered the surgeon's research model for all imperfect sciences; casting the spectrum of problems in a simple–complicated–complex ternary space. In The Checklist Manifesto, Gawande writes about a topic that is relevant to all all geoscience: the problem of extreme complexity. And I have been batting around the related ideas of cookbooks, flowcharts, recipes, and to-do lists for maximizing professional excellence ever since. After all, it is challenging and takes a great deal of wisdom to cut through the chaff, and reduce a problem to its irreducible and essential bits. Then I finally read this book.

The creation of the now heralded 19-item surgical checklist found its roots in three places — the aviation industry, restaurant kitchens, and building construction:

Thinking about averting plane crashes in 1935, or stopping infections in central lines in 2003, or rescuing drowning victims today, I realized that the key problem in each instance was essentially a simple one, despite the number of contributing factors. One needed only to focus attention on the rudder and elevator controls in the first case, to maintain sterility in the second, and to be prepared for cardiac bypass in the third. All were amenable, as a result, to what engineers call "forcing functions": relatively straightforward solutions that force the necessary behavior — solutions like checklists.

What is amazing is that it took more than two years, and a global project sponsored by the World Health Organization, to devise such a seemingly simple piece of paper. But what a change it has had. Major complications fell by 36%, and deaths fells by 47%. Would you adopt a technology that had a 36% improvement in outcomes, or a 36% reduction in complications? Most would without batting an eye.

But the checklist paradigm is not without skeptics. There is resistance to the introduction of the checklist because it threatens our autonomy as professionals, our ego and intelligence that we have trained hard to attain. An individual must surrender being the virtuoso. It enables teamwork and communication, which engages subordinates and empowers them at crucial points in the activity. The secret is that a checklist, done right, is more than just tick marks on a piece of paper — it is a vehicle for delivering behavioural change.

I can imagine huge potential for checklists in the problems we face in petroleum geoscience. But what would such checklists look like? Do you know of any in use today?

News of the month

A few bits of news about geology, geophysics, and technology in the hydrocarbon and energy realm. Do drop us a line if you hear of something you think we ought to cover.

All your sequence strat

The SEPM, which today calls itself the Society for Sedimentary Geology (not the Society of Economic Palaeontologists and Mineralogists, which is where the name comes from, IIRC), has upgraded its website. It looks pretty great (nudge nudge, AAPG!). The awesome SEPM Strata, a resource for teaching and learning sequence stratigraphy, also got a facelift. 

Hat-tip to Brian Romans for this one.

Giant sand volcano

Helge Løseth of Statoil, whom we wrote about last week in connection with the Source Rocks from Seismic workflow, was recently in the news again. This time he and his exploration team were describing the Pleistocene extrusion of more than 10 km3 of sand onto the sea-floor in the northern North Sea, enough to bury Manhattan in 160 m of sand.

The results are reported in Løseth, H, N Rodrigues, and P Cobbold (2012) and build on earlier work by the same team (Rodrigues et al. 2009). 

Tape? There's still tape??

Yes, there's still tape. This story just caught my eye because I had no idea people were still using tape. It turns out that the next generation of tape, Ultrium LTO-6, will be along in the second half of 2012. The specs are pretty amazing: 8 TB (!) of compressed data, and about 200 MB/s (that's megabytes) transfer rates. The current generation of cartridges, LTO-5, cost about $60 and hold 3 TB — a similar-performing hard drive will set you back more than double that. 

The coolest cluster

Physics enables geophysics in lots of cool ways. CGGVeritas is using a 600 kW Green Revolution Cooling CarnotJet liquid cooling system to refrigerate 24 cluster racks in GRC's largest installation to date. In the video below, you can see an older 100 kW system. The company claims that these systems, in which the 40°C racks sit bathed in non-conductive oil, reduce the cost of cooling a supercomputer by about 90%... pretty amazing.

Awesomer still, this server is using Supermicro's SuperServer GPU-accelerated servers. GPUs, or graphics processing units, have massively parallel architectures (over 1000 cores per server), and can perform some operations much faster than ordinary CPUs, which are engineered to perform 'executive' functions as well as just math.

This regular news feature is for information only. We aren't connected with any of these organizations, and don't necessarily endorse their products or services. The cartridge image is licensed CC-BY-SA by Wikimedia Commons user andy_hazelbury. The CarnotJet image is from grcooling.com and thought to be fair use.

How big is that volume?

Sometimes you need to know how much space you need for a seismic volume. One of my machines only has 4GB of RAM, so if I don't want to run out of memory, I need to know how big a volume will be. Or your IT department might want help figuring out how much disk to buy next year.

Fortunately, since all seismic data is digital these days, it's easy to figure out how much space we will need. We simply count the samples in the volume, then account for the bit-depth. So, for example, if a 3D volume has 400 inlines and 300 traces per line, then it has 120 000 traces in total. If each trace is 6 seconds long, and the sample interval is 2 ms, then each trace has 6000/2 = 3000 samples (3001 actually, but let's not worry too much about that), so that's about 360 million samples. for a 32-bit volume, each sample requires 32/8 = 4 bytes, so we're at... a big number.  To convert to kilobytes, divide by 210, or 1024, then do it again for MB and again for GB.

It's worth noting that some seismic interpretation tools have proprietary compressed formats available for seismic data, Landmark's 'brick' format for example. This optionally applies a JPEG-like compression to reduce the file size, as well as making some sections display faster because of the way the compressed file is organized. The amount of compression depends on the frequency content of the data, and the compression is lossy, however, meaning that some of the original data is irretrievably lost in the process. If you do use such a file for visualization and interpretation, you may want to use a full bit-depth, full-fidelity file for attribute analysis. 

Do you have any tricks for managing large datasets? We'd love to hear them!