How to QC a seismic volume

I've had two emails recently about quality checking seismic volumes. And last month, this question popped up on LinkedIn:

We have written before about making a data quality volume for your seismic — a handy way to incorporate uncertainty into risk maps — but these recent questions seem more concerned with checking a new volume for problems.

First things first

Ideally, you'd get to check the volume before delivery (at the processing shop, say), otherwise you might have to actually get it loaded before you can perform your QC. I am assuming you've already been through the processing, so you've seen shot gathers, common-offset gathers, etc. This is all about the stack. Nonetheless, the processor needs to prepare some things:

  • The stack volume, of course, with and without any 'cosmetic' filters (eg fxy, fk).
  • A semblance (coherency, similarity, whatever) volume.
  • A fold volume.
  • Make sure the processor has some software that can rapidly scan the data, plot amplitude histograms, compute a spectrum, pick a horizon, and compute phase. If not, install OpendTect (everyone should have it anyway), or you'll have to load the volume yourself.

There are also some things you can do ahead of time. 

  1. Be part of the processing from the start. You don't want big surprises at this stage. If a few lines got garbled during file creation, no problem. If there's a problem with ground-roll attentuation, you're not going to be very popular.
  2. Make sure you know how the survey was designed — where the corners are, where you would expect live traces to be, and which way the shot and receiver lines went (if it was an orthogonal design). Get maps, take them with you.
  3. Double-check the survey parameters. The initial design was probably changed. The PowerPoint presentation was never updated. The processor probably has the wrong information. General rule with subsurface data: all metadata is probably wrong. Ideally, talk to someone who was involved in the planning of the survey.
  4. You didn't skip (2) did you? I'm serious, double check everything.

Crack open the data

OK, now you are ready for a visit with the processor. Don't fall into the trap of looking at the geology though — it will seduce you (it's always pretty, especially if it's the first time you've seen it). There is work to do first.

  1. Check the cornerpoints of the survey. I like the (0, 0) trace at the SW corner. The inline and crossline numbering should be intuitive and simple. Make sure the survey is the correct way around with respect to north.
  2. Scan through timeslices. All of them. Is the sample interval what you were expecting? Do you reach the maximum time you expected, based on the design? Make sure the traces you expect to be live are live, and the ones you expect to be dead are dead. Check for acquisition footprint. Start with greyscale, then try another colourmap.
  3. Repeat (5) but in a similarity volume (or semblance, coherency, whatever). Look for edges, and geometric shapes. Check again for footprint.
  4. Look through the inlines and crosslines. These usually look OK, because it's what processors tend to focus on.
  5. Repeat (7) but in a similarity volume.

Dive into the details

  1. Check some spectrums. Select some subsets of the data — at least 100 traces and 1000 ms from shallow, deep, north, south, east, west — and check the average spectrums. There should be no conspicuous notches or spikes, which could be signs of all sorts of things from poorly applied filters to reverberation.
  2. Check the amplitude histograms from those same subsets. It should be 32-bit data — accept no less. Check the scaling — the numbers don't mean anything, so you can make them range over whatever you like. Something like ±100 or ±1000 tends to make for convenient scaling of amplitude maps and so on; ±1.0 or less can be fiddly in some software. Check for any departures from an approximately Laplacian (double exponential) distribution: clipping, regular or irregular spikes, or a skewed or off-centre distribution:
  1. Interpret a horizon and check its phase. See Purves (Leading Edge, October 2014) or SubSurfWiki for some advice.
  2. By this time, the fold volume should yield no surprises. If any of the rest of this checklist throws up problems, the fold volume might help troubleshoot.
  3. Check any other products you asked for. If you asked for gathers or angle stacks (you should), check them too.

Last of all, before actual delivery, talk to whoever will be loading the data about what kind of media they prefer, and what kind of file organization. They may also have some preferences for the contents of the SEG-Y file and trace headers. Pass all of this on to the processor. And don't forget to ask for All The Seismic

What about you?

Have I forgotten anything? Are there things you always do to check a new seismic volume? Or if you're really brave, maybe you have some pitfalls or even horror stories to share...

Introducing Bruges

bruges_rooves.png

Welcome to Bruges, a Python library (previously known as agilegeo) that contains a variety of geophysical equations used in processing, modeling and analysing seismic reflection and well log data. Here's what's in the box so far, with new stuff being added every week:


Simple AVO example

VP [m/s] VS [m/s] ρ [kg/m3]
Rock 1 3300 1500 2400
Rock 2 3050 1400 2075

Imagine we're studying the interface between the two layers whose rock properties are shown here...

To compute the zero-offset reflection coefficient at zero offset, we pass our rock properties into the Aki-Richards equation and set the incident angle to zero:

 >>> import bruges as b
 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=0)
 -0.111995777064

Similarly, compute the reflection coefficient at 30 degrees:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=30)
 -0.0965206980095

To calculate the reflection coefficients for a series of angles, we can pass in a list:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=[0,10,20,30])
 [-0.11199578 -0.10982911 -0.10398651 -0.0965207 ]

Similarly, we could compute all the reflection coefficients for all incidence angles from 0 to 70 degrees, in one degree increments, by passing in a range:

 >>> b.reflection.akirichards(vp1, vs1, rho1, vp2, vs2, rho2, theta1=range(70))
 [-0.11199578 -0.11197358 -0.11190703 ... -0.16646998 -0.17619878 -0.18696428]

A few more lines of code, shown in the Jupyter notebook, and we can make some plots:


Elastic moduli calculations

With the same set of rocks in the table above we could quickly calculate the Lamé parameters λ and µ, say for the first rock, like so (in SI units),

 >>> b.rockphysics.lam(vp1, vs1, rho1), b.rockphysics.mu(vp1, vs1, rho1)
 15336000000.0 5400000000.0

Sure, the equations for λ and µ in terms of P-wave velocity, S-wave velocity, and density are pretty straightforward: 

 

but there are many other elastic moduli formulations that aren't. Bruges knows all of them, even the weird ones in terms of E and λ.


All of these examples, and lots of others — Backus averaging,  examples are available in this Jupyter notebook, if you'd like to work through them on your own.


Bruges is a...

It is very much early days for Bruges, but the goal is to expose all the geophysical equations that geophysicists like us depend on in their daily work. If you can't find what you're looking for, tell us what's missing, and together, we'll make it grow.

What's a handy geophysical equation that you employ in your work? Let us know in the comments!

Seismic inception

A month ago, some engineers at Google blogged about how they had turned a deep learning network in on itself and produced some fascinating and/or disturbing images:

One of the images produced by the team at Google. Click to see a larger version. Read more. CC-BY.

The basic recipe, which Google later open sourced, involves training a deep learning network (basically a multi-layer neural network) on some labeled images, animals maybe, then searching for matching patterns in a target image, like these clouds. If it finds something, it emphasizes it — given the data, it tries to construct an animal. Then do it again.

Or, here's how a Google programmer puts it (one of my favourite sentences ever)...

Making the "dream" images is very simple. Essentially it is just a gradient ascent process that tries to maximize the L2 norm of activations of a particular DNN layer. 

That's all! Anyway, the point is that you get utter weirdness:

OK, cool... what happens if you feed it seismic?

That was my first thought, I'm sure it was yours too. The second thing I thought, and the third, and the fourth, was: wow, this software is hard to compile. I spent an unreasonable amount of time getting caffe, the Berkeley Vision & Learning Centre's deep learning software, working. But on Friday I cracked it, so today I got to satisfy my curiosity.

The short answer is: reptiles. These weirdos were 8 levels down, which takes about 20 minutes to reach on my iMac.

Seismic data from the Virtual Seismic Atlas, courtesy of Fugro. 

THE DEEPDREAM TREATMENT. Mostly reptiles.

Er, right... what's the point in all this?

That's a good question. It's just a bit of fun really. But it makes you wonder:

  • What if we train the network on seismic facies? I think this could be very interesting.
  • Better yet, what if we train it on geology? Probably spurious: seismic is not geology.
  • Does this mean learning networks are just dumb machines, or can they see more than us? Tough one — human vision is highly fallible. There are endless illusions to prove this. But computers only do what we tell them, at least for now. I think if we're careful what we ask for, we can use these highly non-linear data-crunching algorithms for good.
  • Are we out of a job? Definitely not. How do you think machines will know what to learn? The challenge here is to make this work, and then figure out how it can help change, or at least accelerate, our understanding of the subsurface.

This deep learning stuff — of which the University of Toronto was a major pioneer during its emergence in about 2010 — is part of the machine learning revolution that you are, like it or not, experiencing. It will take time, and it will make awful mistakes, but the indications are that machine learning will eat every analytical method for breakfast. Customer behaviour prediction, computer vision, natural language processing, all this stuff is reeling from the relatively sudden and widespread availability of inexpensive computer intelligence. 

So what are we going to do with that?

           Okay, one more. from Paige Bailey's Twitter feed.

           Okay, one more. from Paige Bailey's Twitter feed.

Software, stats, and tidal energy

Today was the last day of the conference part of SciPy 2015 in Austin. Almost all the talks at this conference have been inspiring and/or enlightening. This makes it all the more wonderful that the organizers get the talks online within a couple of hours (!), so you can see everything (compared to about 5% maximum coverage at SEG).

Jake Vanderplas, a young astronomer and data scientist at UW's eScience Institute, gave the keynote this morning. He eloquently reviewed the history and state-of-the-art of the so-called SciPy stack, the collection of tools that Pythonistic scientists use to get their research done. If you're just getting started in this world, it's about the best intro you could ask for:

Chris Fonnesbeck treated the room to what might as well have been a second keynote, so well did he express his convictions. Beautiful slides, and a big message: statistics matters.

Kristen Thyng, an energetic contributor to the conference, gave a fantastic talk about tidal energy, her main field, as well as one about perceptual colourmaps, which is more of a hobby. The work includes some very nice visualizations of tidal currents in my home province...

Finally, I highly recommend watching the lightning talks. Apart from being filled with some mind-blowing ideas, many of them eliciting spontaneous applause (imagine that!), I doubt you will ever witness a more effective exercise in building a community of passionate professionals. It's remarkable. (If you don't have an hour these three are awesome.)

Next we'll be enjoying the 'sprints', a weekend of coding on open source projects. We'll be back to geophysics blogging next week :)

Geophysics at SciPy 2015

Yesterday was the geoscience day at SciPy 2015 in Austin.

At lunchtime, Paige Bailey (Chevron) organized a Birds of a Feather on GIS. This was a much-needed meetup for anyone interested in spatial data. It was useful to hear about the tools the fifty-or-so participants  use every day, and a great chance to air some frustrations like Why is it so hard to install a geospatial stack? And questions like How do people make attractive maps with the toolset?

One way to make attractive maps is go beyond the screen and 3D print them. Almost any subsurface dataset could seem more tangible and believable as a 3D object, and Joe Kington (Chevron) showed us how to make data into objects. Just watch:

Matteus Ueckermann followed up with some virtual elevation models, showing how Python can process not just a few tiles of data, but can handle hydrology modeling for the entire world:

Nicola Creati (OGS, Trieste) showed us the PyGmod package, a new and fully parallel geodynamic simulation tool for HPC nuts. So now you can make more plate tectonic models before most people are out of bed!

We also heard from Lindsey Heagy and Gudnir Rosenkjaer from UBC, talking about various applications of Rowan Cockett's awesome SimPEG package to their work. As at the hackathon in Denver, it's very clear that this group's investment in and passion for a well-architected, integrated package is well worth the work, giving everyone who works with it superpowers. And, as we all know, superpowers are awesome. Especially geophysical ones.

Last up, I talked about striplog, a small package for handling interval and point data in logs, core, and other 1D datasets. It's still very immature, but almost ready for real-world users, so if you think you have a use case, I'd love to hear from you.

Today is the last day of the conference part, before we head into the coding sprints tomorrow. Stay tuned for more, or follow the #scipy2015 hashtag to keep up. See all the videos, which go up almost right after talks, on YouTube.

Attribute analysis and statistics

Last week I wrote a basic introduction to attribute analysis. The post focused on the different ways of thinking about sampling and intervals, and on how instantaneous attributes have to be interpolated from the discrete data. This week, I want to look more closely at those interval attributes. We'd often like to summarize the attributes of an interval info a single number, perhaps to make a map.

Before thinking about amplitudes and seismic traces, it's worth reminding ourselves about different kinds of average. This table from SubSurfWiki might help... 

A peculiar feature of seismic data. from a statistical point of view, is the lack of the very low frequencies needed to give it a trend. Because of this, it oscillates around zero, so the average amplitude over a window tends to zero — seismic data has a mean value of zero. So not only do we have to think about interpolation issues when we extract attributes, we also have to think about statistics.

Fortunately, once we understand the issue it's easy to come up with ways around it. Look at the trace (black line) below:

The mean is, as expected, close to zero. So I've applied some other statistics to represent the amplitude values, shown as black dots, in the window (the length of the plot):

  • Average absolute amplitude (light green) — treat all values as positive and take the mean.
  • Root-mean-square amplitude (dark green) — tends to emphasize large values, so it's a bit higher.
  • Average energy (magenta) — the mean of the magnitude of the complex trace, or the envelope, shown in grey.
  • Maximum amplitude (blue) — the absolute maximum value encountered, which is higher than the actual sample values (which are all integers in this fake dataset) because of interpolation.
  • Maximum energy (purple) — the maximum value of the envelope, which is higher still because it is phase independent.

There are other statistics besides these, of course. We could compute the median average, or some other mean. We could take the strongest trough, or the maximum derivative (steepest slope). The options are really only limited by your imagination, and the physical relationship with geology that you expect.

We'll return to this series over the summer, asking questions like How do you know what to expect? and Does a physically realistic relationship even matter? 


To view and run the code that I used in creating the figures for this post, grab the iPython/Jupyter Notebook.

An attribute analysis primer

A question on Stack Exchange the other day reminded me of the black magic feeling I used to have about attribute analysis. It was all very meta: statistics of combinations of attributes, with shifted windows and crazy colourbars. I realized I haven't written much about the subject, despite the fact that many of us spend a lot of time trying to make sense of attributes.

Time slices, horizon slices, and windows

One of the first questions a new attribute-analyser has is, "Where should the window be?" Like most things in geoscience: it depends. There are lots of ways of doing it, so think about what you're after...

  • Timeslice. Often the most basic top-down view is a timeslice, because they are so easy to make. This is often where attribute analysis begins, but since timeslices cut across stratigraphy, not usually where it ends.
  • Horizon. If you're interested in the properties of a strong reflector, such as a hard, karsted unconformity, maybe you just want the instantaneous attribute from the horizon itself.
  • Zone. If the horizon was hard to interpret, or is known to be a gradual facies transition, you may want to gather statistics from a zone around it. Or perhaps you couldn't interpret the thing you really wanted, but only that nice strong reflection right above it... maybe you can bootstrap yourself from there. 
  • Interval. If you're interested in a stratigraphic interval, you can bookend it with existing horizons, perhaps with a constant shift on one or both of them.
  • Proportional. If seismic geomorphology is your game, then you might get the most reasonable inter-horizon slices from proportionally slicing in between stratigraphic surface. Most volume interpretation software supports this. 

There are some caveats to simply choosing the stratigraphic interval you are after. Beware of choosing an interval that strong reflectors come into and out of. They may have an unduly large effect on most statistics, and could look 'geological'. And if you're after spectral attributes, do remember that the Fourier transform needs time! The only way to get good frequency resolution is to provide long windows: a 100 ms window gives you frequency information every 10 Hz.

Extraction depends on sample interpolation

When you extract an attribute, say amplitude, from a trace, it's easy to forget that the software has to do some approximation to give you an answer. This is because seismic traces are not continuous curves, but discrete series, with samples typically every 1, 2, or 4 milliseconds. Asking for the amplitude at some arbitrary time, like the point at which a horizon crosses a trace, means the software has to interpolate between samples somehow. Different software do this in different ways (linear, spline, polynomial, etc), and the methods give quite different results in some parts of the trace. Here are some samples interpolated with a spline (black curve) and linearly (blue). The nearest sample gives the 'no interpolation' result.

As well as deciding how to handle non-sampled parts of the trace, we have to decide how to represent attributes operating over many samples. In a future post, we'll give some guidance for using statistics to extract information about the entire window. What options are available and how do we choose? Do we take the average? The maximum? Something else?

There's a lot more to come!

As I wrote this post, I realized that this is a massive subject. Here are some aspects I have not covered today:

  • Calibration is a gaping void in many published workflows. How can we move past "that red blob looks like a point bar so I drew a line around it in PowerPoint" to "there's a 70% chance of finding reservoir quality sand at that location"?
  • This article was about  single-trace attributes at single instants or over static windows. Multi-trace and volume attributes, like semblance, curvature, and spectral decomposition, need a post of their own.
  • There are a million attributes (though only a few that count, just ask Art Barnes) so choosing which ones to use can be a challenge. Criteria range from what software licenses you have to what is physically reasonable.
  • Because there are a million attributes, the art of combining attributes with statistical methods like principal component analysis or multi-linear regression needs a look. This gets into seismic inversion.

We'll return to these ideas over the next few weeks. If you have specific questions or workflows to share, please leave a comment below, or get in touch by email or Twitter.

To view and run the code that I used in creating the figures for this post, grab the iPython/Jupyter Notebook.

Corendering more attributes

My recent post on multi-attribute data visualization painted two seismic attributes from on a timeslice. Let's look now at corendering attributes extracted on a seismic horizon. I'll reproduce the example Matt gave in his post on colouring maps.

Although colour choices come down to personal preference, there are some points to keep in mind:

  • Data that varies relatively gradually across the canvas — e.g. elevation here — should use a colour scale that varies monotonically in hue and luminance, e.g. CubeHelix or Matteo Niccoli's colourmaps.
  • Data that varies relatively quickly across the canvas — e.g. my similarity data, (a member of the family that includes coherencesemblance, and so on) — should use a monochromatic colour scale, e.g. black–white. 
  • If we've chosen our colourmaps wisely, there should be some unused hues for rendering other additional attributes. In this case, there are no red hues in the elevation colourmap, so we can map redness to instantaneous amplitude.

Adding a light source

Without wanting to get too gimmicky, we can sometimes enliven the appearance of an attribute, accentuating its texture, by simulating a bumpy surface and shining a virtual light onto it. This isn't the same as casting a light source on the composite display. We can make our light source act on only one of our attributes and leave the others unchanged. 

Similarity attribute Displayed using a Greyscale Colourbar (left). Bump mapping of similarity attribute using a lightsource positioned at azimuth 350 degrees, inclination 20 degrees. 

The technique is called hill-shading. The terrain doesn't have to be a physical surface; it can be a slice. And unlike physical bumps, we're not actually making a new surface with relief, we are merely modifying the surface's luminance from an artificial light source. The result is a more pronounced texture.

One view, two dimensions, three attributes

Constructing this display takes a bit of trial an error. It wasn't immediately clear where to position the light source to get the most pronounced view. Furthermore, the amplitude extraction looked quite noisy, so I softened it a little bit using a Gaussian filter. Plus, I wanted to show only the brightest of the bright spots, so that all took a bit of fiddling.

Even though 3D data visualization is relatively common, my assertion is that it is much harder to get 3D visualization right, than for 2D. Looking at the 3 colour-bars that I've placed in the legend. I'm reminded of this difficulty of adding a third dimension; it's much harder to produce a colour-cube in the legend than a series of colour-bars. Maybe the best we can achieve is a colour-square like last time, with a colour-bar for the overlay on the side.

Check out the IPython notebook for the code used to create these figures.

Pick This again

Since I last wrote about it, Pick This! has matured. We have continued to improve the tool, which is a collaboration between Agile and the 100% awesome Steve Purves at Euclidity.

Here's some of the new stuff we've added:

  • Multiple lines and polygons for each interpretation. This was a big limitation; now we can pick multiple fault sticks, say.
  • 'Preshows', to show the interpreter some text or an image before they interpret. In beta, talk to us if you want to try it.
  • Interpreter cohorts, with randomized selection, so we can conduct blind trials.  In beta, again, talk to us.
  • Complete picking history, so we can replay the entire act of interpretation. Coming soon: new visualizations of results that use this data.

Some of this, such as replaying the entire picking event, is of interest to researchers who want to know how experts interpret images. Remotely sensed images — whether in geophysics, radiology, astronomy, or forensics — are almost always ambiguous. Look at these faults, for example. How many are there? Where are they exactly? Where are their tips?  

A seismic line from the Browse Basin, offshore western Australia. Data courtesy of CGG and the Virtual Seismic Atlas

A seismic line from the Browse Basin, offshore western Australia. Data courtesy of CGG and the Virtual Seismic Atlas

Most of the challenges on the site are just fun challenges, but some — like the Browse Basin challenge, above — are part of an experiment by researchers Juan Alcalde and Clare Bond at the University of Aberdeen. Please help them with their research by taking part and making an interpretation! It would also be super if you could fill out your profile page — that will help Juan and Clare understand the results. 

If you're at the AAPG conference in Denver then you can win bonus points by stopping by Booth 404 to visit Juan and Clare. Ask them all about their fascinating research, and say hello from us!

While you're on the site, check out some of the other images — or upload one yourself! This one was a real eye-opener: time-lapse seismic reflections from the water column, revealing dynamic thermohaline stratification. Can you pick this?

Pick This challenge showing time-lapse frames from a marine 3D. The seabed is shown in blue at the bottom of the images.

Pick This challenge showing time-lapse frames from a marine 3D. The seabed is shown in blue at the bottom of the images.

May linkfest

The pick of the links from the last couple of months. We look for the awesome, so you don't have to :)

ICYMI on Pi Day, pimeariver.com wants to check how close river sinuosity comes to pi. (TL;DR — not very.)

If you're into statistics, someone at Imperial College London recently released a nice little app for stochastic simulations of simple calculations. Here's a back-of-the-envelope volumetric calculation by way of example. Good inspiration for our Volume* app.

I love it when people solve problems together on the web. A few days ago Chris Jackson (also at Imperial) posted a question about converting projected coordinates...

I responded with a code snippet that people quickly improved. Chris got several answers to his question, and I learned something about the pyproj library. Open source wins again!

In answering that question, I also discovered that Github now renders most IPython Notebooks. Sweet!

Speaking of notebooks, Beaker looks interesting: individual code blocks support different programming languages within the same notebook and allow you to pass data from one cell to another. For instance, you could do your basic stuff in Python, computationally expensive stuff in Julia, then render a visualization with JavaScript. Here's a simple example from their site.

Python is the language for science, but JavaScript certainly rules the visual side of the web. Taking after JavaScript data-artists like Bret Victor and Mike Bostock, Jack Schaedler has built a fantastic website called Seeing circles, sines, and signals containing visual explanations of signal processing concepts.

If that's not enough for you, there's loads more where that came from: Gallery of Concept Visualization. You're welcome.

My recent notebook about finding small things with 2D seismic grids sparked some chatter on Twitter. People had some great ideas about modeling non-random distributions, like clustered or anisotropic populations. Lots to think about!

Getting help quickly is perhaps social media's most potent capability — though some people do insist on spoiling everything by sharing U might be a genius if u can solve this! posts (gah, stop it!). Earth Science Stack Exchange is still far from being the tool is can be, but there have been some relevant questions on geophysics lately:

A fun thread came up on Reddit too recently: Geophysics software you wish existed. Perfect for inspiring people at hackathons! I'm keeping a list of hacky projects for the next one, by the way.

Not much to say about 3D models in Sketchfab, other than: they're wicked! I mean, check out this annotated anticline. And here's one by R Mahon based on sedimentological experiments by John Shaw and others...