Real and apparent seismic frequency

There's a Jupyter Notebook for you to follow along with this tutorial. You can run it right here in your browser.


We often use Ricker wavelets to model seismic, for example when making a synthetic seismogram with which to help tie a well. One simple way to guesstimate the peak or central frequency of the wavelet that will model a particlar seismic section is to count the peaks per unit time in the seismic. But this tends to overestimate the actual frequency because the maximum frequency of a Ricker wavelet is more than the peak frequency. The question is, how much more?

To investigate, let's make a Ricker wavelet and see what it looks like in the time and frequency domains.

>>> T, dt, f = 0.256, 0.001, 25

>>> import bruges
>>> w, t = bruges.filters.ricker(T, dt, f, return_t=True)

>>> import scipy.signal
>>> f_W, W = scipy.signal.welch(w, fs=1/dt, nperseg=256)
The_frequency_of_a_Ricker_2_0.png

When we count the peaks in a section, the assumption is that this apparent frequency — that is, the reciprocal of apparent period or distance between the extrema — tells us the dominant or peak frequency.

To help see why this assumption is wrong, let's compare the Ricker with a signal whose apparent frequency does match its peak frequency: a pure cosine:

>>> c = np.cos(2 * 25 * np.pi * t)
>>> f_C, C = scipy.signal.welch(c, fs=1/dt, nperseg=256)
The_frequency_of_a_Ricker_4_0.png

Notice that the signal is much narrower in bandwidth. If we allowed more oscillations, it would be even narrower. If it lasted forever, it would be a spike in the frequency domain.

Let's overlay the signals to get a picture of the difference in the relative periods:

The_frequency_of_a_Ricker_6_1.png

The practical consequence of this is that if we estimate the peak frequency to be \(f\ \mathrm{Hz}\), then we need to reduce \(f\) by some factor if we want to design a wavelet to match the data. To get this factor, we need to know the apparent period of the Ricker function, as given by the time difference between the two minima.

Let's look at a couple of different ways to find those minima: numerically and analytically.

Find minima numerically

We'll use scipy.optimize.minimize to find a numerical solution. In order to use it, we'll need a slightly different expression for the Ricker function — casting it in terms of a time basis t. We'll also keep f as a variable, rather than hard-coding it in the expression, to give us the flexibility of computing the minima for different values of f.

Here's the equation we're implementing:

$$ w(t, f) = (1 - 2\pi^2 f^2 t^2)\ e^{-\pi^2 f^2 t^2} $$

In Python:

>>> def ricker(t, f):
>>>     return (1 - 2*(np.pi*f*t)**2) * np.exp(-(np.pi*f*t)**2)

Check that the wavelet looks like it did before, by comparing the output of this function when f is 25 with the wavelet w we were using before:

>>> f = 25
>>> np.allclose(w, ricker(t, f=25))
True

Now we call SciPy's minimize function on our ricker function. It itertively searches for a minimum solution, then gives us the x (which is really t in our case) at that minimum:

>>> import scipy.optimize
>>> f = 25
>>> scipy.optimize.minimize(ricker, x0=0, args=(f))

fun: -0.4462603202963996
 hess_inv: array([[1]])
      jac: array([-2.19792128e-07])
  message: 'Optimization terminated successfully.'
     nfev: 30
      nit: 1
     njev: 10
   status: 0
  success: True
        x: array([0.01559393])

So the minimum amplitude, given by fun, is -0.44626 and it occurs at an x (time) of \(\pm 0.01559\ \mathrm{s}\).

In comparison, the minima of the cosine function occur at a time of \(\pm 0.02\ \mathrm{s}\). In other words, the period appears to be \(0.02 - 0.01559 = 0.00441\ \mathrm{s}\) shorter than the pure waveform, which is...

>>> (0.02 - 0.01559) / 0.02
0.22050000000000003

...about 22% shorter. This means that if we naively estimate frequency by counting peaks or zero crossings, we'll tend to overestimate the peak frequency of the wavelet by about 22% — assuming it is approximately Ricker-like; if it isn't we can use the same method to estimate the error for other functions.

This is good to know, but it would be interesting to know if this parameter depends on frequency, and also to have a more precise way to describe it than a decimal. To get at these questions, we need an analytic solution.

Find minima analytically

Python's SymPy package is a bit like Maple — it understands math symbolically. We'll use sympy.solve to find an analytic solution. It turns out that it needs the Ricker function writing in yet another way, using SymPy symbols and expressions for \(\mathrm{e}\) and \(\pi\).

import sympy as sp
t, f = sp.Symbol('t'), sp.Symbol('f')
r = (1 - 2*(sp.pi*f*t)**2) * sp.exp(-(sp.pi*f*t)**2)

Now we can easily find the solutions to the Ricker equation, that is, the times at which the function is equal to zero:

>>> sp.solvers.solve(r, t)
[-sqrt(2)/(2*pi*f), sqrt(2)/(2*pi*f)]

But this is not quite what we want. We need the minima, not the zero-crossings.

Maybe there's a better way to do this, but here's one way. Note that the gradient (slope or derivative) of the Ricker function is zero at the minima, so let's just solve the first time derivative of the Ricker function. That will give us the three times at which the function has a gradient of zero.

>>> dwdt = sp.diff(r, t)
>>> sp.solvers.solve(dwdt, t)
[0, -sqrt(6)/(2*pi*f), sqrt(6)/(2*pi*f)]

In other words, the non-zero minima of the Ricker function are at:

$$ \pm \frac{\sqrt{6}}{2\pi f} $$

Let's just check that this evaluates to the same answer we got from scipy.optimize, which was 0.01559.

>>> np.sqrt(6) / (2 * np.pi * 25)
0.015593936024673521

The solutions agree.

While we're looking at this, we can also compute the analytic solution to the amplitude of the minima, which SciPy calculated as -0.446. We just plug one of the expressions for the minimum time into the expression for r:

>>> r.subs({t: sp.sqrt(6)/(2*sp.pi*f)})
-2*exp(-3/2)

Apparent frequency

So what's the result of all this? What's the correction we need to make?

The minima of the Ricker wavelet are \(\sqrt{6}\ /\ \pi f_\mathrm{actual}\ \mathrm{s}\) apart — this is the apparent period. If we're assuming a pure tone, this period corresponds to an apparent frequency of \(\pi f_\mathrm{actual}\ /\ \sqrt{6}\ \mathrm{Hz}\). For \(f = 25\ \mathrm{Hz}\), this apparent frequency is:

>>> (np.pi * 25) / np.sqrt(6)
32.06374575404661

If we were to try to model the data with a Ricker of 32 Hz, the frequency will be too high. We need to multiply the frequency by a factor of \(\sqrt{6} / \pi\), like so:

>>> 32.064 * np.sqrt(6) / (np.pi)
25.00019823475659

This gives the correct frequency of 25 Hz.

To sum up, rearranging the expression above:

$$ f_\mathrm{actual} = f_\mathrm{apparent} \frac{\sqrt{6}}{\pi} $$

Expressed as a decimal, the factor we were seeking is therefore \(\sqrt{6}\ /\ \pi\):

>>> np.sqrt(6) / np.pi
0.779696801233676

That is, the reduction factor is 22%.


Curious coincidence: in the recent Pi Day post, I mentioned the Riemann zeta function of 2 as a way to compute \(\pi\). It evaluates to \((\pi / \sqrt{6})^2\). Is there a million-dollar connection between the humble Ricker wavelet and the Riemann hypothesis?

I doubt it.

 
 

To make a wedge

We'll need a wavelet like the one we made last time. We could import it, if we've made one, but SciPy also has one so we can save ourselves the trouble. Remember to put %pylab inline at the top if using IPython notebook.

import numpy as np
from scipy.signal import ricker
import matplotlib.pyplot as plt

Now we need to make a physical earth model with three rock layers. In this example, let's make an acoustic impedance earth model. To keep it simple, let's define the earth model with two-way-travel time along the vertical axis (as opposed to depth). There are number of ways you could describe a wedge using math, and you could probably come up with a way that is better than mine. Here's a way:

nsamps, ntraces = [600, 500]
rock_names = ['shale 1', 'sand', 'shale 2']
rock_grid = np.zeros((n_samples, n_traces))

def make_wedge(n_samples, n_traces, layer_1_thickness, start_wedge, end_wedge):
    for j in np.arange(n_traces): 
        for i in np.arange(n_samples):      
            if i <= layer_1_thickness:      
rock_grid[i][j] = 1 if i > layer_1_thickness:
rock_grid[i][j] = 3 if j >= start_wedge and i - layer_1_thickness < j-start_wedge:
rock_grid[i][j] = 2 if j >= end_wedge and i > layer_1_thickness+(end_wedge-start_wedge):
rock_grid[i][j] = 3 return rock_grid

Let's insert some numbers into our wedge function and make a particular geometry.

layer_1_thickness = 200
start_wedge = 50
end_wedge = 250
rock_grid = make_wedge(n_samples, n_traces, 
            layer_1_thickness, start_wedge, 
            end_wedge)

plt.imshow(rock_grid, cmap='copper_r')

Now we can give each layer in the wedge properties.

vp = np.array([3300., 3200., 3300.]) 
rho = np.array([2600., 2550., 2650.]) 
AI = vp*rho
AI = AI / 10e6 # re-scale (optional step)

Then assign values assign them accordingly to every sample in the rock model.

model = np.copy(rock_grid)
model[rock_grid == 1] = AI[0]
model[rock_grid == 2] = AI[1]
model[rock_grid == 3] = AI[2]
plt.imshow(model, cmap='Spectral')
plt.colorbar()
plt.title('Impedances')

Now we can compute the reflection coefficients. I have left out a plot of the reflection coefficients, but you can check it out in the full version in the nbviewer

upper = model[:-1][:]
lower = model[1:][:]
rc = (lower - upper) / (lower + upper)
maxrc = abs(np.amax(rc))

Now we make the wavelet interact with the model using convolution. The convolution function already exists in the SciPy signal library, so we can just import it.

from scipy.signal import convolve
def make_synth(f):
    synth = np.zeros((n_samples+len(t)-2, n_traces))
    wavelet = ricker(512, 1e3/(4.*f))
    wavelet = wavelet / max(wavelet)   # normalize
    for k in range(n_traces):
        synth[:,k] = convolve(rc[:,k], wavelet)
    synth = synth[ np.ceil(len(wavelet))/2 : -np.ceil(len(wavelet))/2, : ]
    return synth

Finally, we plot the results.

frequencies = array([5, 10, 15]) plt.figure(figsize = (15, 4)) for i in np.arange(len(frequencies)): this_plot = make_synth(frequencies[i]) plt.subplot(1, len(frequencies), i+1) plt.imshow(this_plot, cmap='RdBu', vmax=maxrc, vmin=-maxrc, aspect=1) plt.title( '%d Hz wavelet' % freqs[i] ) plt.grid() plt.axis('tight') # Add some labels for i, names in enumerate(rock_names): plt.text(400, 100+((end_wedge-start_wedge)*i+1), names, fontsize=14, color='gray', horizontalalignment='center', verticalalignment='center')

 

That's it. As you can see, the marriage of building mathematical functions and plotting them can be a really powerful tool you can apply to almost any physical problem you happen to find yourself working on.

You can access the full version in the nbviewer. It has a few more figures than what is shown in this post.

A day of geocomputing

I will be in Calgary in the new year and running a one-day version of this new course. To start building your own tools, pick a date and sign up:

Eventbrite - Agile Geocomputing    Eventbrite - Agile Geocomputing

To plot a wavelet

As I mentioned last time, a good starting point for geophysical computing is to write a mathematical function describing a seismic pulse. The IPython Notebook is designed to be used seamlessly with Matplotlib, which is nice because we can throw our function on graph and see if we were right. When you start your own notebook, type

ipython notebook --pylab inline

We'll make use of a few functions within NumPy, a workhorse to do the computational heavy-lifting, and Matplotlib, a plotting library.

import numpy as np
import matplotlib.pyplot as plt

Next, we can write some code that defines a function called ricker. It computes a Ricker wavelet for a range of discrete time-values t and dominant frequencies, f:

def ricker(f, length=0.512, dt=0.001):
    t = np.linspace(-length/2, (length-dt)/2, length/dt)
    y = (1.-2.*(np.pi**2)*(f**2)*(t**2))*np.exp(-(np.pi**2)*(f**2)*(t**2))
    return t, y

Here the function needs 3 input parameters; frequency, f, the length of time over which we want it to be defined, and the sample rate of the signal, dt. Calling the function returns two arrays, the time axis t, and the value of the function, y.

To create a 5 Hz Ricker wavelet, assign the value of 5 to the variable f, and pass it into the function like so,

f = 5
t, y = ricker (f)

To plot the result,

plt.plot(t, y)

But with a few more commands, we can improve the cosmetics,

plt.figure(figsize=(7,4))
plt.plot( t, y, lw=2, color='black', alpha=0.5)
plt.fill_between(t, y, 0,  y > 0.0, interpolate=False, hold=True, color='blue', alpha = 0.5)
plt.fill_between(t, y, 0, y < 0.0, interpolate=False, hold=True, color='red', alpha = 0.5)

# Axes configuration and settings (optional)
plt.title('%d Hz Ricker wavelet' %f, fontsize = 16 )
plt.xlabel( 'two-way time (s)', fontsize = 14)
plt.ylabel('amplitude', fontsize = 14)
plt.ylim((-1.1,1.1))
plt.xlim((min(t),max(t)))
plt.grid()
plt.show()

Next up, we'll make this wavelet interact with a model of the earth using some math. Let me know if you get this up and running on your own.

Let's do it

It's short notice, but I'll be in Calgary again early in the new year, and I will be running a one-day version of this new course. To start building your own tools, pick a date and sign up:

Eventbrite - Agile Geocomputing    Eventbrite - Agile Geocomputing

E is for Envelope

There are 8 layers in this simple blocky earth model. You might say that there are only 7 important pieces of information for this cartoon earth; the 7 reflectivity contrasts at the layer boundaries.

The seismic traces however, have more information than that. On the zero phase trace, there are 21 extrema (peaks / troughs). Worse yet, on the phase rotated trace there are 28. So somehow, the process of wave propagation has embedded more information than we need. Actually, in that case, maybe we shouldn't call it information: it's noise.

It can be hard to tell what is real and what is side lobe and soon, you are assigning geological significance to noise. A literal interpretation of the peaks and troughs would produce far more layers than there actually are. If you interpret every extreme as being matched with a boundary, you would be wrong.

Consider the envelope. The envelope has extrema positioned exactly at each boundary, and perhaps more importantly, it literally envelopes (I enunciate it differently here for drama) the part of the waveform associated with that reflection. 7 boundaries, 7 bumps on the envelope, correctly positioned in the time domain.

Notice how the envelope encompasses all phase rotations from 0 to 360 degrees; it's phase invariant. Does this make it more robust? But it's so broad! Are we losing precision or accuracy by accompanying our trace with it's envelope? What does vertical resolution really mean anyway?

Does this mean that every time there is an envelope maximum, I can expect a true layer boundary? I for one, don't know if this is fool proof in the face of interferring wavelets, but it has implications for how we work as seismic interpreters. Tomorrow we'll take a look at the envelope attribute in the context of real data.