March 26, 2018

Real and apparent seismic frequency

March 26, 2018/ Matt Hall

There's a Jupyter Notebook for you to follow along with this tutorial. You can run it right here in your browser.

We often use Ricker wavelets to model seismic, for example when making a synthetic seismogram with which to help tie a well. One simple way to guesstimate the peak or central frequency of the wavelet that will model a particlar seismic section is to count the peaks per unit time in the seismic. But this tends to overestimate the actual frequency because the maximum frequency of a Ricker wavelet is more than the peak frequency. The question is, how much more?

To investigate, let's make a Ricker wavelet and see what it looks like in the time and frequency domains.

>>> T, dt, f = 0.256, 0.001, 25

>>> import bruges
>>> w, t = bruges.filters.ricker(T, dt, f, return_t=True)

>>> import scipy.signal
>>> f_W, W = scipy.signal.welch(w, fs=1/dt, nperseg=256)

When we count the peaks in a section, the assumption is that this apparent frequency — that is, the reciprocal of apparent period or distance between the extrema — tells us the dominant or peak frequency.

To help see why this assumption is wrong, let's compare the Ricker with a signal whose apparent frequency does match its peak frequency: a pure cosine:

>>> c = np.cos(2 * 25 * np.pi * t)
>>> f_C, C = scipy.signal.welch(c, fs=1/dt, nperseg=256)

Notice that the signal is much narrower in bandwidth. If we allowed more oscillations, it would be even narrower. If it lasted forever, it would be a spike in the frequency domain.

Let's overlay the signals to get a picture of the difference in the relative periods:

The practical consequence of this is that if we estimate the peak frequency to be $f\ \mathrm{Hz}$, then we need to reduce $f$ by some factor if we want to design a wavelet to match the data. To get this factor, we need to know the apparent period of the Ricker function, as given by the time difference between the two minima.

Let's look at a couple of different ways to find those minima: numerically and analytically.

Find minima numerically

We'll use scipy.optimize.minimize to find a numerical solution. In order to use it, we'll need a slightly different expression for the Ricker function — casting it in terms of a time basis t. We'll also keep f as a variable, rather than hard-coding it in the expression, to give us the flexibility of computing the minima for different values of f.

Here's the equation we're implementing:

$$ w(t, f) = (1 - 2\pi^2 f^2 t^2)\ e^{-\pi^2 f^2 t^2} $$

In Python:

>>> def ricker(t, f):
>>>     return (1 - 2*(np.pi*f*t)**2) * np.exp(-(np.pi*f*t)**2)

Check that the wavelet looks like it did before, by comparing the output of this function when f is 25 with the wavelet w we were using before:

>>> f = 25
>>> np.allclose(w, ricker(t, f=25))
True

Now we call SciPy's minimize function on our ricker function. It itertively searches for a minimum solution, then gives us the x (which is really t in our case) at that minimum:

>>> import scipy.optimize
>>> f = 25
>>> scipy.optimize.minimize(ricker, x0=0, args=(f))

fun: -0.4462603202963996
 hess_inv: array([[1]])
      jac: array([-2.19792128e-07])
  message: 'Optimization terminated successfully.'
     nfev: 30
      nit: 1
     njev: 10
   status: 0
  success: True
        x: array([0.01559393])

So the minimum amplitude, given by fun, is -0.44626 and it occurs at an x (time) of $\pm 0.01559\ \mathrm{s}$.

In comparison, the minima of the cosine function occur at a time of $\pm 0.02\ \mathrm{s}$. In other words, the period appears to be $0.02 - 0.01559 = 0.00441\ \mathrm{s}$ shorter than the pure waveform, which is...

>>> (0.02 - 0.01559) / 0.02
0.22050000000000003

...about 22% shorter. This means that if we naively estimate frequency by counting peaks or zero crossings, we'll tend to overestimate the peak frequency of the wavelet by about 22% — assuming it is approximately Ricker-like; if it isn't we can use the same method to estimate the error for other functions.

This is good to know, but it would be interesting to know if this parameter depends on frequency, and also to have a more precise way to describe it than a decimal. To get at these questions, we need an analytic solution.

Find minima analytically

Python's SymPy package is a bit like Maple — it understands math symbolically. We'll use sympy.solve to find an analytic solution. It turns out that it needs the Ricker function writing in yet another way, using SymPy symbols and expressions for $\mathrm{e}$ and $\pi$.

import sympy as sp
t, f = sp.Symbol('t'), sp.Symbol('f')
r = (1 - 2*(sp.pi*f*t)**2) * sp.exp(-(sp.pi*f*t)**2)

Now we can easily find the solutions to the Ricker equation, that is, the times at which the function is equal to zero:

>>> sp.solvers.solve(r, t)
[-sqrt(2)/(2*pi*f), sqrt(2)/(2*pi*f)]

But this is not quite what we want. We need the minima, not the zero-crossings.

Maybe there's a better way to do this, but here's one way. Note that the gradient (slope or derivative) of the Ricker function is zero at the minima, so let's just solve the first time derivative of the Ricker function. That will give us the three times at which the function has a gradient of zero.

>>> dwdt = sp.diff(r, t)
>>> sp.solvers.solve(dwdt, t)
[0, -sqrt(6)/(2*pi*f), sqrt(6)/(2*pi*f)]

In other words, the non-zero minima of the Ricker function are at:

$$ \pm \frac{\sqrt{6}}{2\pi f} $$

Let's just check that this evaluates to the same answer we got from scipy.optimize, which was 0.01559.

>>> np.sqrt(6) / (2 * np.pi * 25)
0.015593936024673521

The solutions agree.

While we're looking at this, we can also compute the analytic solution to the amplitude of the minima, which SciPy calculated as -0.446. We just plug one of the expressions for the minimum time into the expression for r:

>>> r.subs({t: sp.sqrt(6)/(2*sp.pi*f)})
-2*exp(-3/2)

Apparent frequency

So what's the result of all this? What's the correction we need to make?

The minima of the Ricker wavelet are $\sqrt{6}\ /\ \pi f_\mathrm{actual}\ \mathrm{s}$ apart — this is the apparent period. If we're assuming a pure tone, this period corresponds to an apparent frequency of $\pi f_\mathrm{actual}\ /\ \sqrt{6}\ \mathrm{Hz}$. For $f = 25\ \mathrm{Hz}$, this apparent frequency is:

>>> (np.pi * 25) / np.sqrt(6)
32.06374575404661

If we were to try to model the data with a Ricker of 32 Hz, the frequency will be too high. We need to multiply the frequency by a factor of $\sqrt{6} / \pi$, like so:

>>> 32.064 * np.sqrt(6) / (np.pi)
25.00019823475659

This gives the correct frequency of 25 Hz.

To sum up, rearranging the expression above:

$$ f_\mathrm{actual} = f_\mathrm{apparent} \frac{\sqrt{6}}{\pi} $$

Expressed as a decimal, the factor we were seeking is therefore $\sqrt{6}\ /\ \pi$:

>>> np.sqrt(6) / np.pi
0.779696801233676

That is, the reduction factor is 22%.

Curious coincidence: in the recent Pi Day post, I mentioned the Riemann zeta function of 2 as a way to compute $\pi$. It evaluates to $(\pi / \sqrt{6})^2$. Is there a million-dollar connection between the humble Ricker wavelet and the Riemann hypothesis?

I doubt it.

March 14, 2018

Happy π day, Einstein

March 14, 2018/ Matt Hall

It's Pi Day today, and also Einstein's 139th birthday. MIT celebrates it at 6:28 pm — in honour of pi's arch enemy, tau — by sending out its admission notices.

And Stephen Hawking died today. He will leave a great, black hole in modern science. I saw him lecture in London not long after A Brief History of Time came out. It was one of the events that inspired me along my path to science. I recall he got more laughs than a lot of stand-ups I've seen.

But I can't really get behind 3/14. The weird American way of writing dates, mixed-endian style, really irks me. As a result, I have previously boycotted Pi Day, instead celebrating it on 31/4, aka 31 April, aka 1 May. Admittedly, this takes the edge off the whole experience a bit, so I've decided to go full big-endian and adopt ISO-8601 from now on, which means Pi Day is on 3141-5-9. Expect an epic blog post that day.

Transcendence

Anyway, I will transcend the bickering over dates (pausing only to reject 22/7 and 6/28 entirely so don't even start) to get back to pi. It so happens that Pi Day is of great interest in our house this year because my middle child, Evie (10), is a bit obsessed with pi at the moment. Obsessed enough to be writing a book about it (she writes a lot of books; some previous topics: zebras, Switzerland, octopuses, and Settlers of Catan fan fiction, if that's even a thing).

I helped her find some ways to generate pi numerically. My favourite one uses Riemann's zeta function, which we'd recently watched a Numberphile video about. It's the sum of the reciprocals of the natural numbers raised to increasing powers:

$$\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}$$

Leonhard Euler solved the Basel problem in 1734, proving that $\zeta(2) = \pi^2 / 6$, so you can compute pi slowly with a naive implementation of the zeta function:

def zeta(s, terms=1000):
    z = 0
    for t in range(1, int(terms)):
        z += 1 / t**s
    return z

(6 * zeta(2, terms=1e7))**0.5

Which returns pi, correct to 6 places:

3.141592558095893

Or you can use one of the various optimized versions of the zeta function, for example this one from the floating point math library mpmath (which I got from this awesome list of 100 ways to compute pi):

>>> from mpmath import *
>>> mp.dps = 50
>>> mp.pretty = True
>>>
>>> sqrt(6*zeta(2))
3.1415926535897932384626433832795028841971693993751068

...which is correct to 50 decimal places.

Here's the bit of Evie's book where she explains a bit about transcendental numbers.

Evie's book shows the relationships between the sets of natural numbers (N), integers (Z), rationals (Q), algebraic numbers (A), and real numbers (R). Transcendental numbers are real, but not algebraic. (Some definitions also let them be complex.)

I was interested in this, because while I 'knew' that pi is transcendental, I couldn't really articulate what that really meant, and why (say) √2, which is also irrational, is not also transcendental. Succinctly, transcendental means 'non-algebraic' (equivalent to being non-constructible). Since √2 is obviously the solution to $x^2 - 2 = 0$, it is algebraic and therefore not transcendental.

Weirdly, although hardly any numbers are known to be transcendental, almost all real numbers are. Isn't maths awesome?

Have a transcendental pi day!

The xkcd comic is by Randall Munroe and licensed CC-BY-NC.

March 06, 2018

Jounce, Crackle and Pop

March 06, 2018/ Matt Hall

I saw this T-shirt recently, and didn't get it. (The joke or the T-shirt.)

It turns out that the third derivative of displacement $x$ with respect to time $t$ — that is, the derivative of acceleration $\mathbf{a}$ — is called 'jerk' (or sometimes, boringly, jolt, surge, or lurch) and is measured in units of m/s³.

So far, so hilarious, but is it useful? It turns out that it is. Since the force $\mathbf{F}$ on a mass $m$ is given by $\mathbf{F} = m\mathbf{a}$, you can think of jerk as being equivalent to a change in force. The lurch you feel at the onset of a car's acceleration — that's jerk. The designers of transport systems and rollercoasters manage it daily.

$$ \mathrm{jerk,}\ \mathbf{j} = \frac{\mathrm{d}^3 x}{\mathrm{d}t^3}$$

Here's a visualization of velocity (green line) of a Tesla Model S driving in a parking lot. The coloured stripes show the acceleration (upper plot) and the jerk (lower plot). Notice that the peaks in jerk correspond to changes in acceleration.

The snap you feel at the start of the lurch? That's jounce — the fourth derivative of displacement and the derivative of jerk. Eager et al (2016) wrote up a nice analysis of these quantities for the examples of a trampolinist and roller coaster passenger. Jounce is sometimes called snap... and the next two derivatives are called crackle and pop.

What about momentum?

If the momentum $\mathrm{p}$ of a mass $m$ moving at a velocity $v$ is $m\mathbf{v}$ and $\mathbf{F} = m\mathbf{a}$, what is mass times jerk? According to the physicist Philip Gibbs, who investigated the matter in 1996, it's called yank:

“Momentum equals mass times velocity.
Force equals mass times acceleration.
Yank equals mass times jerk.
Tug equals mass times snap.
Snatch equals mass times crackle.
Shake equals mass times pop.”

There are jokes in there, help yourself.

What about integrating?

Clearly the integral of jerk is acceleration, and that of acceleration is velocity, the integral of which is displacement. But what is the integral of displacement with respect to time? It's called absement, and it's a pretty peculiar quantity to think about. In the same way that an object with linearly increasing displacement has constant velocity and zero acceleration, an object with linearly increasing absement has constant displacement and zero velocity. (Constant absement at zero displacement gives rise to the name 'absement': an absence of displacement.)

Integrating displacement over time might be useful: the area under the displacement curve for a throttle lever could conceivably be proportional to fuel consumption for example. So absement seems to be a potentially useful quantity, measured in metre-seconds.

Integrate absement and you get absity (a play on 'velocity'). Keep going and you get abseleration, abserk, and absounce. Are these useful quantities? I don't think so. A quick look at them all — for the same Tesla S dataset I used before — shows that the loss of detail from multiple cumulative summations makes for rather uninformative transformations:

You can reproduce the figures in this article with the Jupyter Notebook Jerk_jounce_etc.ipynb. Or you can launch a Binder right here in your browser and play with it there, without installing a thing!

References

David Eager et al (2016). Beyond velocity and acceleration: jerk, snap and higher derivatives. Eur. J. Phys. 37 065008. DOI: 10.1088/0143-0807/37/6/065008

Amarashiki (2012). Derivatives of position. The Spectrum of Riemannium blog, retrieved on 4 Mar 2018.

The dataset is from Jerry Jongerius's blog post, The Tesla (Elon Musk) and
New York Times (John Broder) Feud. I have no interest in the 'feud', I just wanted a dataset.

The T-shirt is from Chummy Tees; the image is their copyright and used here under Fair Use terms.

The vintage Snap, Crackle and Pop logo is copyright of Kellogg's and used here under Fair Use terms.

February 28, 2018

Digitalization... of what?

February 28, 2018/ Matt Hall

I've been hearing a lot about 'digitalization', or 'digital transformation', recently. What is this buzzword?

The general idea seems to be: exploit lots and lots of data (which we already have), with analytics and machine learning probably, to do a better job finding and producing fuel and energy safely and responsibly.

At the centre of it all is usually data. Lots of data, usually in a lake. And this is where it all goes wrong. Digitalization is not about data. And it's not about technology either. Or cloud. Or IoT.

Interest in the terms "digital transformation" and "digitalization" since 2004, according to Google Trends. The data reveal a slight preference for the term "digitalization" in central and northern Europe. Google Ngram Viewer indicates that the… — Interest in the terms "digital transformation" and "digitalization" since 2004, according to Google Trends. The data reveal a slight preference for the term "digitalization" in central and northern Europe. Google Ngram Viewer indicates that the term "digitalization" has been around for over 100 years, but it is also a medical term connected with the therapeutic use of digitalis. Just to be clear, that's not what we're talking about.

It's about people

Oh no, here I go with the hand-wavy, apple-pie "people not process" nonsense... well, yes. I'm convinced that it's humans we're transforming, not data or technology. Or clouds. Or Things.

I think it's worth spelling out, because I think most corporations have not grasped the human aspect yet. And I don't think it's unreasonable to say that petroleum has a track record of not putting people at the centre of its activities, so I worry that this will happen again. This would be bad, because it might mean that digitalization not only fails to get traction — which would be bad enough because this revolution is long overdue — but also that it causes unintended problems.

Without people, digital transformation is just another top-down 'push' effort, with too much emphasis on supply. I think it's smarter to create demand, or 'pull', so that professionals are asking for support, and tools, and databases, and are engaged in how those things are created.

Put technical professionals at the heart of the revolution, and the rest will follow. The inverse is not true.

Strategies

This is far from an exhaustive list, but here are some ideas for ways to get ahead in digital transformation:

Make it easy for digitally curious people to dip a toe in. Build a beginner-friendly computing environment, and encourage people to use it. Challenge your IT people to support a culture of experimentation and creativity.
Give those curious professionals access to professional development channels, whether it's our courses, other courses, online channels like Lynda.com or Coursera, or whatever.
Build a community of practice for 'scientific computing'. Whether it's a Yammer group or something more formal, be sure to encourage frequent face-to-face meetups, and perhaps an intranet portal.
Start to connect subsurface professionals with software engineers, especially web programmers and data scientists, elsewhere in the organization. I think the best way is to embed programmers into technical teams.
Encourage participation in external channels like conferences and publications, data science contests, hackathons, open source projects, and so on. I guarantee you'll see a step change in skills and enthusiasm.

The bottom line is that we're looking for a profound culutral change. It will be slow. More than that, it needs to be slow. It might only take a year or two to get traction for an idea like "digital first". But deeper concepts, like "machine readable microservices" or "data-driven decisions" or "reproducible workflows", must take longer because you can't build that high without a solid foundation. Successfully applying specific technologies like deep learning, augmented reality, or blockchain, will certainly require a high level of technology literacy, and will take years to get right.

What's going on with scientific computing in your organization? Are you 'digitally curious'? Do you feel well-supported? Do you know others in your organization like you?

The circuit board image in the thumbnail for this post is by Carl Drougge, licensed CC-BY-SA.

February 23, 2018

Finding Big Bertha with a hot wire

February 23, 2018/ Matt Hall

Sunday will be the 131st birthday of General Andrew McNaughton, a Canadian electrical engineer who served in the Canadian Expeditionary Force in the First World War. He was eventually appointed commander of the Canadian Corps Heavy Artillery and went on to serve in the Second World War as well.

So what is a professional soldier doing on a blog about geoscience? Well, McNaughton was part of the revolution of applied acoustics and geophysics that emerged right before and after the First World War.

Along with eminent British physicist Lawrence Bragg, engineer William Sansome Tucker, and physicist Charles Galton Darwin (the other Charles Darwin's grandson), among others, McNaughton applied physics to the big problem of finding the big noisy things that were trying to blow everyone up. They were involved in an arms race of their own — German surveyor Ludger Mintrop was trying to achieve the same goal from the other side of the trenches.

After gaining experience as a gunner, McNaughton became one of a handful of scientists and engineers involved in counter-battery operations. Using novel ranging techniques, these scientists gave the allied forces a substantial advantage over the enemy. Counter-battery fire became an weapon at pivotal battles like Vimy Ridge, and certainly helped expedite the end of the war.

If all this sounds like a marginal way to win a battle, stop think for a second about these artillery. The German howitzer, known as 'Big Bertha' (left), could toss an 820 kg (1800 lb) shell about 12.5 km (7.8 miles). In other words, it was incredibly annoying.

Combining technologies

Localization accuracy on the order of 5–10 m on the large majority of gun positions was eventually achieved by the coordinated use of several technologies, including espionage, cartography, aerial reconnaissance photography, and the new counter-measures of flash spotting and sound ranging.

Flash spotting was more or less just what it sounds like: teams of spotters recording the azimuth of artillery flashes, then triangulating artillery positions from multiple observations. The only real trick was in reporting the timing of flashes to help establish that the flashes came from the same gun.

Sound ranging, on the other hand, is a tad more complicated. It seems that Lawrence Bragg was the first to realize that the low frequency sound of artillery fire — which he said lifted him off the privy seat in the outhouse at his lodgings — might be a useful signal. However, microphones were not up to the task of detecting such low frequencies. Furthermore, the signal was masked by the (audible) sonic boom of the shell, as well as the shockwaves of passing shells.

Elsewhere in Belgium, William Tucker had another revelation. Lying inside a shack with holes in its walls, he realized that the 20 Hz pressure wave from the gun created tiny puffs of air through the holes. So he looked for a way to detect this pulse, and came up with a heated platinum wire in a rum jar. The filament's resistance dropped when cooled by the wavefront's arrival through an aperture. The wire was unaffected by the high-frequency shell wave. Later, moving-coil 'microphones' (geophones, essentially) were also used, as well as calibration for wind and temperature. The receivers were coupled with a 5-channel string galvanometer, invented by French engineers, to record traces onto 35-mm film bearing timing marks:

McNaughton continued to develop these technologies through the war, and by the end was successfully locating the large majority of enemy artillery locations, and was even able to specify the calibre of the guns and their probable intended targets. Erster Generalquartiermeister Erich Ludendorff commented at one point in the war:

“According to a captured English document the English have a well- developed system of sound-ranging which in theory corresponds to our own. Precautions are accordingly to be taken to camouflage the sound: e.g. registration when the wind is contrary, and when there is considerable artillery activity, many batteries firing at the same time, simultaneous firing from false positions, etc.”

An acoustic arsenal

The hot-wire artillery detector was not Tucker's only acoustic innovation. He also pioneered the use of acoustic mirrors for aircraft detection. Several of these were built around the UK's east coast, starting around 1915 — the three shown here are at Denge in Kent. They were rendered obselete by the invention of radar around the beginning of World War Two.

Acoustic and seismic devices are still used today in military and security applications, though they are rarely mentioned in applied geophysics textbooks. If you know about any interesting contemporary uses, tell us about it in the comments.

According to Crown Copyright terms, the image of McNaughton is out of copyright. The acoustic mirror image is by Paul Russon, licensed CC-BY-SA. The uncredited/unlicensed galvanometer trace is from the excellent Stop, hey, what's that sound article on the geographical imaginations blog; I assume it is out of copyright. The howitzer image is out of copyright.

This post on Target acquisition and counter battery is very informative and has lots of technical details, though most of it pertains to later technology. The Boom! Sounding out the enemy article on ScienceNews for Students is also very nice, with lots of images.

February 21, 2018

Unsolved problems in applied geoscience

February 21, 2018/ Matt Hall

I like unsolved problems. I first wrote about them way back in late 2010 — Unsolved problems was the eleventh post on this blog. I touched on the theme again in 2013, before and after the first 'unsession' at the GeoConvention, which itself was dedicated to finding the most pressing questions in exploration geoscience. As we turn towards the unsession at AAPG in Salt Lake City in May, I find myself thinking again about unsolved problems. Specifically, what are they? How can we find them? And what can we do to make them easier to solve?

It turns out lots of people have asked these questions before.

I've compiled a list of various attempts by geoscientists to list he big questions in the field. The only one I was previous aware of was Milo Backus's challenges in applied seismic geophysics, laid out in his president's column in GEOPHYSICS in 1980 and highlighted later by Larry Lines as part of the SEG's 75th anniversary. Here are some notable attempts:

John William Dawson, 1883 — Nova Scotia's most famous geologist listed unsolved problems in geology in his presidential address to the American Association for the Advancement of Science. They included the Cambrian Explosion, and the origin of the Antarctic icecap.
Leason Heberling Adams, 1947 — One of the first experimental rock physicists, Adams made the first list I can find in geophysics, which was less than 30 years old at the time. He included the origin of the geomagnetic field, and the temperature of the earth's interior.
Milo Backus, 1980 — The list included direct hydrocarbon detection, seismic imaging, attenuation, and anisotropy.
Mary Lou Zoback, 2000 — As her presidential address to the GSA, Zoback kept things quite high-level, asking questions about finding signal indynamic systems, defining mass flux and energy balance, identifying feedback loops, and communicating uncertainty and risk. This last one pops up in almost every list since.
Calgary's geoscience community, 2013 — The 2013 unsession unearthed a list of questions from about 50 geoscientists. They included: open data, improving seismic resolution, dealing with error and uncertainty, and global water management.
Daniel Garcia-Castellanos, 2014 — The Retos Terrícolas blog listed 49 problems in 7 categories, ranging from the early solar system to the earth's interior, plate tectonics, oceans, and climate. The list is still maintained by Daniel and pops up occasionally on other blogs and on Wikipedia.

The list continues — you can see them all in this presentation I made for a talk (online) at the Bureau of Economic Geology last week (thank you to Sergey Fomel for hosting me!). During the talk, I took the opportunity to ask those present what their unsolved problems are, especially the ones in their own fields. Here are a few of what we got (the rest are in the preso):

1-what-are-the-biggest-unsolved-problems-in-your-field-1.jpg

What are your unsolved problems in applied geoscience? Share them in the comments!

If you have about 50 minutes to spare, you can watch the talk here, courtesy of BEG's streaming service.

Click here to watch the talk >>>

February 08, 2018

Easier, better, faster, stronger

February 08, 2018/ Matt Hall

Yesterday I pushed a new release of bruges to Python's main package repository, PyPi. Version 0.3.3 might not sound like an especially auspicious version perhaps, but I'm excited about the new things we've added recently. It has come a long way since we announced it back in 2015, so if you haven't checked it out lately, now's a good time to take another look.

What is bruges again?

Bruges is a...

In other words, nothing fancy — just equations. It is free, open source software. It's aimed at geophysicists who use Python.

How do you install it? The short answer is pip:

    pip install bruges

So what's new?

Here are the highlights of what's been improved and added in the last few months:

The reflectivity equations in reflection module now work on arrays for the Vp, Vs, and rho values, as well as the theta values. This is about 10 times faster than running a loop over elements; the Zoeppritz solution is 100× faster.
The various Zoeppritz solutions and the Aki–Richards approximations now return the complex reflectivity and therefore show post-critical amplitudes correctly.
A new reflection coefficient series function, reflection.reflectivity(), makes it easier to compute offset reflectivities from logs.
Several new linear and non-linear filters are in bruges.filters, including median (good for seismic horizons), mode (good for waveform classification), symmetric nearest-neighbours or snn, and kuwahara.
The wavelets ricker(), sweep() (aka Klauder) and ormsby() wavelet now all work for a sequence of frequencies, returning a wavelet bank. Also added a sinc() wavelet, with a taper option to attenuate the sidelobes.
Added inverse_gardner, and other density and velocity transforms, to petrophysics.
Added transform.v_rms() (RMS velocity), transform.v_avg() (average velocity) and transform.v_bac() (naïve Backus average). These all operate in a 'cumulative' average-down-to sense.
Added a coordinate transformation to translate between arbitrarily oriented (x,y) and (inline, line) coordinates.

Want to try using it right now, with no installation? Give it a spin in My Binder! See how easy it is to compute elastic moduli, or offset reflection coefficients, or convert a log to time.

Want to support the development of open source geophysics software? Here's how:

Use it! This is the main thing we care about.
Report problems on the project's Issue page.
Fork the project and make your own changes, then share them back.
Pay us for the development of functionality you need.

January 31, 2018

This year's social coding events

January 31, 2018/ Matt Hall

If you've always wondered what goes on at our hackathons, make 2018 the year you find out. There'll be plenty of opportunities. We'll be popping up in Salt Lake City, right before the AAPG annual meeting, then again in Copenhagen, before EAGE. We're also running events at the AAPG and EAGE meetings. Later, in the autumn, we'll be making some things happen around SEG too.

If you just want to go sign up right now, head to the Events page. If you want more deets first, read on.

Salt Lake City in May: machine learning and stratigraphy

This will be one of our 'traditional' hackathons. We're looking for 7 or 8 teams of four to come and dream up, then hack on, new ideas in geostatistics and machine learning, especially around the theme of stratigraphy. Not a coder? No worries! Come along to the bootcamp on Friday 18 May and acquire some new skills. Or just show up and be a brainstormer, tester, designer, or presenter.

Thank you to Earth Analytics for sponsoring this event. If you'd like to sponsor it too, check out your options. The bottom line is that these events cost about $20,000 to put on, so we appreciate all the help we can get.

It doesn't stop with the hackathon demos on Sunday. At the AAPG ACE, Matt is part of the team bringing you the Machine Learning Unsession on Wednesday afternoon. If you're interested in the future of computation and geoscience, come along and be heard. It wouldn't be the same without you.

Copenhagen in June: visualization and interaction

After events in Vienna in 2016 and Paris in 2017, we're looking forward to being back in Europe in June. The weekend before the EAGE conference, we'll be hosting the Subsurface Hackathon once again. Partnering with Dell EMC and Total E&P, as last year, we'll be gathering 60 eager geoscientists to explore data visualization, from plotting to virtual reality. I can't wait.

In the EAGE Exhibition itself, we're cooking up something else entirely. The Codeshow is a new kind of conference event, mixing coding tutorials with demos from the hackathon and even some mini-hackathon projects to get you started on your own. It's 100% experimental, just the way we like it.

Anaheim in October: something exciting

We'll be at SEG in Anaheim this year, in the middle of October. No idea what exactly we'll be up to, but there'll be a hackathon for sure (sign up for alerts here). And tacos, lots of those.

You can get tickets to most of these events on the Event page. If you have ideas for future events, or questions about them, drop us a line or leave a comment on this post!

I'll leave you with a short and belated look at the hackathon in Paris last year...

A quick look at the Subsurface Hackathon in Paris, June 2017.

January 10, 2018

What is scientific computing?

January 10, 2018/ Matt Hall

I started my career in sequence stratigraphy, so I know a futile discussion about semantics when I see one. But humour me for a second.

As you may know, we offer a multi-day course on 'geocomputing'. Somebody just asked me: what is this mysterious, made-up-sounding discipline? Swiftly followed by: can you really teach people how to do computational geoscience in a few days? And then: can YOU really teach people anything??

Good questions

You can come at the same kind of question from different angles. For example, sometimes professional programmers get jumpy about programming courses and the whole "learn to code" movement. I think the objection is that programming is a profession, like other kinds of engineering, and no-one would dream of offering a 3-day course on, say, dentistry for beginners.

These concerns are valid, sort of.

No, you can't learn to be a computational scientist in 3 days. But you can make a start. A really good one at that.
And no, we're not programmers. But we're scientists who get things done with code. And we're here to help.
And definitely no, we're not trying to teach people to be software engineers. We want to see more computational geoscientists, which is a different thing entirely.

So what's geocomputing then?

Words seem inadequate for nuanced discussion. Let's instead use the language of ternary diagrams. Here's how I think 'scientific computing' stacks up against 'computer science' and 'software engineering'...

If you think these are confusing, just be glad I didn't go for tetrahedrons.

These are silly, of course. We could argue about them for hours I'm sure. Where would IT fit? ("It's all about the business" or something like that.) Where does Agile fit? (I've caricatured our journey, or tried to.) Where do you fit?

December 29, 2017

x lines of Python: contour maps

December 29, 2017/ Matt Hall

Difficulty rating: EASY

Following on from the post a couple of weeks ago about colourmaps, I wanted to poke into contour maps a little more. Ostensibly, making a contour plot in matplotlib is a one-liner:

plt.contour(data)

But making a contour plot look nice takes a little more work than most of matplotlib's other plotting functions. For example, to change the contour levels you need to make an array containing the levels you want... another line of code. Adding index contours needs another line. And then there's all the other plotty stuff.

Here's what we'll do:

Load the data from a binary NumPy file.
Check the data looks OK.
Get the min and max values from the map.
Generate the contour levels.
Make a filled contour map and overlay contour lines.
Make a map with index contours and contour labels.

The accompanying notebook sets out all the code you will need. You can even run the code right in your browser, no installation required.

Here's the guts of the notebook:

import numpy as np
import matplotlib.pyplot as plt

seabed = np.load('../data/Penobscot_Seabed.npy')
seabed *= -1
mi, ma = np.floor(np.nanmin(seabed)), np.ceil(np.nanmax(seabed))
step = 2
levels = np.arange(10*(mi//10), ma+step, step)
lws = [0.5 if level % 10 else 1 for level in levels]

# Make the plot
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(1,1,1)
im = ax.imshow(seabed, cmap='GnBu_r', aspect=0.5, origin='lower')
cb = plt.colorbar(im, label="TWT [ms]")
cb.set_clim(mi, ma)
params = dict(linestyles='solid', colors=['black'], alpha=0.4)
cs = ax.contour(seabed, levels=levels, linewidths=lws, **params)
ax.clabel(cs, fmt='%d')
plt.show()

This produces the following plot:

Blog