# Jounce, Crackle and Pop

I saw this T-shirt recently, and didn't get it. (The joke or the T-shirt.)

It turns out that the third derivative of displacement $$x$$ with respect to time $$t$$ — that is, the derivative of acceleration $$\mathbf{a}$$ — is called 'jerk' (or sometimes, boringly, jolt, surge, or lurch) and is measured in units of m/s³.

So far, so hilarious, but is it useful? It turns out that it is. Since the force $$\mathbf{F}$$ on a mass $$m$$ is given by $$\mathbf{F} = m\mathbf{a}$$, you can think of jerk as being equivalent to a change in force. The lurch you feel at the onset of a car's acceleration — that's jerk. The designers of transport systems and rollercoasters manage it daily.

$$\mathrm{jerk,}\ \mathbf{j} = \frac{\mathrm{d}^3 x}{\mathrm{d}t^3}$$

Here's a visualization of velocity (green line) of a Tesla Model S driving in a parking lot. The coloured stripes show the acceleration (upper plot) and the jerk (lower plot). Notice that the peaks in jerk correspond to changes in acceleration.

The snap you feel at the start of the lurch? That's jounce  — the fourth derivative of displacement and the derivative of jerk. Eager et al (2016) wrote up a nice analysis of these quantities for the examples of a trampolinist and roller coaster passenger. Jounce is sometimes called snap... and the next two derivatives are called crackle and pop.

If the momentum $$\mathrm{p}$$ of a mass $$m$$ moving at a velocity $$v$$ is $$m\mathbf{v}$$ and $$\mathbf{F} = m\mathbf{a}$$, what is mass times jerk? According to the physicist Philip Gibbs, who investigated the matter in 1996, it's called yank:

Clearly the integral of jerk is acceleration, and that of acceleration is velocity, the integral of which is displacement. But what is the integral of displacement with respect to time? It's called absement, and it's a pretty peculiar quantity to think about. In the same way that an object with linearly increasing displacement has constant velocity and zero acceleration, an object with linearly increasing absement has constant displacement and zero velocity. (Constant absement at zero displacement gives rise to the name 'absement': an absence of displacement.)

Integrating displacement over time might be useful: the area under the displacement curve for a throttle lever could conceivably be proportional to fuel consumption for example. So absement seems to be a potentially useful quantity, measured in metre-seconds.

Integrate absement and you get absity (a play on 'velocity'). Keep going and you get abseleration, abserk, and absounce. Are these useful quantities? I don't think so. A quick look at them all — for the same Tesla S dataset I used before — shows that the loss of detail from multiple cumulative summations makes for rather uninformative transformations:

You can reproduce the figures in this article with the Jupyter Notebook Jerk_jounce_etc.ipynb. Or you can launch a Binder right here in your browser and play with it there, without installing a thing!

### References

David Eager et al (2016). Beyond velocity and acceleration: jerk, snap and higher derivatives. Eur. J. Phys. 37 065008. DOI: 10.1088/0143-0807/37/6/065008

Amarashiki (2012). Derivatives of position. The Spectrum of Riemannium blog, retrieved on 4 Mar 2018.

The dataset is from Jerry Jongerius's blog post, The Tesla (Elon Musk) and
New York Times (John Broder) Feud
. I have no interest in the 'feud', I just wanted a dataset.

The T-shirt is from Chummy Tees; the image is their copyright and used here under Fair Use terms.

The vintage Snap, Crackle and Pop logo is copyright of Kellogg's and used here under Fair Use terms.

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# This year's social coding events

If you've always wondered what goes on at our hackathons, make 2018 the year you find out. There'll be plenty of opportunities. We'll be popping up in Salt Lake City, right before the AAPG annual meeting, then again in Copenhagen, before EAGE. We're also running events at the AAPG and EAGE meetings. Later, in the autumn, we'll be making some things happen around SEG too.

If you just want to go sign up right now, head to the Events page. If you want more deets first, read on.

### Salt Lake City in May: machine learning and stratigraphy

This will be one of our 'traditional' hackathons. We're looking for 7 or 8 teams of four to come and dream up, then hack on, new ideas in geostatistics and machine learning, especially around the theme of stratigraphy. Not a coder? No worries! Come along to the bootcamp on Friday 18 May and acquire some new skills. Or just show up and be a brainstormer, tester, designer, or presenter.

### Algorithmic puzzles and stuff

These are spectacular: randomly generated agate-like jigsaw puzzles. Every one is different! Even the shapes of the wooden pieces are generated with maths. They cost about USD 95, and come from Boston-based Nervous System. The same company has lots of other rock- and fossil-inspired stuff, like ammonity jewellery (from about USD 50) and some very cool coasters that look a bit like radiolarians (USD 48 for 4).

### There's always books

You can't go wrong with books. These all just came out, and just might appeal to a geoscientist. And if these all sound a bit too much like reading for work, try the Atlas of Beer instead. Click on a book to open its page at Amazon.com.

### The posts of Christmas past

If by any chance there aren't enough ideas here, or you are buying for a very large number of geoscientists, you'll have to dredge through the historical listicles of yesteryear — 20112012201320142015, or 2016. You'll find everything there, from stocking stuffers to Triceratops skulls.

The images in this post are all someone else's copyright and are used here under fair use guidelines. I'm hoping the owners are cool with people helping them sell stuff!

1 Comment

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# x lines of Python: Let's play golf!

Normally in the x lines of Python series, I'm trying to do something useful in as few lines of code as possible, but — and this is important — without sacrificing clarity. Code golf, on the other hand, tries solely to minimize the number of characters used, and to heck with clarity. This might, and probably will, result in rather obfuscated code.

So today in x lines, we set x = 1 and see what kind of geophysics we can express. Follow along in the accompanying notebook if you like.

### A Ricker wavelet

One of the basic building blocks of signal processing and therefore geophysics, the Ricker wavelet is a compact, pulse-like signal, often employed as a source in simulation of seismic and ground-penetrating radar problems. Here's the equation for the Ricker wavelet:

$$A = (1-2 \pi^2 f^2 t^2) e^{-\pi^2 f^2 t^2}$$

where $$A$$ is the amplitude at time $$t$$, and $$f$$ is the centre frequency of the wavelet. Here's one way to translate this into Python, more or less as expressed on SubSurfWiki:

import numpy as np
def ricker(length, dt, f):
"""Ricker wavelet at frequency f Hz, length and dt in seconds.
"""
t = np.arange(-length/2, length/2, dt)
y = (1.0 - 2.0*(np.pi**2)*(f**2)*(t**2)) * np.exp(-(np.pi**2)*(f**2)*(t**2))
return t, y

That is alredy pretty terse at 261 characters, but there are lots of obvious ways, and some non-obvious ways, to reduce it. We can get rid of the docstring (the long comment explaining what the function does) for a start. And use the shortest possible variable names. Then we can exploit the redundancy in the repeated appearance of $$\pi^2f^2t^2$$... eventually, we get to:

def r(l,d,f):import numpy as n;t=n.arange(-l/2,l/2,d);k=(n.pi*f*t)**2;return t,(1-2*k)/n.exp(k)

This weighs in at just 95 characters. Not a bad reduction from 261, and it's even not too hard to read. In the notebook accompanying this post, I check its output against the version in our geophysics package bruges, and it's legit:

The 95-character Ricker wavelet in green, with the points computed by the function in BRuges.

What else can we do?

In the notebook for this post, I run through some more algorithms for which I have unit-tested examples in bruges:

To give you some idea of why we don't normally code like this, here's what the Aki–Richards solution looks like:

def r(a,c,e,b,d,f,t):import numpy as n;w=f-e;x=f+e;y=d+c;p=n.pi*t/180;s=n.sin(p);return w/x-(y/a)**2*w/x*s**2+(b-a)/(b+a)/n.cos((p+n.arcsin(b/a*s))/2)**2-(y/a)**2*(2*(d-c)/y)*s**2

A bit hard to debug! But there is still some point to all this — I've found I've had to really understand Python's order of mathematical operations, and find different ways of doing familiar things. Playing code golf also makes you think differently about repetition and redundancy. All good food for developing the programming brain.

Do have a play with the notebook, which you can even run in Microsoft Azure, right in your browser! Give it a try. (You'll need an account to do this. Create one for free.)

Many thanks to Jesper Dramsch and Ari Hartikainen for helping get my head into the right frame of mind for this silliness!

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# Unweaving the rainbow

Last week at the Canada GeoConvention in Calgary I gave a slightly silly talk on colourmaps with Matteo Niccoli. It was the longest, funnest, and least fruitful piece of research I think I've ever embarked upon. And that's saying something.

### Freeing data from figures

It all started at the Unsession we ran at the GeoConvention in 2013. We asked a roomful of geoscientists, 'What are the biggest unsolved problems in petroleum geoscience?'. The list we generated was topped by Free the data, and that one topic alone has inspired several projects, including this one.

Our goal: recover digital data from any pseudocoloured scientific image, without prior knowledge of the colourmap.

I subsequently proferred this challenge at the 2015 Geophysics Hackathon in New Orleans, and a team from Colorado School of Mines took it on. Their first step was to plot a pseudocoloured image in (red, green blue) space, which reveals the colourmap and brings you tantalizingly close to retrieving the data. Or so it seems...

Here's our talk:

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# The quick green forsterite jumped over the lazy dolomite

The best-known pangram — a sentence containing every letter of the alphabet —  is probably

There are lots of others of course. If you write like James Joyce, there are probably an infinite number of others. The point is to be short, and one of the shortest, with only 29 letters (!), even has a geological flavour:

I know what you're thinking: Cool, but what's the shortest set of mineral names that uses all the letters of the alphabet? What logophiliac geologist would not wonder the same thing?

Well, we posed this question in the most recent "Riddle me this" segment on the Undersampled Radio podcast. This blog post is my solution.

### The set cover problem

Finding pangrams in a list of words amounts to solving the classical set cover problem:

Our universe is the alphabet, and our $$S$$ is the list of $$m$$ mineral names. There is a slight twist in our case: the set cover problem wants the smallest subset of $$S$$ — the fewest members. But in this problem, I suspect there are several 4-word solutions (judging from my experiments), so I want the smallest total size of the members of the subset. That is, I want the fewest total letters in the solution.

### The solution

The set cover problem was shown to be NP-complete in 1972. What does this mean? It means that it's easy to tell if you have an answer (do you have all the letters of the alphabet?), but the only way to arrive at a solution is — to oversimplify massively — by brute force. (If you're interested in this stuff, this edition of the BBC's In Our Time is one of the best intros to P vs NP and complexity theory that I know of.)

Anyway, the point is that if we find a better way than brute force to solve this problem, then we need to write a paper about it immediately, claim our prize, collect our turkey, then move to a sunny tax haven with good water and double-digit elevation.

So, this could take a while: there are over 95 billion ways to draw 3 words from my list of 4600 mineral names. If we need 4 minerals, there are 400 trillion combinations... and a quick calculation suggests that my laptop will take a little over 50 years to check all the combinations.

### Can't we speed it up a bit?

Brute force is one thing, but we don't need to be brutish about it. Maybe we can think of some strategies to give ourselves a decent chance:

• The list is alphabetically sorted, so randomize the list before searching. (I did this.)
• Guess some 'useful' minerals and ensure that you get to them. (I did this too, with quartz.)
• Check there are at least 26 letters in the candidate words, and (if it's only records we care about) no more than 44, because I have a solution with 45 letters (see below).
• We could sort the list into word length order. That way we search shorter things first, so we should get shorter lists (which we want) earlier.
• My solution does not depend much on Python's set type. Maybe we could do more with set theory.
• Before inspecting the last word in each list, we could make sure it contains at least one letter that's so far missing.

So far, the best solution I've come up with so far has 45 letters, so there's plenty of room for improvement:

'quartz', 'kvanefjeldite', 'abswurmbachite', 'pyroxmangite'

My solution is in this Jupyter Notebook. Please put me out of my misery by improving on it.

Comment

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# Two new short courses in Calgary

We're running two one-day courses in Calgary for the CSPG Spring Education Week. One of them is a bit... weird, so I thought I'd try to explain what we're up to.

Both classes run from 8:30 till 4:00, and both of them cost just CAD 425 for CSPG members.

### Get introduced to Python

The first course is Practical programming for geoscientists. Essentially a short version of our 2 to 3 day Creative geocomputing course, we'll take a whirlwind tour through the Python programming language, then spend the afternoon looking at some basic practical projects. It might seem trivial, but leaving with a machine fully loaded with all the tools you'll need, plus long list of resources and learning aids, is worth the price of admission alone.

If you've always wanted to get started with the world's easiest-to-learn programming language, this is the course you've been waiting for!

### Hashtag geoscience

This is the weird one. Hashtag geoscience: communicating geoscience in the 21st century. Join me, Evan, Graham Ganssle (my co-host on Undersampled Radio) — and some special guests — for a one-day sci comm special. Writing papers and giving talks is all so 20th century, so let's explore social media, blogging, podcasting, open access, open peer review, and all the other exciting things that are happening in scientific communication today. These tools will not only help you in your job, you'll find new friends, new ideas, and you might even find new work.

I hope a lot of people come to this event. For one, it supports the CSPG (we're not getting paid, we're on expenses only). Secondly, it'll be way more fun with a crowd. Our goal is for everyone to leave burning to write a blog, record a podcast, or at least create a Twitter account.

One of our special guests will be young-and-famous geoscience vlogger Dr Chris. Coincidentally, we just interviewed him on Undersampled Radio. Here's the uncut video version; audio will be on iTunes and Google Play in a couple of days:

Comment

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# Unearthing gold in Toronto

I just got home from Toronto, the mining capital of the world, after an awesome weekend hacking with Diego Castañeda, a recent PhD grad in astrophysics that is working with us) and Anneya Golob (another astrophysicist and Diego's partner). Given how much I bang on about hackathons, it might surprise you to know that this was the first hackathon I have properly participated in, without having to order tacos or run out for more beer every couple of hours.

PArticipants being briefed by one of the problem sponsors on the first evening.

### What on earth is Unearthed?

The event (read about it) was part of a global series of hackathons organized by Unearthed Solutions, a deservedly well-funded non-profit based in Australia that is seeking to disrupt every single thing in the natural resources sector. This was their fourteenth event, but their first in Canada. Remarkably, they got 60 or 70 hackers together for the event, which I know from my experience organizing events takes a substantial amount of work. Avid readers might remember us mentioning them before, especially in a guest post by Jelena Markov and Tom Horrocks in 2014.

A key part of Unearthed's strategy is to engage operating companies in the events. Going far beyond mere sponsorship, Barrick Gold sent several mentors to the event, the Chief Innovation Officer Michelle Ash, as well as two judges, Ed Humphries (head of digital transformation) and Iain Allen (head of digital mining). Barrick provided the chellenge themes, as well as data and vivid descriptions of operational challenges. The company was incredibly candid with the participants, and should be applauded for its support of what must have felt like a pretty wild idea.

Team Auger Effect: Diego and Anneya hacking away on Day 2.

### What went down?

It's hard to describe a hackathon to someone who hasn't been to one. It's like trying to describe the Grand Canyon, ice climbing, or a 1985 Viña Tondonia Rioja. It's always fun to see and hear the reactions of the judges and other observers that come for the demos in the last hours of the event: disbelief at what small groups of humans can do in a weekend, for little tangible reward. It flies in the face of everything you think you know about creativity, productivity, motivation, and collaboration. Not to mention intellectual property.

As the fifteen (!) teams made their final 5-minute pitches, it was clear that every single one of them had created something unique and useful. The judges seemed genuinely blown away by the level of accomplishment. It's hard to capture the variety, but I'll have a go with a non-comprehensive list. First, there was a challenge around learning from geoscience data:

• BGC Engineering, one of the few pro teams and First Place winner, produced an impressive set of tools for scraping and analysing public geoscience data. I think it was a suite of desktop tools rather than a web application.
• Mango (winners of the Young Innovators award), Smart Miner (second place overall), Crater Crew, Aureka, and Notifyer and others presented map-based browsers for public mining data, with assistance from varying degrees of machine intelligence.
• Auger Effect (me, Diego, and Anneya) built a three-component system consisting of a browser plugin, an AI pipeline, and a social web app, for gathering, geolocating, and organizing data sources from people as they research.

The other challenge was around predictive maintenance:

• Tyrelyze, recognizing that two people a year are killed by tyre failures, created a concept for laser scanning haul truck tyres during operations. These guys build laser scanners for core, and definitely knew what they were doing.
• Decelerator (winners of the People's Choice award) created a concept for monitoring haul truck driving behaviour, to flag potentially expensive driving habits.
• Snapfix.io looked at inventory management for mine equipment maintenance shops.
• Arcana, Leo & Zhao, and others looked at various other ways of capturing maintenance and performace data from mining equipment, and used various strategies to try to predict

I will try to write some more about the thing we built... and maybe try to get it working again! The event was immensely fun, and I'm so glad we went. We learned a huge amount about mining too, which was eye-opening. Massive thanks to Unearthed and to Barrick on all fronts. We'll be back!

Brad BEchtold of Cisco (left) presenting the Young Innovator award for under-25s to Team Mango.

The winners of the People's Choice Award, Team Decelerate.

The winners of the contest component of the event, BGC Engineering, with Ed Humphries of Barrick (left).

UPDATE  View all the results and submissions from the event.

Wish there was a hackathon just for geoscientists and subsurface engineers?
You're in luck! Join us in Paris for the Subsurface Hackathon — sponsored by Dell EMC, Total E&P, NVIDIA, Teradata, and Sandstone. The theme is machine learning, and registration is open. There's even a bootcamp for anyone who'd like to pick up some skills before the hack.
Comment

### Matt Hall

Matt is a geoscientist in Nova Scotia, Canada. Founder of Agile Scientific, co-founder of The HUB South Shore. Matt is into geology, geophysics, and machine learning.

# No secret codes: announcing the winners

The SEG / Agile / Enthought Machine Learning Contest ended on Tuesday at midnight UTC. We set readers of The Leading Edge the challenge of beating the lithology prediction in October's tutorial by Brendon Hall. Forty teams, mostly of 1 or 2 people, entered the contest, submitting several hundred entries between them. Deadlines are so interesting: it took a month to get the first entry, and I received 4 in the second month. Then I got 83 in the last twenty-four hours of the contest.

### How it ended

Team F1 Algorithm Language Solution
1 LA_Team (Mosser, de la Fuente) 0.6388 Boosted trees Python Notebook
2 PA Team (PetroAnalytix) 0.6250 Boosted trees Python Notebook
3 ispl (Bestagini, Tuparo, Lipari) 0.6231 Boosted trees Python Notebook
4 esaTeam (Earth Analytics) 0.6225 Boosted trees Python Notebook

The winners are a pair of graduate petroelum engineers, Lukas Mosser (Imperial College, London) and Alfredo de la Fuente (Wolfram Research, Peru). Not coincidentally, they were also one of the more, er, energetic teams — it's say to say that they explored a good deal of the solution space. They were also very much part of the discussion about the contest on GitHub.com and on the Software Underground Slack chat group, aka Swung (you're in there, right?).

I will be sending Raspberry Shakes to the winners, along with some other swag from Enthought and Agile. The second-place team will receive books from SEG (thank you SEG Book Mart!), and the third-place team will have to content themselves with swag. That team, led by Paolo Bestagini of the Politecnico di Milano, deserves special mention — their feature engineering approach was very influential, being used by most of the top-ranking teams.

Coincidentally Gram and I talked to Lukas on Undersampled Radio this week:

### Back up a sec, what the heck is a machine learning contest?

To enter, a team had to predict the lithologies in two wells, given wireline logs and other data. They had complete data, including lithologies, in nine other wells — the 'training' data. Teams trained a wide variety of models — from simple nearest neighbour models and support vector machines, to sophisticated deep neural networks and random forests. These met with varying success, with accuracies ranging between about 0.4 and 0.65 (i.e., error rates from 60% to 35%). Here's one of the best realizations from the winning model:

One twist that made the contest especially interesting was that teams could not just submit their predictions — they had to submit the code that made the prediction, in the open, for all their fellow competitors to see. As a result, others were quickly able to adopt successful strategies, and I'm certain the final result was better than it would have been with secret code.

I spent most of yesterday scoring the top entries by generating 100 realizations of the models. This was suggested by the competitors themselves as a way to deal with model variance. This was made a little easier by the fact that all of the top-ranked teams used the same language — Python — and the same type of model: extreme gradient boosted trees. (It's possible that the homogeneity of the top entries was a negative consequence of the open format of the contest... or maybe it just worked better than anything else.)

### What now?

There will be more like this. It will have something to do with seismic data. I hope I have something to announce soon.

I (or, preferably, someone else) could write an entire thesis on learnings from this contest. I am busy writing a short article for next month's Leading Edge, so if you're interested in reading more, stay tuned for that. And I'm sure there wil be others.