November 15, 2016

x lines of Python: web scraping and web APIs

November 15, 2016/ Matt Hall

The Web is obviously an incredible source of information, and sometimes we'd like access to that information from within our code. Indeed, if the information keeps changing — like the price of natural gas, say — then we really have no alternative.

Fortunately, Python provides tools to make it easy to access the web from within a program. In this installment of x lines of Python, I look at getting information from Wikipedia and requesting natural gas prices from Yahoo Finance. All that in 10 lines of Python — total.

As before, there's a completely interactive, live notebook version of this post for you to run, right in your browser. Quick tip: Just keep hitting Shift+Enter to run the cells. There's also a static repo if you want to run it locally.

Geological ages from Wikipedia

Instead of writing the sentences that describe the code, I'll just show you the code. Here's how we can get the duration of the Jurassic period fresh from Wikipedia:

url = "http://en.wikipedia.org/wiki/Jurassic"
r = requests.get(url).text
start, end = re.search(r'<i>([\.0-9]+)–([\.0-9]+)&#160;million</i>', r.text).groups()
duration = float(start) - float(end)
print("According to Wikipedia, the Jurassic lasted {:.2f} Ma.".format(duration))

The output:

According to Wikipedia, the Jurassic lasted 56.30 Ma.

There's the opportunity for you to try writing a little function to get the age of any period from Wikipedia. I've given you a spot of help, and you can even complete it right in your browser — just click here to launch your own copy of the notebook.

Gas price from Yahoo Finance

url = "http://download.finance.yahoo.com/d/quotes.csv"
params = {'s': 'HHG17.NYM', 'f': 'l1'}
r = requests.get(url, params=params)
price = float(r.text)
print("Henry Hub price for Feb 2017: ${:.2f}".format(price))

Again, the output is fast, and pleasingly up-to-the-minute:

Henry Hub price for Feb 2017: $2.86

I've added another little challenge in the notebook. Give it a try... maybe you can even adapt it to find other live financial information, such as stock prices or interest rates.

What would you like to see in x lines of Python? Requests welcome!

November 08, 2016

Welly to the wescue

November 08, 2016/ Matt Hall

I apologize for the widiculous title.

Last week I described some headaches I was having with well data, and I introduced welly, an open source Python tool that we've built to help cure the migraine. The first versions of welly were built — along with the first versions of striplog — for the Nova Scotia Department of Energy, to help with their various data wrangling efforts.

Aside — all software projects funded by government should in principle be open source.

Today we're using welly to get data out of LAS files and into so-called feature vectors for a machine learning project we're doing for Canstrat (kudos to Canstrat for their support for open source software!). In our case, the features are wireline log measurements. The workflow looks something like this:

Read LAS files into a welly 'project', which contains all the wells. This bit depends on lasio.
Check what curves we have with the project table I showed you on Thursday.
Check curve quality by passing a test suite to the project, and making a quality table (see below).
Fix problems with curves with whatever tricks you like. I'm not sure how to automate this.
Export as the X matrix, all ready for the machine learning task.

Let's look at these key steps as Python code.

1. Read LAS files

from welly import Project
p = Project.from_las('data/*.las')

2. Check what curves we have

Now we have a project full of wells and can easily make the table we saw last week. This time we'll use aliases to simplify things a bit — this trick allows us to refer to all GR curves as 'Gamma', so for a given well, welly will take the first curve it finds in the list of alternatives we give it. We'll also pass a list of the curves (called keys here) we are interested in:

The project table. The name of the curve selected for each alias is selected. The mean and units of each curve are shown as a quick QC. A couple of those RHOB curves definitely look dodgy, and they turned out to be DRHO correction curves.

3. Check curve quality

Now we have to define a suite of tests. Lists of test to run on each curve are held in a Python data structure called a dictionary. As well as tests for specific curves, there are two special test lists: Each and All, which are run on each curve encountered, and on all curves together, respectively. (The latter is required to, for example, compare the curves to each other to look for duplicates). The welly module quality contains some predefined tests, but you can also define your own test functions — these functions take a curve as input, and return either True (for a test pass) for False.

import welly.quality as qty
tests = {
    'All': [qty.no_similarities],
    'Each': [qty.no_monotonic],
    'Gamma': [
        qty.all_positive,
        qty.mean_between(10, 100),
    ],
    'Density': [qty.mean_between(1000,3000)],
    'Sonic': [qty.mean_between(180, 400)],
    }

html = p.curve_table_html(keys=keys, alias=alias, tests=tests)
HTML(html)

the green dot means that all tests passed for that curve. Orange means some tests failed. If all tests fail, the dot is red. The quality score shows a normalized score for all the tests on that well. In this case, RHOB and DT are failing the 'mean_b… — the green dot means that all tests passed for that curve. Orange means some tests failed. If all tests fail, the dot is red. The quality score shows a normalized score for all the tests on that well. In this case, RHOB and DT are failing the 'mean_between' test because they have Imperial units.

4. Fix problems

Now we can fix any problems. This part is not yet automated, so it's a fairly hands-on process. Here's a very high-level example of how I fix one issue, just as an example:

def fix_negs(c):
    c[c < 0] = np.nan
    return c

# Glossing over some details, we give a mnemonic, a test
# to apply, and the function to apply if the test fails.
fix_curve_if_bad('GAM', qty.all_positive, fix_negs)

What I like about this workflow is that the code itself is the documentation. Everything is fully reproducible: load the data, apply some tests, fix some problems, and export or process the data. There's no need for intermediate files called things like DT_MATT_EDIT or RHOB_DESPIKE_FINAL_DELETEME. The workflow is completely self-contained.

5. Export

The data can now be exported as a matrix, specifying a depth step that all data will be interpolated to:

X, _ = p.data_as_matrix(X_keys=keys, step=0.1, alias=alias)

That's it. We end up with a 2D array of log values that will go straight into, say, scikit-learn*. I've omitted here the process of loading the Canstrat data and exporting that, because it's a bit more involved. I will try to look at that part in a future post. For now, I hope this is useful to someone. If you'd like to collaborate on this project in the future — you know where to find us.

* For more on scikit-learn, don't miss Brendon Hall's tutorial in October's Leading Edge.

I'm happy to let you know that agilegeoscience.com and agilelibre.com are now served over HTTPS — so connections are private and secure by default. This is just a matter of principle for the Web, and we go to great pains to ensure our web apps modelr.io and pickthis.io are served over HTTPS. Find out more about SSL from DigiCert, the provider of Squarespace's (and Agile's) certs, which are implemented with the help of the non-profit Let's Encrypt, who we use and support with dollars.

November 03, 2016

Well data woes

November 03, 2016/ Matt Hall

I probably shouldn't be telling you this, but we've built a little tool for wrangling well data. I wanted to mention it, becase it's doing some really useful things for us — and maybe it can help you too. But I probably shouldn't because it's far from stable and we're messing with it every day.

But hey, what software doesn't have a few or several or loads of bugs?

Buggy data?

It's not just software that's buggy. Data is as buggy as heck, and subsurface data is, I assert, the buggiest data of all. Give units or datums or coordinate reference systems or filenames or standards or basically anything at all a chance to get corrupted in cryptic ways, and they take it. Twice if possible.

By way of example, we got a package of 10 wells recently. It came from a "data management" company. There are issues... Here are some of them:

All of the latitude and longitude data were in the wrong header fields. No coordinate reference system in sight anywhere. This is normal of course, and the only real side-effect is that YOU HAVE NO IDEA WHERE THE WELL IS.
Header chaos aside, the files were non-standard LAS sort-of-2.0 format, because tops had been added in their own little completely illegal section. But the LAS specification has a section for stuff like this (it's called OTHER in LAS 2.0).
Half the porosity curves had units of v/v, and half %. No big deal...
...but a different half of the porosity curves were actually v/v. Nice.
One of the porosity curves couldn't make its mind up and changed scale halfway down. I am not making this up.
Several of the curves were repeated with other names, e.g. GR and GAM, DT and AC. Always good to have a spare, if only you knew if or how they were different. Our tool curvenam.es tries to help with this, but it's far from perfect.
One well's RHOB curve was actually the PEF curve. I can't even...

The remarkable thing is not really that I have this headache. It's that I expected it. But this time, I was out of paracetamol.

Cards on the table

Our tool welly, which I stress is very much still in development, tries to simplify the process of wrangling data like this. It has a project object for collecting a lot of wells into a single data structure, so we can get a nice overview of everything:

Our goal is to include these curves in the training data for a machine learning task to predict lithology from well logs. The trained model can make really good lithology predictions... if we start with non-terrible data. Next time I'll tell you more about how welly has been helping us get from this chaos to non-terrible data.

October 27, 2016

Nowhere near Nyquist

October 27, 2016/ Guest

This is a guest post by my Undersampled Radio co-host, Graham Ganssle.

You can find Gram on the web — LinkedIn — Twitter — GitHub

This post is a follow up to Tuesday's post about the podcast — you might want to read that first.

Undersampled Radio was born out of a dual interest in podcasting. Matt and I both wanted to give it a shot, but we didn’t know what to talk about. We still don’t. My philosophy on UR is that it’s forumesque; we have a channel on the Software Underground where we solicit ideas, draft guests, and brainstorm about what should be on the show. We take semi-formed thoughts and give them a good think with a guest who knows more than us. Live and uncensored.

Since with words I... have not.. a way... the live nature of the show gives it a silly, laid back attitude. We attempt to bring our guests out of interview mode by asking about their intellectual curiosities in addition to their professional interests. Though the podcast releases are lightly edited, the YouTube live-stream recordings are completely raw. For a good laugh at our expense you should certainly watch one or two.

Techie deets

Have a look at the command center. It’s where all the UR magic (okay, digital trickery) happens in pre- and post-production.

We’ve migrated away from the traditional hardware combination used by most podcasters. Rather than use the optimum mic/mixer/spaghetti-of-cables preferred by podcasting operations which actually generate revenue, we’ve opted to use less hardware and do a bit of digital conditioning on the back end. We conduct our interviews via YouTube live (aka Google Hangouts on Air) then on my Ubuntu machine I record the audio through stereo mix using PulseAudio and do the filtering and editing in Audacity.

Though we usually interview guests via Google Hangouts, we have had one interviewee in my office for an in-person chat. It was an incredible episode that was filled with the type of nonlinear thinking which can only be accomplished face to face. I mention this because I’m currently soliciting another New Orleans recording session (message me if you’re interested). You buy the plane ticket to come record in the studio. I buy the beer we’ll drink while recording.

as Matt guessed there actually are paddle boats rolling by while I record. Here’s the view from my recording studio; note the paddle boat on the left. — as Matt guessed there actually are paddle boats rolling by while I record. Here’s the view from my recording studio; note the paddle boat on the left.

Forward projections

We have several ideas about what to do next. One is a live competition of some sort, where Matt and I compete while a guest(s) judge our performance. We’re also keen to do a group chat session, in which all the members of the Software Underground will be invited to a raucous, unscripted chat about whatever’s on their minds. Unfortunately we dropped the ball on a live interview session at the SEG conference this year, but we’d still like to get together in some sciencey venue and grab randos walking by for lightning interviews.

In accord with the remainder of our professional lives, Matt and I both conduct the show in a manner which keeps us off balance. I have more fun, and learn more information more quickly, by operating in a space outside of my realm of knowledge. Ergo, we are open to your suggestions and your participation in Undersampled Radio. Come join us!

October 25, 2016

Tune in to Undersampled Radio

October 25, 2016/ Matt Hall

Back in the summer I mentioned Undersampled Radio, the world's newest podcast about geoscience. Well, geoscience and computers. OK, machine learning and geoscience. And conferences.

We're now 25 shows in, having started with Episode 0 on 28 January. The show is hosted by Graham 'Gram' Ganssle, a consulting and research geophysicist based in New Orleans, and me. Appropriately enough, I met Gram at the machine-learning-themed hackathon we did at SEG in 2015. He was also a big help with the local knowledge.

I broadcast from one of the phone rooms at The HUB South Shore. Gram has the luxury of a substantial book-lined office, which I imagine has ample views of paddle-steamers lolling on the Mississippi (but I actually have no idea where it is).

To get an idea of what we chat about, check out the guests on some recent episodes:

Ep 23, Forest Through the Trees — David Holmes, CTO of Energy at Dell EMC.
Ep 22, Geomechanicists vs Geomechanicians — Amy Fox, a geomechanic in Calgary.
Ep 20, Hygge — Jesper Dramsch, a PhD student in Copenhagen.
Ep 18, The Rock Botherer — Chris Jackson, a geologist at Imperial College London.
Ep 17, Rock Women Rock — Maitri Erwin, a geophysicist at Nexen in Houston.
Ep 16, Today's Technology Sucks Less — Gerard Gorman, a computational physicist at Imperial.

Better than cable

The podcast is really more than just a podcast, it's really a live TV show, broadcasting on YouTube Live. You can catch the action while it's happening on the Undersampled Radio channel. However, it's not easy to catch live because the episodes are not that predictable — they are announced about 24 hours in advance on the Software Underground Slack group (you are in there, right?). We should try to put them out on the @undrsmpldrdio Twitter feed too...

So, go ahead and watch the very latest episode, recorded last Thursday. We spoke to Tim Hopper, a data scientist in Raleigh, NC, who works at Distil Networks, a cybersecurity firm. It turns out that using machine learning to filter web traffic has some features in common with computational geophysics...

You can subscribe to the show in iTunes or Google Play, or anywhere else good podcasts are served. Grab the RSS Feed from the UndersampledRad.io website.

Of course, we take guest requests. Who would you like to hear us talk to?

October 12, 2016

Working without a job

October 12, 2016/ Matt Hall

I have drafted variants of this post lots of times. I've never published them because advice always feels... presumptuous. So let me say: I don't have any answers. But I do know that the usual way of 'finding work' doesn't work any more, so maybe the need for ideas, or just hope, has grown.

Lots of people are out of work right now. I just read that 120,000 jobs have been lost in the oil industry in the UK alone. It's about the same order of magnitude in Canada, maybe as much as 200,000. Indeed, several of my friends — smart, uber-capable professionals — are newly out of jobs. There's no fat left to trim in operator or service companies... but the cuts continue. It's awful.

The good news is that I think we can leave this downturn with a new, and much better, template for employment. The idea is to be more resilient for 'next time' (the coming mergers, the next downturn, the death throes of the industry, that sort of thing).

The tragedy of the corporate professional

At least 15 years ago, probably during a downturn, our corporate employers started telling us that we are responsible for our own careers. This might sound like a cop-out, maybe it was even meant as one, but really it's not. Taken at face value, it's a clear empowerment.

My perception is that most professionals did not rise to the challenge, however. I still hear, literally all the time, that people can't submit a paper to a conference, or give a talk, or write a blog, or that they can't take a course, or travel to a workshop. Most of the time this comes from people who have not even asked, they just assume the answer will be No. I worry that they have completely given in; their professional growth curtailed by the real or imagined conditions of their employment.

More than just their missed opportunity, I think this is a tragedy for all of us. Their expertise effectively gone from the profession, these lost scientists are unknown outside their organizations.

Many organizations are happy for things to work out that way, but when they make the situation crystal clear by letting people go, the inequity is obvious. The professional realizes, too late, that the career they were supposed to be managing (and perhaps thought they were managing by completing their annual review forms on time) was just that — a career, not a job. A career spanning multiple jobs and, it turns out, multiple organizations.

I read on LinkedIn recently someone wishing recently let-go people good luck, hoping that they could soon 'resume their careers'. I understand the sentiment, but I don't see it the same way. You don't stop being a professional, it's not a job. Your career continues, it's just going in a different direction. It's definitely not 'on hold'. If you treat it that way, you're missing an opportunity, perhaps the best one of your career so far.

What you can do

Oh great, unsolicited advice from someone who has no idea what you're going through. I know. But hey, you're reading a blog, what did you expect?

Do you want out? If you think you might want to leave the industry and change your career in a profound way, do it. Start doing it right now and don't look back. If your heart's not in this work, the next months and maybe years are really not going to be fun. You're never going to have a better run at something completely different.
You never stop being a professional, just like a doctor never stops being a doctor. If you're committed to this profession, grasp and believe this idea. Your status as such is unrelated to the job you happen to have or the work you happen to be doing. Regaining ownership of our brains would be the silveriest of linings to this downturn.
Your purpose as a professional is to offer help and advice, informed by your experience, in and around your field of expertise. This has not changed. There are many, many channels for this purpose. A job is only one. I firmly believe that if you create value for people, you will be valued and — eventually — rewarded.
Establish a professional identity that exists outside and above your work identity. Get your own business cards. Go to meetings and conferences on your own time. Write papers and articles. Get on social media. Participate in the global community of professional geoscientists.
Build self-sufficiency. Invest in a powerful computer and fast Internet. Learn to use QGIS and OpendTect. Embrace open source software and open data. If and when you get some contracting work, use Tick to count hours, Wave for accounting and invoicing, and Todoist to keep track of your tasks.
Find a place to work — I highly recommend coworking spaces. There is one near you, I can practically guarantee it. Trust me, it's a much better place to work than home. I can barely begin to describe the uplift, courage, and inspiration you will get from the other entrepreneurs and freelancers in the space.
Find others like you, even if you can't get to a coworking space, your new peers are out there somewhere. Create the conditions for collaboration. Find people on meetup.com, go along to tech and start-up events at your local university, or if you really can't find anything, organize an event yourself!
Note that there are many ways to make a living. Money in exchange for time is one, but it's not a very efficient one. It's just another hokey self-help business book, but reading The 4-Hour Workweek honestly changed the way I look at money, time, and work forever.
Remember entrepreneurship. If you have an idea for a new product or service, now's your chance. There's a world of making sh*t happen out there — you genuinely do not need to wait for a job. Seek out your local startup scene and get inspired. If you've only ever worked in a corporation, people's audacity will blow you away.

If you are out of a job right now, I'm sorry for your loss. And I'm excited to see what you do next.

October 06, 2016

What will people pay for?

October 06, 2016/ Matt Hall

Many organizations in the industry are asking this question right now. Software and service companies would like to sell product, technical societies would like to survive diminished ad sales and conference revenue, entrepreneurs would like to find customers. We all need to make a living.

I was recently asked this very question by a technical society. However, it's utterly the wrong question. Even asking this question reveals a deep-seated misunderstanding of what technical societies are for.

The question is not "What will people pay for?", it's "What do people need?".

The leaders of our profession

Geoscientists and engineers are professionals. Our professional contributions are defined by our work and its purpose, not by our jobs and their tasks. This is essentially what makes a professional different from other workers: we are purpose-oriented, not task-oriented. We're interested in the outcome, not the means.

But even professionals benefit from leadership. Professional regulators notwithstanding, our technical societies are the de facto leaders of the profession. The professional regulator is the 'line manager' of the profession, not the 'chief geoscientist'.

Leadership is about setting an example, inspiring great work, and providing the means to grow and make the best contributions people can make. Societies need to be asking themselves how they can create the conditions for a transformed profession, a more relevant and resilient one. In short, how can they be useful? How can they serve?

OK, so what do people need?

I don't claim to have all the answers, or even many of them, but here are some things I think people need:

Representation. Get serious about gender and race balance on your boards and committees. There is recent progress, but it's nowhere near representative. Related: get out of North America and improve global reach.
Better ways to contribute and connect. Experiment more — a lot more, and urgently — with meetings and conferences. Help people participate, not just attend. Help people connect, not just exchange business cards.
New ways to contribute and connect. Get serious about social media. Get scientists involved — social media is not a marketing exercise. Think hard about how you can engage your members through blogs and other content.
Reproducible science. Go further with open access, open data, and open source code. Make your content work harder. Make it reach further. Demand more of your authors to make their work reproducible.
A bit less self-interest. Stop regarding things you didn't organize or produce as a threat. Other people's events and publications may be of interest to your members, and your mission is to serve them.

Don't listen to my blathering. The AGU and the EGU are real leaders in geoscience — be inspired by them, follow their lead. Pay more attention to what's happening in publishing and conferences in other technical verticals, especially technology.

Pie in the sky is still pie

People will say, "That's all great Matt, but right now it's about survival." I get this a lot, and I sympathize, but I'm not buying it. When times are good, you don't need to do the right thing; when times are hard, you can't afford to. True, all this would be easier if you'd started doing the right thing when times were good, but you didn't, so here we are.

Sure it's tough now, but are you sure you can afford to wait till tomorrow?

I've written lots before on these topics. Suggested reading:

Are conferences failing you too? — my initial rant about the shortcomings of the 'commercial–technical' conference.
Ways to experiment with conferences — the follow up post to that rant, with some ideas.
The forum that never happened — more whining, this time about SEG's sometimes silly rules about its meetings.
Capturing conferences — the follow-up to that post, about ways to get something useful from meetings.
Scientists not prospects — on the commercialization of our technical societies.
Proceedings of an unsession — An article in CSEG Recorder about a way to get people talking at conferences.
Open collaboration: hackathons and tomorrow's open software — Hackathons are also a powerful collaboration tool.

September 21, 2016

x lines of Python: read and write SEG-Y

September 21, 2016/ Matt Hall

Reading SEG-Y files comes up a lot in the geophysicist's workflow. Writing, less often, but it does come up occasionally. As long as we're mostly concerned with trace data and not location, both of these tasks can be fairly easily accomplished with ObsPy.

Today we'll load some seismic, compute an attribute on it, and save a new SEG-Y, in 10 lines of Python.

ObsPy is a rare thing. It demonstrates what a research group can accomplish with a little planning and a lot of perseverance (cf my whinging earlier this year about certain consortiums in our field). It's an open source Python package from the geophysicists at the University of Munich — Karl Bernhard Zoeppritz studied there for a while, so you know it's legit. The tool serves their research in earthquake and global seismology needs, and also happens to handle SEG-Y files quite nicely.

Aside: I think SixtyNorth's segpy is actually the way to go for reading and writing SEG-Y; ObsPy is probably overkill for most applications — it's about 80 times the size for one thing. I just happen to be familiar with it and it's super easy to install: conda install obspy. So, since minimalism is kind of the point here, look out for a future x lines of Python using that library.

The sentences

As before, we'd like to express the process in just a few sentences of plain English. Assuming we just want to read the data into a NumPy array, look at it, do something to it, and write a new file, here's what we're doing:

Read (or really index) the file as an ObsPy Stream object.
Stack (in the NumPy sense) the Trace objects into a single NumPy array. We have data!
Get the 99th percentile of the amplitudes to make plotting easier.
Plot the data so we can see it.
Get the sample interval of the data from a trace header.
Compute the similarity attribute using our library bruges.
Make a new Stream object to hold the outbound data.
Add a Stats object, which holds the header, and recycle some header info.
Append info about our data to the header.
Write a new SEG-Y file with our computed data in it!

There's a bit more in the Jupyter Notebook (examining the file and trace headers, for example, and a few more plots) which, remember, you can run right in your browser! You don't need to install a thing. Please give it a look! Quick tip: Just keep hitting Shift+Enter to run the cells.

If you like this sort of thing, and are planning to be at the SEG Annual Meeting in Dallas next month, you might like to know that we'll be teaching our Creative Geocomputing class there. It's basically two days of this sort of thing, only with friends to learn with and us to help. Come and learn some new skills!

The seismic data used in this post is from the NPRA seismic repository of the USGS. The data is in the public domain.

September 15, 2016

x lines of Python: synthetic wedge model

September 15, 2016/ Matt Hall

Welcome to a new blog series! Like the A to Z and the Great Geophysicists, I expect it will be sporadic and unpredictable, but I know you enjoys life's little nonlinearities as much as I.

The idea with this one — x lines of Python — is to share small geoscience workflows in x lines or fewer. I'm not sure about the value of x, but I think 10 seems reasonable for most tasks. If x > 10 then the task may have been too big... If x < 5 then it was probably too small.

Python developer Raymond Hettinger says that each line of code should be equivalent to a sentence... so let's say that that's the measure of what's OK to put in a single line.

Synthetic wedge model

To kick things off, follow this link to a live Jupyter Notebook environment showing how you can make a simple synthetic three-rock wedge model in only 9 lines of code.

The sentences represented by the code that made the data in these images are:

Set up the size of the model.
Make the slanty bit, with 1's in the wedge and 2's in the base.
Add the top of the model as 0; these numbers will turn into rocks.
Define the velocity and density of rocks 0 to 2.
Distribute those properties through the model.
Calculate the acoustic impedance everywhere.
Calculate the reflection coefficients in the model.
Make a Ricker wavelet.
Convolve the wavelet with the reflection coefficients.

Your turn!

All of the notebooks we share in this series will be hosted on mybinder.org. I'm excited about this because it means you can run and edit them live, without installing anything at all. Give it a go right now.

You can see them on GitHub too, and fork or clone them from there. Note that if you look at the notebook for this post on GitHub, you'll be able to view it, but not change or run code unless you get everything running on your own machine. (To do that, you can more or less follow the instructions in my User Guide to the TLE tutorials).

Please do take this notion of x as 'par' as a challenge. If you'd like to try to shoot under par, please do — and share your efforts. Code golf is a fun way to learn better coding habits. (And maybe some bad ones.) There is a good chance I will shoot some bogies on this course.

We will certainly take requests too — what tasks would you like to see in x lines of Python?

September 12, 2016

What's that funny noise?

September 12, 2016/ Matt Hall

Seismic reflections are strange noises. Around 50 Hz, narrow band, very quiet, and difficult to interpret. It is possible to convert seismic traces (active or passive) into audible sound with a shift in pitch and a time stretch.

Made by the legendary Emory Cook, who recorded everything from steel bands to racing cars to ionospheric noises to this treatment of Hugo Benioff's earthquake recordings. Epic.

Curiously the audification thing has never really caught on in exploration geophysics — a bit surprising, given the fascination with spectral decomposition over the last 15 years or so. And especially so when you consider that our hearing has a dynamic range of about 100 dB, which is comparable to, indeed slightly greater than, our vision (about 90 dB).

Paolo Dell'Aversana of ENI wants to change that. Rather than listening to 'raw' seismic, he's sending it to a MIDI interface and listening to it as a piano roll. Just try to imagine playing seismic on a piano for a second, then listen to his weird and wonderful results — at 9:45 in this EAGE video:

In this EAGE E-Lecture Paolo Dell'Aversana discusses how digital music technology can support geophysical data analysis and interpretation. If you've read any of Dell'Aversana's articles, you'll know he has one of the most creative minds in exploration geophysics. Skip to 9:45 for the crazy seismic piano roll.

On the subject of weird sounds, one of my favourite Wikipedia pages is List of unexplained sounds. I especially love the eerie recordings of mysterious underwater noises, like this one called Upsweep:

No-one knows what makes that noise! My money's on a volcanic vent, but that doesn't explain the seasonality. Maybe we should do a hackathon on these unexaplained sounds some time. If you know of any others — I'd love tohear about them.

If you enjoy strange infrasound as much as I do, I recommend following these two scientists on Twitter:

Milton Garces, aka Infrasound Hunter and director of the Infrasound LAb at the University of Hawaii.
Stephen Gibbons, NORSAR seismologist who monitors nuclear testing with seismology.

If you really like strange noises, don't forget to check out the Undersampled Radio podcast!

Blog