x lines of Python: load curves from LAS

Welcome to the latest x lines of Python post, in which we have a crack at some fundamental subsurface workflows... in as few lines of code as possible. Ideally, x < 10.

We've met curves once before in the series — in the machine learning edition, in which we cheated by loading the data from a CSV file. Today, we're going to get it from an LAS file — the popular standard for wireline log data.

Just as we previously used the pandas library to load CSVs, we're going to save ourselves a lot of bother by using an existing library — lasio by Kent Inverarity. Indeed, we'll go even further by also using Agile's library welly, which uses lasio behind the scenes.

The actual data loading is only 1 line of Python, so we have plenty of extra lines to try something more ambitious. Here's what I go over in the Jupyter notebook that goes with this post:

  1. Load an LAS file with lasio.
  2. Look at its header.
  3. Look at its curve data.
  4. Inspect the curves as a pandas DataFrame.
  5. Load the LAS file with welly.
  6. Look at welly's Curve objects.
  7. Plot part of a curve.
  8. Smooth a curve.
  9. Export a set of curves as a matrix.
  10. BONUS: fix some broken things in the file header.

Each one of those steps is a single line of Python. Together, I think they cover many of the things we'd like to do with well data once we get our hands on it. Have a play with the notebook and explore what you can do.

Next time we'll take things a step further and dive into some seismic petrophysics.

The Rock Property Catalog again

Do you like data? Data about rocks? Open, accessible data that you can use for any purpose without asking? Read on.

After writing about anisotropy back in February, and then experimenting with storing rock properties in SubSurfWiki later that month, a few things happened:

  • The server I run the wiki on — legacy Amazon AWS infrastructure — crashed, and my backup strategy turned out to be <cough> flawed. It's now running on state-of-the-art Amazon servers. So my earlier efforts were mostly wiped out... Leaving the road clear for a new experiment!
  • I came across an amazing resource called Mudrock Anisotropy, or — more appealingly — Mr Anisotropy. Compiled by Steve Horne, it contains over 1000 records of rocks, gathered from the literature. It is also public domain and carries only a disclaimer. But it's a spreadsheet, and emailing a spreadsheet around is not sustainable.
  • The Common Ground database that was built by John A. Scales, Hans Ecke and Mike Batzle at Colorado School of Mines in the late 1990s, is now defunct and has been officially discontinued, as of about two weeks ago. It contains over 4000 records, and is public domain. The trouble is, you have to restore a SQLite database to use it.

All this was pointing towards a new experiment. I give you: the Rock Property Catalog again! This time it contains not 66 rocks, but 5095 rocks. Most of them have \(V_\mathrm{P}\), \(V_\mathrm{S}\) and  \(\rho\). Many of them have Thomsen's parameters too. Most have a lithology, and they all have a reference. Looking for Cretaceous shales in North America to use as analogs on your crossplots? There's a rock for that.

As before, you can query the catalog in various ways, either via the wiki or via the web API. Let's say we want to find shales with a velocity over 5000 m/s. You have a few options:

  1. Go to the semantic search form on the wiki and type [[lithology::shale]][[vp::>5000]]
  2. Make a so-called inline query on your own wiki page (you need an account for this).
  3. Make a query via the web API with a rather long URL: http://www.subsurfwiki.org/api.php?action=ask&query=[[RPC:%2B]][[lithology::shale]][[Vp::>5000]]|%3FVp|%3FVs|%3FRho&format=jsonfm

I updated the Jupyter Notebook I published last time with a new query. It's pretty hacky. I'll work on this to produce a more robust method, with some error handling and cleaner code — stay tuned.

The database supports lots of properties, including:

  • Citation and reference
  • Description, lithology, colour (you can have pictures if you want!)
  • Location, lat/lon, basin, age, depth
  • Vp, Vs, \(\rho\), as well as \(\rho_\mathrm{dry}\) and \(\rho_\mathrm{grain}\)
  • Thomsen's \(\epsilon\), \(\delta\), and \(\gamma\)
  • Static and dynamic Young's modulus and Poisson ratio
  • Confining pressure, pore pressure, effective stress, axial stress
  • Frequency
  • Fluid, saturation type, saturation
  • Porosity, permeability, temperature
  • Composition

There is more from the Common Ground data to add, especially photographs. But for now, I'd love some feedback: is this the right set of properties? Do we need more? I want this to be useful — what kind of data and metadata would you like to see? 

I'll end with the usual appeal — I'm open to any kind of suggestions or help with this. Perhaps you can contribute new rocks, or a paper containing data? Or maybe you have some wiki skills, or can help write bots to improve the data? What can you bring? 

Submitting assumptions for meaningful answers

The best talk of the conference was Ran Bachrach's on seismics for unconventionals. He enthusiastically described the physics to his spectators with conviction and duty, and explained why they should care. Isotropic, VTI, and orthorhombic media anisotropy models are used not because they are right, but because they are simple. If the assumptions you bring to the problem are reasonable, the answers can be considered meaningful. If you haven't considered and tested your assumptions, you haven't subscribed to reason. In a sense, you haven't held up your end of the bargain, and there will never be agreement. This talk should be mandatory viewing for anyone working seismic for unconventionals. Advocacy for reason. Too bad it wasn't recorded.

I am both privileged and obliged to celebrate such nuggets of awesomeness. That's a big reason why I blog. And on the contrary, we should call out crappy talks when we see them to raise the bar. Indeed, to quote Zen Faulkes, "...we should start creating more of an expectation that scientific talks will be reviewed and critiqued. And names will be named."

The talk from HEF Petrophysical entitled, Towards modelling three-dimensional oil sands permeability distribution using borehole image logs, drew me in. I was curious enough to show up. But as the talk unfolded, my curiosity was left unsatisfied. A potentially interesting workflow of transforming high-resolution resistivity measurements into flow permeability was obfuscated with a pointless upscaling step. The meat of anything like this is in the transform itself, but it was missing. It's also the most trivial bit; just cross-plot one property with another and show people. So I am guessing they didn't have any permeability data. If that was the case, how can you stand up and talk about permeability? It was a sandwich without the filling. The essential thing that defines a piece of work is the creativity. The thing you add that wasn't there before. I was disappointed. Disappointed that it was accepted, and that no one else piped up. 

I will paraphrase a conversation I had with Ran at the coffee break: Some are not aware, some choose to ignore, and some forget that works of geoscience are problems of extreme complexity. In fact, the only way we can cope with complexity is to make certain assumptions that make our problem solvable. If all you do is say "here is my solution", you suck. But if instead you ask, "Have I convinced you that my assumptions are reasonable?", it entirely changes the conversation. It entirely changes the specialist's role. Only when you understand your assumptions can we talk about whether the results are reasonable. 

Have you ever felt conflicted on whether or not you should say something?

Interpreting spectral gamma-ray logs

Before you can start interpreting spectral gamma-ray logs (or, indeed, any kind of data), you need to ask about quality.

Calibrate your tool...

The main issues affecting the quality of the logs are tool calibration and drilling mud composition. I think there's a tendency to assume that delivered logs have been rigorously quality checked, but... they haven't. The only safe assumption is that nobody cares about your logs as much as you. (There is a huge opportunity for service companies here — but in my experience they tend to be focused on speed and quantity, not quality.)

Calibration is critical. The measurement device in the tool consists of a thallium-laced NaI crystal and a photomultiplier. Both of these components are sensitive to temperature, so calibration is especially important when the temperature of the tool is changing often. If the surface temperature is very different from the downhole—winter in Canada—calibrate often.

Drilling mud containing KCl (to improve borehole stability) increases the apparent potassium content of the formation, while barite acts as a gamma-ray absorber and reduces the count rates, especially in the low energies (potassium).

One of the key quality control indicators is negative readings on the uranium log. A few negative values are normal, but many zero-crossings may indicate that the tool was improperly calibrated. It is imperative to quality control all of the logs, for bad readings and pick-up effects, before doing any quantitative work.

...and your interpretation

Most interpretations of spectral-gamma ray logs focus on the relationships between the three elemental concentrations. In particular, Th/K and Th/U are often used for petrophysical interpretation and log correlation. In calculating these ratios, Schlumberger uses the following cut-offs: if uranium < 0.5 then uranium = 0.5; if potassium < 0.004 then potassium = 0.001 (according to my reference manual for the natural gamma tool).

In general, high K values may be caused by the presence of potassium feldspars or micas. Glauconite usually produces a spike in the K log. High Th values may be associated with the presence of heavy minerals, particularly in channel deposits. Increased Th values may also be associated with an increased input of terrigenous clays. Increases in U are frequently associated with the presence of organic matter. For example, according to the ODP, particularly high U concentrations (> 5 ppm) and low Th/U ratios (< 2) often occur in black shale deposits.

The logs here, from Kansas Geological Survey open file 90-27 by Macfarlane et al. shows a quite overt interpretive approach, with the Th/K log labelled with minerals (feldspar, mica, illite–smectite) and the Th/U log in uranium 'fixedness', a proxy for organic matter.

Sounds useful. But really, you can probably find just a paper to support just about any interpretation you want to make. Which isn't to say that spectral gamma-ray is no use — it's just not diagnostic on its own. You need to calibrate it to your own basin and your own stratigraphy. This means careful, preferably quantitative, comparison of core and logs. 

Further reading 

What is spectral gamma-ray?

The spectral gamma-ray log is a measure of the natural radiation in rocks. The amplitude of the signal from the gamma-ray tool, which is just a sensor with no active source, is proportional to the energy of the gamma-ray photons it encounters. Being able to differentiate between photons of different energies turns out to be very handy Compared to the ordinary gamma-ray log, which ignores the energies and only counts the photons, it's like seeing in colour instead of black and white.

Why do we care about gamma radiation?

First, what are gamma rays? Highly energetic photons: electromagnetic radiation with very short wavelengths. 

Being able to see different energies, or 'colours', means we can differentiate between the radioactive decay of different elements. Elements decay by radiating energy, and the 'colour' of that energy is characteristic of that element (actually, of each isotope). So, we can tell by looking at the energy of a photon if we are seeing a potassium atom (40K) or a uranium atom (238U) decay. These are very different isotopes, with very different habits. We can do geology!

In fact, all sorts of radioisotopes occur naturally in the earth. By far the most abundant are potassium 40K, thorium 232Th and uranium 238U. Of these, potassium is the most abundant in sedimentary rocks, but thorium and uranium are present in small quantities, and have particular sedimentological implications.

What exactly are we measuring?

Potassium 40K decays to argon about 10% of the time, with γ-emission at 1.46 MeV (the other 90% of the time it decays to calcium). However, all of the decay in the 232Th and 238U decay series occurs by α- and β-particle decay, which don't always result in photon emission. The tool in fact measures γ-radiation from the decay of thallium 208Tl in the 232Th series (right), and from bismuth 214Bi in the 238U series. The spectral gamma-ray tool must be calibrated to known samples to give concentrations of 232Th and 238U from its readings. Proper calibration is vital, and is temperature-sensitive (of note in Canada!).

The concentrations of the three elements are estimated from the spectral measure­ments. The concentration of potassium is usually measured in percent (%) or per mil (‰), or sometimes in kilograms per tonne, which is equivalent to per mil. The other two elements are measured in parts per million (ppm).

Here is the gamma-ray spectrum from a single sample from 509 m below the sea-floor at ODP Site 1201. The final spectrum (heavy black line) is shown after removing the background spectrum (gray region) and applying a three-point mean boxcar filter. The thin black line shows the raw spectrum. Vertical lines mark the interval boundaries defined by Peter Blum (an ODP scientist at Texas A&M). Prominent energy peaks relating to certain elements are identified at the top of the figure. The inset shows the spectrum for energies >1500 keV at an expanded scale. 

We wouldn't normally look at these spectra. Instead, the tool provides logs for K, Th, and U. Next time, I'll look at the logs.

Spectrum illustration by Wikipedia user Inductiveload, licensed GFDL; decay chain by Wikipedia user BatesIsBack, licensed CC-BY-SA.

Rocks, pores and fluids

At an SEG seismic rock physics conference in China several years ago, I clearly remember a catch phrase used by one of the presenters, "It's all about rocks, pores, and fluids." He used it several times throughout his talk as an invocation for geophysicists to translate their seismic measurements of the earth into terms that are more appealing to others. Nobody cares about the VP/VS ratio in a reservoir. Even though I found the repetition slightly off-putting, he succeeded—the phrase stuck. It's all about rock, pores, and fluids.

Fast forward to the SEG IQ Earth Forum a few months ago. The message reared its head again, but in a different form. After dinner one evening, I was speaking with Ran Bachrach about advances in seismic rock physics technology: the glamour and the promise of the state-of-the-art. It was a topic right up his alley, but suprisingly, he seemed ambivalent and under-enthused. Which was unusual for him. "More often than not," he said, "we can get all the information we need from the triple combo." 

What is the triple combo? 

I felt embarrased that I had never heard of the term. Like I had been missing something this whole time. The triple combo is the standard set of measurements used in formation evaluation and wireline logging: gamma-ray, porosity, and resistivity. Simply put, the triple combo tells us about rocks, pores, and fluids. 

I find it curious that the very things we are interested in are impossible to measure directly. For example:

  • A gamma-ray log measures naturally occuring radioactive minerals. We use this to make inferences about lithology.
  • A neutron log measures Compton scattering in proportion to the number of hydrogen atoms. This is a proxy for pores.
  • A resistivity log measures the conductivity of electrical current. We use this to tell us about fluid type and saturation.

Subsurface geotechnology isn't only about recording the earth's constituents in isolation. Some measurements, the sonic log for instance, are useful because of the fact that they are an aggregate of all three.

The well log is a section of the Thebaud_E-74 well available from the offshore Nova Scotia Play Fairway Analysis.

Cope don't fix

Some things genuinely are broken. International financial practices. Intellectual property law. Most well tie software. 

But some things are the way they are because that's how people like them. People don't like sharing files, so they stash their own. Result: shared-drive cancer — no, it's not just your shared drive that looks that way. The internet is similarly wild, chaotic, and wonderful — but no-one uses Yahoo! Directory to find stuff. When chaos is inevitable, the only way to cope is fast, effective search

So how shall we deal with the chaos of well log names? There are tens of thousands — someone at Schlumberger told me last week that they alone have over 50,000 curve and tool names. But these names weren't dreamt up to confound the geologist and petrophysicist — they reflect decades of tool development and innovation. There is meaning in the morasse.

Standards are doomed

Twelve years ago POSC had a go at organizing everything. I don't know for sure what became of the effort, but I think it died. Most attempts at standardization are doomed. Standards are awash with compromise, so they aren't perfect for anything. And they can't keep up with changes in technology, because they take years to change. Doomed.

Instead of trying to fix the chaos, cope with it.

A search tool for log names

We need a search tool for log names. Here are some features it should have:

  • It should be free, easy to use, and fast
  • It should contain every log and every tool from every formation evaluation company
  • It should provide human- and machine-readable output to make it more versatile
  • You should get a result for every search, never drawing a blank
  • Results should include lots of information about the curve or tool, and links to more details
  • Users should be able to flag or even fix problems, errors, and missing entries in the database

To my knowledge, there are only two tools a little like this: Schlumberger's Curve Mnemonic Dictionary, and the SPWLA's Mnemonics Data Search. Schlumberger's widget only includes their tools, naturally. The SPWLA database does at least include curves from Baker Hughes and Halliburton, but it's at least 10 years out of date. Both fail if the search term is not found. And they don't provide machine-readable output, only HTML tables, so it's difficult to build a service on them.

Introducing fuzzyLAS

We don't know how to solve this problem, but we're making a start. We have compiled a database containing 31,000 curve names, and a simple interface and web API for fuzzily searching it. Our tool is called fuzzyLAS. If you'd like to try it out, please get in touch. We'd especially like to hear from you if you often struggle with rogue curve mnemonics. Help us build something useful for our community.

The digital well scorecard

In my last post, I ranted about the soup of acronyms that refer to well log curves; a too-frequent book-keeping debacle. This pain, along with others before it, has motivated me to design a solution. At this point all I have is this sketch, a wireframe of should-be software that allows you visualize every bit of borehole data you can think of:

The goal is, show me where the data is in the domain of the wellbore. I don't want to see the data explicitly (yet), just its whereabouts in relation to all other data. Data from many disaggregated files, reports, and so on. It is part inventory, part book-keeping, part content management system. Clear the fog before the real work can begin. Because not even experienced folks can see clearly in a fog.

The scorecard doesn't yield a number or a grade point like a multiple choice test. Instead, you build up a quantitative display of your data extents. With the example shown above, I don't even have to look at the well log to tell you that you are in for a challenging well tie, with the absence of sonic measurements in the top half of the well. 

The people that I showed this to immediately undestood what was being expressed. They got it right away, so that bodes well for my preliminary sketch. Can you imagine using a tool like this, and if so, what features would you need? 

Swimming in acronym soup

In a few rare instances, an abbreviation can become so well-known that it is adopted into everyday language; more familar than the words it used to stand for. It's embarrasing, but I needed to actually look up LASER, and you might feel the same way with SONAR. These acronyms are the exception. Most are obscure barriers to entry in technical conversations. They can be constructs for wielding authority and exclusivity. Welcome to the club, if you know the password.

No domain of subsurface technology is riddled with more acronyms than well log analysis and formation evaluation. This is a big part of — perhaps too much of a part of — why petrophysics is hard. Last week, I came across a well. It has an extended suite of logs, and I wanted make a synthetic. Have a glance at the image and see which curve names you recognize (the size represents the frequency the names are encountered across many files of the same well).

I felt like I was being spoken to by some earlier deliquent: I got yer well logs right here buddy. Have fun sorting this mess out.

The log ASCII standard (*.LAS file) file format goes a long way to exposing descriptive information in the header. But this information is often incomplete, missing, and says nothing about the quality or completeness of the data. I had to scan 5 files to compile this soup. A micro-travesty and a failure, in my opinion. How does one turn this into meaningful information for geoscience?

Whose job is it to sort this out? The service company that collected the data? The operator that paid for it? A third party down the road?

What I need is not only an acronym look-up table, but also a data range tool to show me what I've got in the file (or files), and at which locations and depths I've got it. A database to give me more information about these acronyms would be nice too, and a feature that allows me to compare multiple files, wells, and directories at once. It would be like a life preserver. Maybe we should build it.

I made the word cloud by pasting text into wordle.net. I extracted the text from the data files using the wonderful LASReader written by Warren Weckesser. Yay, open source!

News of the month

Our more-or-less regular news round-up is here again. News tips?

Geophysics giant

On Monday the French geophysics company CGGVeritas announced a deal to buy most of Fugro's Geoscience division for €1.2 billion (a little over $1.5 billion). What's more, the two companies will enter into a joint venture in seabed acquisition. Fugro, based in the Netherlands, will pay CGGVeritas €225 million for the privilege. CGGVeritas also pick up commercial rights to Fugro's data library, which they will retain. Over 2500 people are involved in the deal — and CGGVeritas are now officially Really Big. 

Big open data?

As Evan mentioned in his reports from the SEG IQ Earth Forum, Statoil is releasing some of their Gullfaks dataset through the SEG. This dataset is already 'out there' as the Petrel demo data, though there has not yet been an announcement of exactly what's in the package. We hope it includes gathers, production data, core photos, and so on. The industry needs more open data! What legacy dataset could your company release to kickstart innovation?

Journal innovation

Again, as Evan reported recently, SEG is launching a new peer-reviewed, quarterly journal — Interpretation. The first articles will appear in early 2013. The journal will be open access... but only till the end of 2013. Perhaps they will reconsider if they get hundreds of emails asking for it to remain open access! Imagine the impact on the reach and relevance of the SEG that would have. Why not email the editorial team?

In another dabble with openness, The Leading Edge has opened up its latest issue on reserves estimation, so you don't need to be an SEG member to read it. Why not forward it to your local geologist and reservoir engineer?

Updating a standard

It's all about SEG this month! The SEG is appealing for help revising the SEG-Y standard, for its revision 2. If you've ever whined about the lack of standardness in the existing standard, now's your chance to help fix it. If you haven't whined about SEG-Y, then I envy you, because you've obviously never had to load seismic data. This is a welcome step, though I wonder if the real problems are not in the standard itself, but in education and adoption.

The SEG-Y meeting is at the Annual Meeting, which is coming up in November. The technical program is now online, a fact which made me wonder why on earth I paid $15 for a flash drive with the abstracts on it.

Log analysis in OpendTect

We've written before about CLAS, a new OpendTect plug-in for well logs and petrophysics. It's now called CLAS Lite, and is advertised as being 'by Sitfal', though it was previously 'by Geoinfo'. We haven't tried it yet, but the screenshots look very promising.

This regular news feature is for information only. We aren't connected with any of these organizations, and don't necessarily endorse their products or services. Except OpendTect, which we definitely do endorse.