X lines of Python: Ternary diagrams

Difficulty rating: beginner-friendly

(I just realized that calling the more approachable tutorials ‘easy’ is perhaps not the most sympathetic way to put it. But I think this one is fairly approachable.)

If you’re new to Python, plotting is a great way to get used to data structures, and even syntax, because you get immediate visual feedback. Plots are just fun.

Data loading

The first thing is to load the data, which is contained in a Google Sheets spreadsheet. If you make a sheet public, it’s easy to make a URL that provides a CSV. Happily, the Python data management library pandas can read URLs directly, so loading the data is quite easy — the only slightly ugly thing is the long URL:

    import pandas as pd
    uid = "1r7AYOFEw9RgU0QaagxkHuECvfoegQWp9spQtMV8XJGI"
    url = f"https://docs.google.com/spreadsheets/d/{uid}/export?format=csv"
    df = pd.read_csv(url) 

This dataset contains results from point-counting 51 shallow marine sandstones from the Eocene Sobrarbe Formation. We’re going to plot normalized volume percentages of quartz grains, detrital carbonate grains, and undifferentiated matrix. Three parameters? Two degrees of freedom? Let’s make a ternary plot!

Data exploration

Once you have the data in pandas, and before getting to the triangular stuff, we should have a look at it. Seaborn, a popular statistical plotting library, has a nifty ‘pairplot’ which plots the numerical parameters against each other to help reveal patterns in the data. On the diagonal, it shows kernel density estimations to reveal the distribution of each property:

    import seaborn as sns
    vars = ['Matrix', 'Quartz', 'Carbonate', 'Bioclasts', 'Authigenic']
    sns.pairplot(df, vars=vars, hue='Facies Association')
ternary_data_pairplot.png

Normalization is fairly straightforward. For each column, e.g. df['Carbonate'], we make a new column, e.g. df['C'], which is normalized to the sum of the three components, given by df[cols].sum(axis=1):

cols = ['Carbonate', 'Quartz', 'Matrix']
for col in cols:
    df[col[0]] = df[col] * 100 / df[cols].sum(axis=1)

The ternary plot

For the ternary plot itself I’m using the python-ternary library, which is pretty hands-on in that most plots take quite a bit of code. But the upside of this is that you can do almost anything you want. (Theres one other option for Python, the ever-reliable plotly, and there’s a solid-looking package for R too in ggtern.)

We just need a few lines of plotting code (left) to pull a ternary diagram (right) together.

    fig, tax = ternary.figure(scale=100)
    fig.set_size_inches(5, 4.5)

    tax.scatter(df[['M', 'Q', 'C']].values)
    tax.gridlines(multiple=20)
    tax.get_axes().axis('off')
ternary_tiny.png

But here you see what I mean about this being quite a low-level library: each element of the plot has to be added explicitly. So if we want axis labels, titles, and other annotations, we need more code… all of which is laid out in the accompanying notebook. You can download this from GitHub, or run in right now, right in your browser, with these links:

Binder   Run the accompanying notebook in MyBinder

Open In Colab   Run the notebook in Google Colaboratory (note you need to install python-ternary)

Give it a go, and have fun making your own ternary plots in Python! Share them on LinkedIn or Twitter.

Quartz, carbonate and matrix quantities (normalized to 100%) for 51 calcareous sandstones from the Eocene Sobrarbe Formation. The ternary plot was made with python-ternary library for Python and matplotlib.

Quartz, carbonate and matrix quantities (normalized to 100%) for 51 calcareous sandstones from the Eocene Sobrarbe Formation. The ternary plot was made with python-ternary library for Python and matplotlib.

Visualization in Copenhagen, part 2

In Part 1, I wrote about six of the projects teams contributed at the Subsurface Hackathon in Copenhagen in June. Today I want to tell you about the rest of them. 


A data exploration tool

Team GeoClusterFu...n: Dan Stanton (University of Leeds), Filippo Broggini (ETH Zürich), Francois Bonneau (Nancy), Danny Javier Tapiero Luna (Equinor), Sabyasachi Dash (Cairn India), Nnanna Ijioma (geophysicist). 

Tech: Plotly Dash. GitHub repo.

Project: The team set out to build an interactive web app — a totally new thing for all of them — to make interactive plots from data in a CSV. They ended up with the basis of a useful tool for exploring geoscience data. Project page.

Four sixths of the GeoClusterFu...n team cluster around a laptop.

Four sixths of the GeoClusterFu...n team cluster around a laptop.


AR outcrop on your phone

Team SmARt_OGs: Brian Burnham (University of Aberdeen), Tala Maria Aabø (Natural History Museum of Denmark), Björn Wieczoreck, Georg Semmler and Johannes Camin (GiGa Infosystems).

Tech: ARKit/ARCore, WebAR, Firebase. GitLab repo. 

Project: Bjørn and his colleagues from GiGa Infosystems have been at all the European hackathons. This time, he knew he wanted to get virtual outcrops on mobiles phones. He found a willing team, and they got it done! Project page.

Three views from the SmartOGs's video.  See the full version.

Three views from the SmartOGs's video. See the full version.


Rock clusters in latent space

The Embedders: Lukas Mosser (Imperial College London), Jesper Dramsch (Technical University of Denmark), Ben Fischer (PricewaterhouseCoopers), Harry McHugh (DUG), Shubhodip Konar (Cairn India), Song Hou (CGG), Peter Bormann (ConocoPhillips).

Tech: Bokeh, scikit-learn, Multicore-TSNE. GitHub repo.

Project: There has been a lot of recent interest in the t-SNE algorithm as a way to reduce the dimensionality of complex data. The team explored its application to subsurface data, and found promising applications. Web page. Project page.

The Embeders built a web app to cluster the data in an LAS file. The clusters (top left) are generated by the t-SNE algorithm.

The Embeders built a web app to cluster the data in an LAS file. The clusters (top left) are generated by the t-SNE algorithm.


Fully mixed reality

Team Hands On GeoLabs: Will Sanger (Western Geco), Chance Sanger (Houston Museum of Fine Arts), Pierre Goutorbe (Total), Fernando Villanueva (Institut de Physique du Globe de Paris).

Project: Starting with the ambitious goal of combining the mixed reality of the Meta AR gear with the mixed reality of the Gempy sandbox, the team managed to display and interact with some seismic data in the AR headset, which  allows interaction with simple hand gestures. Project page.

The team demonstrate the Meta AR headset.

The team demonstrate the Meta AR headset.


Huge grids over the web

Team Grid Vizards: Fabian Kampe, Daniel Buse, Jonas Kopcsek, Paul Gabriel (all from GiGa Infosystems)

Tech: three.js. GitHub repo.

Project: Paul and his team wanted to visualize hundreds of millions or billions of grid cells — all in the browser. They ended up with about 20 million points working very smoothly, and impressed everyone. Project page.

grid_vizards.png

Interpreting RGB displays for spec decomp

Team: Florian Smit (Technical University of Denmark), Gijs Straathof (SGS), Thomas Gazzola (Total), Julien Capgras (Total), Steve Purves (Euclidity), Tom Sandison (Shell)

Tech: Python, react.js. GitHub repos: Client. Backend.

Project: Spectral decomposition is still a mostly quantitative tool, especially the interpretation of RGB-blended displays. This team set out to make intuitive, attractive forward models of the spectral response of wells. This should help interpret seismic data, and perhaps make more useful RGB displays too. Intriguing and promising work. Project page.

RGB_log.png

That's it for another year! Twelve new geoscience visualization projects — ten of them open source. And another fun, creative weekend for 63 geoscientists — all of whom left with new connections and new skills. All this compressed into one weekend. If you haven't experienced a hackathon yet, I urge you to seek one out.

I will leave you with two videos — and an apology. We are so focused on creating a memorable experience for everyone in the room, that we tend to neglect the importance of capturing what's happening. Early hackathons only had the resulting blog post as the document of record, but lately we've been trying to livestream the demos at the end. Our success has been, er, mixed... but they were especially wonky this time because we didn't have livestream maestro Gram Ganssle there. So, these videos exist, and are part of the documentation of the event, but they barely begin to convey the awesomeness of the individuals, the teams, or their projects. Enjoy them, but next time — you should be there!

Visualization in Copenhagen, part 1

CPH_blog_banner.png

It's finally here! The round-up of projects from the Subsurface Hacakthon in Copenhagen last month. This is the first of two posts presenting the teams and their efforts, in the same random order the teams presented them at the end of the event.


Subsurface data meets Pokemon Go

Team Geo Go: Karine Schmidt, Max Gribner, Hans Sturm (all from Wintershall), Stine Lærke Andersen (University of Copenhagen), Ole Johan Hornenes (University of Bergen), Per Fjellheim (Emerson), Arne Kjetil Andersen (Emerson), Keith Armstrong (Dell EMC). 

Project: With Pokemon Go as inspiration, the team set out to prototype a geoscience visualization app that placed interactive subsurface data elements into a realistic 3D environment.

180610_agile_scientific_78.jpg

Visualizing blind spots in data

Team Blind Spots: Jo Bagguley (UK Oil & Gas Authority), Duncan Irving (Teradata), Laura Froelich (Teradata), Christian Hirsch (Aalborg University), Sean Walker (Campbell & Walker Geophysics).

Tech: Flask, Bokeh, AWS for hosting app. GitHub repo.

Project: Data management always comes up as an issue in conversations about geocomputing, but few are bold enough to tackle it head on. This team built components for checking the integrity of large amounts of raw data, before passing it to data science projects. Project page.

Sean, Laura, and Christian. Jo and Duncan were out doing research. Note the kanban board in the background — agile all the way!

Sean, Laura, and Christian. Jo and Duncan were out doing research. Note the kanban board in the background — agile all the way!


Volume uncertainties visualization

Team Fortuna: Natalia Shchukina (Total), Behrooz Bashokooh (Shell), Tobias Staal (University of Tasmania), Robert Leckenby (now Agile!), Graham Brew (Dynamic Graphics), Marco van Veen (RWTH Aachen). 

Tech: Flask, Bokeh, Altair, Holoviews. GitHub repo.

Project: Natalia brought some data with her: lots of surface grids. The team built a web app to compute uncertainty sections and maps, then display them dynamically and interactively — eliciting audible gasps from the room. Project page.

The Fortuna app: Probability of being the the zone (left) and entropy (right). Cross-sections are shown at the top, maps on the bottom.


Differences and similarities with RGB blends

Team RGBlend: Melanie Plainchault and Jonathan Gallon (Total), Per Olav Svendsen, Jørgen Kvalsvik and Max Schuberth (Equinor).

Tech: Python, Bokeh. GitHub repo.

Project: One of the more intriguing ideas of the hackathon was not just so much a fancy visualization technique, as a novel way of producing a visualization — differencing 3 images and visualizing the differences in RGB space. It reminded me of an old blog post about the spot the difference game. Project page.

The differences (lower right) between three time-lapse seismic amplitude maps.

The differences (lower right) between three time-lapse seismic amplitude maps.


Augmented reality geological maps

Team AR Sandbox: Simon Virgo (RWTH Aachen), Miguel de la Varga (RWTH Aachen), Fabian Antonio Stamm (RWTH Aachen), Alexander Schaaf (University of Aberdeen).

Tech: Gempy. GitHub repo.

Project: I don't have favourite projects, but if I did, this would be it. The GemPy group had already built their sandbox when they arrived, but they extended it during the hackathon. Wonderful stuff. Project page.

magic box of sand: Sculpting a landscape (left), and the projected map (right). You can't even imagine how much fun it was to play with.


Augmented reality seismic wavefields

Team Sandbox Seismics: Yuriy Ivanov (NTNU Trondheim), Ana Lim (NTNU Trondheim), Anton Kühl (University of Copenhagen), Jean Philippe Montel (Total).

Tech: GemPy, Devito. GitHub repo.

Project: This team worked closely with Team AR Sandbox, but took it in a different direction. They instead read the velocity from the surface of the sand, then used devito to simulate a seismic wavefield propagating across the model, and projected that wavefield onto the sand. See it in action in my recent Code Show post. Project page.

Yuriy Ivanov demoing the seismic wavefield moving across the sandbox.


Pretty cool, right? As usual, all of these projects were built during the hackathon weekend, almost exclusively by teams that formed spontaneously at the event itself (I think one team was self-contained from the start). If you didn't notice the affiliations of the participants — go back and check them out; I think this might have been an unprecedented level of collaboration!

Next time we'll look at the other six projects. [UPDATE: Next post is here.]

Before you go, check out this awesome video Wintershall made about the event. A massive thank you to them for supporting the event and for recording this beautiful footage — and for agreeing to share it under a CC-BY license. Amazing stuff!

Visualize this!

The Copenhagen edition of the Subsurface Hackathon is over! For three days during the warmest June in Denmark for over 100 years, 63 geoscientists and programmers cooked up hot code in the Rainmaking Loft, one of the coolest, and warmest, coworking spaces you've ever seen. As always, every one of the participants brought their A game, and the weekend flew by in a blur of creativity, coffee, and collaboration. And croissants.

Pierre enjoying the  Meta AR headset  that DEll EMC provided.

Pierre enjoying the Meta AR headset that DEll EMC provided.

Our sponsors have always been unusually helpful and inspiring, pushing us to get more audacious, but this year they were exceptionally engaged and proactive. Dell EMC, in the form of David and Keith, provided some fantastic tech for the teams to explore; Total supported Agile throughout the organization phase, and Wintershall kindly arranged for the event to be captured on film — something I hope to be able to share soon. See below for the full credit roll!

sponsors.png

During th event, twelve teams dug into the theme of visualization and interaction. As in Houston last September, we started the event on Friday evening, after the Bootcamp (a full day of informal training). We have a bit of process to form the teams, and it usually takes a couple of hours. But with plenty of pizza and beer for fuel, the evening flew by. After that, it was two whole days of coding, followed by demos from all of the teams and a few prizes. Check out some of the pictures:

Thank you very much to everyone that helped make this event happen! Truly a cast of thousands:

  • David Holmes of Dell EMC for unparallelled awesomeness.
  • The whole Total team, but especially Frederic Broust, Sophie Segura, Yannick Pion, and Laurent Baduel...
  • ...and also Arnaud Rodde for helping with the judging.
  • The Wintershall team, especially Andreas Beha, who also acted as a judge.
  • Brendon Hall of Enthought for sponsoring the event.
  • Carlos Castro and Kim Saabye Pedersen of Amazon AWS.
  • Mathias Hummel and Mahendra Roopa of NVIDIA.
  • Eirik Larsen of Earth Science Analytics for sponsoring the event and helping with the judging.
  • Duncan Irving of Teradata for sponsoring, and sorting out the T-shirts.
  • Monica Beech of Ikon Science for participating in the judging.
  • Matthias Hartung of Target for acting as a judge again.
  • Oliver Ranneries, plus Nina and Eva of Rainmaking Loft.
  • Christopher Backholm for taking such great photographs.

Finally, some statistics from the event:

  • 63 participants, including 8 women (still way too few, but 100% better than 4 out of 63 in Paris)
  • 15 students plus a handful of post-docs.
  • 19 people from petroleum companies.
  • 20 people from service and technology companies, including 7 from GiGa-infosystems!
  • 1 no-show, which I think is a new record.

I will write a summary of all the projects in a couple of weeks when I've caught my breath. In the meantime, you can read a bit about them on our new events portal. We'll be steadily improving this new tool over the coming weeks and months.

That's it for another year... except we'll be back in Europe before the end of the year. There's the FORCE Hackathon in Stavanger in September, then in November we'll be in Aberdeen and London running some events with the Oil and Gas Authority. If you want some machine learning fun, or are looking for a new challenge, please come along!

Simon Virgo (centre) and his colleagues in Aachen built an augmented reality sandbox, powered by their research group's software,  Gempy . He brought it along and three teams attempted projects based on the technology. Above, some of the participants are having a scrum meeting to keep their project on track.

Simon Virgo (centre) and his colleagues in Aachen built an augmented reality sandbox, powered by their research group's software, Gempy. He brought it along and three teams attempted projects based on the technology. Above, some of the participants are having a scrum meeting to keep their project on track.


Looking forward to Copenhagen

We're in Copenhagen for the Subsurface Bootcamp and Hackathon, which start today, and the EAGE Annual Conference and Exhibition, which starts next week. Walking around the city yesterday, basking in warm sunshine and surrounded by sun-giddy Scandinavians, it became clear that Copenhagen is a pretty special place, where northern Europe and southern Europe seem to have equal influence.

The event this weekend promises to be the biggest hackathon yet. It's our 10th, so I think we have the format figured out. But it's only the third in Europe, the theme — Visualization and interaction — is new for us, and most of the participants are new to hackathons so there's still the thrill of the unknown! 

Many thanks to our sponsors for helping to make this latest event happen! Support these organizations: they know how to accelerate innovation in our industry.

sponsors.png

New events for UK

By the way, we just announced two new hackathons, one in London and one in Aberdeen, for the autumn. They are happening just before PETEX, the PESGB petroleum conference; find out more here. You can skill up for these events at some new courses, also just announced. The UK Oil and Gas Authority is offering our Intro to Geocomputing and Machine Learning class for free — apply here for a place. The courses are oversubscribed, so be sure to tell the OGA why you should get a place!

Code Show

There is a lot of other stuff happening at the EAGE exhibition this year — the HPC area, a new start-up area, and a digital transformation area which I hope is as bold as it sounds. Here's the complete schedule and some highlights:

There's lots of other stuff of course — EAGE has the most varied programme of any subsurface conference — but these are the sessions I'd be at if I had time to go to any sessions this year. But I won't because The hackathon is not all that's happening! Next week, starting on Tuesday, we're conducting a new experiment with the Code Show. In partnership with EAGE and Total, this is our attempt to bring some of the hackathon experience to everyone at EAGE. We'll be showing people the projects from the hackathon, talking to them about programming, and helping them get started on their own coding adventure. So if you're at EAGE, swing by Booth #1830 and say Hi.

x lines of Python: contour maps

Difficulty rating: EASY

Following on from the post a couple of weeks ago about colourmaps, I wanted to poke into contour maps a little more. Ostensibly, making a contour plot in matplotlib is a one-liner:

plt.contour(data)

But making a contour plot look nice takes a little more work than most of matplotlib's other plotting functions. For example, to change the contour levels you need to make an array containing the levels you want... another line of code. Adding index contours needs another line. And then there's all the other plotty stuff.

Here's what we'll do:

  1. Load the data from a binary NumPy file.
  2. Check the data looks OK.
  3. Get the min and max values from the map.
  4. Generate the contour levels.
  5. Make a filled contour map and overlay contour lines.
  6. Make a map with index contours and contour labels.

The accompanying notebook sets out all the code you will need. You can even run the code right in your browser, no installation required.

Here's the guts of the notebook:

 
import numpy as np
import matplotlib.pyplot as plt

seabed = np.load('../data/Penobscot_Seabed.npy')
seabed *= -1
mi, ma = np.floor(np.nanmin(seabed)), np.ceil(np.nanmax(seabed))
step = 2
levels = np.arange(10*(mi//10), ma+step, step)
lws = [0.5 if level % 10 else 1 for level in levels]

# Make the plot
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(1,1,1)
im = ax.imshow(seabed, cmap='GnBu_r', aspect=0.5, origin='lower')
cb = plt.colorbar(im, label="TWT [ms]")
cb.set_clim(mi, ma)
params = dict(linestyles='solid', colors=['black'], alpha=0.4)
cs = ax.contour(seabed, levels=levels, linewidths=lws, **params)
ax.clabel(cs, fmt='%d')
plt.show()

This produces the following plot:

my_map.png

Old skool plot tool

It's not very glamorous, but sometimes you just want to plot a SEG-Y file. That's why we crafted seisplot. OK, that's why we cobbled seisplot together out of various scripts and functions we had lying around, after a couple of years of blog posts and Leading Edge tutorials and the like.

Pupils of the old skool — when everyone knew how to write a bash script, pencil crayons and lead-filled beanbags ruled the desktop, and Carpal Tunnel Syndrome was just the opening act to the Beastie Boys — will enjoy seisplot. For a start, it's command line only: 

    python seisplot.py -R -c config.py ~/segy_files -o ~/plots

Isn't that... reassuring? In this age of iOS and Android and Oculus Rift... there's still the command line interface.

Features galore

So what sort of features can you look forward to? Other than all the usual things you've come to expect of subsurface software, like a complete lack of support or documentation. (LOL, I'm kidding.) Only these awesome selling points:

  • Make wiggle traces or variable density plots... or don't choose — do both!
  • If you want, the script will descend into subdirectories and make plots for every SEG-Y file it finds.
  • There are plenty of colourmaps to choose from, or if you're insane you can make your own.
  • You can make PNGs, JPGs, SVGs or PDFs. But not CGM, sorry about that.

Well, I say 'selling points', but the tool is 100% free. We think this is a fair price. It's also open source of course, so please — seriously, please — improve the source code, then share it with the world! The code is in GitHub, natch.

Never go full throwback

There is one more feature: you can go full throwback and add scribbles and coffee stains. Here's one for your wall:


The 2D seismic line in this post is from the USGS NPRA Seismic Data Archive, and are in the public domain. This is line number 31-81-PR (links directly to SEG-Y file).

Monday highlights from SEG

Ben and I are in New Orleans at the 2015 SEG Annual Meeting, a fittingly subdued affair, given the industry turmoil recently. Lots of people are looking for work, others are thankful to have it.

We ran our annual Geophysics Hackathon over the weekend; I'll write more about that later this week. In a nutshell: despite a low-ish turnout, we had 6 great projects, all of them quite different from anything we've seen before. Once again, Colorado School of Mines dominated.

Beautiful maps

One of the most effective ways to make a tight scientific argument is to imagine trying to convince the most skeptical person you know that your method works. When it comes to seismic attribute analysis, I am that skeptical person.

Some of the nicest images I saw today were in the 'Attributes for Stratigraphic Analysis' session, chaired by Rupert Cole and Yuefeng Sun. For example, Tao Zhao, one of Kurt Marfurt's students, showed some beautiful images from the Waka 3D offshore New Zealand (Zhao & Marfurt). He used 2D colourmaps to co-render two attributes together, along with semblance mapped to opacity on a black layer, and were very nice to look at. However I was left wondering, and not for the first time, how we can do a better job calibrating those maps to geology. We (the interpretation community) need to stop side-stepping that issue; it's central to our credibility. Even if you have no wells, as in this study, you can still use forward models, analogs, or at least interpretation by a sedimentologist, preferably two.

© SEG and Zhao & Marfurt. Left to right: Peak spectral frequency and peak spectral magnitude; GLCM homogeneity; shape index and curvedness. All of the attributes are also corendered with Sobel edge detection.

© SEG and Zhao & Marfurt. Left to right: Peak spectral frequency and peak spectral magnitude; GLCM homogeneity; shape index and curvedness. All of the attributes are also corendered with Sobel edge detection.

Pavel Jilinski at GeoTeric gave a nice talk (Calazans Muniz et al.) about applying some of these sort of fancy displays to a large 3D dataset in Brazil, in a collaboration with Petrobras. The RGB displays of spectral attributes were as expected, but I had not seen their cyan-magenta-yellow (CMY) discontinuity displays before. They map dip to the yellow channel, similarity to the magenta channel, and 'tensor discontinuity' to the cyan channel. No, I don't know what that means either, but the displays were pretty cool.

Publications news

This evening we enjoyed the Editor's Dinner (I coordinate a TLE column and review for Geophysics and Interpretation, so it's totally legit). Good things are coming to the publication world: adopted Canadian Mauricio Sacchi is now Editor-in-Chief, there are no more page charges for colour in Geophysics (up to 10 pages), and watch out for video abstracts next year. Also, Chris Liner mentioned that Interpretation gets 18% of its submissions from oil companies, compared to only 5% for Geophysics. And I heard, but haven't verified, that downturns result in more papers. So at least our journals are healthy. (You do read them, right?)

That's it for today (well, yesterday). More tomorrow!


References

Calazans Muniz, Moises, Thomas Proença, and Pavel Jilinski (2015). Use of Color Blend of seismic attributes in the Exploration and Production Development - Risk Reduction. SEG Technical Program Expanded Abstracts 2015: pp. 1638-1642. doi: 10.1190/segam2015-5916038.1

Zhao, Tao, and Kurt J. Marfurt (2015). Attribute assisted seismic facies classification on a turbidite system in Canterbury Basin, offshore New Zealand. SEG Technical Program Expanded Abstracts 2015: pp. 1623-1627. doi: 10.1190/segam2015-5925849.1

Corendering more attributes

My recent post on multi-attribute data visualization painted two seismic attributes from on a timeslice. Let's look now at corendering attributes extracted on a seismic horizon. I'll reproduce the example Matt gave in his post on colouring maps.

Although colour choices come down to personal preference, there are some points to keep in mind:

  • Data that varies relatively gradually across the canvas — e.g. elevation here — should use a colour scale that varies monotonically in hue and luminance, e.g. CubeHelix or Matteo Niccoli's colourmaps.
  • Data that varies relatively quickly across the canvas — e.g. my similarity data, (a member of the family that includes coherencesemblance, and so on) — should use a monochromatic colour scale, e.g. black–white. 
  • If we've chosen our colourmaps wisely, there should be some unused hues for rendering other additional attributes. In this case, there are no red hues in the elevation colourmap, so we can map redness to instantaneous amplitude.

Adding a light source

Without wanting to get too gimmicky, we can sometimes enliven the appearance of an attribute, accentuating its texture, by simulating a bumpy surface and shining a virtual light onto it. This isn't the same as casting a light source on the composite display. We can make our light source act on only one of our attributes and leave the others unchanged. 

Similarity attribute Displayed using a Greyscale Colourbar (left). Bump mapping of similarity attribute using a lightsource positioned at azimuth 350 degrees, inclination 20 degrees. 

The technique is called hill-shading. The terrain doesn't have to be a physical surface; it can be a slice. And unlike physical bumps, we're not actually making a new surface with relief, we are merely modifying the surface's luminance from an artificial light source. The result is a more pronounced texture.

One view, two dimensions, three attributes

Constructing this display takes a bit of trial an error. It wasn't immediately clear where to position the light source to get the most pronounced view. Furthermore, the amplitude extraction looked quite noisy, so I softened it a little bit using a Gaussian filter. Plus, I wanted to show only the brightest of the bright spots, so that all took a bit of fiddling.

Even though 3D data visualization is relatively common, my assertion is that it is much harder to get 3D visualization right, than for 2D. Looking at the 3 colour-bars that I've placed in the legend. I'm reminded of this difficulty of adding a third dimension; it's much harder to produce a colour-cube in the legend than a series of colour-bars. Maybe the best we can achieve is a colour-square like last time, with a colour-bar for the overlay on the side.

Check out the IPython notebook for the code used to create these figures.

The (bad) stuff of legend

What is a legend? Merriam–Webster says:

  1. A story from the past that is believed by many people but cannot be proved to be true.
  2. An explanatory list of the symbols on a map or chart.

I think we can combine these:

An explanatory list from the past that is believed by many to be useful but which cannot be proved to be.

Maybe that goes too far, sometimes you need a legend. But often, very often, you don't. At the very least, you should always try hard to make the legend irrelevant. Why, and how, can you do this? 

A case study

On the right is a non-scientific caricature of a figure from a paper I just finished reviewing for Geophysics. I won't give any more details because I don't want to pick on it unduly — lots of authors make the same mistakes.

Here are some of the things I think are confusing about this figure, detracting from the science in the paper. 

  • Making the reader cross-reference the line decoration with the legend makes it harder to make the comparison you're asking them to make. Just label the lines directly. 
  • Using unhelpful, generic names like 1, 2, and 3 for the models leads the reader into cross-reference Inception. The models were shown and explained on the previous page. 
  • Inception again: the models 1, 2, and 3 were shown in the previous figure parts (a), (b), and (c) respectively. So I had to cross-reference deeper still to really find out about them. 
  • The paper used colour elsewhere, so the use of black and white line decoration here seems unnecessary. There are other ways to ensure clarity if the paper is photocopied.
  • Everything on the same visual plane, so to speak, so the chart cannot take any more detail, such as gridlines. 

Getting better

I have tried to fix some of this in the version of the figure shown here. It's the same size as the original. The legend, such as it is, is now a visual key to the models. Careful juxtaposition of figures could obviate the need even for this extra key. The idea would be to use the colours and names of the models in every figure, to link them more intuitively.

The principles at work:

  • Reduce the fatigue of reading by labeling things directly.
  • Avoid using 'a' and 'b' or other generic names. Call the parts before and after, or 8 ms gate and 16 ms gate
  • Put things you want people to compare next to each other: models with data, output with input, etc. 
  • Use less ink for decoration, more ink for data. Gently direct the reader's attention. 

I'm sure there are other improvements we could make. Do you have any tips to share for making better figures? Leave them in the comments. 


Update, 30 Jan 2015

Some great comments came in today, and the point about black and white is well taken. Indeed, our 52 Things books are all black and white, and I end up transforming most images and figures to (I hope) make them clearer without colour. Here's how I'd do this figure in black and white.