EarthArXiv wants your preprints

eartharxiv.png

If you're into science, and especially physics, you've heard of arXiv, which has revolutionized how research in physics is shared. BioarXiv, SocArXiv and PaleorXiv followed, among others*.

Well get excited, because today, at last, there is an open preprint server especially for earth science — EarthArXiv has landed! 

I could write a long essay about how great this news is, but the best way to get the full story is to listen to two of the founders — Chris Jackson (Imperial College London and fellow University of Manchester alum) and Tom Narock (University of Maryland, Baltimore) — on Undersampled Radio this morning:

Congratulations to Chris and Tom, and everyone involved in EarthArXiv!

  • Friedrich Hawemann, ETH Zurich, Switzerland
  • Daniel Ibarra, Earth System Science, Standford University, USA
  • Sabine Lengger, University of Plymouth, UK
  • Andelo Pio Rossi, Jacobs University Bremen, Germany
  • Divyesh Varade, Indian Institute of Technology Kanpur, India
  • Chris Waigl, University of Alaska Fairbanks, USA
  • Sara Bosshart, International Water Association, UK
  • Alodie Bubeck, University of Leicester, UK
  • Allison Enright, Rutgers - Newark, USA
  • Jamie Farquharson, Université de Strasbourg, France
  • Alfonso Fernandez, Universidad de Concepcion, Chile
  • Stéphane Girardclos, University of Geneva, Switzerland
  • Surabhi Gupta, UGC, India

Don't underestimate how important this is for earth science. Indeed, there's another new preprint server coming to the earth sciences in 2018, as the AGU — with Wiley! — prepare to launch ESSOAr. Not as a competitor for EarthArXiv (I hope), but as another piece in the rich open-access ecosystem of reproducible geoscience that's developing. (By the way, AAPG, SEG, SPE: you need to support these initiatives. They want to make your content more relevant and accessible!)

It's very, very exciting to see this new piece of infrastructure for open access publishing. I urge you to join in! You can submit all your published work to EarthArXiv — as long as the journal's policy allows it — so you should make sure your research gets into the hands of the people who need it.

I hope every conference from now on has an EarthArXiv Your Papers party. 


* Including snarXiv, don't miss that one!

Why Python beats MATLAB for geophysics

MATLAB — the scientific computing environment which includes a programming language — is amazing. It has probably done as much for the development of new geophysical methods, and for the teaching and learning of geophysics, as any other tool or language. A purely anecdotal assertion, but it's rare to meet a geophysicist who has not at least dabbled in MATLAB, and it is used daily in geophysics labs and classrooms. Geophysics <3 MATLAB.

It's easy to see why — MATLAB definitely has some advantages.

Advantages of MATLAB

  • Matrices. MATLAB implicitly treats arrays as matrices (the name means 'matrix laboratory'). As a result, notation is quite intuitive for mathematicians. For example, a*b means standard matrix multiplication, the dot product. (Slightly confusingly, to get Python-style element-wise multiplication, add a dot: a.*b).
  • Lots of functions. MATLAB has been around for over 30 years, so there are many, many useful functions. Find them either in the core product, in one of the toolboxes, or in MATLAB Central.
  • Simulink. This block-based system design and simulation engine is much-loved by engineers. It allows users to model physical systems in an intuitive, graphical environment.
  • Easy to install. The MATLAB environment is a desktop application, so it is instantly familiar and can be managed under the same processes other software in your machine or organization is managed.
  • MATLAB is widespread in academia. Thanks to one of those generous schemes where software corporations give free software to universities, just because they're awesome and definitely not for any other reason, students and profs have easy and free access to MATLAB. Outside academia, however, you're looking at tens of thousands of dollars.

So far so good, but it's time for geophysics to switch to Python. On the face of it, the language has a lot in common with MATLAB: they're both easy to learn, and both have broad ecosystems that make things like image processing, statistics, and signal processing easy. But Python has some special features that make it a fantastic platform for scientific computing...

Advantages of Python

  • Free and open. Thanks to one of those generous schemes where people make software and let anyone use it for any purpose for free, Python is free! Not only is it free of charge, you are free to inspect and modify the code. Open is awesome. (There are other free alternatives to MATLAB, notably GNU Octave and SciLab.)
  • General purpose. One of the things I love about Python is its flexibility. You can use it in the shell on microtasks, or interactively, or in scripts, or to write server software, or to build enterprise software with GUIs.
  • Namespaces. Everything in MATLAB lives in the main namespace, whereas Python keeps things inherently modular. To access NumPy, say, you have to import it and then use its namespace to get at its contents: numpy.ndarray([1, 2, 3]). This has various advantages, including flexibility, readability, learnability, and portability.
  • Introspection. A powerful idea in Python, introspection means that you (or your code) can see inside every module, class, and function. You can use access private variables, or write code that 'knows' about other objects' interfaces.
  • Portable. You can run your Python code on any architecture, whereas to run MATLAB code you either need all the MATLAB licenses the software uses, or another pricey toolbox to make executables.
  • Popular. Python is the 7th most popular tag in Stack Overflow, whereas MATLAB is the 58th. While programming is not a popularity contest, think of your career, or the careers of your students. Once they graduate, Python will serve them better than MATLAB. There are over 300 jobs for Pythonistas on Stack Overflow Jobs right now. MATLAB jobs? Nine.

So there you have it. It's time to switch to Python. If you're new to programming, there's no contest. I suppose if you're productive in MATLAB, and have access to all the toolboxes, then admittedly it's hard to say you should switch.

But I'll still say it.


I was inspired to write this post after talking to a geophysicist about using programming languages in the classroom, and by the lists in this nice post on pyzo.org. It would be interesting to hear what you use in the classroom — as an instructor or as a student. I know geophysics is being taught with the help of MATLAB (in many places), Java (e.g. at Colorado School of Mines), Mathematica (e.g. by Chris Liner). I wonder if there's anyone using JavaScript, which wouldn't be a terrible choice. Or C++? Or Fortran?? Let us know in the comments!

Try an outernship

In my experience, consortiums under-deliver. We can get the best of both worlds by making the industry–academia interface more permeable.

At one of my clients, I have the pleasure of working with two smart, energetic young geologists. One recently finished, and the other recently started, a 14-month super-internship. Neither one had more than a BSc in geology when they started, and both are going on to do a postgraduate degree after they finish with this multinational petroleum company.

This is 100% brilliant — for them and for the company. After this gap-year-on-steroids, what they accomplish in their postgraduate studies will be that much more relevant, to them, to industry, and to the science. And corporate life, the good bits anyway, can teach smart and energetic people about time management, communication, and collaboration. So by holding back for a year, I think they've actually got a head-start.

The academia–industry interface

Chatting to these young professionals, it struck me that there's a bigger picture. Industry could get much better at interfacing with academia. Today, it tends to happen at a few key relationships, in recruitment, and in a few long-lasting joint industry projects (often referred to as JIPs or consortiums). Most of these interactions happen on an annual timescale, and strictly via presentations and research reports. In a distributed company, most of the relationships are through R&D or corporate headquarters, so the benefits to the other 75% or more of the company are quite limited.

Less secrecy, free the data! This worksheet is from the Unsolved Problems Unsession in 2013.Instead, I think the interface should be more permeable and dynamic. I've sat through several JIP meetings as researchers have shown work of dubious relevance, using poor or incomplete data, with little understanding of the implications or practical possibilities of their insights. This isn't their fault — the petroleum industry sucks at sharing its goals, methods, uncertainties, and data (a great unsolved problem!).

Increasing permeability

Here's my solution: ordinary human collaboration. Send researchers to intern alongside industry scientists for a month or two. Let them experience the incredible data and the difficult problems first hand. But don't stop there. Send the industry scientists to outern (yes, that is probably a word) alongside the academics, even if only for a week or two. Let them experience the freedom of sitting in a laboratory playground all day, working on problems with brilliant researchers. Let's help  people help each other with real side-by-side collaboration, building trust and understanding in the process. A boring JIP meeting once a year is not knowledge sharing.

Have you seen good examples of industry, government, or academia striving for more permeability? How do the high-functioning JIPs do it? Let us know in the comments.


If you liked this, check out some of my other posts on collaboration and knowledge sharing...