# Attribute analysis and statistics

/Last week I wrote a basic introduction to attribute analysis. The post focused on the different ways of thinking about sampling and intervals, and on how instantaneous attributes have to be interpolated from the discrete data. This week, I want to look more closely at those interval attributes. We'd often like to summarize the attributes of an interval info a single number, perhaps to make a map.

Before thinking about amplitudes and seismic traces, it's worth reminding ourselves about different kinds of average. This table from SubSurfWiki might help...

A peculiar feature of seismic data. from a statistical point of view, is the lack of the very low frequencies needed to give it a trend. Because of this, it oscillates around zero, so the average amplitude over a window tends to zero — seismic data has a mean value of zero. So not only do we have to think about interpolation issues when we extract attributes, we also have to think about statistics.

Fortunately, once we understand the issue it's easy to come up with ways around it. Look at the trace (black line) below:

The mean is, as expected, close to zero. So I've applied some other statistics to represent the amplitude values, shown as black dots, in the *window* (the length of the plot):

**Average absolute amplitude**(light green) — treat all values as positive and take the mean.**Root-mean-square amplitude**(dark green) — tends to emphasize large values, so it's a bit higher.**Average energy**(magenta) — the mean of the magnitude of the complex trace, or the envelope, shown in grey.**Maximum amplitude**(blue) — the absolute maximum value encountered, which is higher than the actual sample values (which are all integers in this fake dataset) because of interpolation.**Maximum energy**(purple) — the maximum value of the envelope, which is higher still because it is phase independent.

There are other statistics besides these, of course. We could compute the median average, or some other mean. We could take the strongest trough, or the maximum derivative (steepest slope). The options are really only limited by your imagination, and the physical relationship with geology that you expect.

**We'll return to this series over the summer, asking questions like How do you know what to expect? and Does a physically realistic relationship even matter? **

*To view and run the code that I used in creating the figures for this post, grab the iPython/Jupyter Notebook.*