Crazy Useful Paper on Statistics

Want to know when to use Standard Deviation (SD) as opposed to Standard Error (SE) or a Confidence Interval (CI)? Then you should read this really useful paper in JCB about error bars in scientific papers. Here is just a sampling of their useful rules:

Rule 3: error bars and statistics should only be shown for independently repeated experiments, and never for replicates. If a "representative" experiment is shown, it should not have error bars or P values, because in such an experiment, n = 1...

Rule 4: because experimental biologists are usually trying to compare experimental results with controls, it is usually appropriate to show inferential error bars, such as SE or CI, rather than SD. However, if n is very small (for example n = 3), rather than showing error bars and statistics, it is better to simply plot the individual data points.

They also have some handy tips on interpreting figures in papers -- such as this one about SEs (click to enlarge):

i-01667fb9d9ad5a92c9528ec0d47d60ad-7fig5small.jpg

Figure 5. Estimating statistical significance using the overlap rule for SE bars. Here, SE bars are shown on two separate means, for control results C and experimental results E, when n is 3 (left) or n is 10 or more (right). "Gap" refers to the number of error bar arms that would fit between the bottom of the error bars on the controls and the top of the bars on the experimental results; i.e., a gap of 2 means the distance between the C and E error bars is equal to twice the average of the SEs for the two samples. When n = 3, and double the length of the SE error bars just touch (i.e., the gap is 2 SEs), P is ~0.05 (we don't recommend using error bars where n = 3 or some other very small value, but we include rules to help the reader interpret such figures, which are common in experimental biology).

Definitely read the whole thing.

The craziest part about many scientific papers is how many people still screw this up. My personal favorite is the use of SE with tiny n's, hiding the fact that you really can't be that confident about the mean.

Hat-tip: Faculty of 1000.

Tags

More like this

In a recent post, Candid Engineer raised some interesting questions about data and ethics: When I was a graduate student, I studied the effects of many different chemicals on a particular cell type. I usually had anywhere from n=4 to n=9. I would look at the data set as a whole, and throw out the…
Earlier today I posted a poll challenging Cognitive Daily readers to show me that they understand error bars -- those little I-shaped indicators of statistical power you sometimes see on graphs. I was quite confident that they wouldn't succeed. Why was I so sure? Because in 2005, a team led by…
The margin of error is the most widely misunderstood and misleading concept in statistics. It's positively frightening to people who actually understand what it means to see how it's commonly used in the media, in conversation, sometimes even by other scientists! The basic idea of it is very…
[This post was originally published in March 2007] Earlier today I posted a poll [and I republished that poll yesterday] challenging Cognitive Daily readers to show me that they understand error bars -- those little I-shaped indicators of statistical power you sometimes see on graphs. I was quite…

The problem with "many scientific papers", particularly those in high citation index journals goes way beyond this. Take a scan through a few issues of Science and Nature and see how many times 1) no error bars are provided when they should be, 2) one cannot tell what statistic (SEM, SD, etc) the bars convey when provided and 3) no proper inferential analysis is provided at all. "Real" biologists don't believe in statistics so of course they don't use them properly when forced... The development of gene-array techniques and the requisite need for biologists to "speak" inferential analyses has been entertaining to watch, frankly.

On the other hand, those of us in the "sainted p value" type fields can err on the other side. Namely the misconception that statistical analysis "reveals" or "demonstrates" an effect as if something is only true (in the state-of-nature sense) if the experiment reaches P<0.05. This leads to all kinds of erroneous thinking and interpretation of real findings.

hmm. apparently one should avoid "less than" symbols.

...reaches P less than 0.05. full stop.

By Drugmonkey (not verified) on 30 Apr 2007 #permalink

this really useful paper

It was straight forward and useful, but I have some quibbles with it.

As I'm a physicist and not a biologist I usually haven't as much use for more qualified statistical methods such as tests. Consequently I have never heard of SE before reading blogs. I wish the paper had eliminated the SE rules to press more on the fact that "CIs make things easier to understand".

But I would especially quibble with the discussion of replicates and the implicit assumption of normal distribution of experiments.

First, I don't think that "the errors would reflect the accuracy of pipetting" is correct, it would reflect the precision of the measurement. And usually you need to assess that precision and its distribution to be able to do proper statistical testing.

So as a physicist I would probably break their suggested rules and make figures with error bars on replicates specifically, perhaps forced to use ranges for n={2,3} but preferably SD throughout.

Second, I would use these graphs or real tests to establish the distributions and then a suitable test to compare groups. The rules seems fine when it is known that independent normal distributions are observed.

But I like the later discussion of within group correlation. Often in experimental physics you find that you advertently or inadvertently increase precision by utilizing short-term stability in time series. But making later inferences on direct comparisons, real accuracy or repeatable precision is then difficult.

By Torbjörn Larsson (not verified) on 30 Apr 2007 #permalink

"when it is known"

Actually, in an experimental situation I would prefer to be precise: "when it is established earlier".

By Torbjörn Larsson (not verified) on 30 Apr 2007 #permalink

Torbjörn, it's usually a good idea to give some measure of precision, even in physicic. If, for example, you showed two curves in a graph, the reader should be given some idea of what uncertainty there is in those estimates. In some cases there will be virtually none, compared to the range of Y-values spanned by the curves; but in other cases there will be a lot.

This could be due to measurement uncertainty, or the fact that the curve presents a Y vs X model, and there are several more independent variables (or input factors) which varied during the measurements.