The Significance of Negative Data

Yes you've guessed it, I'm in Italy. Here is another entry dealing with scientific thinking.

Spurred on by some comments left by Coturnix on the Three Types of Experiments entry, and by the Microparadigm paper (see my entry, and another discussion of this paper at In the Pipeline), I now present to you ... the significance of negative data.

Now most of the older (and well read) philosophers of data such as Kuhn, Popper and Feyerabend were obsessed with the physical sciences, and as Ernst Mayr has pointed out in several books, they're ideas are less applicable to the life sciences. Even the principle of Occam's Razor often fails biology. The main mechanisms that operate within a living organism arose from evolution, a producer of diversification. Every system (think signal cascades, morophagen programs, circadian mechanisms, molecular motors) is a baroque assembly of proteins each with its own intricate parts that may resemble other related proteins, but has diverged to meet the needs of the organism in this point in time. At first glance, our assumptions as to what the underlying mechanisms "should be" are often much more simplistic than the mechanisms that are eventually found. Often this is not due to the fact that nature is weird but because we often make naive assumptions about natural systems.

So how does one study biology?

Well in simplistic terms, lots of biology is exploratory ... much more than the physical sciences. We are explorers of the weird and wonderful mechanism produced by natural selection.

Hold on ... is this really true for all research done in biology?

At this point we come back to the three types of experiments. From my previous post:

Type A Experiment: every possible result is informative.
Type B Experiment: some possible results are informative, other results are uninformative.
Type C Experiment: every possible result is uninformative.

There is even a little saying that accompanies this ...

The goal is to maximize type A and minimize type C.

In that post I then stated that type B experiments were actually the most important. Well lets rephrase our three types of experiments into terms that more acurately describes their function in terms of how the life sciences opperate:

Type B Experiment: exploratory science
Type A Experiment: explanatory science
(Type C Experiment: forget about it or ask yourself what am I trying to demonstrate)

Just like an expedition team, as a life scientist you first attempt to find something new (type B experimentation). Then once the strange and wonderful product of natural selection is discovered, you then work out the details of the mechanism (type A experimentation).

As you can see my argument is that:

Type B Experiment = exploratory science = some possible results are informative, other results are uninformative

What do I mean? When exploring the vast possible space of what "might be out there" you will often get negative data. The question becomes: is that negative data informative? (this is the point that coturnix and other have raised). To a certain degree, all data is informative. However I would argue that negative data from exploratory experiments is not that informative for two reasons:

1 - In exploring the vast space of possibilities, to say that a particular biological mechanism does not occur is not saying very much. If I told you that i couldn't find any evidence for my crazy idea, you would probably reply big deal. But if I had convincing evidence that supported my crazy idea, you would say wow (I hope).
2 - A failure to find such a mechanism can occur through an improper experiment. Furthermore the reason that a certain experiment is "improper" may not be apparent to any reasonable researcher. If I told you that I couldn't find any evidence for my crazy idea, you could always say "maybe you didn't try hard enough". Or you could say "maybe you used the wrong technique". This is why grad students and postdocs on occasion hate their PIs.

I'll explain with three examples, one fictional, two real.

Example 1: The lunar cell cycle check point (fictional) from a comment I left:

Say that your hypothesis is the lunar check point ... cells divide normally unless there is a full moon. Well that would be easy to test - measure doubling rates of your cells every day and see if there is a drop in multiplication the day that there is a full moon. Now if the drop happens, you've gain some potential insight "Ha the moon may activate a cell division check point - although my evidence is correlative". And that would be a major discovery. But if you never saw a decrease in cell division, you would have negative data. This (IMHO) makes this a type B experiment.

Now, negative data, is data. But it's weak data. It doesn't formaly disprove the hypothesis. For example one may say - "Well these are tissue culture cells and so they are transformed and maybe missing the check point, but perhaps other cells have the check point." or "Maybe the cells have to be grown in direct contact with moon light for the checkpoint to be activated" or "Maybe a subset of cells in the hypothalamus detect the signal and send a secondary signal to other cells" ... etc.

Example 2: Neurogenesis.

It was long believed that you are born with all the neurons that you'll ever had. Researchers recently have discovered that this is FALSE. From a previous entry:

Dan Gilbert explained that stress came in two forms: a positive stress that challenges us and that may enhance neurogenessis, and a negative stress (think despair) that actually inhibits neurogenesis. Animals in cages are under heavy negative stress, and as Gilbert explained "that's why no one ever observed neurogenesis in the lab."

Now to get back to the three types of experiments ... Many inexperienced researchers fail to perform control experiments and this often transforms explanatory science from type A into type B experiments (or into type C experiments). It is important to recognize that if you aren't performing exploratory experiments you should try to perform enough controls so that any particular result that you get from your experiment is informative. You should also perform controls when you undertake exploratory science, but it's hard to control for everything.

Last example:

How could Ferdinand and Isabella control for Columbus' exploratory trip to China? And he made a big discovery - not the one he was looking for. But even if he was just looking for new land it would have been hard to perform a control experiment. And if Columbus turned back a day before he reached the Caribbean, could he have said, "there is no land out there"?

More like this

Still in Italy. Here's another old entry for you. I'm not sure about the history of "the three types of experiments" (3 T's), but they are referred to quite often in the labs I've been in. So what exactly are they? Here goes ... Type A Experiment: every possible result is informative. Type B…
Two weeks ago, an interesting commentary by Paul Nurse, came out in Nature. The bottom line? We need to change how we study and understand cellular signaling cascades. First, some background. Cellular function is governed by a network of protein interactions that act like an information processing…
When computers first entered the mainstream, it was common to hear them getting blamed for everything. Did you miss a bank statement? that darned computer! Miss a phone call? - again the computer! The latest issue of Science had a new twist on this old story. Now, instead of a researcher failing…
Davies is up to his same old nonsense again: he's in Australia, lecturing people about his theory of the causes of cancer. Seven years ago, the National Cancer Institute in the US asked Professor Davies to use his insight as a physicist to look at cancer. His conclusion is that most cancer…

I have thought for a while now that a Journal of Negative Results could be very useful, btw. It would serve as a measn of seeing how previous researchers have tried to address a problem, and let future workers avoid making the same mistakes.

By Mark Wright (not verified) on 28 Sep 2007 #permalink

I love good posts on methodology.

Even the principle of Occam's Razor often fails biology.

I would say that the principle is fine, but it is less likely to give correct theories as default in biology.

Also AFAIU the razor is used with great effect in specific areas. Likelihood methods are used in cladistics, and interpreted as bayesian likelihoods they express parsimony.

The principle of parsimony is used for several reasons:

- It suggests using simpler formal theories first, which are simpler to test and less likely to contain errors.
- It gives fewer reversals in knowledge, ie the theory may be wrong but we save on number of failures.
- Nature is basically simple for reasons of symmetries.

But the basic simplicity of fundamental theories guarantee that applications are complicated instead. For example in physics which I know more of, EM theory may be simple but the near- to farfield coupling and its description around an antenna is nothing but. And evolutionary biology may have a simple formal principle in inheritance, but the applications (first mechanisms and then twice removed specific outcomes) are nothing but.

On top of that inherent complexity evolution is the ultimate tinkerer. Life is just not fair. :-)

By Torbjörn Larsson, OM (not verified) on 28 Sep 2007 #permalink

Occam's Razor doesn't fail in all areas of biology; genetics, especially genetics from the mid-20th century, used this a ton. Jacob and Wollman, for example, brilliantly applied it to deducing that bacteria had a circular genome, though the particular mechanism they cited was a little inaccurate.