Bad Math from David Berlinksi

I'm away on vacation this week, so this is a repost of one of early GM/BM entries when it was on Blogger. As usual, I've revised it slightly. Berlinksi actually showed up and responded; a digest of the discussion back and forth is scheduled to appear here later this week.

-------------------------------

In my never-ending quest for bad math to mock, I was taking a look at the Discovery Institute's website, where I found an essay, On the Origin of Life, by David Berlinksi. Bad math? Oh, yeah. Bad, sloppy, crappy math. Some of which is just duplication of things I've criticized before, but there's a few different tricks in this mess.

Before I jump in to look at a bit of it, I'd like to point out a general technique that's used in this article. It's *very* wordy. It rambles, it wanders off on tangents, it mixes quotes from various people into its argument in superfluous ways. The point of this seems to be to keep you, the reader, somewhat off balance; it's harder to analyze an argument when the argument is so scattered around, and it's easier to miss errors when the steps of the argument are separated by large quantities of cutesy writing. Because of this, the section I'm going to quote is fairly long; it's the shortest I could find that actually contained enough of the argument I want to talk about to be coherent.

>The historical task assigned to this era is a double one: forming chains of
>nucleic acids from nucleotides, and discovering among them those capable of
>reproducing themselves. Without the first, there is no RNA; and without the
>second, there is no life.
>
>In living systems, polymerization or chain-formation proceeds by means of the
>cell's invaluable enzymes. But in the grim inhospitable pre-biotic, no enzymes
>were available. And so chemists have assigned their task to various inorganic
>catalysts. J.P. Ferris and G. Ertem, for instance, have reported that activated
>nucleotides bond covalently when embedded on the surface of montmorillonite, a
>kind of clay. This example, combining technical complexity with general
>inconclusiveness, may stand for many others.
>
>In any event, polymerization having been concluded--by whatever means--the result
>was (in the words of Gerald Joyce and Leslie Orgel) "a random ensemble of
>polynucleotide sequences": long molecules emerging from short ones, like fronds
>on the surface of a pond. Among these fronds, nature is said to have discovered
>a self-replicating molecule. But how?
>
>Darwinian evolution is plainly unavailing in this exercise or that era, since
>Darwinian evolution begins with self-replication, and self-replication is
>precisely what needs to be explained. But if Darwinian evolution is unavailing,
>so, too, is chemistry. The fronds comprise "a random ensemble of polynucleotide
>sequences" (emphasis added); but no principle of organic chemistry suggests
>that aimless encounters among nucleic acids must lead to a chain capable of
>self-replication.
>
>If chemistry is unavailing and Darwin indisposed, what is left as a mechanism?
>The evolutionary biologist's finest friend: sheer dumb luck.
>
>Was nature lucky? It depends on the payoff and the odds. The payoff is clear:
>an ancestral form of RNA capable of replication. Without that payoff, there is
>no life, and obviously, at some point, the payoff paid off. The question is the
>odds.
>
>For the moment, no one knows how precisely to compute those odds, if only
>because within the laboratory, no one has conducted an experiment leading to a
>self-replicating ribozyme. But the minimum length or "sequence" that is needed
>for a contemporary ribozyme to undertake what the distinguished geochemist
>Gustaf Arrhenius calls "demonstrated ligase activity" is known. It is roughly
>100 nucleotides.
>
>Whereupon, just as one might expect, things blow up very quickly. As Arrhenius
>notes, there are 4100 or roughly 1060 nucleotide sequences that are 100
>nucleotides in length. This is an unfathomably large number. It exceeds the
>number of atoms contained in the universe, as well as the age of the universe
>in seconds. If the odds in favor of self-replication are 1 in 1060, no betting
>man would take them, no matter how attractive the payoff, and neither
>presumably would nature.
>
>"Solace from the tyranny of nucleotide combinatorials," Arrhenius
>remarks in discussing this very point, "is sought in the feeling
>that strict sequence specificity may not be required through all
>the domains of a functional oligmer, thus making a large number of
>library items eligible for participation in the construction of the
>ultimate functional entity." Allow me to translate: why assume that
>self-replicating sequences are apt to be rare just because they are long? They
>might have been quite common.

>They might well have been. And yet all experience is against it. Why
>should self-replicating RNA molecules have been common 3.6 billion
>years ago when they are impossible to discern under laboratory
>conditions today? No one, for that matter, has ever seen a ribozyme
>capable of any form of catalytic action that is not very specific in its
>sequence and thus unlike even closely related sequences. No one has ever seen a
>ribozyme able to undertake chemical action without a suite of enzymes in
>attendance. No one has ever seen anything like it.
>
>The odds, then, are daunting; and when considered realistically, they are even
>worse than this already alarming account might suggest. The discovery of a
>single molecule with the power to initiate replication would hardly be
>sufficient to establish replication. What template would it replicate against?
>We need, in other words, at least two, causing the odds of their joint
>discovery to increase from 1 in 1060 to 1 in 10120. Those two sequences would
>have been needed in roughly the same place. And at the same time. And organized
>in such a way as to favor base pairing. And somehow held in place. And buffered
>against competing reactions. And productive enough so that their duplicates
>would not at once vanish in the soundless sea.
>
>In contemplating the discovery by chance of two RNA sequences a mere 40
>nucleotides in length, Joyce and Orgel concluded that the requisite "library"
>would require 1048 possible sequences. Given the weight of RNA, they observed
>gloomily, the relevant sample space would exceed the mass of the earth. And
>this is the same Leslie Orgel, it will be remembered, who observed that "it was
>almost certain that there once was an RNA world."
>
>To the accumulating agenda of assumptions, then, let us add two more: that
>without enzymes, nucleotides were somehow formed into chains, and that by means
>we cannot duplicate in the laboratory, a pre-biotic molecule discovered how to
>reproduce itself.

Ok. Lots of stuff there, huh? Let's boil it down.

The basic argument is the good old *big numbers* argument. Berlinski wants to come up with some really whoppingly big numbers to make things look bad. So, he makes his first big numbers appeal by looking at polymer chains that could have self-replicated. He argues (not terribly well) that the minimum length for a self-replicating polymer is 100 nucleotides. From this, he then argues that the odds of creating a self-replicating chain is 1 in 1060.

Wow, that's a big number. He goes to some trouble to stress just what a whopping big number it is. Yes Dave, it's a big number. In fact, it's not just a big number, it's a bloody *huge* number. The shame of it is, it's *wrong*; and what's worse, he *knows* it's wrong. Right after he introduces it, he quotes a biochemist who pointed out the fact that that's stupid odds because there's probably more than one replicator in there. In fact, we can be pretty certain that there's more than one: we know lots of ways of modifying RNA/DNA chains that don't affect their ability to replicate. How many of those 1060 cases self-replicate? We don't know. Berlinski just handwaves. Let's look again at how he works around that:

>They might well have been. And yet all experience is against it. Why should
>self-replicating RNA molecules have been common 3.6 billion years ago when they
>are impossible to discern under laboratory conditions today? No one, for that
>matter, has ever seen a ribozyme capable of any form of catalytic action that
>is not very specific in its sequence and thus unlike even closely related
>sequences. No one has ever seen a ribozyme able to undertake chemical action
>without a suite of enzymes in attendance. No one has ever seen anything like
>it.

So - first, he takes a jump away from the math, so that he can wave his hands around. Then he tries to strengthen the appeal to big numbers by pointing out that we don't see simple self-replicators in nature today.

Remember what I said in my post about Professor Culshaw, the HIV-AIDS denialist? You can't apply a mathematical model designed for one environment in another environment without changing the model to match the change in the environment. The fact that it's damned unlikely that we'll see new simple self-replicators showing up *today* is irrelevant to discussing the odds of them showing up billions of years ago. Why? Because the environment is different. In the days when a first self-replicator developed, there was *no competition for resources*. Today, any time you have the set of precursors to a replicator, they're part of a highly active, competitive biological system.

And then, he goes back to try to re-invoke the big-numbers argument by making it look even worse; and he does it by using an absolutely splendid example of bad combinatorics:

>The odds, then, are daunting; and when considered realistically, they are even
>worse than this already alarming account might suggest. The discovery of a
>single molecule with the power to initiate replication would hardly be
>sufficient to establish replication. What template would it replicate against?
>We need, in other words, at least two, causing the odds of their joint
>discovery to increase from 1 in 1060 to 1 in 10120. Those
>two sequences would have been needed in roughly the same place. And at the same
>time. And organized in such a way as to favor base pairing. And somehow held in
>place. And buffered against competing reactions. And productive enough so that
>their duplicates would not at once vanish in the soundless sea.

The odds of one self-replicating molecule developing out of a soup of pre-biotic chemicals is, according to Berlinksi, 1 in 1060. But then, the replicator can't replicate unless it has a "template" to replicate against; and the odds of that are also, he claims, 1 in 1060. Therefore, the probability of having both the replicator and the "template" is the product of the probabilities of either one, or 1 in 10120.

The problem? Oh, no biggie. Just totally invalid false-independence. The Bayesian product formulation probabilities only works if the two events are completely independent. They aren't. If you've got a soup of nucleotides and polymers, and you get a self-replicating polymer, it's in an environment where the "target template" is quite likely to occur. So the odds are *not* independent - and so you can't use the product rule.

Oh, and he repeats the same error he made before: assuming that there's exactly *one* "template" molecule that can be used for replication.

And even that is just looking at a tiny aspect of the mathematical part: the entire argument about a template is a strawman; no one argues that the earliest self-replicator could only replicate by finding another perfectly matched molecule of exactly the same size that it could reshape into a duplicate of itself.

Finally, he rehashes his invalid-model argument: because we don't see primitive self-replicators in todays environment, that must mean that they were unlikely in a pre-biotic environment.

This is what mathematicians call "slop". A pile of bad reasoning based on fake numbers pulled out of thin air, leading to assertions based on the presence of really big numbers. All of which is what you expect from an argument that deliberately uses wrong numbers, invalid combinatorics, and misapplication of models. It's hard to imagine what else he could have gotten wrong.

More like this

I thought that for a followup to yesterday's repost of my takedown of Berlinksi, that today I'd show you a digested version of the debate that ensued when Berlinksi showed up to defend himself. You can see the original post and the subsequent discussion here. It's interesting, because it…
Last night, my wife and I had dinner with a friend of ours from the Szostak Lab (yes at Buddha's Delight - I had the "beef" taro stirfry). There we discussed Capote (we just saw the movie) and the existence of ribose in a pre-biotic earth. Apparently it is unlikely that sugars, such as ribose,…
Stanley Miller, of the famed Urey-Miller experiment, died Sunday (NYTimes Obit). Here's an entry from over a year ago that was catalyzed by a conversation with a former member of the Miller lab: Last night, my wife and I had dinner with a friend of ours from the Szostak Lab (yes at Buddha's…
Some fields of science are so wide open, such virgin swamps of unexplored territory, that it takes some radically divergent approaches to make any headway. There will always be opinionated, strong-minded investigators who charge in deeply and narrowly, committed to their pet theories, and there…

While you're revising, you might spellcheck Berlinski's name. :)

(Although if there *was* a separate David Berlinksi--maybe an evil twin, maybe an alternate personality--it would help explain that weird self-interview he did...)

By Anton Mates (not verified) on 28 Aug 2006 #permalink

Since Berlinski is under discussion, I would like to pre-clarify something that will undoubtedly come up in the comments. Although Berlinski is a Senior Fellow of the Discovery Institute's Center for Science & Culture, he is on the record as not endorsing Intelligent Design. From a Knight-Ridder article of September 27, 2005:

But in an e-mail message, Berlinski declared, "I have never endorsed intelligent design."

You seem to be claiming that muddled thinking is a tactic with a purpose.

That's absurd. Muddled thinking is just muddled thinking. It's not a plan with the purpose of hiding holes in an arguement. The muddled thining is the reasons for the holes in the argument, not something that comes along later to cover the holes.

The muddled thining is the reasons for the holes in the argument...

Yeah, that sorta makes sense. :)

By Pete Dunkelberg (not verified) on 04 Mar 2007 #permalink