An essay in Nature recently, titled "A question of class" (by Jeffrey Parsons and Yair Wand) puts the case that classification is crucial to science and needs to be understood. They hold, as I do, that a poor understanding of classification - particularly of the concepts/words "class" and "category" - lead to unproductive and dangerous conclusions within science. But I don't think they get there quite yet...
Classification - the act of putting things into classes - is something that every science does, ranging from elements and planets, to diseases, taxa and functions. The authors make the following, it seems to me entirely arbitrary, specification:
In science, classification is rarely related to survival so directly, yet the underlying principles are the same. Phenomena can be categorized in many ways on the basis of the properties they share. However, categories are useful only if they make it possible to infer further information, and only if they do so consistently and over a reasonable time period. To distinguish a general category from a more useful one with inferences, we call the latter a 'class'.
Whereas a category simply reflects a repeating pattern of properties, a class additionally indicates that relationships exist between these properties, even if the mechanisms behind the relationships are unknown. For example, all the things in a room make up a category (they share the property of being in the room), but not necessarily a class; their presence in the room might not reflect some deeper underlying principle.
In fact, in traditional logic class and category are pretty much the same thing.* That you have put something into a class or category merely means that things that fall under it share some properties. These can be properties in the minds of the categoriser, or they can be shared properties out there in the world.
But something like this distinction is required, for reasons I will explain. We need to make a distinction between things gathered for conceptual reasons into an equivalence class, and things that just are in an equivalence class because they in fact have related properties. Rather than take up the confusing terminology of Parsons and Wand, I'll rely on a prior distinction between type and taxon. A type is a class that is formed merely for conceptual purposes; that is, it exists in the head of the cogniser. A taxon is a class that is formed in virtue of sharing causal properties and historical origins.
Let's consider their examples. A "planet" is defined by the International Astronomical Union in the recent redefinition excludes Pluto and other "planetoids" not because different causes are involved in their formation, but for purposes of convenience. The same causes are involved n the making of Jupiter and Pluto, but we need to have a classification scheme to deal with the increasing number of Trans-uranic objects of size.
But in the case of disease, we classify in various ways. Sometimes we know what the etiology of the disease is and classify it that way. Other times we merely have a group of similar diseases or symptoms, and classify together on that basis. In fields where the etiologies remain opaque, such as psychiatry, we only have classification by similarity, and this leads to trashcan categories - groups of everything else. The former are taxa. The latter are types. Types are formed by, as it were superficial similarities. Taxa are formed on the basis of large numbers of properties, where the underlying causal etiology is opaque, or by a few causal properties where the etiology is known. Consider the DSM IV. There is a category of "autistic spectrum disorders", which includes full autism and Asperger's Syndrome. So far as I can tell, these are quite distinct etiologies, and quite distinct symptomatologies. Putting them together for reasons of resemblance interferes with investigation of the causes and behaviours of these conditions.
In biology, there has been a longstanding issue about what counts as a biological classification, and the positions have been largely, since around 1975 or so, between those who use a similarity measure based on arbitrary choices of properties, called numerical taxonomy or phenetics (from the Greek phainero, to seem), those who hold that etiology (in this case genealogy of taxa) is sufficient, called phylogenetic systematics, or cladistics, and those who wish to use some more or less arbitrary mixture, called evolutionary systematics. The traditional Linnaean system can be applied to all of these approaches, and so does not represent anything methodologically deep, but rather only a matter of convention. However, those who use similarity metrics often defend Linnaean systematics, although not all who defend it are pheneticists. Some cladists prefer the Linnaean system as it "stores" in its descriptions a lot of information about the organisms, and wish merely to ensure that Linnaean groups are monophyletic; which is to say, that a Linnaean group of any rank does not exclude parts of an evolutionary branch based on similarities.
Philosopher Manfred Laublicher, of Arizona State University, has said that the difference between phenetics and cladistics is that in phenetics, the relationship is similarity, as I have said here, but the relationship between cladistic objects is one of identity. What does this mean? Take whales - a more derived form of mammals one cannot expect to find - no hindlimbs, no fur, often no teeth. And yet they are still mammals. Not "like" mammals, they are mammals. That is, they are "the same" as any other mammal. Even if they lost their mammary glands and started to bear young in egg sacs, they would still be mammals.
The reason why this identity matters is that even if whales lost some "key" characters of being mammals, like fur, milk, and live birth, they would still have a vast number of retained mammalian characters, and we could inductively project that they would, for example, share mammalian cytochrome C genes, and have evidence of genes that grow fur, mammary glands and hindlimbs still in their genomes. We could project their overall metabolism on a knowledge of their relatives. We would know that, if the nearest non-whale relatives of whales had a particular hormone or enzyme, so too would those whales.
But on a resemblance criterion, this is not inductively projectible. All that a class formed from resemblance allows you to infer is that the members, by definition, resemble each other in various ways you have set up by using the resemblance criterion. Imagine a class of warm blooded organisms (actually proposed by Julian Huxley as Homeothermia). If you know something is a member of that class, what can you infer? That it has warm blood, and nothing else. If you know something is a member of Cetacea, what do you know? Lots. You know its reproductive organs, internal physiology, skeletal elements, hearing structures, etc., no matter how derived or modified they may be.
Now there's no reason why a type and a taxon cannot be coextensive, and a reason why many scientists are confused about this is that they often, but not always, are. Types are formed by the sharing of lots of superficial properties, and these are, in biology at any rate, often as much the result of shared evolutionary history as the deeper causal properties. But to investigate the natures of taxa, one must use etiological properties (such as homologies in biology), not superficial ones, no matter how useful the superficial ones are for diagnosis or identification.
Something that I think the oft-derided philosopher Alfred North Whitehead got very right is what he called the fallacy of misplaced concreteness. Here the mistake is to think that because you have a noun, there is a thing that the noun names. In taxonomy, scientists often mistake the name for the thing it names, or even conclude that the name of a class indicates there actually is a thing. This is a reason why types can mislead - have a type name and you soon start to think in terms of there being some natural thing the type names. But if the name is merely one of convenience, that is an error in inference and science. And this is the problem that Parsons and Wand do not quite reach.
* The real distinction between a class and a category is between a group formed by its members' properties (a class), and a group formed by a predicate, or propositional attribute, or its members. In short, a class may be non-conceptual, while a category is conceptual and linguistic. It doesn't matter here.
Haha, finally, distinction between "real relationships" and "imagined relationships." (imagined relationships: kind of like "Colubridae")
Thank you, John - this will help me to clarify my thinking.
So I guess you're accusing Parsons and Wand of a category error? :-)
To quibble, does Devil Facial Tumour Disease qualify as a more derived mammal than a whale?
"Autistic Spectrum Disorders" is not a DSM-IV category.
There is a category called "Pervasive Developmental Disorders" which includes the specific diagnosis of autism, Asperger syndrome, PDD-NOS, Rett syndrome, and CDD (childhood disintegrative disorder).
There are many doubts as to whether Rett's belongs in this category, and it's unclear why Rett's is in the DSM when Down syndrome and Fragile X (and so on) aren't. And very little is known about CDD, which is very rare.
On the other hand, you would find wide consensus that Asperger's is a form of autism. The remaining disagreement is whether Asperger's is distinct from autism. Many papers provide evidence and arguments in either direction.
You would have no problems finding a lot of papers and researchers who take the position that there is no difference between autism and Asperger's.
Also, Asperger individuals almost invariably also meet DSM-IV autism criteria which, because of how the DSM-IV is written (there is a priority rule), makes diagnosing Asperger's nearly impossible. Asperger individuals who've been involved in my work (where the distinction between autism and Asperger's is made) have all met DSM-IV autism criteria as well as autism cut-offs on the 2 gold-standard autism diagnostic instruments.
Re "etiologies," autism and Asperger's both are in the large majority idiopathic, with only a small percentage (less than 20%) of individuals having a possible etiology (e.g. an associated genetic syndrome, like Fragile X or West syndrome). The distinction between etiological and idiopathic autism or Asperger's is important and is routinely made in autism research.
Would you agree that using "life" as a noun is an example of the fallacy of misplaced concreteness?
Isn't schizophrenia used as a rather broad classification that includes quite a wide variety of symptoms? My impression is that when a practitioner isn't sure how to classify the patient, he/she is deemed schizophrenic. I am probably wrong.
While I do not have an in depth knowledge myself and would be quite keen to learn more on the subject, I was under the impression that cladistics comes under fire quite alot for not actually revealing etiology. That due to the nature of cladistic analysis all you can identfy is types because of the was it compares characters which have been artibrarily established as important by the person creating the matrix.
But even as I write this I realise that perhaps that is only a major limitation of cladistics in terms of its use in palaeontology.
How is cladistics more generally applied?
Ernest, that's an interesting question, although I find it hard to quantify "more derived" without some idea of the characters used to measure it. HeLa also raises that problem.
Michelle, thanks for the clarification. But I remain convinced that there is no basic etiological commonality between Asperger's and classical autism. I say this as a probable Aspie.
Jim: Yes. "Life" is what physics does in a locale that interests biologists, that's all.
pubcat: I said that we set up etiological classes either because we know the etiology, or because we have used deep properties that are likely to give us taxa. Cladistics cannot tell you what the etiologies actually are - that comes before we do cladistics. I once asked Gareth Nelson what criteria he used for character inclusion for his cladograms, and he replied "I hope I would know my organisms by now!" (His organisms were fishes, of course.) Cladistics may shed light on what characters are informative, but it is not freestanding: you need to know something about the organisms first. And eliminating homoplasies (which are similarities, of course) requires knowledge of the etiologies to some degree.
I believe that all organisms with a significant brain are evolutionarily hard-wired to categorize objects (and, in humans at least, concepts). Categorization allows inferences that have survival value. (I haven't seen this kind of animal before, but it looks like a big cat, so I'll avoid it as I would a leopard.) Imagine being an organism that treats EVERY entity as entirely novel. After dealing with object 1, the organism would have to deal with object 2 as if it were completely novel, even if it is just a different side of the same leaf. [Such a creature wouldn't even be able to have this thought, because 'leaf' is itself a category]. There could be no learning because no lesson could be generalized. Survival would be a matter of pure luck.
Perhaps categorization is the fundamental function purpose of a brain.
The human need to categorize is so deep that it is impossible to even COMMUNICATE without doing it. Nouns are names for categories of objects or ideas; verbs names for are categories of actions; adjectives.... I don't believe I need to belabor the point, but without categories, language is impossible. Even unspoken communication would be impossible because we could not infer a meaning from a gesture without associating it with some category of meaning.
I think the scientific preoccupation with classification is an natural (and sometimes overzealous?) extension of an innate hard-wired behavior. We just have to do it.
p.s. not "Trans-uranic", but "trans-Neptunian".
And, off-topic: how many of your readers have been accused of having Apsberger syndrome? (I include myself and at least 5 of my 8 closest colleagues). I only ask this because I think its most basic symptom is an ability for intense concentration on a subject (and a preoccupation with numbers). It seems that this ability is almost a prerequisite for a career in science.
Ability to categorize is a basic characteristic of all living things. All living things, on encountering an object, place in it one of four categories: ignore it, run from it, try to eat it, try to make love to it. The categories may overlap, and some living things have other categories.
Jim, I find that an almost incomprehensible claim. Do you think that merely because living things reliably react to stimuli they are somehow classifying those stimuli? If so, then rocks classify hard objects that hit them, according to those that crush them and those that don't. Humans classify, as do most organisms with higher central nervous systems, but to really do it properly, the organism must have a cognitive map of the world, and that eliminates most organisms and many that have CNSs. Without that cognitive map, and one that affects future behaviour, all we can say is that there is a mapping relation between stimuli and response, and not that it is a classificatory one.
The human need to categorize is so deep that it is impossible to even COMMUNICATE without doing it.
That is quite true. However a distinction I suspect that John is making, is one of "natural" vs "artificial" aggregation. For example, I've always been a big fan of prototyping computer languages as opposed to traditional statically typed languages. The latter has types which are organized based on subjective "designer" constraints, whereas prototyping languages automatically group objects together more naturally, based on their experimental history and usage taxonomy. In prototyping languages, there are either are no types ("typeless"), or every object is a type unto itself - depending on your perspective. However, every object will still always have a "taxon", based on it's derivation. Thus, prototyping languages always seemed to me a more natural environment for experimenting with evolutionary programming (if a mutable "object" is what is being evolved). Static types or classes are just too rigid. The downside is that, as objects evolve, it becomes difficult to mentally categorize them. The upside is that many novel objects are free to evolve, freed from human preconception.