Eldest Children are Smarter: A Study in Effect Sizes

The story about two weeks ago that eldest children have a significantly higher IQ was really big news, but I didn't have time to talk about it then. Now, that I have had time to look at the articles about it, I think that some statement about what the word "significant" means is in order.

The NYTimes reported:

The eldest children in families tend to develop higher I.Q.'s than their siblings, researchers are reporting today, in a large study that could settle more than a half-century of scientific debate about the relationship between I.Q. and birth order.

The average difference in I.Q. was slight -- three points higher in the eldest child than in the closest sibling -- but significant, the researchers said. And they said the results made it clear that it was due to family dynamics, not to biological factors like prenatal environment.

Researchers have long had evidence that firstborns tended to be more dutiful and cautious than their siblings, and some previous studies found significant I.Q. differences. But critics said those reports were not conclusive, because they did not take into account the vast differences in upbringing among families.

Three points on an I.Q. test may not sound like much. But experts say it can be a tipping point for some people -- the difference between a high B average and a low A, for instance. That, in turn, can have a cumulative effect that could mean the difference between admission to an elite private liberal-arts college and a less exclusive public one. (Emphasis mine.)

The studies cited in the NYTimes article are here and here.

Now just to be clear, I do not object to the findings of this study. I am not qualified to judge their methods, so if they say that they found a statistically significant difference of 3 IQ points then I can buy that.

However, there is a huge difference between statistically significant and practically significant, and this is the source of my objection -- particularly the last sentence in the quote above.

The suggestion that 3 IQ points will have practical significance to someone's grades and hence their admission to college is tripe of the highest order.

Here is the evidence, taken from Gutman et al. 2003. They performed a risk assessment, an IQ test, and a psychiatric exam on children aged four and then followed these children throughout their school lives. They wanted to determine what effect of negative factors -- risk factors such as disadvantaged minority status, low education, low occupational status, large family size, father absence, multiple negative life events, rigid parenting values, maternal anxiety, maternal mental illness, and poor parent-child interaction style -- and positive factors -- such as preschool intelligence and good mental health -- would be on later academic performance.

They then ranked the children according to preschool intelligence in high and low risk categories to determine their average GPAs (the measure of academic performace). That data is below:

i-c10fac41418724e2b53875614068e1fd-gradesiq.jpg

As you can see, IQ does increase GPA for both the low and high risk groups. However, look at the size of the effect. The difference between the low and high IQ group is 2 standard deviations, and IQ scores are normalized so that the standard deviation is 15 points. So there is a 30 IQ point average difference between the groups. The difference in GPA between the low and the high group is about .6 for low risk group and about .2 for the high risk group. Thus, by my calculations each additional IQ point would then contribute .02 to the low risk group and .006 to the high risk group -- an absolutely minuscule amount.

Now you can argue that IQ is not stable across the lifetime. (They are relatively but not absolutely stable over the lifetime, and the study that looked at the subject tested 11 year olds as opposed to 4 year olds.) You could also argue that the population presented in Gutman et al. overemphasized risk factors -- which is something they admit -- and hence overemphasize the role of risk factors in academic achievement.

However, I choose to interpret this data as suggesting that while large changes in IQ may have practically significant effects on academic performance, a change in 3 IQ points certainly does not.

This gets to my point. What makes me nuts about studies like this is that they confuse statistical significance with practical significance. Statistically, 3 IQ points may indeed be significant, but will it tell you anything about any particular set of siblings? No.

Here is my test for practical significance... If I am walking down the street and see someone, I have a limited amount of information about them. From that limited information, I am forced to draw conclusions. Would knowing that they are the eldest sibling let me infer that they are smart? No. Would knowing that they are the eldest sibling even let me infer that they are smarter than their younger siblings? On average yes, but average isn't really going to help me here. The distributions of intelligence between the eldest and their younger siblings overlap so substantially that no reasonable statement can be made about a random person met on the street.

This is the same reason why racism and sexism are useless. (We will put aside the moral issues.) They are useless because race and gender are not good proxies by which I could infer meaningful statements about particular individuals; therefore, I should not base my behavior towards them on those statements. Prejudice is not only ethically wrong; it is patently illogical.

What distresses me about how this study was presented is not that it is wrong. It is correct. It is all the cocktail party banter and pop-psychology that will result from it -- assumptions about other people's intelligence based on irrelevant traits -- that disturb me. It is all the assumptions that parents will make about their children that are completely unwarranted by the facts. These falsities have a cost, and that cost is results from the failure to delineate practical from statistical significance.

More like this

The New York Times reports today on a study published today in two papers in Science (Science 22 June 2007: Vol. 316. no. 5832, p. 1717) and Intelligence: "Research Finds Firstborns Gain The Higher I.Q." The study could settle more than half a century of scientific debate! Frank J. Sulloway,…
A continuation of our "greatest hits" from past Cognitive Daily postings: [originally posted on December 14, 2005] IQ has been the subject of hundreds, if not thousands of research studies. Scholars have studied the link between IQ and race, gender, socioeconomic status, even music. Discussions…
IQ has been the subject of hundreds, if not thousands of research studies. Scholars have studied the link between IQ and race, gender, socioeconomic status, even music. Discussions about the relationship between IQ and race and the heritability of IQ (perhaps most notably Steven Jay Gould's…
If there has been at least one good side-effect of Dr. Watson making a jack-ass of himself, it is that it has given scientists the opportunity to set the record straight about heredity, race, and IQ. (He has since recanted, so everything is all better now. Watson to Blacks: "Sorry Blacks."…

Jake, very nice analysis of this finding. I am really tired of reading stories that seem to try to reduce the wonderful complexity and dynamism of human individuality to a mashup of various statistical categories, and I like the approach you've taken to dispelling the notion that this IQ difference, even if it is statistically significant, is useful in predicting behavior in a practical sense.

Have you considered making clear the target of your wrath? The authors of the journal article make no claims such as those in the Times piece that you quote from, but you haven't stated that. This is just an example of poor science journalism, not poor science. In fact, the conclusion to the paper is a whopping three lines:

We believe the results of the present study provide strong evidence in favor of a negative effect of increasing birth order on intelligence among young adult men. This is a result that should be tested out in similar materials, which are available at least in the other Nordic countries. Based on our data, we can scarcely make inferences as to the nature of this relation.

Hardly a call to using birth order as a cue for discrimination...

They are useless because race and gender are not good proxies by which I could infer meaningful statements about particular individuals

Of course, you can, with the caveat that the truth value is probabilistic, and the aspect in question, may or may not be useful or profound.

Science is at fault here. If we just used the phrase "statistically reliable" instead of "statistically significant" we'd avert a lot of confusion.