The Human Genome Project: Hype meets reality

By oracknows on June 14, 2010.

I've had the immense good fortune to have trained and ultimately become a physician-scientist during a time when the pace of discovery and the paradigm changes in science have occurred just over the course of my career in medicine and science has been staggering. microRNA, the shift from single gene studies to genomics, the development of targeted therapies, the completion of the Human Genome Project, these are but a few examples. Of course, arguably the Human Genome Project is the granddaddy of all of the huge changes and paradigm shifts that has occurred to revolutionize biomedical research. Back when I was in and then later graduate school, it was inconceivable to me that we would ever be able to sequence the entire human genome in my lifetime. Back then, DNA sequencing was a tedious affair requiring tricky reactions, difficult-to-pour gels to separate nucleotide fragments, and hours of pouring over radiographs and manually matching fragments. Indeed, my PhD thesis project in the early 1990s involved cloning gene whose length was approximately 2,300 bases; it took months to sequence once it was isolated, and the tools to search DNA sequence databases to verify that it was a novel gene were primitive at best, involving e-mailing the sequence to an NIH server and then waiting for the results to come back. Does anyone remember doing that?

Arguably the most useful spinoffs of the Human Genome Project and Craig Venter's competing project to sequence the human genome were the high throughput techniques developed to sequence DNA rapidly and to match and line up the appropriate sequences to produce an actual sequence of each chromosome, as well as the computational tools to analyze the results. The results over the last decade have been nothing short of paradigm changing. Add to that the ability to analyze the levels of gene expression of every gene in the human genome simultaneously on a chip, which, while not part of the genome project did use similar technology and computational tools, and the revolution that has occurred in molecular biology is unprecedented. Virtually overnight, we have gone from studying single genes, looking for the effects of increasing or decreasing their level of expression and studying the function of their individual protein products, to studying numbers of genes grouped into networks of similar function and related signaling tied together in "hubs" whose perturbation may be at the heart of much of the dysfunction leading to cancer and other diseases. There's just one problem (well, there are severals, but I'm going to address primarily one problem in this post), and this problem is described by an article that appeared in the Sunday edition of the New York Times in an article by Nicholas Wade entitled A Decade Later, Genetic Map Yields Few New Cures:

Ten years after President Bill Clinton announced that the first draft of the human genome was complete, medicine has yet to see any large part of the promised benefits.
For biologists, the genome has yielded one insightful surprise after another. But the primary goal of the $3 billion Human Genome Project -- to ferret out the genetic roots of common diseases like cancer and Alzheimer's and then generate treatments -- remains largely elusive. Indeed, after 10 years of effort, geneticists are almost back to square one in knowing where to look for the roots of common disease.

Disappointingly, this is more or less true. The Human Genome Project did spawn a revolution in studying the biology of disease. Unfortunately, that revolution has not yet made it to using the information to develop treatments for diseases that plague humanity, particularly diseases that claim the most lives (heart disease and cancer) or the most quality of life (Alzheimer's disease, for example). These are common diseases, particularly heart disease and cancer, although in all fairness it should be pointed out that cancer is not a single disease. Be that as it may, here's an example Wade provides to show how the results of the Human Genome Project have thus far been disappointing when applied to trying to predict or treat human disease:

One sign of the genome's limited use for medicine so far was a recent test of genetic predictions for heart disease. A medical team led by Nina P. Paynter of Brigham and Women's Hospital in Boston collected 101 genetic variants that had been statistically linked to heart disease in various genome-scanning studies. But the variants turned out to have no value in forecasting disease among 19,000 women who had been followed for 12 years.

The old-fashioned method of taking a family history was a better guide, Dr. Paynter reported this February in The Journal of the American Medical Association.

This is the paper to which Wade is referring. Basically, what Paynter et al did was to to examine a cohort of 19,313 initially healthy women enrolled in the Women's Genome Health Study and followed for a median period of 12.3 years and construct genetic risk scores from the National Human Genome Research Institute's catalog of genome-wide association study (GWAS) results published between 2005 and June 2009. The endpoints were myocardial infarction, stroke, arterial revascularization, and cardiovascular death. Unfortunately, what they found was that the genetic risk score developed from the 101 single nucleotide polymorphisms (SNPs) did not predict cardiovascular disease as manifest by MIs, strokes, or the need for angioplasty or coronary artery bypass surgery, nor did it predict death from heart disease. Clearly, this study was quite disappointing, although the results of the recent paper on the genetics of autism may give some hope that this is not the final word because that paper found that it was primarily uncommon SNPs that were associated with autism. A major limitation of this paper, as discussed by the authors, was that it looked only at common SNPs:

Limitations of our study merit consideration. As suggested by the strong effect of family history on cardiovascular disease risk, there is a substantial risk component due to genes and shared environment, which may be elucidated by future genetic research. While the NHGRI catalog is based on all available published genome-wide studies, these have focused to date only on common SNPs and, thus, we also were unable to assess the potential contributions of rare alleles. However, if only discovered through a major increase in sample size, it is possible that unidentified variants will have increasingly small effects.22 It also may be possible in the future to obtain stable estimates of the exact effect or HR for use in a weighted score and to find interactions between genes or within genes and other markers, both of which may improve predictive ability.

As Wade states, describing a second study derived from the Human Genome Project, the human HapMap, which catalogues common genetic variants in European, East Asian and African genomes:

It now seems more likely that each common disease is mostly caused by large numbers of rare variants, ones too rare to have been cataloged by the HapMap.

Which is exactly what Pinto et al found regarding autism, by the way.

Still, it's hard to rate the results of attempts thus far to apply findings from the the Human Genome Project to predicting or treating disease as anything more than highly disappointing thus far, at least to the public who funded the project. Part of the problem was the hype around the Human Genome Project during the 1990s, when it was being carried out, and particularly shortly after its results were announced and published in 2000. At the time, as Wade points out, Francis Collins, who was in charge of the Genome Project at the time, was as guilty as anyone of feeding this hype. Remember, he predicted that the genetic diagnosis of diseases would be accomplished in ten years (i.e., right about now) and that five years after that the treatments and cures would start rolling out, something that now appears unlikely. Clearly, those grand predictions have not panned out to the extent expected in those heady days right after the human genome sequence and map were first published. The pharmaceutical industry has spent billions of dollars, as Wade points out, and by and large failed to come up with the expected results, largely because the genetics of most common human diseases is far more complex than we had expected. Wade quotes Harold Varmus, who gets it quite right:

"Genomics is a way to do science, not medicine," said Harold Varmus, president of the Memorial Sloan-Kettering Cancer Center in New York, who in July will become the director of the National Cancer Institute.

The last decade has brought a flood of discoveries of disease-causing mutations in the human genome. But with most diseases, the findings have explained only a small part of the risk of getting the disease. And many of the genetic variants linked to diseases, some scientists have begun to fear, could be statistical illusions.

None of this is surprising to those who kept a level head on their shoulders ten years ago. For one thing, as has been pointed out ad nauseum on this blog, whenever you look at large numbers of anything and try to link them to something, there will be many false positives, and there will be a lot of noise. Because of the enormous amount of data generated in GWAS, it's not at all surprising that, statistical tests notwithstanding, that most of the associations detected would be due to chance or to statistical flukes. This is particularly true since scientists don't actually sequence the genomes of people in these studies. Rather, they look for sites in the genome where many people have a variant bit of DNA, known as the single nucleotide polymorphism, or SNP. When you start looking for differences in 1.2 million SNPs, you will find them. Lots of them. It isn't the SNPs per se that tell us a lot, but rather the genes implicated by the SNPs, and, more importantly, the biological pathways and functions of the networks of genes implicated by them. That's why I liked Pinto et al so much. That study implicated not just SNPs, but identified potential biological pathways that are altered in autism and autistic spectrum disorders. This is information that scientists can sink their teeth into in order to really understand the biology of autism. From the understanding of biology will eventually emerge treatments. Moreover, as was pointed out, the cost of sequencing a genome has fallen dramatically over the last decade. Within the next year, it is thought that the cost will fall to between $5,000 and $10,000 to sequence one genome, and I have been to talks where it is predicted that within three years the cost will fall further to around $1,000.

I'd pay $1,000 to have my genome sequenced. That's cheaper than a typical MRI scan.

One thing that I couldn't help but notice in the blogospheric discussions of this article is that Wade's description of evolution as it relates to the Human Genome Project and applying its results to human disease is what came under the most fire, with the broader medical implications of the article mostly ignored. In particular, Larry Moran, P.Z. Myers, and Jonathan Eiesen are particularly peeved that Wade commented that the number of genes in humans is "astonishingly small" compared to "lower" animals like the roundworm and and fruit fly, which have comparable numbers of genes to humans. I guess it's just the difference between me as a physician, who saw that the main point of the article is the difficulty we've discovered over the last decade in translating genetic information into treatments for common human diseases, and that of evolutionary biologists. In fact, the whole bit by Wade about the genomes of worms and flies compared to human genomes struck me as almost a throwaway point not necessary to the article, and leaving it out would have allowed the science blogosphere not to be distracted from the main point of the article. While I can see Larry's, P.Z.'s, and Jonathan's points and did cringe inwardly just a bit when Wade compared worm and fruit fly genomes to the human genome, I can't help but think that they are missing the forest for the trees in this particular instance. I would posit that the forest is this statement in Wade's article:

As more people have their entire genomes decoded, the roots of genetic disease may eventually be understood, but at this point there is no guarantee that treatments will follow. If each common disease is caused by a host of rare genetic variants, it may not be susceptible to drugs.

"The only intellectually honest answer is that there's no way to know," Dr. Lander said. "One can prefer to be an optimist or a pessimist, but the best approach is to be an empiricist."

In other words, what we are finding as a result of the Human Genome Project is that the actual physical sequencing and deducing of the sequence of the human genome was the easy part of the project. Figuring out what all those genes do, how they do it, how they interact, and what perturbations is going to be so much harder than the sequencing which, when it comes down to it was primarily a technical, chemical, and engineering problem. Further, figuring out how to intervene will be even more difficult than figuring out the function. It may turn out that how genes are regulated may be far more important than the actual sequences of genes for many of the common diseases. Moreover, the Human Genome Project and projects derived from it, related to it, or spun off from it have been a boon to basic science in so many ways, particularly in comparative genomics. The more organisms there are that have their genomes sequence, the more we learn about how different sets of genes result in different phenotypes.

Indeed, as I've said on other occasions, our understanding of breast cancer has undergone a sea change over the last decade, largely due to results and technology derived from the Human Genome Project, as well as our ability to measure the levels of expression of every gene in the genome simultaneously, a technology known as cDNA microarray analysis or whole genome expression profiling, a technology developed as the Human Genome Project was in its later stages. For example, where once we looked at tumors that did and did not make the estrogen receptor, thanks to whole genome expression profiling that identified biological subtypes of breast cancer based on gene expression, including the less aggressive luminal subtypes versus the more aggressive basal subtypes. Multiple prognostic assays based on gene expression have been developed, the most commonly used and reliable of which is the Oncotype DX assay, which started out based on 250 genes linked to breast cancer progression selected from the Human Genome Project. That set was whittled down to a 21 gene assay that can be done on paraffin-embedded tissue that reliably predicts prognosis and which women with estrogen receptor-positive, node negative breast cancer do and do not require chemotherapy. We use this test right now. Coming into use is a test known as the Mammaprint assay, which looks at 70 genes and can predict the risk of distant metastases.

In the end, we have to remember that the translation of basic science discoveries into actual treatments used in humans typically takes on the order of 24 years, as John Ioannidis has documented, and many of the potentially useful discoveries based on the Human Genome Project probably haven't even been made yet. What was wrong was not the Human Genome Project itself but rather our expectations of how fast the project would bear fruit in terms of amazing new treatments for diseases based on personal genomics. We have learned that, as is so often the case, our preconceptions about nature and genetics from ten years ago were far too simplistic and optimistic. Indeed, given how complex the interplay between genes, proteins, and gene regulation is, we may actually not be doing so badly, and we may not yet have the resources, mathemetical models, and computing power to fully exploit genomic medicine based on the Human Genome Project and its spinoffs. Given that in 20 years I'll be in my late 60s, I certainly hope that we have managed to use the fruits of the Human Genome Project to improve the care of common diseases like heart disease and cancer.

More like this

Why do genome-wide scans fail?

The successes of genome-wide association studies (GWAS) in identifying genetic risk factors for common diseases have been heavily publicised in the mainstream media - barely a week goes by these days that we don't hear about another genome scan that has identified new risk genes for diabetes, lupus…

Genome-wide association studies: failure or success?

The latest issue of theÂ New England Journal of MedicineÂ has four excellent and thought-provoking articles on the recent revolution in the genetics of common disease and its implications for personalised medicine and personal genomics. Razib and Misha AngristÂ have already commented, and there's…

David Goldstein on the failures of genome-wide association studies

The genome-wide association study has been the technique du jour in human genetics for much of the last two years. It's a pure brute force approach, surveying up to a million sites of common variation throughout the genomes of thousands of people at a time, some of whom suffer from a particular…

Guest post: Kai Wang on the McClellan and King critique of genome-wide association studies

Kai Wang is a postdoctoral fellow at the Center for Applied Genomics, Children's Hospital of Philadelphia and an author on numerous genome-wide association studies. He left this lengthy comment as a response to my recent post on this comment by McClellan and King in Cell, and I felt it warranted…

Excellent post. You expand on many of the points that Richard Lewontin was making about gene hype thirty years ago.

Hype or not, this project got a lot of people in my generation (those of us who were in high school during the Clinton Era) very much interested in science. I know of several lab techs (like myself) who went into it because we were all wanting to do genetic research. Heck, some of us salivated at the prospect of ever running a PCR or flow cytometry analyzer.
So, on the one hand, we have to look at this critically. Was it the silver bullet it promised to be? Appears it wasn't. Did it help shove science along, even a little bit? Yes. I'd say yes.
Much like NASA, the reasoning behind these investments is not very well understood until we need them, or until they inspire someone to do something we need.

I don't think it's an argument against the research, just against the hyping of it.

It's always a problem, selling an expensive research project to people who would resist making an "investment" into the unknown verses being honest about how long a lot of the results would take to reach people who need them and the additional research in other areas that would need to precede some of those. Some of that might be necessary over-selling but it can lead to over-expectations and eventually those can cause trouble too.

In the wake of Lawrence Krauss' paper about the possible non-existence of black holes you would have to wonder what the abandonment of that huge PR success would mean for the reputation of science. It would be a disaster. Not because of the theory not being a valid representation of the knowledge at the time but because of the hype and its installation into popular culture.

"Genes" are, if anything, even more over-hyped as an explanation for just about everything these days, often without any evidence. You can't open a newspaper most days without seeing that. And it's a far more dangerous one.

To me, mapping the human genome was like flying to the moon. Something to be done, whether it had an actual benefit or not. Like the moon landing, it gave us lots of new technologies and tools, even if you can argue the scientific value in regards to immediate beneficial things for men kind.
I'm actually not too surprised that fewer and fewer genetic diseases turn out to be dependent on single point errors, after 500 million years of evolution the body got pretty good in dealing with those. If the body can handle whole chromosome errors like Down's and XXY and still reasonably function, a couple genes here or there are unlikely to have big effects (unless they are fatal errors, but those get removed from the gene pool).

I'd like to see what, if anything, comes out of the Personal Genome Project. They'll open the data up for any computer scientist (specifically, Machine Learning experts) to have a go at it.

I could imagine a future where you can predict health outcomes with some accuracy, just by looking at someone's genome, without knowing which specific genes are the relevant ones. As an analogy, it's possible to accurately classify an email as spam or ham, without knowing or caring which words are the spammiest ones, or which ones have a statistically significant association (after controlling for multiple comparisons) with spam.

Complaints about hype fail to take into account its context with regard to the nature of funding research. While it cannot substitute for a well-structured research proposal, its thoroughly ingrained in the funding process. Want more funding for you virology research? Argue to the DoD that it's relevant to combatting bioterrorism. However tangential the relationship may be in reality, you ar expected to construct a role for your molecule(s) of interest in a popular disease/health process.

Hype is even more important for a project like the HGP, which both: 1.) requires massive funding 2.) must be justified to science illiterate bureaucrats.

Orac,
I think you're on target today, as is often the case. The oncologist in my would just emphasize one point:

What you say about OncotypeDx is a good example of how genetic analyses of tumor specimens can help patients and doctors establish a rational approach to treating a tumor. As you're well-aware, other, similar kinds of tests are available to evaluate lung cancers, leukemias, lymphomas and other malignancy types.

The public is insufficiently informed of the distinction between tumor biopsy sequencing, which is very valuable in that it can inform/direct better care vs. whole-body (germline) sequencing which at this point is interesting but not of much practical significance to patients.

The technology arising from this project is completely revolutionizing diagnosis and treatment of infectious disease. It will allow rapid detection of outbreaks within institutions and in the general population, so that they can be acted on much sooner than before. That application alone should save millions of lives.

Hello friends -

Rather, they look for sites in the genome where many people have a variant bit of DNA, known as the single nucleotide polymorphism, or SNP. When you start looking for differences in 1.2 million SNPs, you will find them. Lots of them. It isn't the SNPs per se that tell us a lot, but rather the genes implicated by the SNPs, and, more importantly, the biological pathways and functions of the networks of genes implicated by them.

This sentence crystallizes my difficulty in getting a firm grasp on how much wow factor is in the Pinto study. Can anyone help?

When you 'look for differences in 1.2 million SNPs', what are you using as a 'correct' set of data as a baseline? Are we comparing two groups against each other; i.e., one autism group and one without, or one had a heart attack group and one who did not? Or do we compare both groups with a third set of data?

Any insight is appreciated. Thanks.

- pD

This is one of those situations that seems like a disappointment precisely because it's being examined from the wrong perspective.

As Orac points out, the genome is So. Fricking. Huge and So. Fricking. Complicated. that just getting a handle on one tiny section of it is going to be a major achievement. The deal is that while the genome is sequenced, we have only just started to delve deeper into it. Once a critical mass of genome data has been gone over and analyzed, then that's when we'll start seeing the big discoveries.

We're in the position of the French soldiers who found the Rosetta Stone -- except in our case, the Greek translations aren't attached to the stone, but scattered throughout that part of the desert. We're finding bits of them, little by little -- half of a letter here, a quarter of one there -- and our Champollions are working like dogs trying to fit them into the proper sequence. But once enough of the Greek letters have been found, the work suddenly gets faster.

@pD

(pardon me if this too basic an explanation, just trying to make sure I don't give half an answer)

You can't have SNPs from 1 genome. They are generated by comparing the sequence of a gene from many sources. The 1.2 million SNPs were generated by sequencing comparisons of x genes and genomes (I don't know the specific details atm because I'm generally constrained to single celled organisms). They are, in essence, functioning as an index of possible changes, irrespective of host condition.

Then you can take 2 populations, ASD and non-ASD and see if any given SNP is only present in ASD population or is statistically significantly over-represented in the ASD samples.

SNPs are what is available to work with now, however as sequencing costs decrease I fully expect more sequencing to be done. We get to do that now with bacteria, where a ~$25,000 worth of Illumina sequencing generates 30x coverage for 70 or so bacterial genomes (details excluded for simplicity sake).

The value to whole genome sequencing as opposed to a SNP array is that you don't have to know what you're looking for with genome sequencing. The same holds true for gene expression experiments where the use of high throughput sequencing will permit better studies as researchers won't be limited to analyzing the expression of what were thought to be genes at the time the array was manufactured.

My take: The difficulty is that the complete set of interacting gene and epigenetic variations, interacting with the environment over the life course, has emergent properties which are in most cases for all practical purposes impossible to predict from the more reductionist level of the individual genetic variants. Variant A increases your risk for outcome X in combination with some others, under particular environmental circumstances; is irrelevant in other combinations; and perhaps is protective in others. Furthermore, it may have a completely different interaction with outcome Y.

You can't look at any of them in isolation or even just at specific combinations of a few. It's reasonable to hope that some combinations of gene variants will be discovered that have a reasonably strong and consistent relationship with some outcome, good or bad; but most of them probably won't work for everybody. We may get some insight into disease etiology from all this, but only as a result of solving a lot of very complicated puzzles. So don't hold your breath.

JohnV -

Thank you.

I believe, in this case, what some folks were trying to explain to me previously was that our 1.2 million SNPs is very likley a tiny subset of the true number of SNPs actually out there, and as such, if we had ten or twenty million SNPs in our index, we would have had a larger percentage of people in the autism group with meaningful (i.e., hit on a gene) CNVs. Does this sound accurate?

The 1.2 million SNPs were generated by sequencing comparisons of x genes and genomes (I don't know the specific details atm because I'm generally constrained to single celled organisms).

Does anyone else know the answer to this question?

Regarding the simplicity of your explanation, you need not worry. I appreciate your comments.

- pD

Orac, I'm sorry if you fell for the hype. That's unfortunate collateral damage.

The science behind genome sequences is sound. The majority of scientists supported the Human Genome project because they knew that the technology would be developed and they knew that many other genomes would be sequenced before we got to humans. These scientists also knew that a lot of important information would come out of the sequencing projects.

However, in order to sell it to the politicians and the general public, it was necessary to make it out as a project that would advance cures, not knowledge. That's what all the hype about medicine was for. It was about politics and "framing" not science.

Today, in 2010, the vast majority of papers making use of genome sequences are about basic science, not medicine. That's exactly what most of us expected would happen.

Nicholas Wade has missed the whole point. As far as most scientists were concerned, it was never about curing diseases. You seem to have missed the point as well.

Hundreds of genomes have been sequenced. The results have led to a much better understanding of evolution and of how genes work. The results have strengthened our ideas about the numbers of genes and our understanding of the relationship between evolution and developmental biology.

Can you understand why we are so upset with Nicholas Wade? He's missed all the scientific benefits of this new knowledge in order to focus on the fact that we haven't yet cured cancer. That, in turn, highlights one of the main problems with science education these days. It's the focus on applications instead of knowledge. If a prominent science writer can't recognize this problem then we are in bigger trouble than I imagined.

Why couldn't we have sold the genome projects as pure science, like the large hadron collider or the Hubble telescope? That would have solved the problem.

Larry Moran--

"Collateral damage" means "we were trying to hurt him, sorry we hurt you." Are you really saying that it's okay that they lied to the people who they were asking for funding, and the problem is only that they were too good at it, so scientists-in-training also believed the lies?

I'm not sure why it's surprised so many people that cures for cancer, Alzheimer's, etc didn't appear within a decade of the Human Genome Project completing its primary goal. This is a huge amount of data. Far more than a metric shit-ton, or a mega-crap-load even. This is a staggeringly big chunk of data, much of which nobody had ever looked at before. That will take a very long time to even begin to understand, much less to make wonder drugs out of.

To put it into perspective....

In 1977, NASA launched two revolutionary interplanetary spacecraft: Voyagers 1 and 2. They were remarkably well equipped, and took lots of pictures and collected huge amounts of data during their journeys to Jupiter, Saturn, Uranus, Neptune, and through the interplanetary medium and now progressing towards interstellar space. They are currently traversing the heliopause, which has turned out to be a fairly complicated thing -- there isn't just a single heliosheath to pass through, but rather it seems to be a set of layers, and they are constantly in motion.

Both Voyagers are still operational, and beaming back data, though the data rate is now greatly reduced due to the loss of various instruments (both now have nonfunctional scan platforms, for instance) and declining power margins, which is really just a function of age. The signal as it reaches Earth is now weaker than a flashlight across the room, yet the massive radio telescopes of the Deep Space Network can receive and amplify it enough to make sense of it.

During their primary missions alone, the Voyagers produced a staggering amount of data. It's less than what Cassini is producing now, but it's still a huge amount of data. Scientists are still finding things in it. A moon was recently discovered -- in Voyager imagery over a quarter century old. It had been in the data all that time, but no one had seen it.

So the Voyagers produced a staggeringly huge amount of data. But the human genome is *bigger* and much more complicated, and unlike the Voyagers, we didn't engineer it so we have less basis for comparison. The challenge is bigger. I doubt we will have a handle on the human genome for a good while to come.

Larry Moran writes:

Why couldn't we have sold the genome projects as pure science, like the large hadron collider...? That would have solved the problem.

Because there's a big round tunnel of nothin' under Texas that was supposed to be the Superconducting Super Collider, America's version of the LHC, only bigger and better. It was sold as pure science, and was stopped by appropriations cutoff in mid-construction, perhaps the largest U.S. federal project ever to have its funding pulled after multiple billions had already been spent.

Congresspersons and their constituents see "pure science" as something that should only have money wasted on it by the pointy-headed academics who tend to be interested in that sort of thing. Science is viewed as having the great benefit of keeping these folks generally out of the trouble they'd otherwise cause in the real world.

I was about to post pretty much the same thing as Jud. The reason being that, for whatever reason, seemingly no one gives a crap in this country. Here's the intro paragraph from Wikipedia:

"The Superconducting Super Collider (SSC) or Desertron[1] was a particle accelerator complex under construction in the vicinity of Waxahachie, Texas that was set to be world's largest and most energetic, surpassing the current record held by the Large Hadron Collider. [snip] The project was canceled in 1993."

"Why couldn't we have sold the genome projects as pure science, like the large hadron collider or the Hubble telescope? That would have solved the problem."

It might have solved the problem of wrong perception and inappropriate expectations, but it might also have "solved" the entire problem of the genome project altogether, by not allowing it to happen. People are often reluctant to spend money without the possibility of concrete return.

"That, in turn, highlights one of the main problems with science education these days. It's the focus on applications instead of knowledge."

I don't think this is just science education. It's what leads to things like, 'Why do I need to study Literature? I'm never going to use it.' We've moved so far toward touting education as a means toward making a living that we've forgotten about it as a good thing in itself. So if the science we're working on doesn't lead to an almost immediate, or at least recognizable, usable result, we don't see the point of doing it.

The title of Wade's article alone ticked me off because it relies on misconceptions in the first place. Anyone who's actually read any of the articles--TIME magazine doesn't count--about the genome project would understand the vast complexity and the machine limits on processing that information.

Sigh...just when I start to feel normal...

The conclusion thus far is: Science illiteracy is a powerful weapon in the hands of politicians and marketing professionals. It's value is not to be underestimated.

Vicki asks,

Are you really saying that it's okay that they lied to the people who they were asking for funding, and the problem is only that they were too good at it, so scientists-in-training also believed the lies?

Yes.

But I'd like to add that "okay" isn't the same as "ideal." A lot of the funding for the Human Genome Project was provided by other countries and the UK did a significant part of the sequencing. I wonder if politicians in those other countries were more open to the scientific goals of the project?

Orac, I'm sorry if you fell for the hype. That's unfortunate collateral damage.

Are you reading the same post I wrote? If so, I'm not sure how you could come to the conclusion that I "fell for the hype." Really. I'm serious. My whole point was that the hype was too great, that it's not surprising that we don't have all those fantastic cures yet, and that there have been some fantastic basic science spinoffs from the HGP, the lack of "cures" notwithstanding.

The science behind genome sequences is sound.

Tell me where I ever said the science of the HGP wasn't sound. You're attacking a straw man here.

The majority of scientists supported the Human Genome project because they knew that the technology would be developed and they knew that many other genomes would be sequenced before we got to humans. These scientists also knew that a lot of important information would come out of the sequencing projects.

None of which is denied or argued against in my post. In fact, I point out that there have been some fantastic scientific spinoffs as a result of the HGP, some of which have already--amazingly--reached clinical utility. Sorry if I used more cancer- and disease-oriented examples, but, really, that's what I know. They were convenient examples. What you said that irked me was how you focused on, in essence, one paragraph in a large article about the worm and fruitfly genome and used that to castigate Wade's description of evolution. Ditto PZ, albeit to a lesser extent in that he also pointed out that the fact that the HGP hadn't yielded a treasure trove of cures yet is not the standard by which to evaluate the research program's success or failure. In fact, if you had said the above was your real problem with Wade's article, I probably wouldn't have even mentioned your post because I wouldn't have found it so narrow. But you mentioned nothing about Wade's overall focus on the lack of translatable science that has resulted in treatments. You spent your entire post castigating Wade because he doesn't understand evolution and falls for the whole "higher organism" fallacy based on numbers of genes in various organisms.

[...]

Today, in 2010, the vast majority of papers making use of genome sequences are about basic science, not medicine. That's exactly what most of us expected would happen.

I was around then, too. I remember the hype, and the scientific community got into almost as much as politicians did. Methinks there's a lot of 20-20 hindsight going on here. The difference is that I admit it.

Nicholas Wade has missed the whole point. As far as most scientists were concerned, it was never about curing diseases. You seem to have missed the point as well.

Go back and read your own post. I repeat: You never made this argument in your post. Consequently, it is a massive straw man to criticize me for supposedly criticizing you for having made this point. In fact, I would have partially, if not largely agreed with this argument. But that's not the argument you made. You spent your entire post trashing Nicholas Wade for his lack of understanding of evolution and his expressing wonder that humans have roughly the same number of protein-coding genes that worms and fruit flies do.

Hundreds of genomes have been sequenced. The results have led to a much better understanding of evolution and of how genes work. The results have strengthened our ideas about the numbers of genes and our understanding of the relationship between evolution and developmental biology.

No one denies this. I didn't deny it. See above.

Can you understand why we are so upset with Nicholas Wade? He's missed all the scientific benefits of this new knowledge in order to focus on the fact that we haven't yet cured cancer. That, in turn, highlights one of the main problems with science education these days. It's the focus on applications instead of knowledge. If a prominent science writer can't recognize this problem then we are in bigger trouble than I imagined.

Again, that's not the argument you made. It's the argument you're making now, but it's not the one you made in the post I linked to. You may be upset with Nicholas Wade, but the reason you gave for being upset with him in the post to which I linked is not the reason you are giving now.

Orac says,

Are you reading the same post I wrote? If so, I'm not sure how you could come to the conclusion that I "fell for the hype." Really. I'm serious. My whole point was that the hype was too great, that it's not surprising that we don't have all those fantastic cures yet, and that there have been some fantastic basic science spinoffs from the HGP, the lack of "cures" notwithstanding.

Sorry. My reading of your posting suggested that you were disappointed in the lack of progress on curing disease. It suggested to me (and still does) that you thought the main goal was to advance medicine.

I apologize if I've misread your intent but, honestly, when I go back ad re-read it I still don't see it as clearly as you claim.

But that's not the argument you made.

I agree. I didn't make that argument in my post. I never claimed otherwise. I didn't think we were discussing my posting.

The point I made on Sandwalk was that Nicholas Wade doesn't understand the science he discusses. That's sad.

"To me, mapping the human genome was like flying to the moon. Something to be done, whether it had an actual benefit or not."

To me it's closer to launching Sputnik. By itself, it was not that useful, but it is the necessary first step to communication satellites, Hubble, Voyager, etc.

Back when I first heard about it, I though the genome project was over hyped, even though I believed it would be incredibly important. It doesn't do you much good to transcribe a copy of an ancient scroll unless you figure out how to read the language. With the genome, even when you know what every sequence does, even that's mostly useful for diagnostic/prevention planning purposes unless you can do something with those sequences.

Just as an example of how basic research can unexpectedly lead to massive economic consequences which were not obvious at the time, refer to a paper published in the Annalen der Physik in, I believe 1908, by a physicist named Albert Einstein, predicting a phenomena known as stimulated emission. This paper, which looked rather innocent at the time, has now given birth to the technology known as the laser. The economic development and employment generated by the invention of the laser amounts to hundreds of billions of dollars. Certainly, the late Prof. Einstein had not the slightest notion at the time that his paper would result in such consequences.

I think the point which really needs to be made is that the sequencing of the human genome (and the recording of the small sample of variations obtained so far) is but one step on just one of the many paths of obtaining knowledge about where diseases and illnesses come from. Where this particular path branched off from the others is debatable - I'd suggest looking at Mendel's pioneering work on heritable traits and gene theory; Watson/Crick gave a mechanical explanation of how these genes could be transmitted; this led to people trying to figure out how to spot the genes, and then to others wanting to sequence various genomes, including the genome of the organism currently known as homo sapiens sapiens. It's also worth noting that probably less than 1% of the global population have had their genomes sequenced, so effectively the researchers are working from a greatly reduced data pool for comparisons (with 6 billion people in the world, we'd need 600,000,000 complete 42-chromosome samples in the genome's databases alone just to handle one tenth of the diversity of the world's human population... and even then, can we guarantee we'd be looking at the interesting one-tenth?).

There's a lot we've learned about the human genome already. We've learned we have a lot of genetic material in common with just about every other species on the planet, and that only about 2% of our genetic data actually makes us "human". We know what the genome is made of (and it's amazing - a single, complex molecule made of four different bases and thus a limited number of combinations and permutations, but apparently limitless complexity as a result) and we know how it replicates across both mitotic and meiotic cell divisions (and thus how variations in the genome can arise).

On the other hand, we don't know which bits of the genome are the ones which make us "human". We don't know what the majority of genes actually do, or (as far as I'm aware) how they function except on a very micro- level. We don't understand how these micro-level alterations potentiate out to make macro-level alterations - for example, while there have been geneplexes found which are common across persons with various forms of mental illness, there's no real explanation of how those geneplexes work to make the person actually mentally ill. Heck, there's very little actual work that's been done on how micro-level alterations in brain chemistry can cause macro-level alterations in "normal" behaviour, so working out how the genes for a particular mental illness actually function is something which will come a lot later on down the line.

On the gripping hand, genetics as a science isn't precisely an old stager. If it reaches the age of, for example, theoretical physics, before providing something concrete, we might have to worry. Let's come back in another two or three millennia and have a look then.

Prometheus;

I am not nearly impressed with the Pinto et al study on Copy Number Variations in Autism than you are.

In autism studies (Autopsy and MRI) controls groups are matched on Age, IQ and Gender. The control groups in this paper did not control for either Age, IQ or Gender.

The SAGE control cohort (N=1880) the male female ratio was M=31% F=69% exactly opposite the ASD group which consisted primarily of multiple incidence families M=83% F= 17% consistent with a M/F ratio of 4:1.

It could be argued that CNV's are gender neutral therefore it is a representative control group. Other studies have shown that Males are at higher risk for genetic mutations than females:

http://www.bioone.org/doi/abs/10.1111/j.1558-5646.2007.00250.x?journalC…

"In many instances, there are large sex differences in mutation rates, recombination rates, selection, rates of gene flow, and genetic drift. Mutation rates are often higher in males, a difference that has been estimated both directly and indirectly. The higher male mutation rate appears related to the larger number of cell divisions in male lineages but mutation rates also appear gene- and organism-specific".

Failure to control for IQ may be even more qeustionable. The phenomona of diagnostic substitution where diagnoses of MR have declined in direct proportion to the increase in autism diagnoses suggests that many of the CNV's in this study are not specific to autism but rather to a variety of developmental disorders with or without the co-occuring problems of loosely defined 'autism'.

The authors also did not control for Age as the authors readily admit.

They reported app. 4% of the autism group with the CNV's but about 2% of the controls also showed the same CNV mutations.

Prometheus, do you consider the control groups of this study representative and why so?

Failure to control for IQ may be even more qeustionable. The phenomona of diagnostic substitution where diagnoses of MR have declined in direct proportion to the increase in autism diagnoses suggests that many of the CNV's in this study are not specific to autism but rather to a variety of developmental disorders with or without the co-occuring problems of loosely defined 'autism'.

If you were to try to match by IQ, you would have the choice of eliminating most of your subjects with classical autism, the vast majority of whom test as substantially below normal in IQ, or using as controls individuals who are classed as mentally retarded but not autistic, and who would almost certainly have other abnormalities. When it comes to classical autism, nobody is quite sure how to properly measure IQ, as standard measures of IQ have been developed for individuals without the communication and perceptual difficulties that are prevalent in autism. When you can't properly measure something, it is hard to control for it.

It is likely that the shift in diagnosis from MR to autism does not reflect an increase in mistaken autism diagnoses, but rather the decline of an earlier practice of lumping everybody who tested low on standardized IQ tests under the category of MR.

Trrll--I agree with your basic point: the former error was probably at least mostly mis-classification of autistic people as mentally retarded, not a recent misclassification of MR neurotypicals as autistic. That said, IQ is problematic even if you limit the population to neurotypical people who are not developmentally disabled: it is influenced by any number of cultural factors. A silly but perhaps illustrative example: a chimp was marked down on an IQ test because, asked to select the food, she picked a picture of a flower instead of an ice cream cone. Chimpanzees eat flowers, and she'd never tasted an ice cream cone.

@Vicki:

"A silly but perhaps illustrative example: a chimp was marked down on an IQ test because, asked to select the food, she picked a picture of a flower instead of an ice cream cone. Chimpanzees eat flowers, and she'd never tasted an ice cream cone."

It's not that silly, actually... if anything, it's the use of an intellectual ability test designed for humans on a non-human-but-rather-sentient animal that is silly. Reminds me of a study in which some children were tested using WISC-R and gave some interesting responses (in the sense that they were consistent across the group) and yet were all marked down for these responses in the test. The main response reported relates to the following question:

"You go to the grocery store to buy bread but the bread has been sold out. What do you do?"

The test rubric allows for one answer: "Go to another store".

The children tested responded with "Go home".

Why would this be the most intelligent thing to do, rather than go to another store? Well... the children tested were not from white, middle-class backgrounds. They came from schools in inner-city areas, and a prominent feature of these areas was the existence of a serious gang culture. Another was poverty. Going to the shop and finding no bread would necessitate going back home because of a number of reasons: family shopping tied to that store via social security/welfare office payment arrangements, so the child would not be given money and therefore could not go to another shop; and, even if the child had money for the bread, going to another shop might necessitate the entering of another neighbourhood that would be entry into a rival gang's territory and leave the child in danger.

In essence, the problem lies in the notion that intelligence can be tied to the concept of answering the question correctly as per the rubric of the test, whereas - in real life - what the rubric says is the 'intelligent answer' may actually be unintelligent behaviour in which to engage. And this is a problem in intelligence testing.

I can understand that response behaviour has to be tightly controlled for in order to have the test fully standardisable, but - like I say - this is one of the major issues in testing intelligence... what is intelligent per the test may not be intelligent per real life.

Failure!

The only bigger failure has been string theory.

This is a great post with interesting comments, I really enjoyed reading it. I must say that the subject matter is well discussed and I will definitely coming back for more. I'm gonna bookmark and share this to my friends. Thank you for this!

The Human Genome Project: Hype meets reality

More like this

Why do genome-wide scans fail?

Genome-wide association studies: failure or success?

David Goldstein on the failures of genome-wide association studies

Guest post: Kai Wang on the McClellan and King critique of genome-wide association studies

Turning out the lights and moving on: Goodbye, old ScienceBlogs blog, hello new blog

A quick update on the migration to a new domain

A change is gonna come. Respectful Insolence is moving.

And the box of blinky lights has arrived in Manchester for QEDCon

On the "integration" of quackery into the medical school curriculum

Whispers from the Ghosting Trees

Who won last night's Democratic Primary debate in New Hampshire?

Comments of the Week #38: From Matter to Mars