Biology is a hard problem

New genetic disorders pop up all the time -- each one represents a child who may face incredible challenges, or even be doomed to death. A child named Bertrand exhibited some serious symptoms -- profound developmental disabilities -- shortly after he was born, and no one could figure out what was wrong with him. So they took advantage of 21st century biotechnology and sequenced his genome, and the genome of both of his parents, and asked what novel mutations the child carried.

For years, sequencing was too expensive for common use—in 2001, the cost of sequencing a single human genome was around a hundred million dollars. But by 2010, with the advent of new technologies, that figure had dropped by more than ninety-nine per cent, to roughly fifty thousand dollars. To reduce costs further, the Duke researchers, including Shashi and a geneticist named David Goldstein, planned to sequence only the exome—the less than two per cent of the genome that codes for proteins and gives rise to the vast majority of known genetic disorders. In a handful of isolated cases, exome sequencing had been successfully used by doctors desperate to identify the causes of mysterious, life-threatening conditions. If the technique could be shown to be more broadly effective, the Duke team might help usher in a new approach to disease discovery.

For their study, Shashi, Goldstein, and their colleagues assembled a dozen test subjects, all suffering from various undiagnosed disorders. There were nine children, two teen-agers, and one adult; their symptoms included everything from spine abnormalities to severe intellectual disabilities. The researchers began by sequencing each patient and both biological parents—what’s known as a parent-child trio. There are between thirty and fifty million base pairs in the human exome; the average child’s exome differs from each of his parents’ in roughly fifteen thousand spots. The researchers could dismiss most of those variations—either they corresponded to already known conditions, or they occurred frequently enough in the general population to rule out their being the cause of a rare disease, or they were involved in biological processes that were unrelated to the patient’s symptoms. That left a short list of about a dozen genes for each patient.

In Bertrand's case, they narrowed it down to one likely gene responsible for his condition -- one gene that they also found that each of his parents carried variants for, although paired in both cases with normal functional alleles. Bertrand was unlucky: he inherited one bad copy from his mother, and another bad (but different) copy from his father.

Then there was Bertrand. The Duke team thought it was likely that mutations on one of his candidate genes, known as NGLY1, were responsible for his problems. Normally, NGLY1 produces an enzyme that plays a crucial role in recycling cellular waste, by removing sugar molecules from damaged proteins, effectively decommissioning them. Diseases that affect the way proteins and sugar molecules interact, known as congenital disorders of glycosylation, or CDGs, are extremely rare—there are fewer than five hundred cases in the United States. Since the NGLY1 gene operates in cells throughout the body, its malfunction could conceivably cause problems in a wide range of biological systems.

The article points out that one of the things that has made tracking down the genetic cause of this disorder is academic competition. Lots of people are born with novel genetic disorders, and they go to their high-powered geneticist/MD, and they get parts or their entire genome sequenced, and then the sequence is kept private. This is now the doctor's discovery: making it open knowledge would also make it likely that someone else would use it and publish it, and that they wouldn't get credit for it. That doesn't help patients, but it does help careers.

And that's the next step. It's clear that Bertrand has an anomalous form of NGLY1, but that doesn't demonstrate that that is the cause (remember, he's got 15,000 other variations from his parents' genome). The clincher would be to find other kids with similar phenotypes who also had NGLY1 variants, and then you'd be relatively certain you'd found the cause. If you had lots of sequence data, you might also find people who had the NGLY1 variants but none of the disease symptoms, which would rule out NGLY1 as the cause. It's a real problem that information gets locked up in little academic kingdoms, and is difficult to pry out without promising authorship on a paper…and who wants to be the 63rd author on a paper that has 200 contributors, anyway?

So the article ends up pointing out a flaw in poor Bertrand's genome, and another flaw in the institution of science.

I have to point out another problem, though, and this one has been known about in genetics for a long, long time: the high visibility of mutations of large effect, and how they skew our perception of how the genome works. These mutations exist, and Bertrand's case is an excellent example: a single point mutation wreaks global havoc on the system, causes profoundly disruptive symptoms, and draws a bulls-eye around itself to attract the attention of geneticists. But the overwhelming majority of allelic variants do nothing detectable at all -- again, witness Bertrand's 15,000 differences that were ruled out as causal -- yet we can't rule out the possibility in other genetic disorders that multiple genes are required to be messed up to trigger the problem, and that focusing on them just one at a time means you miss the causes.

We know this is the case in cancer, for instance. There are central players that frequently end up mutated to cause oncogenesis -- myc, ras, and p53, for instance -- but no cancer is caused by just one genetic change, and it requires multiple steps to initiate. Further, there are multiple components, each with their own likely cause: proliferation is different from suppression of apoptosis is different from metastasis, and every patient has a different genetic profile. That's why you're not going to find any responsible doctor claiming that they found THE gene that causes cancer and have THE cure.

But here's another example: a large genetic study that used similar techniques to those applied to Bertrand, looking for the heritable cause for a more complex and subtle disorder, schizophrenia. They didn't find one. They found a hundred.

One clue to this complexity, and how schizophrenia as a disease is "built", has come this week in new research published in Nature which looks at the genetic basis for the illness. In one of the largest genetic studies of its kind, a team of scientists from around the world compared the genomes of 36,989 people with schizophrenia with 113,075 control participants. They identified 128 independent genes in the people with schizophrenia, 83 of which were not known about until now.

Although this is an important study, it would be false to say that genomics work will lead to an imminent breakthrough in terms of a cure for mental illnesses. What we can do with this information is to ask better questions about what to research next in this field, for example some of the new genes identified are involved with immune processes, which provides the first real evidence for a long-held hypothesis that connects schizophrenia with immune system problems.

The medical team studying Bertrand got lucky and found a single gene as the likely source of his problems (Bertrand is not lucky at all, though: that we know what's wrong with him is a world away from being able to fix him). What makes people tick is a constellation of genes interacting cooperatively with one another, and you generally can't map single genes to single phenotypic traits.

It's going to be hard to figure that out. That's why we need more biologists!

More like this

This is the story of a Turkish boy, who became the first person to have a genetic disorder diagnosed by thoroughly sequencing his genome. He is known only through his medical case notes as GIT 264-1 but for the purposes of this tale, I'm going to call Baby T. At a mere five months of age, Baby T…
Jones et al. (2009). Exomic Sequencing Identifies PALB2 as a Pancreatic Cancer Susceptibility Gene. Science DOI: 10.1126/science.1171202 A paper published online today in Science illustrates both the potential and the challenges of using large-scale DNA sequencing to identify rare genetic variants…
A paper just published online in Nature Genetics describes a brute force approach to finding the genes underlying serious diseases in cases where traditional methods fall flat. While somewhat successful, the study also illustrates the paradoxical challenge of working with large-scale sequencing…
Lupski, J.R., et al. (2010). Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. New England Journal of Medicine advance online 10.1056/nejmoa0908094 Roach, J.C., & et al. (2010). Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science : 10…

do you accept guest blogs? If you send your e-mail I will attach one I have been trying to get published without success. Its titel is Multiple Hypotheses as a partial antidote to science misconduct

By Dr. Tom Poulson (not verified) on 22 Jul 2014 #permalink