Meet the Monkey Cousins

i-9b60a672d253d00928f8fb093df9027a-macaque.jpgTrace your genealogy back 25 million years, and you'll meet long-tailed monkey-like primates living in trees. Those primates were not just the ancestors of ourselves, but of all the other apes--chimpanzees, bonobos, gorillas, orangutans, and gibbons--along with the monkeys of the Eastern Hemisphere, such as baboons and langurs. By comparing ourselves to these other primates, scientists can get clues to our evolution over the past 25 million years. Until now, most of those clues have come from fossils and studies on the behavior and physiology of apes and monkeys. But in the past few years scientists have begun to pore over a new record: the one that is inscribed in our genome and the genomes of other apes and monkeys.

The first draft of the human genome was published in 2000, and in 2005 came the genome of the chimpanzee--our closest living relative. Scientists compared the two genomes to get a sense of what the genome of our common ancestor looked like, and how the genomes of both species have changed over the past few million years. (I wrote about the first wave of chimp/human studies here). One of the biggest surprises came when one team of researchers concluded that the ancestors of chimpanzees and humans interbred for over a million years, producing hybrid humanzees.

But there's a limit to how much you can learn from just two genomes. If you find two versions of a gene that are nearly--but not quite--identical in humans and chimpanzees, it's hard to know for sure how that difference evolved. Imagine, for the sake of brevity, that the human version of a gene is AAAT, and the chimpanzee version is AAAC. (Real genes are hundreds or thousands of nucleotides long.) It's possible that the ancestor of humans and chimps had the AAAC version of the gene, and in humans the C mutated to T. But it's also possible that humans have the ancestral version, and in chimps the T flipped to C. It's even possible that the ancestral version was neither. It might have started out as AAAA, and in humans the final A became T and in chimps A became C.

The way through this impasse is to compare chimpanzees and humans to a third species--ideally another primate. But it was not until today that scientists had a third primate genome to study. Now they have all the DNA from a macaque.

There are 22 species of macaque in the world, their natural ranges reaching as fast west as Gibraltar and as far east as Japan. They're tough, adaptable monkeys that can be found living in cities and temples. Scientists have long studied macaques to learn things about ourselves. The Rh factor in blood types is short for Rhesus factor, discovered in rhesus macaques. It was the macaque's special role in science that put its genome near the top of the list to be sequenced. An inventory of every gene in the macaque genome would make the monkeys even more useful models for human biology.

At the same time, the macaque genome promised to bring human evolution into sharper focus. Humans, chimpanzees, and macaques share an ancestor that lived 25 million years ago. Imagine that you discover that the gene I just mentioned is AAAC in macaques. The simplest explanation for the three versions of the gene is that the ancestor had AAAC, which macaques and chimpanzees inherited. Only in humans did it flip to AAAT. Now imagine that you can make this sort of judgment on all of the roughly 18,000 genes in the human genome.

The macaque genome team has published three papers in the journal Science, along with a dedicated web site. The papers are the latest in a long series of papers that show how intimately intertwined evolutionary biology and medical research have become (despite unfounded claims to the contrary). You just need to look at the title of the lead paper: "Evolutionary and biomedical insights from the rhesus macaque genome."

The scientists examine a lot of biology in the papers, but four topics really jump out: ancestral genes involved in diseases, fast-evolving genes, the origin of new genes, and the spread of genome parasites. Below the fold, I'll hit them one at a time...

1. Ancestral genes and diseases. Some people carry versions of genes (known as alleles) that either cause diseases or predispose them to getting sick. When scientists studied the chimpanzee genome, they discovered that chimpanzees carry the human disease allele, without suffering the human disease. The macaque genome team expanded this search, looking for matches in chimp and macaque genomes to every known human disease allele. They discovered 229 cases in which the disease allele turns out to be the ancestral version. Some of these alleles are quite nasty. Some cause severe mental retardation. Others cause potentially fatal defects in metabolism. Healthy people, for example, convert a chemical called phenylalanine into another one called tyrosine in the process of building proteins. But a genetic defect will stop this transformation, causing phenylalanine to build up to dangerous levels--a condition called phenylketonuria. Macaques have the phenylketonuria gene. And yet they are not poisoned by phenylalanine.

This paradox is not so paradoxical if you bear in mind that no gene works alone. Genes work in networks with dozens of other genes. These entire networks evolve over time, as natural selection fine-tunes the genes to work together successfully. But if conditions change, alleles that worked well in those networks may start to work badly, and new versions of the genes will be favored by natural selection. In the case of 229 genes, it appears, we are in that awkward intermediate stage, with obsolete alleles still in circulation.

2. Fast-evolving genes. The genomes of humans and macaques are, on average, 93 percent identical. About eleven percent produce identical proteins. Others differ by just a few amino acids, and others by dozens. Natural selection could have driven some of those differences, but so could luck. Even neutral mutations sometimes become widespread through nothing more than chance. Comparing genomes helps scientists tell these changes apart.

By reconstructing the ancestral genome of macaques, chimpanzees, and humans, scientists can tally up the changes along each branch. They can identify mutations that cause significant changes to the structure of proteins, as well as mutations that leave the protein unchanged. These techniques even let scientists distinguish between genes that have experienced strong selection from ones that have experienced weak selection. Previous studies on chimpanzees and humans identified 35 genes that experienced strong selection. Adding the macaque genome into the analysis sharpens up the picture considerably. The macaque genome team has identified 178 genes.

Many of these fast-evolving genes help build the immune systems. That's not too surprising, given the grave threat of disease and the swift evolution of parasites. Among the surprise entries are two genes that help build hair. Apes and Old World monkeys might have undergone strong selection on these genes as they adapted to changes in climate or to attract mates with a handsome coat. And some of these genes are best known for making proteins in cancer cells. In order to explain that particularly weird finding, I have to jump into the next topic...

3. The origin of new genes. Comparing genes from one species to another can cause serious headaches. That's because it's hard to find a clean, one-to-one correspondence. Many genes in the human genome belong to gene families--groups of dozens, even hundreds of genes that have very similar sequences. Other primates have gene families as well, but some of their families have more genes than ours, and some have fewer. The new macaque genome study does a great job of demonstrating how this confusing situation came to be. Over the course of millions of years, genes get accidentally duplicated. Some copies are later lost, and some evolve into new forms.

The macaque genome team took a close look at a particularly interesting family of genes called PRAME genes. Humans have dozens of PRAME genes, with some people having more copies than others. Scientists aren't exactly sure what PRAME genes do, but it seems they have a role in producing sperm, judging from the fact that they normally only make proteins in testes. But PRAME genes also have a dark side: they often switch on in cancer cells. There are actually many genes that play this dual role, so many in fact that they have their own name: cancer testes genes. There's something very useful to cancer cells in the genes used to build sperm--possibly their ability to grow and divide quickly. (For more on this connection, see my article in the January issue of Scientific American on cancer and evolution.)

i-0caeb7544e5c355325425c2f359563a5-Macaque prame tree 250.jpgThe macaque team used the monkey's genome to trace the evolution of PRAME genes. Their results are summed up nicely in this picture. Each line represents a gene, and the three clusters of genes belong, from top to bottom, to macaques, chimps, and humans.

The PRAME gene family started with a single gene which was accidentally duplicated. The two new genes then became four, and four became eight. These three rounds of duplication had already taken place before the ancestors of macaques, humans, and chimps diverged 25 million years ago. All eight kinds of PRAME genes can be found in their genomes. But evolution did not then grind to a halt. In the new primate lineages, some PRAME genes disappeared thanks to mutations that snipped them out of genomes. In other cases, PRAME genes were duplicated yet again, expanding the family.

But these new genes were not just extra copies of old ones. They acquired mutations to their sequences that changed the shape of the proteins they made. In some cases, the selection acting on these genes was intense. It's possible that as these genes evolved for their function in making sperm, they became more effective as cancer genes. Understanding that evolution may help scientists understand how cancer cells benefit from them.

4. Genomic parasites. Comparing primate genomes shows that primates have been particularly prone to picking up duplicated genes. One reason for this may be that our genomes are overrun with self-replicating segments of DNA called mobile elements. When they copy themselves, they sometimes copy ordinary genes as well.

Mobile elements are a motley collection of weird chunks of DNA that take up about half of the human genome. Many of them got their start from viruses that invaded ancient primates. Known as endogenous retroviruses, they are related to HIV. But while HIV jumps from one host to another, endogenous retroviruses fused with a host genome and were passed down from one generation to the next. Their DNA mutated over time, but in some cases they still retained the ability to make copies of themselves that were then inserted back into the genome. Other mobile elements cannot replicate themselves but depend instead on these more viral elements to do the work for them. Parasites of parasites, as it were.

Mobile elements are particularly tough to survey. Protein-coding genes usually have some very clear markers--instructions that tell the cell when to stop copying them, for example--that computers can search for. Mobile elements, on the other hand, can become obscured by mutations. By comparing several different versions of the same mobile element, scientists can recognize them more easily. The macaque genome allows scientists to do exactly that. It turns out that the ancestor of macaques, chimps, and humans had already been infected by seven different endogenous retroviruses. After the three branches split, the viruses made new copies of themselves. In the human lineage, only two viruses at most have infected us, and both have become trapped in our genome. (Last fall I wrote about how researchers resurrected one of them.) Macaques, by contrast, have been infected by eight viruses since their ancestors split off from our own. I wonder if that difference reflect a difference in how many viruses each species faces. And given that we got HIV from chimpanzees and monkeys, it would help to understand the relationship between primates and their viruses better.

Mobile elements are also important medically because when they make new copies of themselves, they can wreak havoc in the genome. They can pick up parts of genes or entire genes and moved them to other places in the genome. Genes that are normally kept silent can become active; they can respond to new signals. Scientists have linked 118 genetic disorders to mobile elements, including hemophilia, breast cancer, and muscular dystrophy. The macaque genome team points out that understanding their evolution in humans and macaques will make the monkeys better models for these diseases. But they also observe that mobile elements can be a creative force as well. They create new copies of genes, which can take on new functions. They can insert parts of genes into other genes, in a kind of natural genetic engineering that can generate new kinds of proteins. Once again, medicine and evolution prove intertwined. Thus the macaque genome helps us better understand the mysterious encyclopedia of DNA we carry inside us.

(For more information check out an interactive site Science has set up here.)

[Macaque image courtesy of the Southwest National Primate Research Center at Southwest Foundation for Biomedical Research in San Antonio]

Tags

More like this

Carl Zimmer has a terrific blog post on the subject of endogenous retroviruses (ERVs) that should be mandatory reading for anyone who still denies common descent. ERVs are viruses that get inserted into the genomes of other organisms and are inherited by the descendants of that organism. He writes…
In a recent posting, Rusty answers me once again on the issue of testability. He proposes an actual test for both creationism and evolution. This is what he says: But in the strictest sense of the term testability, a falsifiable prediction must be made in order for a scientific theory to be…
Why is there "junk DNA"? What is Junk DNA? What is a Pseudogene? What is Gene Duplication? Goodness, you certainly do have a lot of questions. And some of them can be answered, or at least addressed, on examination of a very interesting new paper recently published about a gene that became a…
Tyrannosaurus Rex And Mastodon Protein Fragments Discovered, Sequenced: Scientists have confirmed the existence of protein in soft tissue recovered from the fossil bones of a 68 million-year-old Tyrannosaurus rex (T. rex) and a half-million-year-old mastodon. Their results may change the way people…

I truly hope that the ignorant neurosurgeon Michael Egnor, newest ID flogger for the Disco Institute, reads this post. He has repeatedly denied that medicine has anything to learn from evolutionary biology.

I am a non-scientist. Could you take a few minutes and explain something to me? You say:
" they have all the DNA from a macaque.

There are 22 species of macaque in the world"

First - If there are 22 species, doesn't each specie (singular of species?) have a different genome?
Second - Don't any two individuals of the same species also have different genomes, though to a lesser degree?
Third - If the human genome is 93% the same as the macaque genome, and 94% the same as the chimpanzee genome, what percent the same are any two human genomes, or the genomes of different macaque species?
And, finally, depending on the answer to #3,if two humans have different genomes, how does one decide which is "the" human genome?

Carl, there's an important distinction to make between genes that are rapidly evolving because of positive selection and those evolving rapidly due to relaxed selective constraint. Not all the rapidly evolving genes are under positive selection -- we require further evidence to conclude natural selection is driving rapid evolution.

Karl (Re: Comment #4),
You've raised some really good questions. First of all, the species of macaque that has been sequenced is the rhesus macaque (Macaca mulatta). Each other species of macaque (Japanese, pigtail, etc.) will, indeed, have its own genome, though they will almost certainly be very similar to the rhesus genome.

The answer to your second question is "yes". Each individual of a species will have differences at the nucleotide level from each other individual. So, in that sense, each individual has his or her own unique genome. However, the value of a complete genome sequence like this comes from knowing where all the parts are and what they're likely to be. Most of the differences between individuals will be very minor -- a nucleotide here, a mobile element insertion there. And those individual differences, when compared to the full sequence can give lots of insights into what makes individuals different, which loci might be involved in a disease mechanism, or what sort of structure may be present within and between populations of individuals.

I can't answer your third question precisely because I don't have the figures. But, I can guarantee that any two human genomes will likely be well over 99% similar. In fact, since there are two fully sequenced human genomes (the public one and the Celera one). The percent similarity between different macaque species will largely depend on how far in the past they diverged from one another. The phylogeny showing the relationships between the various species of macaques is not completely resolved (I believe), but being that they're all in the same genus (Macaca), it's not likely to be very much divergence. Possibly less than that between humans and chimpanzees.

Your final quesiton is a good one. The way the two human genomes were sequenced involved using sequence from several individuals. We don't really know who is the "humans" for the public human genome project. That's kept secret. Though, it has been reported that the lion's share of the DNA for that project came from one individual from New York due to the good quality that those samples yeilded. The Celera project, which is a privately funded complete human genome that was published at the same time as the public one, used five indivudals, one of which is the president and founder of Celera, Craig Venter.

There are other projects ongoing, such as the International HapMap Project, that are busily trying to get partial and/or complete genomes from a wide diversity of humans from various ethnic groups in order to help us determine what genetic differences there may be between human subpopulations. Having the complete reference sequences from the public genome and Celera's genome, however, are absolutely critical to these other efforts.

Hope this helped.

RMP [#5] Point taken. I'd point out, however, that the authors themselves never brought up relaxed selection in the main paper.

I'd be grateful if someone could elucidate this sentence from one of the Science articles:

"Given that a slower rate of variability at the single-nucleotide level in the X chromosome compared with autosomes has been interpreted as support for speciation models, this difference is worthy of further investigation." [Citation omitted.]

This appears near the bottom of the right-hand column on page 224 of the article at the following link: http://www.sciencemag.org/cgi/reprint/316/5822/222.pdf

The phylogeny showing the relationships between the various species of macaques is not completely resolved (I believe), but being that they're all in the same genus (Macaca), it's not likely to be very much divergence. Possibly less than that between humans and chimpanzees.

Why are they "lower" animals?
'Cause Man's so vain.

Else the genus Pan might have another species.

By Pete Dunkelberg (not verified) on 13 Apr 2007 #permalink

Every single person has a genetic contribution to mankind. If we lower the types and variety available in our gene pool we imperil our survival. Perhaps genes from other closely related species may be necessary. That would be way cool.

Pete (Re: Comment #9),

I'm not sure why you quoted me, but I re-read the section of my comment you quoted, then I re-read my entire comment (#6). I can't find any place in which I implied that rhesus monkeys are "lower" animals. I didn't even use the word "lower".

Like most people working in evolution, I don't usually deal in lower-higher distinctions. As you are probably aware, those sorts of distinctions rarely have useful meanings in the context of biology. The terms "ancestral" and "derived" come up much more often when we try to explain if a characteristic is more similar to the way we think it was in the common ancestor or if it has been changed significantly since then.

Now, you may occasionally hear scientists, including me, use the term "lower", but I'd argue that those instances are more conversational laziness than an actual misunderstanding of the facts on the part of the speaker. I would completely agree, however, that such laziness should be kept out of classrooms, papers, and public talks because it gives those unfamiliar with evolution the wrong idea about the picture of life that the field is showing us.

As for the number of species within the genus, Pan, that's always going to be a tough nut to crack. Anytime you deal with the question of how many species are in a certain genus, you have to consider exactly which one of the several different species definitions you're going to use to make your groupings. And, while the species definition arugment is always a lively and thought-provoking discussion to have, my experience has told me that there really is no "one right way" to do species determinations. It just depends on what seems to most reflect the biology of the things you're studying. But, there's almost always a robust case to be made that you should have chosen a different species definition and, therefore, that the number of species in your genus should change.

Rick (Re: Comment #10),

I have always heard that the Celera Genome is all Craig Venter's DNA. In fact, based on conversations I've had with researchers who should know what they're talking about, I believe that to be the case. However, before I made my original comment (#6), I tried to do some fact checking via Google to make sure I could provide a reference for that assertion if one were demanded.

During this quick search, I couldn't find a source of irreproachable evidence for the claim that Celera is 100% Venter, and I found a few places that mentioned that the Celera genome was actually built from 5 individuals, only one of which was Venter. So, in the interests of honesty, I decided to go with the "watered down" number instead of just posting what I have heard for years but can't back up with proof, which is that the Celera Genome is 100% Craig Venter.

If anyone has a source that settles the question, please post it. I'm now interested in which account is the truth. I'm sure it will come up again at some point and I'd prefer to have a definitive answer.

Just another excellent, exciting post, Carl.

Thanks! And a hat-tip to the several who gave thoughtful responses to the interesting questions and comments on the thread...!

By Steviepinhead (not verified) on 21 Apr 2007 #permalink