Recently there has been a flood of press about epigenetics and non-coding RNA. What is lacking from these articles is a description of how DNA is packaged and what DNA elements such as promoters and enhancers do. Today I would like to touch upon all of these subjects with a post on how DNA is organized and how this affects the turning on or off of genes.
OK here we go ...
One of the biggest findings over the past couple of years is how the act of transcription feeds back onto the organization of DNA.
What do I mean by that?
Well in our cells, DNA is wrapped around highly conserved structures called nucleosomes, each made up of eight "histone" proteins (two copies of histone 2A, 2B, 3 and 4). These histone subunits have long tails that jut out of the main body of the nucleosome and accumulate modifications such as methylation or acetylation. If you were to reach into a cell and pull out the DNA, it would resemble a pearl necklace, except that the string (the DNA) would be wrapped around twice around each pearl (the nucleosomes) but then each pearl would also have hair like projections (the histone tails that can be decorated by different modifications. Now neighboring nucleosomes tend to have similar modifications, so that the modifications can be thought to cluster at distinct places in the necklace and thus mark these regions.
Modifications found on any particular nucleosome influence how the cell treats the wrapped DNA. For example, if we look at promoter regions. These are DNA segments where RNA Polymerase II (or Pol II for short), the major mRNA-synthesizing enzyme, starts to transcribe DNA into RNA. It turns out that these promoter regions are usually wrapped around nucleosomes that are modified on histone 3 so that the lysine at amino acid number 4 is tri-methylated. The terminology for such a modification is H3K4me3 (H3=Histone 3; K4= lysine at position 4, me3=trimethylated). This modification can recruit large protein complexes to the promoter, such as the Nucleosome Remodeling Factor or NURF. Such proteins can either directly recruit Pol II and Pol II activating proteins, or in the case of NURF, loosen up the connections between the DNA and the nucleosome to give Pol II and Pol II activating proteins a better chance to recognize the promoter sequences. In the end all these interactions helps to stimulate the transcription of a neighboring gene.
Other histone modifications have different roles, some stimulating others inhibiting transcription. An army of labs are trying to figure out exactly what's going on. What is strange about this whole histone modification business is that the sequence of marks along the genome (i.e. the the pattern of modifications present along our necklace) is inherited by the two daughter cells after division. That means that as the DNA is copied, each new strand gets loaded onto nucleosomes that are modified in the exact same way as in the parental DNA. No one knows how this can happen! (Yes this is a big problem in biology right now!) This weird type of inheritance is what is often termed as "epigenetic" - daughter cells not only inherit a sequence of DNA but also a pattern of modified histones found at various points along the genome.
Recently there has been evidence that histone modifications not only influence RNA production but also the reverse. For example, look at this map of various histone modifications found on nucleosomes along 1000 highly active and 1000 silent genes in a immune cell line (ChIP- Solexa sequencing, from a 2007 Cell paper from the Zhao group, see the refs).
You can look at these graphs and say, so histone modifications dictate gene expression! The real answer may be something even stranger. One characteristic of these modifications is how they vary along the length of a transcribed portion of DNA. For example, mono-methylation of histone 3 on lysine 9 (H3K9me1) is really high on a region of the DNA just around the transcription start site (txStart) and slowly decreases towards the end of the transcribed DNA (txEnd). Other modifications show other patterns, but they all tend to vary depending how the underlying DNA is transcribed. It is likely that these modification patterns are responsive to transcription.
How could this work?
Well certain histone acetylases, such as Set1, the yeast histone H3-lysine 4 (H3K4) methylase, bind to the c-terminal tail (aka c-terminal domain or CTD) of Pol II. You could imagine that as RNA Pol II transcribes an mRNA, this enzyme behaves like an ant and leaves a trail of pheromones behind but in this case the pheromones are histone modifications. As other RNA polymerases come along, the modifications left its predecessor will influences how it works and how its RNA transcript should be treated. What is neat about Pol II's CTD is that is consists of a seven amino-acid sequence (YSPTSPS) that is repeated many times (52 times in the human Pol II). At the beginning of transcription, proteins that associate with the promoter add a phosphate to the second serine in many of these repeats. Yes, another mark! As Pol II transcribes a new RNA message, serine 5 looses its phosphate and serine 2 gains a phosphate. So this mark also serves as a timer for the Pol II enzyme! So based on the phosphate composition of the CTD, we roughly know how long Pol II has been making RNA. It would now be easy to imagine how various histone modification enzymes could climb on or off Pol II along its journey and mark the underlying histones. Histone modifications (a spatial mark) are thus coupled to Pol II CTD phosphorylation state (a temporal mark).
But there is another way that Pol II can influence histone modifications, for this example we will look at silent regions of the genome that contain long repeats, aka heterochronic loci, in fission yeast. In these regions Pol II chugs along producing transcripts from the "silent DNA" except that as the newly made RNA exits Pol II, it is recognized by siRNAs that are loaded onto the RITS complex. The multi-protein monster then does all sorts of crazy things - it attracts histone modifying enzymes, it triggers the decay of the newly made transcript, and it recruits RNA dependent polymerase that helps generate more siRNAs. In addition, the histone modifications can spread to neighboring nucleosomes like a wild fire and hence silencing the entire region near the heterochronic loci. Paradoxically it would seem that the RITS complex also needs histone modifications to be properly recruited to these heterochronic loci. It is thus clear that there exists these very intricate feedback loops between RNA transcription and histone modification. To read more about RITS, see this entry.
Now having told you all this I want to throw out there one last concept. Besides histone modification, the actual position of the nucleosome contributes quite a bit to how a given genomic sequence is treated by the cellular machinery. Thus if you slide the nucleosome to a new position, then certain DNA sequences become "more" accessible to Pol II and other DNA binding proteins. Likewise when the histones accumulate certain modifications they tend to bind the DNA more tightly or more loosely. All these factors feed back onto transcription.
You see it is a very complicated mess, but one that allows for a very dynamic regulation of gene expression.
Tomorrow I'll talk about a recent manuscript that appeared in Nature last month where the authors show that the transcription of non-coding stretches of DNA found in front of a particular gene alter how nucleosomes are associated with that gene's promoter and thus affect how that gene is transcribed by RNA Polymerase II.
Ref:
Huck Hui Ng, François Robert, Richard A. Young and Kevin Struhl
Targeted Recruitment of Set1 Histone Methylase by Elongating Pol II Provides a Localized Mark and Memory of Recent Transcriptional Activity
Mol. Cell (03) 11:709-719
Artem Barski, Suresh Cuddapah, Kairong Cui, Tae-Young Roh, Dustin E. Schones, Zhibin Wang, Gang Wei, Iouri Chepelev and Keji Zhao
High-Resolution Profiling of Histone Methylations in the Human Genome
Cell (07) 129:823-837
- Log in to post comments
Great post, dude!
You see it is a very complicated mess, but one that allows for a very dynamic regulation of gene expression.
Presumably this will also have implications for the evolution of novel phenotypes. Fascinating post!