Last week’s excitement about the discovery of “seventh” and “eighth” DNA bases might’ve obscured parts of that work that shed a little more light on a very murky corner of epigenetics’ cytosine modifications — “Where do the methyls go?” And related research also published last week gets a bit closer to answering, “What does 5-hydroxymethyl-C really do, anyway?”
The first, a Science paper by Yi Zhang and his group at the University of North Carolina, does introduce two new cytosine modifications, 5-formylcytosine and 5-carboxylcytosine. But more interesting, I think, is that the group gets halfway through proving a plausible mechanism for cytosine demethylation, which no one’s demonstrated yet. And during the same week, researchers at UCLA and New England Biolabs (which, full disclosure, owns this site) report in Genome Biology that 5-hmC is concentrated in gene enhancer sequences and gene bodies themselves, in the first genome-wide human embryonic stem cell map of 5hmC.
As every epigeneticist knows, cells can methylate cytosine with enzymes like DNMT1, regulating nearby gene expression by turning it down or completely off. These Cs can’t stay methylated forever, and in plants, a DNA repair mechanism snips out a little of the methylated strand and fills in the gap. But no one knows how it’s done in mammals.
Zhang’s group got a clue from other base-changing chemistry. “The proposed reaction of [5mC] demethylation is very similar to the conversion of thymine to uracil,” Zhang says. “So that’s how we initially identified Tet.” The thymine-to-uracil reaction has two oxidation steps and a decarboxylating step carried out by two different enzymes. And since Tet has already proven to oxidize 5mC into 5hmC, his group saw that enzyme as a prime suspect.
With the right thin-layer chromatography conditions and modification-insensitive restriction enzymes, the UNC researchers demonstrate convincing evidence that Tet enzymes 1, 2, and 3 do indeed convert 5mC in short DNA fragments to 5hmC, 5hmC to 5-formylC, and 5-formylC to 5-carboxylC.
Would it surprise him if 5-formylC and 5-carboxylC turned out to have regulatory roles? “Probably not,” says Zhang. “The fact that you can detect them suggests they will probably have some function — otherwise, if there’s no regulatory role, [the reaction] should go all the way to carboxyl.”
Still, the group can’t rule out the possibility that a DNA-repair type mechanism operates in mammals too. “The next step is looking for the decarboxylase” says Zhang. “Without finding the decarboxylase, this pathway is not proved.”
Concerning old-hat 5hmC, to me the most interesting part of the UCLA-NEB group’s study is its finding that this “sixth” DNA base tends to show up in enhancer regions and within actual gene sequences. Led by the university’s Steven Jacobsen, the group used hmeDIP-seq — immunoprecipitating fractionated DNA with a 5hmC antibody — to make the first genomic map of 5hmC in human embryonic stem cells. Here’s a summary of some of the study’s findings:
As did Song et al., we found that a large fraction of 5hmC peaks were enriched over genes. However, we also found that 5hmC is enriched over predicted hESC enhancers further suggesting a potential role of 5hmC in gene regulation. Moreover, we observed enrichment of 5hmC peaks with transcription binding sites such as those of pluripotency factors OCT4 and NANOG.
But it doesn’t look like the evidence is clear enough yet to call 5hmC a regulatory modification. The truth might be a whole lot more complicated in fact.
Plotting the distribution of 5hmC peaks over RefSeq genes with different expression levels, we observed that 5hmC is enriched near the transcription start sites (TSS) of lowly expressed genes, whereas 5hmC is depleted at TSS of highly expressed genes (Figure 1c). This is in contrast to data reported by Song et al. that suggested that 5hmC levels positively correlate with expression in mouse cerebellum, suggesting possible differences in the role of 5hmC in different tissues.
So, possibly downregulatory in ES cells, but apparently upregulatory in mouse brain. However, because previous studies have reported that 5hmC inhibits DNMT1 and methyl-CpG-binding protein, the authors suggest that this modification might act to negatively regulate 5mC. In effect, this may encourage enhancer proteins and transcription factors to bind, the authors reason.
The 5hmC modification also seems to show up often when the G-C content of one strand changes from high-G to high-C. This might be a characteristic of replication termini and recombination hotspots, and the UCLA group suggests 5hmC plays a protein binding role in these places too.
(Edited to add: Picture of a newfangled Mystery Machine by Flickr user luckylynda74 used here under a Creative Commons license.)
Ito, S., Shen, L., Dai, Q., Wu, S., Collins, L., Swenberg, J., He, C., & Zhang, Y. (2011). Tet Proteins Can Convert 5-Methylcytosine to 5-Formylcytosine and 5-Carboxylcytosine Science, 333 (6047), 1300-1303 DOI: 10.1126/science.1210597
Stroud, H., Feng, S., Morey Kinney, S., Pradhan, S., & Jacobsen, S. (2011). 5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells Genome Biology, 12 (6) DOI: 10.1186/gb-2011-12-6-r54