Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun;40(11):5023-33.
doi: 10.1093/nar/gks144. Epub 2012 Feb 16.

Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA

Affiliations

Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA

Jeffrey E Squires et al. Nucleic Acids Res. 2012 Jun.

Abstract

The modified base 5-methylcytosine (m(5)C) is well studied in DNA, but investigations of its prevalence in cellular RNA have been largely confined to tRNA and rRNA. In animals, the two m(5)C methyltransferases NSUN2 and TRDMT1 are known to modify specific tRNAs and have roles in the control of cell growth and differentiation. To map modified cytosine sites across a human transcriptome, we coupled bisulfite conversion of cellular RNA with next-generation sequencing. We confirmed 21 of the 28 previously known m(5)C sites in human tRNAs and identified 234 novel tRNA candidate sites, mostly in anticipated structural positions. Surprisingly, we discovered 10,275 sites in mRNAs and other non-coding RNAs. We observed that distribution of modified cytosines between RNA types was not random; within mRNAs they were enriched in the untranslated regions and near Argonaute binding regions. We also identified five new sites modified by NSUN2, broadening its known substrate range to another tRNA, the RPPH1 subunit of RNase P and two mRNAs. Our data demonstrates the widespread presence of modified cytosines throughout coding and non-coding sequences in a transcriptome, suggesting a broader role of this modification in the post-transcriptional control of cellular RNA function.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Single-nucleotide resolution mapping of m5C candidate sites in RNA. HeLa cell RNA preparations were spiked with a trace amount of in vitro transcribed R-Luc RNA and bisulfite-converted as detailed in the ‘Materials and Methods’ section. (A) Negative control R-Luc and (B) endogenous tRNA(AspGUC) as a positive control were first analyzed by conventional sequencing to establish the efficacy of bisulfite conversion (top panels; columns signify cytosine positions along the RNA sequence, rows represent individually sequenced alleles, open boxes indicate cytosine to uracil conversion read as thymidine in cDNA and filled boxes indicate a retained cytosine). Numbers below refer to cytosine positions in the primary RNA sequence. Nucleotide positions highlighted in red designate previously identified m5C sites in tRNA(AspGUC). Dual axis charts (bottom panels) display next-generation sequencing data mapped at 2 mm for the same control RNAs. Blue bars represent bisulfite-induced cytosine conversion, while red lines represent read coverage across individual residues. Top and bottom panels are aligned to each other by interrogated cytosine residues.
Figure 2.
Figure 2.
Defining parameters for m5C candidate site selection using tRNA data. (A) Plot of conversion against read coverage at 28 known m5C sites in human tRNAs (43). The majority of the previously identified tRNA m5C sites had a conversion of 80% or less and read coverage of at least 10 (shaded area). (B) Dependence of the proportion of novel tRNA candidate sites in anticipated tRNA structural positions (red circles in tRNA cloverleaf cartoon) on chosen conversion cut-off. Read coverage threshold was ≥10. Colour code for dots and lines refers to mapping at different colour-space mismatch limits.
Figure 3.
Figure 3.
Validation of novel m5C candidate sites. Conventional bisulfite sequencing data is shown for three novel sites, (A) residue 48 in tRNA(LysCUU), (B) residue 174 in the RNase P RNA component H1 (RPPH1), and (C) residue 748 in cyclin-dependent kinase 2 interacting protein mRNA (CINP). Top panels display results for endogenous transcripts. Data for spiked-in in vitro transcribed negative controls harboring the same sequence flanked by unique priming sites are also shown (middle panels) as are corresponding next-generation sequencing results (lower panels). Numbering of cytosine positions is as described in Figure 1, positions highlighted in red designate m5C sites identified by next- generation sequencing. See Supplementary Figure 3 for additional validation data.
Figure 4.
Figure 4.
Analysis of methyltransferase target sites. HeLa cells were transfected with siRNAs targeting DNMT1, TRDMT1 and NSUN2 or a non-targeting control siRNA (NTC) as indicated on the left. Conventional bisulfite sequencing data was obtained as described in Figure 1 and is shown for (A) tRNAAsp(GUC) (residue 38 is a known TRDMT1 target) and (B) tRNALeu(CAA) (residue 34 is a known NSUN2 target) (C) CINP and (D) nicotinate phosphoribosyltransferase domain containing 1 (NAPRT1) mRNAs, and (E) RPPH1 non-coding RNA. Nucleotide positions highlighted in red below designate m5C sites identified by next-generation sequencing. Green boxes indicate sites selectively responding to MTase depletion. Hatched boxes indicate intronic sequence. See Supplementary Figure S4 for data on RNAi knock-down efficiency and Supplementary Figure S2 for analysis of additional target sites.
Figure 5.
Figure 5.
Analysis of bias in m5C candidate site location within mRNA. (A) Histogram showing relative enrichment of m5C sites in 3′ UTR and 5′ UTR compared to CDS (Error bars = 95% confidence interval; ‘Poisson’ distribution). (B) RNA-binding protein target density versus distance from m5C site in protein coding transcripts (CDS and UTR both). Confidence bands are shown in grey (SEM).

Similar articles

Cited by

References

    1. Hotchkiss RD. The quantitative separation of purines, pyrimidines, and nucleosides by paper chromatography. J. Biol. Chem. 1948;175:315–332. - PubMed
    1. Suzuki MM, Bird A. DNA methylation landscapes: provocative insights from epigenomics. Nat. Rev. Genet. 2008;9:465–476. - PubMed
    1. Wyatt GR. Occurrence of 5-methylcytosine in nucleic acids. Nature. 1950;166:237–238. - PubMed
    1. Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452:215–219. - PMC - PubMed
    1. Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW, Molloy PL, Paul CL. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc. Natl Acad. Sci. USA. 1992;89:1827–1831. - PMC - PubMed

Publication types