Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 1;11(9):e1004441.
doi: 10.1371/journal.pcbi.1004441. eCollection 2015 Sep.

Analysis of Nearly One Thousand Mammalian Mirtrons Reveals Novel Features of Dicer Substrates

Affiliations

Analysis of Nearly One Thousand Mammalian Mirtrons Reveals Novel Features of Dicer Substrates

Jiayu Wen et al. PLoS Comput Biol. .

Abstract

Mirtrons are microRNA (miRNA) substrates that utilize the splicing machinery to bypass the necessity of Drosha cleavage for their biogenesis. Expanding our recent efforts for mammalian mirtron annotation, we use meta-analysis of aggregate datasets to identify ~500 novel mouse and human introns that confidently generate diced small RNA duplexes. These comprise nearly 1000 total loci distributed in four splicing-mediated biogenesis subclasses, with 5'-tailed mirtrons as, by far, the dominant subtype. Thus, mirtrons surprisingly comprise a substantial fraction of endogenous Dicer substrates in mammalian genomes. Although mirtron-derived small RNAs exhibit overall expression correlation with their host mRNAs, we observe a subset with substantial differences that suggest regulated processing or accumulation. We identify characteristic sequence, length, and structural features of mirtron loci that distinguish them from bulk introns, and find that mirtrons preferentially emerge from genes with larger numbers of introns. While mirtrons generate miRNA-class regulatory RNAs, we also find that mirtrons exhibit many features that distinguish them from canonical miRNAs. We observe that conventional mirtron hairpins are substantially longer than Drosha-generated pre-miRNAs, indicating that the characteristic length of canonical pre-miRNAs is not a general feature of Dicer substrate hairpins. In addition, mammalian mirtrons exhibit unique patterns of ordered 5' and 3' heterogeneity, which reveal hidden complexity in miRNA processing pathways. These include broad 3'-uridylation of mirtron hairpins, atypically heterogeneous 5' termini that may result from exonucleolytic processing, and occasionally robust decapitation of the 5' guanine (G) of mirtron-5p species defined by splicing. Altogether, this study reveals that this extensive class of non-canonical miRNA bears a multitude of characteristic properties, many of which raise general mechanistic questions regarding the processing of endogenous hairpin transcripts.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Examples of novel mirtrons confidently annotated in this study.
(A) Drosha-mediated and splicing-mediated pathways for the generation of Dicer-substrate pre-miRNA hairpins. For both canonical miRNA loci and mirtron loci of all four classes, the critical judgement for their annotation is whether the small RNA evidence supports the notion that their progenitor hairpins were subject to Dicer cleavage to generate specific miRNA/miRNA* (star) duplexes. The thickness of the arrows leading to Dicer indicates the relative number of substrates generated by each pathway. (B) Mirtron supported by especially robust evidence, including the presence of both miRNA and star reads in Ago-IP data and the recovery of abundant phased loop reads. Note that loop reads were not found in the Ago-IP data, providing evidence for the selection of specific species in Ago complexes. (C) A mirtron locus lacking substantial "star" read evidence, whose confidence is bolstered by hundreds of reads in Ago-IP libraries. (D) Example of a skin-restricted mirtron whose reads are highly represented in Ago1-IP, Ago2-IP and Ago3-IP data. (E) Example of a heart-restricted mirtron. Over 96% of reads collected from nearly 700 human small RNA datasets were from heart. (F) Example of a "two-tailed" mirtron supported by abundant Ago-IP reads. We infer that generation of the pre-miRNA involves splicing followed by removal of both 5' and 3' tails (see A).
Fig 2
Fig 2. Greatly expanded annotations of human and mouse mirtrons.
(A) Numbers of splicing-derived miRNAs in human and mouse, categorized as conventional, 5'-tailed, 3'-tailed, and two-tailed mirtrons. Most of the miRNAs newly annotated in this study were 5'-tailed mirtrons, reflecting their status as the dominant mirtron class in human and mouse. (B) Few mirtrons were annotated from small RNA data in both mouse and human, and only a subset of these were constrained in primary sequence. (C, D) Human and mouse mirtrons are generally modestly expressed, but were annotated to higher levels of evidence than hundreds of human and mouse miRNAs in the miRBase registry (i.e. that have <50 reads in the aggregate data analyzed in this study). Most mirtrons were supported by evidence from Ago-IP datasets (red bars). (E, F) Cumulative distribution function (CDF) plots of enrichment of canonical miRNAs and mirtron-derived miRNAs in Ago complexes. (E) Analysis of human small RNAs. Rat RmC was used as control IP; since Ago4 is not expressed in HeLa cells, it effectively serves as another control IP. Canonical miRNAs were enriched in Ago1-3-IP data as well as input RNA (which is mostly composed of Ago-bound miRNAs), relative to control IP data. Mirtron-derived small RNAs showed similar Ago-IP enrichment, except that they also exhibited enrichment between Ago1-3-IP and input RNA libraries. (F) Analysis of mouse small RNAs shows similar enrichment of canonical miRNAs and mirtron-derived small RNAs in Ago1 and Ago2 complexes relative to control IgG complex.
Fig 3
Fig 3. Correlation of mirtron and host gene expression.
(A) We calculated the Pearson correlation coefficients of the accumulation of mouse mirtron-derived small RNAs and spliced RNA-seq reads directly flanking the mirtron across seven tissues. We also performed 100 control comparisons where the tissue origins were shuffled. The cumulative distribution function (CDF) of these correlations was plotted, and observed to be significantly positively correlated (by Mann-Whitney U-test). (B) The binned distribution of mirtron/mRNA Pearson correlation coefficients was plotted. This visualization emphasizes their positive correlation, but also highlights a subset of discordant loci. (C) Examples of correlated and discordant expression of mirtron-derived miRNAs and host mRNAs across tissues. We show host level gene expression as reads per kilobase of transcript per million mapped reads (RPKM) and the spliced exonic reads that directly cross the mirtronic locus as reads per million mapped reads (RPM). Mirtron-derived miRNAs are quantified as reads per million mapped miRNA reads (RPMM).
Fig 4
Fig 4. Sequence and length properties of mirtron-containing introns.
(A) Comparison of mirtron-bearing introns with total introns in human. The distribution of total intron lengths is much broader than for mintrons. The dominant class of 5' tailed mirtrons derives mostly from introns that are <3kb in length, while the 3'-tailed mirtrons and conventional mirtrons derive from very short introns. (B, C) Nucleotide bias of small RNAs from 5'-tailed mirtrons. Three anchor points were considered, as schematized on the 5'-tailed mirtron model in the center (1, 2, 3, arrows). (B) Biased nucleotide identities of mirtron-5p reads from the dominant class of 5'-tailed mirtrons. Compared to an equivalent sequence range of control introns of similar length, mirtron-5p reads exhibit substantial 5'-U bias and overall enrichment of G across their lengths. The G bias is greater in the 5' than 3' regions of the mirtron-5p reads, and is not evident in bulk intron sequences downstream of their ~22 nt lengths. (C) Biased nucleotide identities of mirtron-3p reads from the dominant class of 5'-tailed mirtrons. Compared to control introns, there is substantial 5'-U bias (evident with aligning by their 5' ends) and substantial C-bias across their length. Note that the bulk introns exhibit polypyrimidine tracts upstream of the splice acceptor site (YAG), but mirtrons exhibit greater representation of C while control introns show greater bias for U. (D) Mirtronic regions exhibit much lower minimum free energy (MFE) than control intronic regions. CDF (cumulative distribution function) is plotted for MFE/base distribution. (E) All four classes of mirtrons are hosted by genes with greater numbers of introns than average genes. Various classes of other intronic non-coding RNAs (e.g. tRNAs, snoRNAs, and either conserved or non-conserved canonical miRNAs) typically reside in genes with larger numbers of introns than bulk genes, but their averages are intermediate to all classes of mirtrons. (F) Bar graphs that emphasize the individual properties of genes that host various classes of non-coding RNAs. It is evident that the all four classes of mirtrons have a broader distribution of intron numbers relative to other types of non-coding RNAs.
Fig 5
Fig 5. Broader distribution of hairpin lengths in mirtrons vs. canonical miRNAs.
(A) Example of a conventional mirtron with extremely long pre-miRNA hairpin. (B) Example of two-tailed mirtron with extremely long pre-miRNA hairpin. In both cases, small RNA reads were recovered specifically from the genomically distant miRNA/star duplexes. (C) Analysis of mouse pre-miRNA lengths. The left plot illustrates individual loci, while the right plot summarizes their overall behavior. Canonical miRNAs exhibit a very tight distribution with no pre-miRNAs greater than 82 nt. The average lengths of 3'-tailed, 5'-tailed, and two-tailed mirtron pre-miRNAs are similar to canonical pre-miRNAs, but 5'-tailed mirtrons exhibit a noticeably broader length distribution. Conventional mirtrons exhibit noticeably longer pre-miRNA lengths than the other classes. (D) Analysis of human pre-miRNA lengths. Their overall properties are similar to mouse loci, including the subpopulation of long 5'-tailed mirtron hairpins and the substantially increased length of conventional mirtron pre-miRNAs as a class.
Fig 6
Fig 6. Distinct patterns of terminal heterogeneity in mirtron-derived small RNAs.
(A, B) 5'-end heterogeneity in the 5p and 3p reads from human (A) and mouse (B) miRNA loci. There are several distinctions between canonical miRNAs and specific classes of mirtrons. These include substantial populations of mirtron-5p reads that lack their 5' nucleotide defined by splice donor sites, namely 5p reads from conventional mirtrons and 3' tailed mirtrons (*), and overall greater 5' heterogeneity in the 5' reads from 5' tailed mirtrons (#). (C, D) 3'-end heterogeneity in the 5p and 3p reads from human (C) and mouse (D) miRNA loci. Particularly notable are the dominant populations of 3'-tailed reads from 3p arms of conventional mirtrons and 5'-tailed mirtrons (marked by + signs), i.e., those reads that are defined by splice acceptor sites.
Fig 7
Fig 7. Unexpected patterns of 5' heterogeneity and processing of mirtrons.
(A) A 5'-tailed mirtron in Irak1 exhibits strong heterogeneity in its 5p species that differ in register by 2 nt; this is accompanied by strong heterogeneity in its 3p species. Inspection of this array of isomiR sequences suggests that distinct 5' ends of 5p species may instruct alternative Dicer cleavage. This may be accompanied by subsequent 3' resection of "long" 3p reads produced by Dicer cleavage closer to the terminal loop, and retention of 3'-uridylated 3p reads when the Dicer cleavage is further from the loop. (B) A counter-example in which broad 5' heterogeneity of 5p species, here distributed equally over three nucleotides, is not accompanied by 5’ heterogeneity of 3p species, which are extremely precisely-defined. All of these reads accumulate similarly in total and Ago-IP data. (C-E) Frequent 5' decapitation of select mirtron-5p reads defined by splicing. (C) Example of 3'-tailed mirtron (hsa-mir-4745) exhibiting nearly complete decapitation of 5'-G from its 5p reads (i.e., "xU" reads). These reads are present in multiple Ago-IP datasets. (D) Example of a conventional mirtron (hsa-mir-1236) exhibiting high frequency "xU" reads supported by Ago-IP evidence. (E) Summary of 5' heterogeneity amongst 5p reads from human conventional and 3'-tailed mirtrons, indicating that many mirtrons are subject to 5' decapitation.

Similar articles

Cited by

References

    1. Flynt AS, Lai EC (2008) Biological principles of microRNA-mediated regulation: shared themes amid diversity. Nature reviews Genetics 9: 831–842. 10.1038/nrg2455 - DOI - PMC - PubMed
    1. Sun K, Lai EC (2013) Adult-specific functions of animal microRNAs. Nature Reviews Genetics 14: 535–548. 10.1038/nrg3471 - DOI - PMC - PubMed
    1. Axtell MJ, Westholm JO, Lai EC (2011) Vive la différence: biogenesis and evolution of microRNAs in plants and animals. Genome Biology 12: 221–221.213. 10.1186/gb-2011-12-4-221 - DOI - PMC - PubMed
    1. Ghildiyal M, Zamore PD (2009) Small silencing RNAs: an expanding universe. Nature Reviews Genetics 10: 94–108. 10.1038/nrg2504 - DOI - PMC - PubMed
    1. Czech B, Hannon GJ (2010) Small RNA sorting: matchmaking for Argonautes. Nature Reviews Genetics 12: 19–31. 10.1038/nrg2916 - DOI - PMC - PubMed

Publication types