Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov;16(11):1534-1546.
doi: 10.1080/15476286.2019.1637680. Epub 2019 Aug 12.

miRBaseMiner, a tool for investigating miRBase content

Affiliations

miRBaseMiner, a tool for investigating miRBase content

Xiangfu Zhong et al. RNA Biol. 2019 Nov.

Abstract

microRNAs are small non-coding RNA molecules playing a central role in gene regulation. miRBase is the standard reference source for analysis and interpretation of experimental studies. However, the richness and complexity of the annotation is often underappreciated by users. Moreover, even for experienced users, the size of the resource can make it difficult to explore annotation to determine features such as species coverage, the impact of specific characteristics and changes between successive releases. A further consideration is that each new miRBase release contains entries that have had limited review and which may subsequently be removed in a future release to ensure the quality of annotation. To aid the miRBase user, we developed a software tool, miRBaseMiner, for investigating miRBase annotation and generating custom annotation sets. We apply the tool to characterize each release from v9.2 to v22 to examine how annotation has changed across releases and highlight some of the annotation features that users should keep in mind when using for miRBase for data analysis. These include: (1) entries with identical or very similar sequences; (2) entries with multiple annotated genome locations; (3) hairpin precursor entries with extremely low-estimated minimum free energy; (4) entries possessing reverse complementary; (5) entries with 3' poly(A) ends. As each of these factors can impact the identification of dysregulated features and subsequent clinical or biological conclusions, miRBaseMiner is a valuable resource for any user using miRBase as a reference source.

Keywords: Microrna; NGS; annotation; characterization; miRBase; miRBaseMiner; miRNA.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(a) Schematic of miRNA nomenclature used in miRBase release 22. Each entry contains the following fields delimited with a hyphen: (1) three to seven letters indicating species; (2) miR/mir indicating miRNA and miRNA hairpin precursor, respectively; (3) numeric suffix that is assigned sequentially to new entries (in plant miRNAs there is no hyphen delimiter between this field and the species field). Additional letters indicate mature miRNA sequence is shared by entries with the same numeric suffix, and additional numbers indicate miRNA is generated from different hairpin precursors; (4) 3p/5p indicates from which arm of the hairpin precursor the miRNA was generated. (b) Overview of content in successive releases of miRBase (Y-axis corresponds to miRBase releases from 9.2 to 22). Left: Bar plot showing number of annotated species (X-axis) in each miRBase release. Middle: heat map of number of miRNA entries for the 26 species with more than 500 entries in miRBase v22. X-axis corresponds to the 26 species, ordered by total number of miRNAs for each species. Only a few species contain a large number of miRNAs (in red); Right: Bar plot showing total number of miRNAs entries summed over all species (X-Axis). (c) Sequence length distribution of miRNAs from the 26 species in (b). The average miRNA length is 21 ~ 22 nucleotides but many entries are shorter or longer than this.
Figure 2.
Figure 2.
(a) Bar plot showing the number of updated miRNAs (right column) and pre-miRNA (left column) entries that were updated between subsequent versions of miRBase from release 9.2 to 22. The rows correspond to four categories: NEW, NAME, SEQUENCE and DELETE. In each plot, X-axis: miRBase versions in chronological order. Y-axis: the number of updated miRNAs/hairpin precursors. Red corresponds to human data and green for mouse. (b) Five types of sequence changes that occur between successive miRBase releases. First three examples are changes between release 21 and 22. Last two examples are changes between miRBase version 17 and 18. Nucleotides changes in sequences between two releases are marked in blue. The bottom sequence is the original miRNA sequence in the previous version of miRBase, the top sequence is in the newer release. The number after sequence box represents the frequency of corresponding miRNA sequence changes that occurred in miRBase 22.
Figure 3.
Figure 3.
The presence of identical sequences in human and mouse miRNA entries from miRBase version 9.2 to 22 (A, B), and sequence similarity in miRBase 22 (C, D). (a) Left: human miRNAs; Right: mouse miRNAs. In miRBase 9.2, there are no miRNAs in human or mouse sharing identical sequence with other entries. The colour denotes the number of miRNAs annotated with that sequence, from yellow to red indicating increasing number. The number in each cell represents the number of miRNA entries with the sequence in that row and the miRBase entry (corresponding to that column). X-axis indicates the miRBase version; Y-axis indicates the duplicated miRNA sequence. (B) Examples of miRNAs sharing identical sequence. The number in green hexagon refers to the corresponding row in (A). The text in red upper case indicates the type of annotation change; NEW: newly added miRNA; DELETE: miRNA entry deleted from miRBase. The text above arrows indicates miRBase versions in which that change occurred. Right-hand plots. The similarity network of human and mouse miRNAs and hairpin precursors in miRBase version 22 based on the pairwise Levenshtein distance matrix for Levenshtein distances less than three nucleotides. (C) human miRNA and hairpin precursor network; (D) mouse miRNA and pre-miRNA network. Each dot represents a miRNA or hairpin precursor. Blue edge: two similar miRNAs; red edge: pre-miRNA and its respective miRNA; green edge: two similar hairpin precursors. The darker color (blue/green) corresponds to a Levenshtein distance equal to 0, the lighter colour corresponds to larger Levenshtein distances.
Figure 4.
Figure 4.
The workflow of miRBaseMiner.

Similar articles

Cited by

References

    1. Bartel DP. Metazoan MicroRNAs. Cell. 2018;173:20–51. - PMC - PubMed
    1. Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75:843–854. - PubMed
    1. Wightman B, Ha I, Ruvkun G. Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell. 1993;75:855–862. - PubMed
    1. Truscott M, Islam AB, Frolov MV. Novel regulation and functional interaction of polycistronic miRNAs. RNA. 2016;22:129–138. - PMC - PubMed
    1. Zisoulis DG, Kai ZS, Chang RK, et al. Autoregulation of microRNA biogenesis by let-7 and Argonaute. Nature. 2012;486:541–544. - PMC - PubMed

Publication types

Grants and funding

This work was supported by Helse Sør-Øst Grants [2016122, 2015034] and Norwegian Research Council Grant [274715].

LinkOut - more resources