Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Sep 22:11:476.
doi: 10.1186/1471-2105-11-476.

Improving performance of mammalian microRNA target prediction

Affiliations

Improving performance of mammalian microRNA target prediction

Hui Liu et al. BMC Bioinformatics. .

Abstract

Background: MicroRNAs (miRNAs) are single-stranded non-coding RNAs known to regulate a wide range of cellular processes by silencing the gene expression at the protein and/or mRNA levels. Computational prediction of miRNA targets is essential for elucidating the detailed functions of miRNA. However, the prediction specificity and sensitivity of the existing algorithms are still poor to generate meaningful, workable hypotheses for subsequent experimental testing. Constructing a richer and more reliable training data set and developing an algorithm that properly exploits this data set would be the key to improve the performance current prediction algorithms.

Results: A comprehensive training data set is constructed for mammalian miRNAs with its positive targets obtained from the most up-to-date miRNA target depository called miRecords and its negative targets derived from 20 microarray data. A new algorithm SVMicrO is developed, which assumes a 2-stage structure including a site support vector machine (SVM) followed by a UTR-SVM. SVMicrO makes prediction based on 21 optimal site features and 18 optimal UTR features, selected by training from a comprehensive collection of 113 site and 30 UTR features. Comprehensive evaluation of SVMicrO performance has been carried out on the training data, proteomics data, and immunoprecipitation (IP) pull-down data. Comparisons with some popular algorithms demonstrate consistent improvements in prediction specificity, sensitivity and precision in all tested cases. All the related materials including source code and genome-wide prediction of human targets are available at http://compgenomics.utsa.edu/svmicro.html.

Conclusions: A 2-stage SVM based new miRNA target prediction algorithm called SVMicrO is developed. SVMicrO is shown to be able to achieve robust performance. It holds the promise to achieve continuing improvement whenever better training data that contain additional verified or high confidence positive targets and properly selected negative targets are available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The block diagram of SVMicrO. SVMicrO includes three steps. First, a site filter is applied to find the potential binding sites of the probing miRNA. Second step, Site-SVM extracts features from each potential site and assigns a score to indicate the prediction confidence of the site as a true site. Final step, the site scores together with other UTR features are considered by the UTR-SVM to produce the final prediction of the UTR as a target.
Figure 2
Figure 2
Binding structure and regional definition of miRNA and target site.
Figure 3
Figure 3
Binding sequence logo of miR-1 predicted by Site-SVM. The 22 nucleotides of miR-1 sequence are plotted from 5' to 3'. The height of each nucleotide is proportional to the probability of binding to the site.
Figure 4
Figure 4
Comparison of ROC curves based on training data. (a) Entire ROC curves based on training data (b) Zoom-in view of ROC curves based on training data To investigate the performance of SVMicrO, the ROC performance was obtained from the cross-validation compared with several other popular target prediction algorithms including TargetScan, PITA, PicTar and miRanda.
Figure 5
Figure 5
Number of true positives among top ranked predictions. This figure reveals the precision of each algorithm in terms of the number of true targets among the different numbers of top ranked genes.
Figure 6
Figure 6
Cumulative sum of protein fold change as a function of ranked predictions. (a)Cumulative sum of protein fold change as a function of ranked predictions of hsa-miR-124 (b) Cumulative sum of protein fold change as a function of ranked predictions of hsa-miR-1 SVMicrO shows faster drop in CFC compare to other algorithm, which means SVMicrO achieves higher precision and smaller false positive.
Figure 7
Figure 7
ROC curves for the predictions of miR-124 tested on IP pull-downs. The ROC curves were plotted based on 388 high confidence positive targets determined by IP pull down experiment.
Figure 8
Figure 8
Number of true positives among top ranked predictions of miR-124.
Figure 9
Figure 9
ROC curves for the predictions of miR-1 tested on IP pull-downs. The ROC curves were plotted based on 56 high confidence positive targets determined by IP pull down experiment.
Figure 10
Figure 10
Number of true positives among top ranked predictions of miR-1.

Similar articles

Cited by

References

    1. Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116(2):281–297. doi: 10.1016/S0092-8674(04)00045-5. - DOI - PubMed
    1. Grey F, Hook L, Nelson J. The functions of herpesvirus-encoded microRNAs. Med Microbiol Immunol. 2008;197(2):261–267. doi: 10.1007/s00430-007-0070-1. - DOI - PMC - PubMed
    1. Xiao F, Zuo Z, Cai G, Kang S, Gao X. miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res. 2009. pp. D105–110. - DOI - PMC - PubMed
    1. Sethupathy P, Megraw M, Hatzigeorgiou A. A guide through present computational approaches for the identification of mammalian microRNA targets. Nature methods. 2006;3(11):881. doi: 10.1038/nmeth954. - DOI - PubMed
    1. Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP. MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell. 2007;27(1):91–105. doi: 10.1016/j.molcel.2007.06.017. - DOI - PMC - PubMed

Publication types

LinkOut - more resources