Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 23:5:8004.
doi: 10.1038/srep08004.

MBSTAR: multiple instance learning for predicting specific functional binding sites in microRNA targets

Affiliations

MBSTAR: multiple instance learning for predicting specific functional binding sites in microRNA targets

Sanghamitra Bandyopadhyay et al. Sci Rep. .

Abstract

MicroRNA (miRNA) regulates gene expression by binding to specific sites in the 3'untranslated regions of its target genes. Machine learning based miRNA target prediction algorithms first extract a set of features from potential binding sites (PBSs) in the mRNA and then train a classifier to distinguish targets from non-targets. However, they do not consider whether the PBSs are functional or not, and consequently result in high false positive rates. This substantially affects the follow up functional validation by experiments. We present a novel machine learning based approach, MBSTAR (Multiple instance learning of Binding Sites of miRNA TARgets), for accurate prediction of true or functional miRNA binding sites. Multiple instance learning framework is adopted to handle the lack of information about the actual binding sites in the target mRNAs. Biologically validated 9531 interacting and 973 non-interacting miRNA-mRNA pairs are identified from Tarbase 6.0 and confirmed with PAR-CLIP dataset. It is found that MBSTAR achieves the highest number of binding sites overlapping with PAR-CLIP with maximum F-Score of 0.337. Compared to the other methods, MBSTAR also predicts target mRNAs with highest accuracy. The tool and genome wide predictions are available at http://www.isical.ac.in/~bioinfo_miu/MBStar30.htm.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Distribution of PAR-CLIP gold standard clusters according to (a) biological complexity and (b) read numbers (number of T→C conversion).
Figure 2
Figure 2. Scatter plot of false positive rate and true positive rate of MBSTAR and other algorithms (miRanda, TargetScan100p, MirTarget2 and SVMicrO) for verified positive and non-target interactions.
The plot also contains ROC of MBSTAR with AUC (area under curve) of 0.7.
Figure 3
Figure 3. Scatter plot of precision and recall of MBSTAR and other algorithms (miRanda, TargetScan100p, MirTarget2 and SVMicrO) for verified positive and non-target interactions.
The plot also contains Precision-recall curve of MBSTAR with AUC (area under curve) of 0.93.
Figure 4
Figure 4. Distribution of predicted PAR-CLIP gold standard clusters according to (a) biological complexity by MBSTAR, (b) biological complexity by miRanda, (c) biological complexity by MirTarget2, (d) biological complexity by SVMicrO, (e) biological complexity by TargetScan25p, (f) biological complexity by TargetScan50p, (g) biological complexity by TargetScan75p and (h) biological complexity by TargetScan100p.
Figure 5
Figure 5. Cumulative distribution of sensitivity for MBSTAR, miRanda, TargetScan, MirTarget2 and SVMicrO according to (a) biological complexity, (b) number of T→C conversion.
Figure 6
Figure 6. Empirical cumulative distribution of sensitivity for MBSTAR, miRanda, MirTarget2, SVMicrO and TargetScan according to normalized scores.
Figure 7
Figure 7. Number of predicted sites verified by PAR-CLIP for MBSTAR, miRanda, MirTarget2, SVMicrO and TargetScan according to normalized cut-off score intervals.
Figure 8
Figure 8. Process flowchart of the proposed MBSTAR.
Figure 9
Figure 9. Classification of positive and negative instances by multiple instance learning methodology when only the bag label is known.
Figure 10
Figure 10. Interaction prediction of one positive bag (hsa-let-7a/NM_181833) and a negative bag (hsa-miR-9/NM_172216) with MIL technique where (+) denotes the true binding site of miRNA targets and (-) denotes the negative instance of binding sites.

Similar articles

Cited by

References

    1. Ambros V. The functions of animal microRNAs. Nature 431, 350–355 (2004). - PubMed
    1. Bartel D. P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 116, 281–297 (2004). - PubMed
    1. Lytle J. R., Yario T. A. & Steitz J. A. Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5′ UTR as in the 3′ UTR. Proc. Natl. Acad. Sci. USA. 104, 9667–9672 (2007). - PMC - PubMed
    1. Enright Anton J. et al. MicroRNA targets in Drosophila. Genome Biol. 5, R1–R1 (2004). - PMC - PubMed
    1. Kozomara A. & Griffiths-Jones S. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39, D152–D157 (2011). - PMC - PubMed

Publication types

LinkOut - more resources