Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May;17(5):820-34.
doi: 10.1261/rna.2387911. Epub 2011 Mar 9.

MicroRNA transfection and AGO-bound CLIP-seq data sets reveal distinct determinants of miRNA action

Affiliations

MicroRNA transfection and AGO-bound CLIP-seq data sets reveal distinct determinants of miRNA action

Jiayu Wen et al. RNA. 2011 May.

Abstract

Microarray expression analyses following miRNA transfection/inhibition and, more recently, Argonaute cross-linked immunoprecipitation (CLIP)-seq assays have been used to detect miRNA target sites. CLIP and expression approaches measure differing stages of miRNA functioning-initial binding of the miRNP complex and subsequent message repression. We use nonparametric predictive models to characterize a large number of known target and flanking features, utilizing miRNA transfection, HITS-CLIP, and PAR-CLIP data. In particular, we utilize the precise spatial information provided by CLIP-seq to analyze the predictive effect of target flanking features. We observe distinct target determinants between expression-based and CLIP-based data. Target flanking features such as flanking region conservation are an important AGO-binding determinant-we hypothesize that CLIP experiments have a preference for strongly bound miRNP-target interactions involving adjacent RNA-binding proteins that increase the strength of cross-linking. In contrast, seed-related features are major determinants in expression-based studies, but less so for CLIP-seq studies, and increased miRNA concentrations typical of transfection studies contribute to this difference. While there is a good overlap between miRNA targets detected by miRNA transfection and CLIP-seq, the detection of CLIP-seq targets is largely independent of the level of subsequent mRNA degradation. Also, models built using CLIP-seq data show strong predictive power between independent CLIP-seq data sets, but are not strongly predictive for expression change. Similarly, models built from expression data are not strongly predictive for CLIP-seq data sets, supporting the finding that the determinants of miRNA binding and mRNA degradation differ. Predictive models and results are available at http://servers.binf.ku.dk/antar/.

PubMed Disclaimer

Figures

FIGURE 2.
FIGURE 2.
Predictive power of both individual miRNA-target features and feature combinations. (A) miRNA-interaction features are grouped into six categories by their sequence, structure, or positional characteristics (listed in the colored box). (B) Barplot shows univariate feature AUC for the positive set vs. negative sets in miRNA transfection, HITS-CLIP, and PAR-CLIP data sets. The error bars show standard errors of AUC. (C) Comparison of combined predictive performance for each feature category. AUCs for feature combinations in each of six categories are shown in bars for all three data sets (leave-one miRNA family-out cross-validation; Random forest classifier). Two additional feature combinations are also shown: the combination of all flanking features and the combination of all features. All flanking features include flanking nucleotide composition, target free-energy-based features, flanking strand asymmetry, flanking region conservation, and relative/minimum distance to 3′ UTR ends. Bar colors in B and C have the same color schema as A.
FIGURE 1.
FIGURE 1.
Feature importance ranking and predictive power for miRNA transfection/inhibition, HITS-CLIP, and PAR-CLIP data sets. (A) Feature importance rankings were evaluated using the Gini impurity criterion (normalized between 0 and 1) from Random Forest classification for three data sets, separately. Top 30 features are shown: miRNA transfection data (dark blue, in decreasing order), HITS-CLIP (light blue), and PAR-CLIP (pink). (B) The heatmap shows the AUCs for the corresponding features across individual data sets and miRNA families (number of 3′ UTRs in each data set in parentheses). AUCs from high to low are represented by colors from red to white.
FIGURE 3.
FIGURE 3.
Association of targets detected by PAR-CLIP and targets detected by knock-down expression, stratified by fold change. The distribution for each decile of fold change after inhibiting the top 25 miRNAs in HEK293 cells (x-axis) versus the proportion of transcripts showing at least one PAR-CLIP cluster (y-axis) are plotted. The least-square fitted line for the proportion is shown.
FIGURE 4.
FIGURE 4.
Comparison of feature predictive power between PAR-CLIP groups of the top-10 and bottom-10 expressed miRNA families. (A) The heatmap shows the AUCs for the corresponding features across individual data sets and miRNA families. AUCs from high to low are represented by colors from red to white. (B) Comparison of AUCs for each feature category between the two groups. P-values measure the statistical significance of AUC differences.
FIGURE 5.
FIGURE 5.
Comparisons of predictive models trained on miRNA transfection, HITS-CLIP, and PAR-CLIP data sets. (A) The heatmap shows the predictive performance (measured in AUC) of models using all features, trained and validated on different data sets. Diagonal cells (yellow framed) show the AUCs for leave-one miRNA family-out cross-validation of the models. Off-diagonal cells show the model trained on the row data set and tested on the column data set. (B) Comparisons of the correlation of expression change and posterior probability of prediction generated by the miRNA transfection-trained model (red) and the PAR-CLIP-trained model (blue). The two models were applied to miRNA transfection and protein-OE data sets without thresholding on fold change. Linear regression lines were fitted to all predicted targets with posterior probability ≥0.5 and dash lines are the 95% confidence interval of the fitting. (C) The performance of the model trained on the miRNA transfection data using all features and, for comparison, several current target prediction programs. All were evaluated on the protein-OE data set. The ROC curves (red) show the true positive rate (sensitivity) vs. false positive rate (1−specificity) for the positive set (note: fold change threshold ≥1.4-fold) vs. the negative set (low fold change) classifications. Short red horizontal lines on ROC curves marked predictions with a FDR limit of 50%, 20%, and 10% (FDR estimation in the Supplemental Materials and Methods), showing the trade-off between sensitivity and specificity. Colored dots represent maximum sensitivity (percentage of miRNA-target interactions predicted from the positive set) vs. 1−specificity (percentage of miRNA–target interactions predicted from the negative set) of predicted miRNA–target interactions for each corresponding prediction program. The CLIP-seq models evaluated on the same data are also shown.

Similar articles

Cited by

References

    1. Baek D, Vilén J, Shin C, Camargo FD, Gygi SP, Bartel DP 2008. The impact of microRNAs on protein output. Nature 455: 64–71 - PMC - PubMed
    1. Bartel DP 2009. MicroRNAs: target recognition and regulatory functions. Cell 136: 215–233 - PMC - PubMed
    1. Bartel DP, Chen CZ 2004. Micromanagers of gene expression: the potentially widespread influence of metazoan microRNAs. Nat Rev Genet 5: 396–400 - PubMed
    1. Breiman L 2001. Random forests. Mach Learn 45: 5–32
    1. Brodersen P, Voinnet O 2009. Revisiting the principles of microRNA target recognition and mode of action. Nat Rev Mol Cell Biol 10: 141–148 - PubMed

Publication types

LinkOut - more resources