Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 7;15(1):R2.
doi: 10.1186/gb-2014-15-1-r2.

Advancing the functional utility of PAR-CLIP by quantifying background binding to mRNAs and lncRNAs

Advancing the functional utility of PAR-CLIP by quantifying background binding to mRNAs and lncRNAs

Matthew B Friedersdorf et al. Genome Biol. .

Abstract

Background: Sequence specific RNA binding proteins are important regulators of gene expression. Several related crosslinking-based, high-throughput sequencing methods, including PAR-CLIP, have recently been developed to determine direct binding sites of global protein-RNA interactions. However, no studies have quantitatively addressed the contribution of background binding to datasets produced by these methods.

Results: We measured non-specific RNA background in PAR-CLIP data, demonstrating that covalently crosslinked background binding is common, reproducible and apparently universal among laboratories. We show that quantitative determination of background is essential for identifying targets of most RNA-binding proteins and can substantially improve motif analysis. We also demonstrate that by applying background correction to an RNA binding protein of unknown binding specificity, Caprin1, we can identify a previously unrecognized RNA recognition element not otherwise apparent in a PAR-CLIP study.

Conclusions: Empirical background measurements of global RNA-protein crosslinking are a necessary addendum to other experimental controls, such as performing replicates, because covalently crosslinked background signals are reproducible and otherwise unavoidable. Recognizing and quantifying the contribution of background extends the utility of PAR-CLIP and can improve mechanistic understanding of protein-RNA specificity, protein-RNA affinity and protein-RNA association dynamics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Control immunoprecipitations of PAR-CLIP gels show significant background RNA. Phosphorimages of SDS acrylamide gels following the PAR-CLIP protocol show radiolabeled RNPs from HEK293 cell lysates. The IP from each lysate contains four replicates. Brackets indicate extracted regions of the HuR RBP and controls using FLAG-GFP (tagged green fluorescent protein as G20, G35, G45) and no tagged protein.
Figure 2
Figure 2
PAR-CLIP background reads contain many T-to-C conversions that have G-rich motifs. A) Pie charts of all PARalyzer utilized reads from HuR PAR-CLIP and the three background gel slices (G45, G35 and G20) that contain T-to-C conversions, suggesting crosslinked events. PARalyzer utilized reads are defined as sequences containing 0, 1 or 2 T-to-C conversions that map uniquely to the human reference genome. B) Pie charts of genomic location where HuR, total RNA and background clusters map to in human reference genome. Background clusters are union of all three background samples. Total RNA was prepared from 4SU-containing, crosslinked lysates that were partially digested with RNase T1. C) Motif logo of top 25 occurring 8mer motifs by union of all three background samples. The G-rich motif is significantly enriched compared to shuffled sequences preserving di-nucleotide frequencies of the library (p-value = 7.04 × 10-8).
Figure 3
Figure 3
High abundance background sites are common and reproducible between different PAR-CLIP background gel slices. Area-proportional elliptical Venn diagrams of reads from three background samples isolated from different sizes on SDS-PAGE gel. A) All sites with one or more reads displayed. B) Sites with five or more reads displayed. C) Sites with 25 of more reads displayed.
Figure 4
Figure 4
Experimentally measured background reads distinguish commonly detected background sites from authentic binding events. Three representative examples of genomic regions containing PAR-CLIP reads: MATLAT1 lncRNA, ELAVL1 coding sequence (CDS), and ELAVL1 3’UTR. MALAT1 lncRNA and the ELAVL1 CDS have significant background binding (three middle panels G20, G35, G45) while the definitive HuR RBP binding site in ELAVL1 3’UTR lacks any reads from background libraries but contains reads in the total (Tot) library. Grey bars represent unique sequencing reads while blue/red marks or green/tan marks represent T-to-C conversions detected on the positive or negative genomic strand, respectively. The numbers in the upper left corners are the scale of the maximum read depth for an individual nucleotide. Depictions of these binding events to the full-length MALAT1 and ELAVL1 transcripts are shown in Additional file 3: Figure S3 and Additional file 4: Figure S4, respectively.
Figure 5
Figure 5
Background binding sites are present in many published PAR-CLIP libraries especially putatively weak binding RBPs. Previously published PAR-CLIP results from multiple studies were processed using PARalyzer and analyzed for sites that overlap with background and with each other. A) Genome browser window depicting overlapping sites in an exon of the XIST lncRNA from various PAR-CLIP studies. Red horizontal bars in the top panel indicate areas with read evidence in at least one of the three background libraries. Horizontal blue bars indicate reads from various PAR-CLIP studies, re-analyzed in this study, all with the same PARalyzer parameters. Black bars indicate PAR-CLIP studies not re-analyzed in this study but found in the doRiNA database. B) Bar graph depicting percent overlap of background sites with sites from 33 different PAR-CLIP experiments (see Additional file 1: Table S1 for info on RBP and study). Vertical red bars are from PAR-CLIPs of putative or “non-professional” RBPs, which are defined as previously unrecognized RBPs because they do not contain any known RNA recognition motifs or domains. Orange bars are PAR-CLIP experiments that produced very few reads, those containing roughly 10,000 or fewer reads (post-processing and mapping). Blue bars are all other PAR-CLIP experiments using established RBPs.
Figure 6
Figure 6
Sites found in an increasing number of PAR-CLIP libraries increase in their percent overlap with PAR-CLIP background sites. X-axis shows bins of PAR-CLIP identified sites appearing in exactly the indicated number of different PAR-CLIP libraries. Y-axis indicates the percent overlap for those sites with background sites.
Figure 7
Figure 7
Accounting for background binding dramatically improves motif finding from PAR-CLIP data. A) Percent change in motif enrichment after background correction for full Pum motif (UGUAHAUA) and half Pum motif (UGUA). Blue bars indicate improvement for Pum library after background correction. HuR (red bars) and Total (green bars) are controls that do not specifically associate with Pum motifs. B) Matched-pair analysis of 7mers from uncorrected Caprin1 sites versus 7mers from background corrected Caprin1 sites (normalized for library size). Motif 1 and Motif 2 are enriched by background correction and both are A-rich with motif 1 containing poly (A) signal sequences. Motif 3 is U-rich and moderately depleted by background correction while Motif 4 is G-rich and dramatically depleted.
Figure 8
Figure 8
Background correction removes misleading enrichment of HuR binding in region downstream of transcription start sites. Reads from each library were normalized for number of nucleotides per library and length of region (fraction of nucleotides per library per Kb). Background reads were derived from union of all three backgrounds. Numbers 5+, 10+ and 25+ indicate HuR clusters with 5, 10, or 25 or more reads, per cluster. Regions 25-50 bp downstream of transcription start sites (green bars) and 50-100 bp downstream (purple bars) appear enriched prior to background correction; however, they were no longer enriched after correction.

Similar articles

Cited by

References

    1. Moore MJ. From birth to death: the complex lives of eukaryotic mRNAs. Science. 2005;309:1514–1518. doi: 10.1126/science.1111443. - DOI - PubMed
    1. Keene JD, Tenenbaum SA. Eukaryotic mRNPs may represent posttranscriptional operons. Mol Cell. 2002;9:1161–1167. doi: 10.1016/S1097-2765(02)00559-2. - DOI - PubMed
    1. Keene JD. RNA regulons: coordination of post-transcriptional events. Nat Rev Genet. 2007;8:533–543. doi: 10.1038/nrg2111. - DOI - PubMed
    1. Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. Deciphering the splicing code. Nature. 2010;465:53–59. doi: 10.1038/nature09000. - DOI - PubMed
    1. Irimia M, Blencowe BJ. Alternative splicing: decoding an expansive regulatory layer. Curr Opin Cell Biol. 2012;24:323–332. doi: 10.1016/j.ceb.2012.03.005. - DOI - PubMed

Publication types

MeSH terms