Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 14:2:88.
doi: 10.3389/fbioe.2014.00088. eCollection 2014.

Discovery of Protein-lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

Affiliations

Discovery of Protein-lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

Jun-Hao Li et al. Front Bioeng Biotechnol. .

Abstract

Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein-lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP-lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP-lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.

Keywords: CLIP-Seq; GWAS; RNA-Seq; RNA-binding protein; long non-coding RNA.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The genomic context distributions of binding sites for 47 human RBPs. Binding sites are mapped to genomic features in the following priority order: CDS, 3′UTR, 5′UTR, lncRNA, pseudogene, sncRNA, intron, intergenic.
Figure 2
Figure 2
The genome-wide binding map of Ezh2 in mouse. The outer track is mouse chromosomes labeled with lncRNAs bound by Ezh2. The red tiles of the inner track represent the genomic coordinates of corresponding binding sites.
Figure 3
Figure 3
The distribution of lncRNAs bound by different numbers of RBPs. Histograms showing counts of lncRNAs bound by over 10 RBPs are zoomed in at the subpanel. SNHG1, GAS5, NEAT1, and SHNG16 are marked, which are bound by 42, 40, 39, and 39 RBPs, respectively.
Figure 4
Figure 4
The genome-wide binding map of HuR, Ago2, and MOV10 in human. The outermost track represents ideograms of chr1, chr13, and chr22 in human genome. lncRNAs bound by these RBPs are labeled on the periphery, and those bound by at least two of the three RBPs at identical binding sites are colored red. The blue, green, and purple tracks indicate the binding positions of HuR, Ago2, and MOV10, respectively.
Figure 5
Figure 5
RBP–lncRNA interactions are supported by co-expression analysis in 14 types of cancers. (A) Histograms show RBP–lncRNA interactions with expression association (Pearson correlation, p < 0.05) in at least one cancer type. (B) The expression levels of PUM2 and TUG1 are positively correlated (p < 0.05) in all 14 cancers. (C) The PUM2 binding sites on TUG1 are inferred from PAR-CLIP data, and the consensus recognition motif UGURUAUA are conserved in mammals.
Figure 6
Figure 6
The GWAS-associated SNPs and binding sites of three RBPs in the locus of PVT1. Gene annotations from UCSC, lncRNAs from GENCODE, GWAS, and LD SNPs, binding sites of eIF4AIII/HuR/U2AF65 and LD plot from HapMap are shown accordingly. The SNP rs10283090 overlapped with binding sites of HuR and U2AF65 are zoomed in at the bottom panel.
Figure 7
Figure 7
An instance for displaying RBPs target sites in the deepView Browser of starBase V2.0. The predictive FUS binding sites on MEG3 are visible in the RBP binding sites track. In this track, the binding sites of other RBPs such as TDP-43 and PTB on MEG3 are also showed, which facilitates comparative analysis of binding events of multiple RBPs.

Similar articles

Cited by

References

    1. Alvarez M. L., DiStefano J. K. (2011). Functional characterization of the plasmacytoma variant translocation 1 gene (PVT1) in diabetic nephropathy. PLoS ONE 6:e18671.10.1371/journal.pone.0018671 - DOI - PMC - PubMed
    1. Alwohhaib M., Alwaheeb S., Alyatama N., Dashti A. A., Abdelghani A., Hussain N. (2014). Single nucleotide polymorphisms at erythropoietin, superoxide dismutase 1, splicing factor, arginine/serin-rich 15 and plasmacytoma variant translocation genes association with diabetic nephropathy. Saudi J. Kidney Dis. Transpl. 25, 577–581. - PubMed
    1. Barrett T., Wilhite S. E., Ledoux P., Evangelista C., Kim I. F., Tomashevsky M., et al. (2013). NCBI GEO: archive for functional genomics data sets – update. Nucleic Acids Res. 41, D991–D995.10.1093/nar/gks1193 - DOI - PMC - PubMed
    1. Becker K. G., Barnes K. C., Bright T. J., Wang S. A. (2004). The genetic association database. Nat. Genet. 36, 431–43210.1038/ng0504-431 - DOI - PubMed
    1. Benjamini Y., Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300.

LinkOut - more resources