Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;41(2):e35.
doi: 10.1093/nar/gks967. Epub 2012 Nov 5.

Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks

Affiliations

Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks

Xingli Guo et al. Nucleic Acids Res. 2013 Jan.

Abstract

More and more evidences demonstrate that the long non-coding RNAs (lncRNAs) play many key roles in diverse biological processes. There is a critical need to annotate the functions of increasing available lncRNAs. In this article, we try to apply a global network-based strategy to tackle this issue for the first time. We develop a bi-colored network based global function predictor, long non-coding RNA global function predictor ('lnc-GFP'), to predict probable functions for lncRNAs at large scale by integrating gene expression data and protein interaction data. The performance of lnc-GFP is evaluated on protein-coding and lncRNA genes. Cross-validation tests on protein-coding genes with known function annotations indicate that our method can achieve a precision up to 95%, with a suitable parameter setting. Among the 1713 lncRNAs in the bi-colored network, the 1625 (94.9%) lncRNAs in the maximum connected component are all functionally characterized. For the lncRNAs expressed in mouse embryo stem cells and neuronal cells, the inferred putative functions by our method highly match those in the known literature.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Principles of lnc-GFP. (A) The coding–non-coding bi-colored network is represented as a graph. (B) Function T is used to compute the previous knowledge score between an unannotated lncRNA v and the given function category f. (C) Function S is used to compute the final association score between v and f based on the genes known to be annotated with f. The computation not only simulates the iterative propagation of the ‘function flow’ on the network but also considers the local constraint on behalf of previous knowledge score.
Figure
2.
Figure 2.
Coding–non-coding bi-colored biological network. (A) A maximum connected subnetwork of the bi-colored network of mouse is shown; here, the red node represents protein coding gene and the green node represents lncRNA, the blue line represents co-expression between two genes, the light blue line represents co-expression and protein interaction between two genes and the black line represents protein interaction between two genes. (B) The distribution of ‘co-expression’ edges and ‘protein interaction’ edges in the bi-colored network. (C) The degree distribution of the bi-colored network. Here, k is degree, P(k) denotes the probability with a degree k. (D) Superior performance of our bi-colored network.
Figure 3.
Figure 3.
Performance of lnc-GFP. (A) The performance of lnc-GFP in cross-validation tests. (B) The performance of lnc-GFP in noisy bi-colored networks with part of edges randomized. (C) The performance of lnc-GFP in noisy bi-colored networks with part of edges deleted.
Figure 4.
Figure 4.
LncRNAs involved in diverse GO BPs. Here, the rank denotes the rank threshold. For the given rank threshold, the number of lncRNAs and GO BPs involved in the predicted ‘lnc2go’associations are given on the top of bars.

Similar articles

Cited by

References

    1. Okazaki Y, Furuno M, Kasukawa T, Adachi J, Bono H, Kondo S, Nikaido I, Osato N, Saito R, Suzuki H, et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002;420:563–573. - PubMed
    1. Ota T, Suzuki Y, Nishikawa T, Otsuki T, Sugiyama T, Irie R, Wakamatsu A, Hayashi K, Sato H, Nagai K, et al. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat. Genet. 2003;36:40–45. - PubMed
    1. Tupy JL, Bailey AM, Dailey G, Evans-Holm M, Siebel CW, Misra S, Celniker SE, Rubin GM. Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA. 2005;102:5495–5500. - PMC - PubMed
    1. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458:223–227. - PMC - PubMed
    1. Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat. Biotechnol. 2010;28:503–510. - PMC - PubMed

Publication types

Substances