Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts
- PMID: 23892401
- PMCID: PMC3783192
- DOI: 10.1093/nar/gkt646
Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts
Abstract
It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense-antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.
Figures
Similar articles
-
De novo approach to classify protein-coding and noncoding transcripts based on sequence composition.Methods Mol Biol. 2014;1182:203-7. doi: 10.1007/978-1-4939-1062-5_18. Methods Mol Biol. 2014. PMID: 25055913
-
CNIT: a fast and accurate web tool for identifying protein-coding and long non-coding transcripts based on intrinsic sequence composition.Nucleic Acids Res. 2019 Jul 2;47(W1):W516-W522. doi: 10.1093/nar/gkz400. Nucleic Acids Res. 2019. PMID: 31147700 Free PMC article.
-
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311. BMC Bioinformatics. 2014. PMID: 25239089 Free PMC article.
-
Non-coding Natural Antisense Transcripts: Analysis and Application.J Biotechnol. 2021 Nov 10;340:75-101. doi: 10.1016/j.jbiotec.2021.08.005. Epub 2021 Aug 8. J Biotechnol. 2021. PMID: 34371054 Review.
-
Classification and experimental identification of plant long non-coding RNAs.Genomics. 2019 Sep;111(5):997-1005. doi: 10.1016/j.ygeno.2018.04.014. Epub 2018 Apr 19. Genomics. 2019. PMID: 29679643 Review.
Cited by
-
Unraveling the role of long non-coding RNAs in chronic heat stress-induced muscle injury in broilers.J Anim Sci Biotechnol. 2024 Oct 8;15(1):135. doi: 10.1186/s40104-024-01093-6. J Anim Sci Biotechnol. 2024. PMID: 39375773 Free PMC article.
-
The lncRNA SNHG26 drives the inflammatory-to-proliferative state transition of keratinocyte progenitor cells during wound healing.Nat Commun. 2024 Oct 5;15(1):8637. doi: 10.1038/s41467-024-52783-8. Nat Commun. 2024. PMID: 39366968 Free PMC article.
-
Integrated Metabolome, Transcriptome and Long Non-Coding RNA Analysis Reveals Potential Molecular Mechanisms of Sweet Cherry Fruit Ripening.Int J Mol Sci. 2024 Sep 12;25(18):9860. doi: 10.3390/ijms25189860. Int J Mol Sci. 2024. PMID: 39337346 Free PMC article.
-
Full-Length Transcriptome Construction and Systematic Characterization of Virulence Factor-Associated Isoforms in Vairimorpha (Nosema) Ceranae.Genes (Basel). 2024 Aug 23;15(9):1111. doi: 10.3390/genes15091111. Genes (Basel). 2024. PMID: 39336702 Free PMC article.
-
Whole-transcriptome analyses of ovine lung microvascular endothelial cells infected with bluetongue virus.Vet Res. 2024 Sep 27;55(1):122. doi: 10.1186/s13567-024-01372-0. Vet Res. 2024. PMID: 39334220 Free PMC article.
References
-
- Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. The evolution of gene expression levels in mammalian organs. Nature. 2011;478:343–348. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources