Current methods of gene prediction, their strengths and weaknesses
- PMID: 12364589
- PMCID: PMC140543
- DOI: 10.1093/nar/gkf543
Current methods of gene prediction, their strengths and weaknesses
Abstract
While the genomes of many organisms have been sequenced over the last few years, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed that try to address one part of this problem, which consists of locating the genes along a genome. This paper reviews the existing approaches to predicting genes in eukaryotic genomes and underlines their intrinsic advantages and limitations. The main mathematical models and computational algorithms adopted are also briefly described and the resulting software classified according to both the method and the type of evidence used. Finally, the several difficulties and pitfalls encountered by the programs are detailed, showing that improvements are needed and that new directions must be considered.
Similar articles
-
Gene structure prediction and alternative splicing analysis using genomically aligned ESTs.Genome Res. 2001 May;11(5):889-900. doi: 10.1101/gr.155001. Genome Res. 2001. PMID: 11337482 Free PMC article.
-
Detecting alternative gene structures from spliced ESTs: a computational approach.J Comput Biol. 2009 Jan;16(1):43-66. doi: 10.1089/cmb.2008.0028. J Comput Biol. 2009. PMID: 19119993
-
Computational methods for alternative splicing prediction.Brief Funct Genomic Proteomic. 2006 Mar;5(1):46-51. doi: 10.1093/bfgp/ell011. Epub 2006 Feb 20. Brief Funct Genomic Proteomic. 2006. PMID: 16769678 Review.
-
Computational approaches to gene prediction.J Microbiol. 2006 Apr;44(2):137-44. J Microbiol. 2006. PMID: 16728949 Review.
-
Bioinformatics detection of alternative splicing.Methods Mol Biol. 2008;452:179-97. doi: 10.1007/978-1-60327-159-2_9. Methods Mol Biol. 2008. PMID: 18566765 Review.
Cited by
-
Global discriminative learning for higher-accuracy computational gene prediction.PLoS Comput Biol. 2007 Mar 16;3(3):e54. doi: 10.1371/journal.pcbi.0030054. Epub 2007 Feb 2. PLoS Comput Biol. 2007. PMID: 17367206 Free PMC article.
-
Ambiguous splice sites distinguish circRNA and linear splicing in the human genome.Bioinformatics. 2019 Apr 15;35(8):1263-1268. doi: 10.1093/bioinformatics/bty785. Bioinformatics. 2019. PMID: 30192918 Free PMC article.
-
Charting gene regulatory networks: strategies, challenges and perspectives.Biochem J. 2004 Jul 1;381(Pt 1):1-12. doi: 10.1042/BJ20040311. Biochem J. 2004. PMID: 15080794 Free PMC article. Review.
-
Predicting non-coding RNA genes in Escherichia coli with boosted genetic programming.Nucleic Acids Res. 2005 Jun 7;33(10):3263-70. doi: 10.1093/nar/gki644. Print 2005. Nucleic Acids Res. 2005. PMID: 15942029 Free PMC article.
-
GlimmerM, Exonomy and Unveil: three ab initio eukaryotic genefinders.Nucleic Acids Res. 2003 Jul 1;31(13):3601-4. doi: 10.1093/nar/gkg527. Nucleic Acids Res. 2003. PMID: 12824375 Free PMC article.
References
-
- The International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860–921. - PubMed
-
- The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796–815. - PubMed
-
- Goff S.A., Ricke,D., Lan,T.H., Presting,G., Wang,R., Dunn,M., Glazebrook,J., Sessions,A., Oeller,P., Varma,H. et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science, 296, 92–100. - PubMed
-
- Myers E., Sutton,G., Delcher,A., Dew,I., Fasulo,D., Flanigan,M., Kravitz,S., Mobarry,C., Reinert,K., Remington,K. et al. (2000) A whole-genome assembly of Drosophila. Science, 287, 2196–2204. - PubMed
-
- Claverie J.M., Poirot,O. and Lopez,F. (1997) The difficulty of identifying genes in anonymous vertebrate sequences. Comput. Chem., 21, 203–214. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources