Using database matches with for HMMGene for automated gene detection in Drosophila
- PMID: 10779492
- PMCID: PMC310864
- DOI: 10.1101/gr.10.4.523
Using database matches with for HMMGene for automated gene detection in Drosophila
Abstract
The application of the gene finder HMMGene to the Adh region of the Drosophila melanogaster is described, and the prediction results are analyzed. HMMGene is based on a probabilistic model called a hidden Markov model, and the probabilistic framework facilitates the inclusion of database matches of varying degrees of certainty. It is shown that database matches clearly improve the performance of the gene finder. For instance, the sensitivity for coding exons predicted with both ends correct grows from 62% to 70% on a high-quality test set, when matches to proteins, cDNAs, repeats, and transposons are included. The specificity drops more than the sensitivity increases when ESTs are used. This is due to the high noise level in EST matches, and it is discussed in more detail why this is and how it might be improved.
Comment in
-
A biologist's view of the Drosophila genome annotation assessment project.Genome Res. 2000 Apr;10(4):391-3. doi: 10.1101/gr.10.4.391. Genome Res. 2000. PMID: 10779478 Review. No abstract available.
Similar articles
-
Drosophila genomic sequence annotation using the BLOCKS+ database.Genome Res. 2000 Apr;10(4):543-6. doi: 10.1101/gr.10.4.543. Genome Res. 2000. PMID: 10779495 Free PMC article.
-
GeneID in Drosophila.Genome Res. 2000 Apr;10(4):511-5. doi: 10.1101/gr.10.4.511. Genome Res. 2000. PMID: 10779490 Free PMC article.
-
Using GeneWise in the Drosophila annotation experiment.Genome Res. 2000 Apr;10(4):547-8. doi: 10.1101/gr.10.4.547. Genome Res. 2000. PMID: 10779496 Free PMC article.
-
Genie--gene finding in Drosophila melanogaster.Genome Res. 2000 Apr;10(4):529-38. doi: 10.1101/gr.10.4.529. Genome Res. 2000. PMID: 10779493 Free PMC article.
-
FlyBase : a database for the Drosophila research community.Methods Mol Biol. 2008;420:45-59. doi: 10.1007/978-1-59745-583-1_3. Methods Mol Biol. 2008. PMID: 18641940 Review.
Cited by
-
Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources.BMC Bioinformatics. 2006 Feb 9;7:62. doi: 10.1186/1471-2105-7-62. BMC Bioinformatics. 2006. PMID: 16469098 Free PMC article.
-
The Crohn's disease susceptibility gene DLG5 as a member of the CARD interaction network.J Mol Med (Berl). 2008 Apr;86(4):423-32. doi: 10.1007/s00109-008-0307-5. Epub 2008 Mar 12. J Mol Med (Berl). 2008. PMID: 18335190
-
AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome.Genome Biol. 2006;7 Suppl 1(Suppl 1):S11.1-8. doi: 10.1186/gb-2006-7-s1-s11. Epub 2006 Aug 7. Genome Biol. 2006. PMID: 16925833 Free PMC article.
-
Paircomp, FamilyRelationsII and Cartwheel: tools for interspecific sequence comparison.BMC Bioinformatics. 2005 Mar 24;6:70. doi: 10.1186/1471-2105-6-70. BMC Bioinformatics. 2005. PMID: 15790396 Free PMC article.
-
Revamp a model-status and prospects of the Dictyostelium genome project.Curr Genet. 2003 Nov;44(2):59-72. doi: 10.1007/s00294-003-0416-1. Epub 2003 Jul 11. Curr Genet. 2003. PMID: 12856150 Review.
References
-
- Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. - PubMed
-
- Durbin RM, Eddy SR, Krogh A, Mitchison G. Biological sequence analysis. Cambridge, UK: Cambridge University Press; 1998.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials