Using Dirichlet mixture priors to derive hidden Markov models for protein families
- PMID: 7584370
Using Dirichlet mixture priors to derive hidden Markov models for protein families
Abstract
A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the columns of a multiple alignment of that family is introduced. This method uses Dirichlet mixture densities as priors over amino acid distributions. These mixture densities are determined from examination of previously constructed HMMs or multiple alignments. It is shown that this Bayesian method can improve the quality of HMMs produced from small training sets. Specific experiments on the EF-hand motif are reported, for which these priors are shown to produce HMMs with higher likelihood on unseen data, and fewer false positives and false negatives in a database search task.
Similar articles
-
Hidden Markov models in computational biology. Applications to protein modeling.J Mol Biol. 1994 Feb 4;235(5):1501-31. doi: 10.1006/jmbi.1994.1104. J Mol Biol. 1994. PMID: 8107089
-
Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology.Comput Appl Biosci. 1996 Aug;12(4):327-45. doi: 10.1093/bioinformatics/12.4.327. Comput Appl Biosci. 1996. PMID: 8902360
-
Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.J Comput Biol. 1999 Summer;6(2):261-77. doi: 10.1089/cmb.1999.6.261. J Comput Biol. 1999. PMID: 10421527
-
Hidden Markov model and its applications in motif findings.Methods Mol Biol. 2010;620:405-16. doi: 10.1007/978-1-60761-580-4_13. Methods Mol Biol. 2010. PMID: 20652513 Review.
-
Profile hidden Markov models.Bioinformatics. 1998;14(9):755-63. doi: 10.1093/bioinformatics/14.9.755. Bioinformatics. 1998. PMID: 9918945 Review.
Cited by
-
Machine Boss: rapid prototyping of bioinformatic automata.Bioinformatics. 2021 Apr 9;37(1):29-35. doi: 10.1093/bioinformatics/btaa633. Bioinformatics. 2021. PMID: 32683444 Free PMC article.
-
Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases.Genome Res. 2003 Apr;13(4):673-92. doi: 10.1101/gr.862303. Genome Res. 2003. PMID: 12671004 Free PMC article.
-
Compositional adjustment of Dirichlet mixture priors.J Comput Biol. 2010 Dec;17(12):1607-20. doi: 10.1089/cmb.2010.0117. J Comput Biol. 2010. PMID: 21128852 Free PMC article.
-
A hidden Markov model that finds genes in E. coli DNA.Nucleic Acids Res. 1994 Nov 11;22(22):4768-78. doi: 10.1093/nar/22.22.4768. Nucleic Acids Res. 1994. PMID: 7984429 Free PMC article.
-
An efficient algorithm for accurate computation of the Dirichlet-multinomial log-likelihood function.Bioinformatics. 2014 Jun 1;30(11):1547-54. doi: 10.1093/bioinformatics/btu079. Epub 2014 Feb 11. Bioinformatics. 2014. PMID: 24519380 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Other Literature Sources