Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;34(1):57-65.
doi: 10.1002/humu.22225. Epub 2012 Nov 2.

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models

Affiliations
Free PMC article

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models

Hashem A Shihab et al. Hum Mutat. 2013 Jan.
Free PMC article

Abstract

The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The distribution of the predicted magnitude of effect for disease-associated (shaded region) and functionally neutral (unshaded region) AASs in the SwissVar dataset using our unweighted and weighted methods (A and B, respectively). From this, we calculated prediction thresholds at which both specificity and sensitivity were maximized (−3.0 and −1.5, respectively).
Figure 2
Figure 2
Receiver operating characteristic (ROC) curves for the top-ranking computational prediction algorithms evaluated using the SwissVar dataset. Here, we compare our unweighted method against SIFT and PANTHER (A—full curve; B—10% false positive rate) whereas our weighted method is compared to SNPs&GO and MutPred (C—full curve; D—10% false positive rate). Full ROC curves for all computational prediction algorithms evaluated are made available in Supp. Figure S3.
Figure 3
Figure 3
The intersection of disease-associated amino acid substitutions correctly identified by the top-ranking computational prediction algorithms evaluated using the SwissVar dataset. Here, we compare our unweighted method against SIFT and PANTHER (A) whereas our weighted method is compared to SNPs&GO and MutPred (B).

Similar articles

Cited by

References

    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249. - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, et al. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004;32:D115–119. - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. - PMC - PubMed

Publication types