Abstract
Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM–based lightning-fast iterative sequence search' (HHblits; http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50–100% higher sensitivity and generates more accurate alignments.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Altschul, S.F. et al. Nucleic Acids Res. 25, 3389–3402 (1997).
Karplus, K., Barrett, C. & Hughey, R. Bioinformatics 14, 846–856 (1998).
Eddy, S.R. Genome Inform. 23, 205–211 (2009).
Söding, J. & Remmert, M. Curr. Opin. Struct. Biol. 21, 404–411 (2011).
Söding, J. Bioinformatics 21, 951–960 (2005).
Söding, J., Biegert, A. & Lupas, A.N. Nucleic Acids Res. 33, W244–W248 (2005).
Hegyi, H. & Gerstein, M. Genome Res. 11, 1632–1640 (2001).
Biegert, A. & Söding, J. Proc. Natl. Acad. Sci. USA 106, 3770–3775 (2009).
Farrar, M. Bioinformatics 23, 156–161 (2007).
Biegert, A. & Söding, J. Bioinformatics 24, 807–814 (2008).
Andreeva, A. et al. Nucleic Acids Res. 36, D419–D425 (2008).
Gonzalez, M.W. & Pearson, W.R. Nucleic Acids Res. 38, 2177–2189 (2010).
Jones, D.T. J. Mol. Biol. 292, 195–202 (1999).
Aydin, Z., Singh, A., Bilmes, J. & Noble, W. BMC Bioinformatics 12, 154 (2011).
Finn, R.D. et al. Nucleic Acids Res. 38, D211–D222 (2010).
Marks, D.S. et al. PLoS ONE 6, e28766 (2011).
Li, W. & Godzik, A. Bioinformatics 22, 1658–1659 (2006).
Holmes, I. & Durbin, R. J. Comput. Biol. 5, 493–504 (1998).
Zhang, Y. & Skolnick, J. Nucleic Acids Res. 33, 2302–2309 (2005).
Griep, S. & Hobohm, U. Nucleic Acids Res. 38, D318–D319 (2009).
Sali, A. & Blundell, T.L. J. Mol. Biol. 234, 779–815 (1993).
Acknowledgements
We acknowledge financial support by the Deutsche Forschungsgemeinschaft (grant SFB646) and by a Gastprofessur grant from Ludwig-Maximilians Universität Munich financed through the Excellence Initiative of the Bundesministerium für Bildung und Forschung.
Author information
Authors and Affiliations
Contributions
M.R. performed research, J.S. initiated and guided research, A.B. generated the profile-column alphabet, A.H. contributed code for fast file access, and M.R. and J.S. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–10, Supplementary Tables 1 and 2 (PDF 3828 kb)
Supplementary Data 1
100 random sequences from the nr database used for run time benchmark. (TXT 55 kb)
Supplementary Data 2
List of query-template pairs for alignment benchmark. (TXT 84 kb)
Supplementary Data 3
3D homology model of PIP49/FAM69B. (TXT 161 kb)
Supplementary Data 4
Training and test set of SCOP domain sequence for sensitivity benchmark. (TXT 174 kb)
Supplementary Data 5
FASTA formatted multiple sequence alignment for human PIP49/FAM69B built by HHblits. (TXT 701 kb)
Rights and permissions
About this article
Cite this article
Remmert, M., Biegert, A., Hauser, A. et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9, 173–175 (2012). https://doi.org/10.1038/nmeth.1818
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1818
This article is cited by
-
Distance plus attention for binding affinity prediction
Journal of Cheminformatics (2024)
-
Genome sequencing and functional analysis of a multipurpose medicinal herb Tinospora cordifolia (Giloy)
Scientific Reports (2024)
-
In-silico prediction and validation of Carica papaya protein domains interaction with the Papaya leaf curl virus and associated betasatellite encoded protein
Discover Applied Sciences (2024)
-
Review and Comparative Analysis of Methods and Advancements in Predicting Protein Complex Structure
Interdisciplinary Sciences: Computational Life Sciences (2024)
-
A comprehensive map of human glucokinase variant activity
Genome Biology (2023)