Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
- PMID: 9254694
- PMCID: PMC146917
- DOI: 10.1093/nar/25.17.3389
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Similar articles
-
Large-scale comparison of protein sequence alignment algorithms with structure alignments.Proteins. 2000 Jul 1;40(1):6-22. doi: 10.1002/(sici)1097-0134(20000701)40:1<6::aid-prot30>3.0.co;2-7. Proteins. 2000. PMID: 10813826
-
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.Bioinformatics. 2000 Nov;16(11):988-1002. doi: 10.1093/bioinformatics/16.11.988. Bioinformatics. 2000. PMID: 11159310
-
Using CLUSTAL for multiple sequence alignments.Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8. Methods Enzymol. 1996. PMID: 8743695
-
Getting the most from PSI-BLAST.Trends Biochem Sci. 2002 Mar;27(3):161-4. doi: 10.1016/s0968-0004(01)02039-4. Trends Biochem Sci. 2002. PMID: 11893514 Review.
-
Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.Nucleic Acids Res. 2001 Jul 15;29(14):2994-3005. doi: 10.1093/nar/29.14.2994. Nucleic Acids Res. 2001. PMID: 11452024 Free PMC article. Review.
Cited by
-
Introduction to Integrated Proteogenomic Pipeline for Dealing with Pathogenic Missense SNPs.Methods Mol Biol. 2025;2859:93-107. doi: 10.1007/978-1-0716-4152-1_6. Methods Mol Biol. 2025. PMID: 39436598
-
Multiobjective heuristic algorithm for de novo protein design in a quantified continuous sequence space.Comput Struct Biotechnol J. 2021 Apr 25;19:2575-2587. doi: 10.1016/j.csbj.2021.04.046. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34025944 Free PMC article.
-
BLAST: a more efficient report with usability improvements.Nucleic Acids Res. 2013 Jul;41(Web Server issue):W29-33. doi: 10.1093/nar/gkt282. Epub 2013 Apr 22. Nucleic Acids Res. 2013. PMID: 23609542 Free PMC article.
-
Validating a Coarse-Grained Potential Energy Function through Protein Loop Modelling.PLoS One. 2013 Jun 18;8(6):e65770. doi: 10.1371/journal.pone.0065770. Print 2013. PLoS One. 2013. PMID: 23824634 Free PMC article.
-
RiceXPro version 3.0: expanding the informatics resource for rice transcriptome.Nucleic Acids Res. 2013 Jan;41(Database issue):D1206-13. doi: 10.1093/nar/gks1125. Epub 2012 Nov 23. Nucleic Acids Res. 2013. PMID: 23180765 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials