Database of homology-derived protein structures and the structural meaning of sequence alignment
- PMID: 2017436
- DOI: 10.1002/prot.340090107
Database of homology-derived protein structures and the structural meaning of sequence alignment
Abstract
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.
Similar articles
-
The HSSP database of protein structure-sequence alignments.Nucleic Acids Res. 1994 Sep;22(17):3597-9. Nucleic Acids Res. 1994. PMID: 7937066 Free PMC article.
-
Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures.J Mol Biol. 1993 Aug 5;232(3):805-25. doi: 10.1006/jmbi.1993.1433. J Mol Biol. 1993. PMID: 8355272
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Secondary structure prediction and protein design.Biochem Soc Symp. 1990;57:11-24. Biochem Soc Symp. 1990. PMID: 2099736 Review.
-
Searching protein structure databases has come of age.Proteins. 1994 Jul;19(3):165-73. doi: 10.1002/prot.340190302. Proteins. 1994. PMID: 7937731 Review.
Cited by
-
Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures.J Struct Funct Genomics. 2012 Dec;13(4):213-25. doi: 10.1007/s10969-012-9146-2. Epub 2012 Oct 20. J Struct Funct Genomics. 2012. PMID: 23086054 Free PMC article.
-
AlphaFill: enriching AlphaFold models with ligands and cofactors.Nat Methods. 2023 Feb;20(2):205-213. doi: 10.1038/s41592-022-01685-y. Epub 2022 Nov 24. Nat Methods. 2023. PMID: 36424442 Free PMC article.
-
Algorithmic approaches to protein-protein interaction site prediction.Algorithms Mol Biol. 2015 Feb 15;10:7. doi: 10.1186/s13015-015-0033-9. eCollection 2015. Algorithms Mol Biol. 2015. PMID: 25713596 Free PMC article.
-
T-RMSD: a web server for automated fine-grained protein structural classification.Nucleic Acids Res. 2013 Jul;41(Web Server issue):W358-62. doi: 10.1093/nar/gkt383. Epub 2013 May 28. Nucleic Acids Res. 2013. PMID: 23716642 Free PMC article.
-
Structural Exploration and Conformational Transitions in MDM2 upon DHFR Interaction from Homo sapiens: A Computational Outlook for Malignancy via Epigenetic Disruption.Scientifica (Cairo). 2016;2016:9420692. doi: 10.1155/2016/9420692. Epub 2016 Apr 17. Scientifica (Cairo). 2016. PMID: 27213086 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources