LPFC: an Internet library of protein family core structures
- PMID: 9007997
- PMCID: PMC2143520
- DOI: 10.1002/pro.5560060127
LPFC: an Internet library of protein family core structures
Abstract
As the number of protein molecules with known, high-resolution structures increases, it becomes necessary to organize these structures for rapid retrieval, comparison, and analysis. The Protein Data Bank (PDB) currently contains nearly 5,000 entries and is growing exponentially. Most new structures are similar structurally to ones reported previously and can be grouped into families. As the number of members in each family increases, it becomes possible to summarize, statistically, the commonalities and differences within each family. We reported previously a method for finding the atoms in a family alignment that have low spatial variance and those that have higher spatial variance (i.e., the "core" atoms that have the same relative position in all family members and the "non-core" atoms that do not). The core structures we compute have biological significance and provide an excellent quantitative and visual summary of a multiple structural alignment. In order to extend their utility, we have constructed a library of protein family cores, accessible over the World Wide Web at http:/ /www-smi.stanford.edu/projects/helix/LPFC/. This library is generated automatically with publicly available computer programs requiring only a set of multiple alignments as input. It contains quantitative analysis of the spatial variation of atoms within each protein family, the coordinates of the average core structures derived from the families, and display files (in bitmap and VRML formats). Here, we describe the resource and illustrate its applicability by comparing three multiple alignments of the globin family. These three alignments are found to be similar, but with some significant differences related to the diversity of family members and the specific method used for alignment.
Similar articles
-
The FSSP database of structurally aligned protein fold families.Nucleic Acids Res. 1994 Sep;22(17):3600-9. Nucleic Acids Res. 1994. PMID: 7937067 Free PMC article.
-
OLDERADO: on-line database of ensemble representatives and domains. On Line Database of Ensemble Representatives And DOmains.Protein Sci. 1997 Dec;6(12):2628-30. doi: 10.1002/pro.5560061215. Protein Sci. 1997. PMID: 9416612 Free PMC article.
-
DMAPS: a database of multiple alignments for protein structures.Nucleic Acids Res. 2006 Jan 1;34(Database issue):D273-6. doi: 10.1093/nar/gkj018. Nucleic Acids Res. 2006. PMID: 16381863 Free PMC article.
-
The HSSP database of protein structure-sequence alignments and family profiles.Nucleic Acids Res. 1998 Jan 1;26(1):313-5. doi: 10.1093/nar/26.1.313. Nucleic Acids Res. 1998. PMID: 9399862 Free PMC article.
-
Average core structures and variability measures for protein families: application to the immunoglobulins.J Mol Biol. 1995 Aug 4;251(1):161-75. doi: 10.1006/jmbi.1995.0423. J Mol Biol. 1995. PMID: 7643385
Cited by
-
FoldMiner: structural motif discovery using an improved superposition algorithm.Protein Sci. 2004 Jan;13(1):278-94. doi: 10.1110/ps.03239404. Protein Sci. 2004. PMID: 14691242 Free PMC article.
-
CORA--topological fingerprints for protein structural families.Protein Sci. 1999 Apr;8(4):699-715. doi: 10.1110/ps.8.4.699. Protein Sci. 1999. PMID: 10211816 Free PMC article.
-
A database of macromolecular motions.Nucleic Acids Res. 1998 Sep 15;26(18):4280-90. doi: 10.1093/nar/26.18.4280. Nucleic Acids Res. 1998. PMID: 9722650 Free PMC article.
-
Overcoming sequence misalignments with weighted structural superposition.Proteins. 2012 Nov;80(11):2523-35. doi: 10.1002/prot.24134. Epub 2012 Jul 28. Proteins. 2012. PMID: 22733542 Free PMC article.
-
A structural census of the current population of protein sequences.Proc Natl Acad Sci U S A. 1997 Oct 28;94(22):11911-6. doi: 10.1073/pnas.94.22.11911. Proc Natl Acad Sci U S A. 1997. PMID: 9342336 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials