Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 May 22:8:167.
doi: 10.1186/1471-2105-8-167.

Hinge Atlas: relating protein sequence to sites of structural flexibility

Affiliations

Hinge Atlas: relating protein sequence to sites of structural flexibility

Samuel C Flores et al. BMC Bioinformatics. .

Abstract

Background: Relating features of protein sequences to structural hinges is important for identifying domain boundaries, understanding structure-function relationships, and designing flexibility into proteins. Efforts in this field have been hampered by the lack of a proper dataset for studying characteristics of hinges.

Results: Using the Molecular Motions Database we have created a Hinge Atlas of manually annotated hinges and a statistical formalism for calculating the enrichment of various types of residues in these hinges.

Conclusion: We found various correlations between hinges and sequence features. Some of these are expected; for instance, we found that hinges tend to occur on the surface and in coils and turns and to be enriched with small and hydrophilic residues. Others are less obvious and intuitive. In particular, we found that hinges tend to coincide with active sites, but unlike the latter they are not at all conserved in evolution. We evaluate the potential for hinge prediction based on sequence. Motions play an important role in catalysis and protein-ligand interactions. Hinge bending motions comprise the largest class of known motions. Therefore it is important to relate the hinge location to sequence features such as residue type, physicochemical class, secondary structure, solvent exposure, evolutionary conservation, and proximity to active sites. To do this, we first generated the Hinge Atlas, a set of protein motions with the hinge locations manually annotated, and then studied the coincidence of these features with the hinge location. We found that all of the features have bearing on the hinge location. Most interestingly, we found that hinges tend to occur at or near active sites and yet unlike the latter are not conserved. Less surprisingly, we found that hinge residues tend to be small, not hydrophobic or aliphatic, and occur in turns and random coils on the surface. A functional sequence based hinge predictor was made which uses some of the data generated in this study. The Hinge Atlas is made available to the community for further flexibility studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Amino acids arranged in ascending order of Hinge Index (HI) (orange line). Low p-values (vertical bars) indicate high statistical significance. Legend information applies to similar graphs in this work.
Figure 2
Figure 2
Residues within four amino acid positions of the active site are significantly more likely to be in hinges.
Figure 3
Figure 3
Residues in alpha helices were less likely to occur in hinges, with very high significance. Turn and coil residues, on the other hand, were more likely to be in hinges, also with high statistical significance.
Figure 4
Figure 4
Size, aliphaticity, and hydrophobicity appear to account for much of the segregation of residues along physicochemical lines. In particular, the individually underrepresented residues (Gly, Ser, Ala) are classified as "tiny." Other underrepresented residues types (Leu, Val) are aliphatic, while still others (Phe, and again Val) are hydrophobic.
Figure 5
Figure 5
The least conserved 20% of residues are significantly more likely to appear in hinges.
Figure 6
Figure 6
Since active sites residues are enriched in hinges, we performed a separate conservation check on hinge residues in the 94 Hinge Atlas proteins with CSA annotation. We found that even in this set, the least conserved 1/5th of amino acids in each protein tended to contain significantly more hinge residues. The fourth bin was sparse in hinge residues, but at a p-value of 0.043, the significance of this was marginal.
Figure 7
Figure 7
Hinge residues tend to be on the surface, since steric clashes would often prevent them from being in the core. We computed the solvent accessible surface area for the backbone atoms of all residues in the Hinge Atlas and binned the residues by this quantity. Bin #1 contains the 20% of all residues with the largest solvent accessible surface area, and bin #5 contains the 20% of residues with the smallest solvent accessible surface area. The first two bins (together representing the 40% of residues with highest surface area) are enriched with hinges in a highly significant manner. Conversely, the last two bins (lowest 40% ASA) are significantly low in hinges.
Figure 8
Figure 8
The thick red trace represents HingeSeq performance against the Hinge Atlas annotation in the test set of 53 proteins. The diagonal black line represents the performance of a completely random predictor, with area under the curve of 0.5. HingeSeq is seen to have substantial predictive power, since it encloses significantly greater area.
Figure 9
Figure 9
To check for possible database bias, we computed the amino acid composition of MolMovDB and found that it follows that of the PDB, from which it is largely compiled.
Figure 10
Figure 10
Although the Hinge Atlas and computer annotated set have a significant overlap, they are statistically different sets. Importantly, the hinge residues within these sets are different from each other, despite sharing 106 residues.
Figure 11
Figure 11
Histogram of f˜j* MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGMbGzgaacamaaDaaaleaacqWGQbGAaeaacWaGWkOkaOcaaaaa@31AE@(aGLY) (sample frequency of glycine among NON-hinge residues, blue trace) and fj* MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaqhaaWcbaGaemOAaOgabaGamaiScQcaQaaaaaa@319F@(aGLY) (sample frequency of glycine among hinge residues, dashed red trace). The sample frequency of glycine residues among NON-hinge residues in bins containing 1/8th of all Hinge Atlas proteins was found to average 0.078. The sample frequency of glycine among hinge residues in bins containing 1/8th of all Hinge Atlas proteins was found to average 0.124. The standard deviation was considerably larger for the hinge set, since this is a small subset of the Hinge Atlas.

Similar articles

Cited by

References

    1. The database of macromolecular motions http://www.molmovdb.org
    1. Gerstein RJM, Johnson T, Tsai J, Krebs W. Studying macromolecular motions in a database framework: from structure to sequence. Rigidity Theory and Applications. 1999. pp. 401–442.
    1. Krebs W. Dissertation. New Haven: Yale University; The database of macromolecular motions: A standardized system for analyzing and visualizing macromolecular motions in a database framework. - PMC - PubMed
    1. Gerstein M, Krebs W. A database of macromolecular motions. Nucleic Acids Res. 1998;26:4280–4290. doi: 10.1093/nar/26.18.4280. - DOI - PMC - PubMed
    1. Shatsky M, Nussinov R, Wolfson HJ. Flexible protein alignment and hinge detection. Proteins. 2002;48:242–256. doi: 10.1002/prot.10100. - DOI - PubMed

Publication types

LinkOut - more resources