Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 26;108(30):12301-6.
doi: 10.1073/pnas.1102727108. Epub 2011 Jul 7.

Maps of protein structure space reveal a fundamental relationship between protein structure and function

Affiliations

Maps of protein structure space reveal a fundamental relationship between protein structure and function

Margarita Osadchy et al. Proc Natl Acad Sci U S A. .

Abstract

To study the protein structure-function relationship, we propose a method to efficiently create three-dimensional maps of structure space using a very large dataset of > 30,000 Structural Classification of Proteins (SCOP) domains. In our maps, each domain is represented by a point, and the distance between any two points approximates the structural distance between their corresponding domains. We use these maps to study the spatial distributions of properties of proteins, and in particular those of local vicinities in structure space such as structural density and functional diversity. These maps provide a unique broad view of protein space and thus reveal previously undescribed fundamental properties thereof. At the same time, the maps are consistent with previous knowledge (e.g., domains cluster by their SCOP class) and organize in a unified, coherent representation previous observation concerning specific protein folds. To investigate the function-structure relationship, we measure the functional diversity (using the Gene Ontology controlled vocabulary) in local structural vicinities. Our most striking finding is that functional diversity varies considerably across structure space: The space has a highly diverse region, and diversity abates when moving away from it. Interestingly, the domains in this region are mostly alpha/beta structures, which are known to be the most ancient proteins. We believe that our unique perspective of structure space will open previously undescribed ways of studying proteins, their evolution, and the relationship between their structure and function.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Maps of protein structure space. Each point represents a SCOP domain, and the distance between any two points approximates the structural distance between their corresponding domains. BD show the map of the SCOP classes: As expected, the points are clustered. FH show the structural density map, where the color of each point indicates the number of domains that lie in its vicinity of fixed distance (denoted Vfd). We see that the highest density is within the regions of the all-alpha domains, followed by a region in the alpha/beta domain and in the all-beta domain. Fig. S2 shows a similar density map when considering sequence nonredundant samples of the protein world.
Fig. 2.
Fig. 2.
Structural density and functional diversity by SCOP class. We calculate the separate histograms of structural density (A) and functional diversity (B) of each of the SCOP classes and stack them one on top of the other. We see that the densest regions are populated by all-alpha domains, and the most functionally diverse regions by the alpha/beta domains. See Table S2 (listing the exact proportions of each of the SCOP classes, among the top 10%/20% most dense/functionally diverse domains) and Fig. S12 for supporting evidence.
Fig. 3.
Fig. 3.
Functional-diversity map of protein structure space. The color of a point indicates the degree of functional diversity measured by the number of distinct GO-MF terms annotating the domains in its vicinity. Here, we use the Vsamp definition for a vicinity of a protein: a sample of fixed size from all domains that fall within a fixed distance from it. AD show the functional diversity for the true data; EH show the functional diversity of a random world, in which the proteins have the same structures, yet their functions are assigned at random. We see that when using the true functional annotations, there is a core of high functional diversity, and that functional diversity drops toward the periphery. Alternatively, when the functions are assigned at random, there is no such core, and function diversity is uniformly high. The figures in SI Appendix, and Table S1, show that the results are qualitatively similar when using alternative datasets, scoring functions, and the more uniform (coarser) annotation graph GO slim.
Fig. 4.
Fig. 4.
SCOP folds that lie in the functionally diverse core. We highlight the location in structure space of specific SCOP folds and show histograms of the diversity of the domains of these folds; for comparison, A shows the full dataset (a copy of Fig. 3 A and B) outlined in black. B and C show two SCOP folds that are known to be functionally diverse, the TIM barrel fold (c.1) and the adenine nucleotide alpha hydrosase-like fold (c.26). Indeed, the domains of these two folds are located in the highly diverse core of structure space. There are, however, many other domains in the core. DF show three more examples of SCOP folds that lie in the highly diverse core: phosphorylase/hydrolase-like (c.56), alpha/beta-Hydrolases (c.69), and protein kinase-like (PK-like, d.144), respectively. Table S3 lists the mean and average functional diversity scores for several SCOP folds that lie in the core.

Similar articles

Cited by

References

    1. Redfern OC, Dessailly B, Orengo CA. Exploring the structure and function paradigm. Curr Opin Struct Biol. 2008;18:394–402. - PMC - PubMed
    1. Friedberg I. Automated protein function prediction—the genomic challenge. Brief Bioinform. 2006;7:225–242. - PubMed
    1. Holm L, Sander C. Mapping the protein universe. Science. 1996;273:595–603. - PubMed
    1. Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function. Proc Natl Acad Sci USA. 2005;102:3651–3656. - PMC - PubMed
    1. Hou J, Sims GE, Zhang C, Kim SH. A global representation of the protein fold space. Proc Natl Acad Sci USA. 2003;100:2386–2390. - PMC - PubMed

Publication types

LinkOut - more resources