Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(2):e31437.
doi: 10.1371/journal.pone.0031437. Epub 2012 Feb 8.

Composite structural motifs of binding sites for delineating biological functions of proteins

Affiliations

Composite structural motifs of binding sites for delineating biological functions of proteins

Akira R Kinjo et al. PLoS One. 2012.

Abstract

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Examples of elementary and composite motifs.
A: Concrete examples of elementary motifs (corresponding to B). Several binding sites belonging to each elementary motif are superimposed. The binding site atoms that constitute the elementary motif are shown in ball-and-stick representation with CPK coloring and ligands are shown in green wireframes (non-polymers) or tubes (proteins). These binding sites include subunits shown in C. Non-polymer ligands are phenylalanine and its analogs (N1), FAD (N2), and polyamines (N3). B: In this example, the combinations of 3 non-polymer binding elementary motifs (cyan triangles labeled N1, N2 and N3) and 3 protein binding elementary motif (orange rectangles labeled P1, P2 and P3) found in various protein subunits (black dots) define 3 distinct composite motifs (hexagons in magenta labeled C1, C2, and C3). Examples of each elementary motif are shown in molecular figures (A) right above the triangles or rectangles, and those of each composite motif are shown in molecular figures (C) right below the hexagons. Direct correspondence between elementary and composite motifs is indicated by thick edges in pale magenta. C: Concrete examples of composite motifs (corresponding to B). These 3 composite motifs share the same elementary motif for FAD binding (labeled N2 in B). Subunits (colored pink) containing the composite motifs (C1, C2, C3) are shown with elementary motifs in ball-and-stick representations (protein binding sites in orange, non-polymer binding sites in cyan) and with ligands in green (spacefill for non-polymers, cartoon for proteins). From left to right: L-amino acid oxidase (LAAO) from Calloselasma rhodostoma in homo-dimeric form (PDB ID: 1F8S , chain A); human lysine-specific histone demethylase 1 (KDM1) (PDB ID: 2IW5 , chain A); polyamine oxidase (PAO) from Zea mays in putative homo-dimeric form (PDB ID: 3KU9 , chain A, pdbx_struct_assembly.id 3). The protein figures were created using jV . The network diagrams (also in Figs. 5 and 6) were created using Cytoscape .
Figure 2
Figure 2. Characterization of composite motifs.
A: Histogram of the number of elementary motifs comprising composite motifs. B: Histograms of the average and minimum sequence identities (%) between pairs of subunits within each composite motif. C: Composite motif similarity as a function of minimum sequence identity between pairs of composite motifs. Sequence identity between two composite motifs is defined as the sequence identity between two protein sequences, one belonging to the one motif, the other to the other motif.
Figure 3
Figure 3. Correspondence between composite motifs and protein functions.
A: Average UniProt function similarity as a function of similarity between subunits based on composite motifs, individual binding sites or sequence identity. Data points with insufficient number of samples were discarded (see Materials and Methods). Error bars indicate the standard deviation of the average function similarity based on 10 bootstrap samplings. B: Same as A, except that only the UniProt functions of the Biological process category were used. C: Composite motifs with more than one elementary motif (nformula image1) are compared with those with at least one elementary motif (nformula image0), the latter are the same as in A. D: Same as C, except that only the UniProt functions of the Biological process category were used.
Figure 4
Figure 4. Examples of differences in composite motifs and functions.
Left column: superposition of common elementary motifs (pink and cyan) and their ligands (magenta and blue). Center column: the biological unit containing the subunit with the elementary motif shown in the left column in pink, with interacting molecules (other than that in the left column) in green and non-interacting molecules in grey. Right column: the biological unit containing the subunit with the elementary motif shown in the left column in cyan, with interacting molecules (other than that in the left column) in green and non-interacting molecules in grey. A: Glycine oxidase (center) and glycerol-3-phosphate dehydrogenase (right), sharing FAD binding motif (left). B: D-3-phosphoglycerate dehydrogenase (center) and C-terminal binding protein 3 (right) sharing NAD binding motif (left). C: formula image-trypsin (center) and coagulation factor VII (right) sharing protease inhibitor binding motif (left). D: Cytochrome formula image (center) and glycolate oxidase (right) sharing FMN binding motif (left).
Figure 5
Figure 5. Meta-composite motifs.
A: A meta-composite motif is defined as a set of all composite motifs (hexagons in magenta) associated with particular UniProt functions (green circles). The associations are defined through individual protein subunits (black dots); see text for the detailed definitions. Each composite motifs are associated with elementary motifs for non-polymer (triangles in cyan), protein (rectangles in orange), or nucleic acid (diamonds in blue) binding sites (c.f. Fig. 1). B: A simplified representation of the diagram shown in A. C: Average function similarity as a function of meta-composite motif similarity or meta-sequence motif (type-1 and type-2) similarity.
Figure 6
Figure 6. Network structure of the meta motif for biological process.
Examples of a meta-composite motif (A) and a type-1 meta-sequence motif (B) for the UniProt biological process “Transcription.” A: The meta-composite motif, i.e., the set of composite motifs (colored hexagons) associated with Transcription. B: type-1 meta-sequence motif, i.e., the set of type-1 sequence clusters associated with the same keyword.
Figure 7
Figure 7. Characteristics of meta motif networks.
A: Average counts of composite motifs or sequence clusters (denoted CM/SC), connected components (CC) as well as edges representing sharing of common elementary motifs (CEM) for non-polymer, protein and nucleic acid binding sites, common sequences (CS) and protein-protein interactions (PPI). B: The same counts for nodes and various edges, but only for the meta motifs for the UniProt keyword “Transcription” (corresponding to the diagrams in Fig. 6).

Similar articles

Cited by

References

    1. Santos GM, Fairall L, Schwabe JWR. Negative regulation by nuclear receptors: a plethora of mechanisms. Trends Endoctorinol Metab. 2011;22:87–93. - PMC - PubMed
    1. Shi J, Wei Z, Song J. Dissection study on the severe acute respiratory syndrome 3C-like protease reveals the critical role of the extra domain in dimerization of the enzyme. J Biol Chem. 2004;279:24765–24773. - PMC - PubMed
    1. Koike R, Kidera A, Ota M. Alteration of oligomeric state and domain architecture is essential for functional transformation between transferase and hydrolase with the same scaffold. Protein Sci. 2009;18:2060–2066. - PMC - PubMed
    1. Friedberg I. Automated protein function prediction – the genomic challenge. Brief Bioinform. 2006;7:225–242. - PubMed
    1. Loewenstein Y, Raimondo D, Redfern OC, Watson J, Frishman D, et al. Protein function annotation by homology-based inference. Genome Biol. 2009;10:207. - PMC - PubMed

Publication types