Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Aug 25:6:210.
doi: 10.1186/1471-2105-6-210.

Variation in structural location and amino acid conservation of functional sites in protein domain families

Affiliations

Variation in structural location and amino acid conservation of functional sites in protein domain families

Birgit Pils et al. BMC Bioinformatics. .

Abstract

Background: The functional sites of a protein present important information for determining its cellular function and are fundamental in drug design. Accordingly, accurate methods for the prediction of functional sites are of immense value. Most available methods are based on a set of homologous sequences and structural or evolutionary information, and assume that functional sites are more conserved than the average. In the analysis presented here, we have investigated the conservation of location and type of amino acids at functional sites, and compared the behaviour of functional sites between different protein domains.

Results: Functional sites were extracted from experimentally determined structural complexes from the Protein Data Bank harbouring a conserved protein domain from the SMART database. In general, functional (i.e. interacting) sites whose location is more highly conserved are also more conserved in their type of amino acid. However, even highly conserved functional sites can present a wide spectrum of amino acids. The degree of conservation strongly depends on the function of the protein domain and ranges from highly conserved in location and amino acid to very variable. Differentiation by binding partner shows that ion binding sites tend to be more conserved than functional sites binding peptides or nucleotides.

Conclusion: The results gained by this analysis will help improve the accuracy of functional site prediction and facilitate the characterization of unknown protein sequences.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of interaction scores. The interaction score reflects the importance of a functional sites in establishing an interaction. Surprisingly, only few interacting sites are absolutely conserved in their location within the whole protein family and characterized by high interaction scores. The majority of interacting sites feature small interaction scores. This shows that these sites are only used by a few sequences of the domain family for establishing an interaction, which can also be caused by the different nature of ligands.
Figure 2
Figure 2
Interaction profile of the RICIN domain. Alignment of positions corresponding to an HMM match state. Sites interacting with saccharides are indicated in blue, peptide interactions in orange, and sites interacting with both ligands, saccharides and peptides, are indicated in purple. Light colours represent backbone interactions, darker colours involve side chain atoms. The amino acid conservation is visualized by green bars below the alignment. Sugar binding sites described in the literature are indicated by red arrows above the alignment [41]. Several positions (1, 3, 4, 22, 42, 58, 88, 90, 122) are located in the vicinity of a glycosylation site, but do not specifically interact with saccharides. The unrooted tree reflects the classification into three main subgroups with different interaction sites. Group II harbours two sugar-binding sites, group I and III originate from tandem RICIN domains, in which group I preserved the N-terminal sugar-binding site and group III the carboxy-terminal binding site. PDB identifiers from top to bottom: 1PC8 (B: 5–131), 1TFM (B: 5–131), 2MLL (B: 5–131), 1CE7 (B: 5–131), 1ONK (B: 9–135), 1PUM (B: 9–135), 1M2T (B: 257–383), 1OQL (B:13–139), 1ABR (B: 13–139), 2AAI (B: 8–134), 1HWO (B: 10–135), 1HWP (B: 10–135), 1HWN (B:10–135), 1HWM (B:3–266), 1V6U (A: 312–436), 1ISW (A:312–436), 1ISV (A:312–436), 1ITO (A:312–436), 1V6W (A: 312–436), 1V6X (A: 312–436), 1XYF (A:312–436), 1ISY (A: 312–436), 1ISZ (A:312–436), 1V6V (A:312–436), 1ISX (A:312–436), 1KNM (A:7–131), 1KNL (A:9–133), 1BFM1MC9(A:9–133), 1QXM (A: 29–157), 1PUM (B: 140–262), 1M2T (B: 390–510), 1ONK (B: 140–262), 1OQL (B: 140–262), 1PC8 (B: 136–254), 1TFM (B: 136–254), 2MLL (B: 136–254), 1CE7 (B:136–254), 2AAI (B: 138–261), 1ABR (B: 143–266), 1HWO (B: 138–262), 1HWP (B: 138–262), 1HWM (B: 138–262), 1HWN (B: 139–263), 1FWU (A: 3–123), 1DQG (A: 4–124), 1DQO (A: 4–124), 1FWV (A: 3–123)
Figure 3
Figure 3
Variable location of interacting amino acid residues in the HMG domain. Sequence specific interaction by the high mobility group (SMART: HMG) domain (green) is achieved by an amino acid side chain (pink) pointing into the DNA double helix (blue). The interaction is achieved by a phenylalanine in figure 3a [28] or by a serine residue in 3b [27]. The sequence alignment (figure 3c) reveals that these two interacting residues are not located at corresponding position.
Figure 4
Figure 4
Amino acid conservation of interacting and non-interacting sites. Non-interacting sites (yellow) are slightly more highly conserved than interacting sites (red) as shown by the shift to higher amino acid conservation of interacting sites.
Figure 5
Figure 5
Substrate specific interaction by varying the type of amino acid. Substrate specificity in the zinc finger domain (SMART: ZnF_C2H2) is ensured by various amino acids that interact with the bases of the DNA. The protein domain is highlighted in green, the DNA chain in orange and the zinc atom in red.
Figure 6
Figure 6
Correlation of interaction scores and amino acid conservation. For better visualization of the correlation, the data was divided into five groups corresponding to the 0–20% quantile, 20–40% quantile, etc. of the amino acid conservation scores and then the median interaction score and median amino acid conservation score was calculated for each group and plotted with red dots. The correlation coefficient, p-value and population are indicated for each ligand above the graphs. The correlation coefficient was calculated according to Pearson's method under the null-hypothesis of no correlation (c = 0).

Similar articles

Cited by

References

    1. Lichtarge O, Sowa ME. Evolutionary predictions of binding surfaces and interactions. Curr Opin Struct Biol. 2002;12:21–27. doi: 10.1016/S0959-440X(02)00284-1. - DOI - PubMed
    1. Campbell SJ, Gold ND, Jackson RM, Westhead DR. Ligand binding: functional site location, similarity and docking. Curr Opin Struct Biol. 2003;13:389–395. doi: 10.1016/S0959-440X(03)00075-7. - DOI - PubMed
    1. Jones S, Thornton JM. Searching for functional sites in protein structures. Curr Opin Chem Biol. 2004;8:3–7. doi: 10.1016/j.cbpa.2003.11.001. - DOI - PubMed
    1. Bhinge A, Chakrabarti P, Uthanumallian K, Bajaj K, Chakraborty K, Varadarajan R. Accurate detection of protein:ligand binding sites using molecular dynamics simulations. Structure (Camb) 2004;12:1989–1999. doi: 10.1016/j.str.2004.09.005. - DOI - PubMed
    1. Laurie AT, Jackson RM. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics. 2005;21:1908–1916. doi: 10.1093/bioinformatics/bti315. - DOI - PubMed

Publication types

LinkOut - more resources