Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;82 Suppl 2(0 2):154-63.
doi: 10.1002/prot.24495.

Assessment of ligand binding site predictions in CASP10

Affiliations

Assessment of ligand binding site predictions in CASP10

Tiziano Gallo Cassarino et al. Proteins. 2014 Feb.

Abstract

The identification of amino acid residues in proteins involved in binding small molecule ligands is an important step for their functional characterization, as the function of a protein often depends on specific interactions with other molecules. The accuracy of computational methods aiming to predict such binding residues was evaluated within the "function prediction (prediction of binding sites, FN)" category of the critical assessment of protein structure prediction (CASP) experiment. In the last edition of the experiment (CASP10), 17 research groups participated in this category, and their predictions were evaluated on 13 prediction targets containing biologically relevant ligands. The results of this experiment indicate that several methods achieved an overall good performance, showing the usefulness of such methods in predicting ligand binding residues. As in previous years, methods based on a homology transfer approach were dominating. In comparison to CASP9, a larger fraction of the top predictors are automated servers. However, due to the small number of targets and the characteristics of the prediction format, the differences observed among the first ten methods were not statistically significant and it was also not possible to analyze differences in accuracy for different ligand types or overall structure, difficulty. To overcome these limitations and to allow for a more detailed evaluation, in future editions of CASP, methods in the FN category will no longer be evaluated on the "normal" CASP targets, but assessed continuously by CAMEO (continuous automated model evaluation) based on weekly prereleased sequences from the PDB.

Keywords: CASP; active site; assessment; binding site; cofactor; evaluation; ligand; protein function; protein structure.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Binding sites and ligands of the assessed targets
Biologically relevant ligands are coloured in green and the residues included in the binding site are coloured in blue. Targets and ligands included here: (A) T0652: AMP, (B) T0675: ZN, (C) T0686: MG, (D) T0696: NA, (E) T0697: LLP, (F) T0706: MG, (G-H) T0720: SF4 and MN, (I) T0721: FAD, (J) T0726: ZN, (K) T0737: FAD, (L) T0744: FNR. Targets T0657 and T0659 are in Figure 7.
Figure 2
Figure 2. Number of predictions per group
Since only a small number of all CASP10 targets contained relevant ligands, only a few predictions could be used for the assessment (dark blue and dark green), while the majority of the predictions could not be evaluated (blue and green).
Figure 3
Figure 3. Target difficulty
Distribution of the predictor’s MCC for each target shown as Box plot (1st quartile, median and 3rd quartile), indicating the difficulty in the prediction of the various binding sites.
Figure 4
Figure 4. Groups ranking by MCC
The predictors are ranked in decreasing order by the average value of the MCC, calculated on all the evaluated targets. Human predictors are shown in blue and server predictors in yellow.
Figure 5
Figure 5. Groups ranking robustness
Methods were ranked using the median value of the MCC distributions after 100 cycles of random sampling using 70% of the targets. Bars indicate best, median and worst ranking for each group.
Figure 6
Figure 6. FAD binding site in target T0737
Residues 176-292 (D1, blue) have been classified as “free modeling”. However, the N-terminal domain (grey) where the ligand FAD is bound, is covered by experimental structures. (Image generated with OpenStructure).
Figure 7
Figure 7. Binding site prediction examples
Residues are colored as “correctly predicted” (true positive, green) and “wrongly predicted” (false positive, violet). (A) The Zn2+ binding site in a Btk-type zinc finger in the PH domain from the Tyrosine-protein kinase Tec (T0657) is formed by His 121, Cys 132, Cys 133, and Cys 143. Coloring according to prediction by group “Binding_Kihara” (FN231). (B) Structure of a hypothetical protein T0659 with a Zn2+ ion bound by three conserved Cysteine residues (Cys 43,48,63). Coloring according to predictions by group “SP-ALIGN” (FN326).

Similar articles

Cited by

References

    1. Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002;18(Suppl 1):S71–77. - PubMed
    1. Capra JA, Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics. 2007;23(15):1875–1882. - PubMed
    1. Fischer JD, Mayer CE, Soding J. Prediction of protein functional residues from sequence by probability density estimation. Bioinformatics. 2008;24(5):613–620. - PubMed
    1. Laurie AT, Jackson RM. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics. 2005;21(9):1908–1916. - PubMed
    1. Binkowski TA, Joachimiak A. Protein functional surfaces: global shape matching and local spatial alignments of ligand binding sites. BMC Struct Biol. 2008;8:45. - PMC - PubMed

Publication types

LinkOut - more resources