Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Jun:50:91-100.
doi: 10.1016/j.sbi.2017.12.004. Epub 2018 Jan 9.

So you think computational approaches to understanding glycosaminoglycan-protein interactions are too dry and too rigid? Think again!

Affiliations
Review

So you think computational approaches to understanding glycosaminoglycan-protein interactions are too dry and too rigid? Think again!

Nehru Viji Sankaranarayanan et al. Curr Opin Struct Biol. 2018 Jun.

Abstract

Glycosaminoglycans (GAGs) play key roles in virtually all biologic responses through their interaction with proteins. A major challenge in understanding these roles is their massive structural complexity. Computational approaches are extremely useful in navigating this bottleneck and, in some cases, the only avenue to gain comprehensive insight. We discuss the state-of-the-art on computational approaches and present a flowchart to help answer most basic, and some advanced, questions on GAG-protein interactions. For example, firstly, does my protein bind to GAGs?; secondly, where does the GAG bind?; thirdly, does my protein preferentially recognize a particular GAG type?; fourthly, what is the most optimal GAG chain length?; fifthly, what is the structure of the most favored GAG sequence?; and finally, is my GAG-protein system 'specific', 'non-specific', or a combination of both? Recent advances show the field is now poised to enable a non-computational researcher perform advanced experiments through the availability of various tools and online servers.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

  1. The authors have no conflicts of interest to declare

Figures

Figure 1
Figure 1
A) Prediction of GAG binding site on a protein by ClusPro docking server (http://cluspro.bu.edu) [38,39]. A generic heparin sequence predicted to bind to FGF2 (cyan sticks) is compared with the crystal bound hexamer (magenta). B) Prediction of GAG binding modes through the DMD approach [31,33]. The study deduced most populated clusters of six DMD simulations with different GAG type and chain length binding to IL-10. Shown is schematic visualization of the two principally different GAG binding modes observed followed DMD. C) Prediction of the most favored GAG sequence that binds to HCII using dual-filter CVLS strategy that identifies ‘high-affinity & high-specificity’ sequences [11]. D) Prediction of two ternary GAG–protein complexes (AT–HS–T and HCII–HS–T) using CVLS approach [26]. Although the two serpins (AT and HCII) are strikingly similar, the position of T (thrombin) in the two is dramatically different and this matches with the 60° difference in mode of HS binding onto the two proteins. Figures to be reproduced after permission from respective publishers.
Figure 2
Figure 2
A flowchart describing the use of computational approaches in addressing key questions on GAG–protein interactions (Panels A through D). Although shown in sequential format (A→B→C→D), it is not strictly necessary to rigorously follow this flowchart, especially if some information is already available for any of the steps. ➀ Sequence and atomic coordinates of a protein can be obtained from the protein data bank (www.rcsb.org). ➁ Homology model of a protein of unknown structure can be generated using programs such as Modeller (https://salilab.org/modeller/), Swiss-Model (https://swissmodel.expasy.org), etc. ➂ Consensus sequences include –XBBXBX-, -XBBBXXBX- (B = basic residue and X = hydropathic residue) [14], TXXBXXTBXXXTBB (T = turn),[41] CPC clif motif [43], clamp-like orientation of basic residues with beta sheet conformations [44]. ➃ If a protein satisfies step ➁, then it is likely to bind GAGs. ➄ a) Electrostatic potential (ESP) can be calculated using tools such as APBS from PyMol (https://www.pymol.org/), DeepView-Swiss-PdbViewer (http://spdbv.vital-it.ch/) and others. GRID search refers to protocol described by Goodford [50]. Site-mapping technique [30]. b) from step 5a we can identify basic site / subsite(s) ➅ Experimental evidence typically includes site directed mutagenesis, NMR, congenital mutation information, etc [–54]. ➆ Putative GAG binding site(s) are identified based on results from ESP, GRID search, site-mapping techniques [–20, 30]. ➇ This includes protonation, addition of hydrogens, modeling of missing residues and minimization of protein using a modeling software. GAG structures can be built using CHIMERA (https://www.cgl.ucsf.edu/chimera/) or GLYCAM (http://glycam.org/tools/molecular-dynamics/oligosaccharide-builder/build-glycan?id=8). ➈ Perform initial docking to site(s) of binding identified in step ➆ for various GAGs (HP, HS, CS, DS) of various lengths (dp2, dp4 and dp6) using either Autodock (http://autodock.scripps.edu), Autodock Vina (http://vina.scripps.edu), GOLD (https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/), DOCK (http://dock.compbio.ucsf.edu/), MOE (https://www.chemcomp.com/MOE-Structure_Based_Design.htm), or other programs(refer supplementary information) ➉ Here GAG length, radius of site of binding, number of iterations, number of docking runs, type of docking program, etc. are evaluated and the best protocol is implemented in production run. ⑪ Perform repeated molecular docking using the optimized program and parameters from step ➉ for a library of GAG sequences. A library of GAG sequences can be obtained from the Desai lab (built using SPL scripts) [23,24]. Based on need, this library could have 1,000 to more than 100,000 unique sequences. ⑫ Analysis includes ranking of docked poses by calculating either RMSD, energy, score, non-bonded interactions, etc. and identify the most favored GAG equence(s). ⑬ Although typically not considered part of a computational program, validation of results in solution experiments obtained in ⑫ is extremely important. ⑭ Utilize the most favored GAG–protein complex from ⑬ and prepare initial coordinates for MD, which includes selecting force field, ensuring charge neutralization, immersing in an explicit box of solvent molecules, and minimizing the system. ⑮ Equilibration implies allowing the system to reach physiological conditions such as constant temperature and pressure (NPT/NVE) conditions. ⑯ This includes performing MD run for ~1 ns to ~1 ms, based on need, and collecting trajectories of data. ⑰ – ⑲ Analysis of trajectories may involve RMSD convergence, direct and water mediated H-bond interactions and their occupancies, binding free energy calculations (MMPBSA/MMGBSA), FEP, LIE and single residue energy decomposition calculations. ⑳ This involves ascertaining that computational deduction of thermodynamic stability on the basis of steps ⑰ through ⑲ is supported by some results in solution.
Figure 3
Figure 3
The use of MD in understanding GAG–protein interactions. Analyzes of MD trajectories over a timescale of few picosec to hundreds of microsec affords a wealth of thermodynamic and kinetic information on GAG–protein co-complexes. Shown are results for an exemplary system, heparin octasaccharide (HS08) binding to CXCL5 in water (A-F). A) The observed intermolecular hydrogen bonds (H-bonds, broken black lines) between donors and acceptor atoms of GAG and side chains of amino acids at a given time frame. B) Percent occupancy of H-bonds between key amino acid residues with HS08 is shown (higher (red) to lower (blue)). C) A representative MD frame showing the bed of heterogeneous water molecules surrounding interacting regions. The distribution of water engineers GAG–water, water–water, protein–water, protein–GAG–water H-bonds. Water molecules are represented as red spheres and interactions are shown as faint black dashed lines. D) Significant number of interactions arise from water mediated H-bonds (i.e., not direct) between GAG and protein, as shown. E) Shown is the overall occupancy of water mediated H-bond interactions between GAG and protein. F) Single residue energy decomposition (SRED) of interacting amino acids in the co-complex as deduced by MM-PBSA/MM-GBSA method.

Similar articles

Cited by

References

    1. Balagurunathan K, Nakato H, Desai UR. Glycosaminoglycans. Chemistry and Biology. Methods Mol Biol. 2015;1229:1–625. - PubMed
    1. Babik S, Samsonov SA, Pisabarro MT. Computational drill down on FGF1-heparin interactions through methodological evaluation. Glycoconj J. 2017;34:427–440. - PMC - PubMed
    1. Xu D, Esko JD. Demystifying heparan sulfate-protein interactions. Annu Rev Biochem. 2014;83:129–157. - PMC - PubMed
    1. Mulloy B, Hogwood J, Gray E, Lever R, Page CP. Pharmacology of Heparin and Related Drugs. Pharmacol Rev. 2016;68:76–141. - PubMed
    1. Joseph PR, Mosier PD, Desai UR, Rajarathnam K. Solution NMR characterization of chemokine CXCL8/IL-8 monomer and dimer binding to glycosaminoglycans: structural plasticity mediates differential binding interactions. Biochem J. 2015;472:121–133. The paper presents a good combination of biophysical and computational experiments to understand the nature of interactions in monomer–dimer equilibrium. MD studies show that the GAG binding is plastic and pinpointed residues contributing to the dynamic nature of the process. - PMC - PubMed

Publication types

LinkOut - more resources