Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar;27(3):798-808.
doi: 10.1002/pro.3353. Epub 2017 Dec 8.

Homology-based hydrogen bond information improves crystallographic structures in the PDB

Affiliations

Homology-based hydrogen bond information improves crystallographic structures in the PDB

Bart van Beusekom et al. Protein Sci. 2018 Mar.

Abstract

The Protein Data Bank (PDB) is the global archive for structural information on macromolecules, and a popular resource for researchers, teachers, and students, amassing more than one million unique users each year. Crystallographic structure models in the PDB (more than 100,000 entries) are optimized against the crystal diffraction data and geometrical restraints. This process of crystallographic refinement typically ignored hydrogen bond (H-bond) distances as a source of information. However, H-bond restraints can improve structures at low resolution where diffraction data are limited. To improve low-resolution structure refinement, we present methods for deriving H-bond information either globally from well-refined high-resolution structures from the PDB-REDO databank, or specifically from on-the-fly constructed sets of homologous high-resolution structures. Refinement incorporating HOmology DErived Restraints (HODER), improves geometrical quality and the fit to the diffraction data for many low-resolution structures. To make these improvements readily available to the general public, we applied our new algorithms to all crystallographic structures in the PDB: using massively parallel computing, we constructed a new instance of the PDB-REDO databank (https://pdb-redo.eu). This resource is useful for researchers to gain insight on individual structures, on specific protein families (as we demonstrate with examples), and on general features of protein structure using data mining approaches on a uniformly treated dataset.

Keywords: PDB; PDB-REDO; X-ray crystallography; databank; high-throughput computing; homology; hydrogen bonds; refinement; restraints.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of PDB‐REDO runs with and without homology‐based H‐bond restraints for all entries in the test set of 155 entries. Each arrow represents the scores from two rerefinements on a single PDB entry. Arrow tails indicate scores from refinement without restraints; arrowheads indicate scores from refinement with restraints. Blue and red arrows indicate improvement and deterioration of the score, respectively. The shown scores are the R free (top) calculated by Refmac5,23 and the Ramachandran Z score (middle) and first generation packing Z score (bottom) from WHAT_CHECK.24 Arrows at the same resolution have been shifted up to 0.05 Å to reduce clutter. Packing Z score and Ramachandran Z score are not shown if they were not computed by WHAT_CHECK; R free is not shown if a new R free set was chosen by PDB‐REDO.17
Figure 2
Figure 2
R free and Ramachandran Z score as a function of crystallographic resolution for entries present in PDB, in the PDB‐REDO databank prior to the introduction of homology‐derived H‐bond restraints (PDB‐REDO version 6.23), and in the PDB‐REDO databank calculated with version 7.00. Outliers are shown when they are located beyond 1.5 times the interquartile range. R free for PDB entries was determined by PDB‐REDO for consistency.
Figure 3
Figure 3
Network representations of H‐bond information transfer between homologs. The nodes represent structures in the PDB‐REDO databank. Node size and color correspond to the number of incoming edges and used resolution (darker is lower), respectively. The edge weight corresponds to the number of homologous chains. (A) Breast cancer 1 (BRCA1). (B) Alcohol dehydrogenase (ADH). (C) Modules detected in the largest network. Node size reflects module size. The three most frequent terms in PDB TITLE records (stripped from English articles, punctuation, etc.) of the structure members in the labeled modules are (1) lysozyme, carbonic, anhydrase; (2) Fab, antibody, fragment; (3) antibody, Fab, HIV; (4) trypsin, inhibitor, thrombin; (5) HLA, peptide, class; (6) hsp90, bound, inhibitor; (7) ubiquitin, nucleosome, histone; (8) binding, maltose, bound. The MutS community (orange; MutS, mismatch, coli) is linked to community 8. (D) The MutS community.
Figure 4
Figure 4
(A) The 4.5 Å structure model of E. coli maltose transporter (PDB entry 3fh629) after PDB‐REDO with HODER. All colored residues (strand in red, helix in blue) in the full structure are residues that changed in secondary structure between PDB, PDB‐REDO, and PDB‐REDO with restraints from HODER. Secondary structure elements are defined by CCP4mg30 which defines secondary structure based on DSSP algorithms.18 The secondary structure content is highest after using homology‐based H‐bond restraints, which coincides with improvements of quality scores compared to the PDB structure: R free (0.338 vs 0.3770), Ramachandran Z score (−6.7 vs −7.8), first‐generation packing Z score (−2.2 vs −2.8) and Molprobity overall percentile (60.0 vs 6.0). The all‐atom rmsd is 0.9Å and the biggest coordinate shift is 5.6Å. (B) The network neighborhood of homologous PDB entries that were used to define the restraints. The target entry, 3fh6, is shown in yellow. The node size corresponds to the number of incoming edges and edge thickness represents the number of homologous chains used. Small nodes are the high‐resolution homologs that only donate information. (C, top) Details of a β‐strand region are shown for PDB, PDB‐REDO and PDB‐REDO with HODER‐generated restraints. The regularity of the strand is improved by PDB‐REDO compared to the PDB and still further improved when restraints are used. (C, bottom) Details of an α‐helical region in the same structure models. At such a low resolution, PDB‐REDO requires the restraints from HODER to retain helical regularity. (D) The average absolute difference of φ/ψ torsion angles between 3fh6 chains and homologous chains for each homologous chain in the PDB, in PDB‐REDO and in the new version of PDB‐REDO with restraints from HODER. The chains A, B, C, and D are homologous mixed α/β domains and there are two pairs of homologous α‐helical domains: chains F and H and G and I, respectively. These three groups of homologous chains are shown separately. Especially the mixed α/β domains become much more similar to their homologous counterparts. All chains become still more similar to homologs when restraints are applied. Some homologs are clearly more similar in conformation to 3fh6 than others. All average angle differences fall in the range between 20° and 62° presented in the legend.
Figure 5
Figure 5
The MolProbity31 score percentiles (top) and energy of folding (ΔG) from FoldX32 (bottom) for each chain in the six investigated protein families. Data are shown for PDB and PDB‐REDO with restraints from HODER. For the Molprobity percentiles, a single data point is shown per entry; for ΔG, a score is shown per chain. The red and green horizontal bars indicate the median values.

Similar articles

Cited by

References

    1. Kleywegt GJ, Jones TA (2002) Homo crystallographicus ‐ quo vadis? Structure 10:465–472. - PubMed
    1. Engh RA, Huber R (1991) Accurate bond and angle parameters for X‐ray protein structure refinement. Acta Cryst 47:392–400.
    1. Parkinson G, Vojtechovsky J, Clowney L, Brünger AT, Berman HM (1996) New parameters for the refinement of nucleic acid‐containing structures. Acta Cryst 52:57–64. - PubMed
    1. Mooij WTM, Cohen SX, Joosten K, Murshudov GN, Perrakis A (2009) “Conditional restraints”: restraining the free atoms in ARP/wARP. Structure 17:183–189. - PMC - PubMed
    1. Headd JJ, Echols N, Afonine PV, Grosse‐Kunstleve RW, Chen VB, Moriarty NW, Richardson DC, Richardson JS, Adams PD (2012) Use of knowledge‐based restraints in phenix.refine to improve macromolecular refinement at low resolution. Acta Cryst D68:381–390. - PMC - PubMed

Publication types

LinkOut - more resources