Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Aug;50(4):371-95.
doi: 10.1007/s10858-011-9522-4. Epub 2011 Jun 25.

Protein side-chain resonance assignment and NOE assignment using RDC-defined backbones without TOCSY data

Affiliations

Protein side-chain resonance assignment and NOE assignment using RDC-defined backbones without TOCSY data

Jianyang Zeng et al. J Biomol NMR. 2011 Aug.

Abstract

One bottleneck in NMR structure determination lies in the laborious and time-consuming process of side-chain resonance and NOE assignments. Compared to the well-studied backbone resonance assignment problem, automated side-chain resonance and NOE assignments are relatively less explored. Most NOE assignment algorithms require nearly complete side-chain resonance assignments from a series of through-bond experiments such as HCCH-TOCSY or HCCCONH. Unfortunately, these TOCSY experiments perform poorly on large proteins. To overcome this deficiency, we present a novel algorithm, called NASCA: (NOE Assignment and Side-Chain Assignment), to automate both side-chain resonance and NOE assignments and to perform high-resolution protein structure determination in the absence of any explicit through-bond experiment to facilitate side-chain resonance assignment, such as HCCH-TOCSY. After casting the assignment problem into a Markov Random Field (MRF), NASCA: extends and applies combinatorial protein design algorithms to compute optimal assignments that best interpret the NMR data. The MRF captures the contact map information of the protein derived from NOESY spectra, exploits the backbone structural information determined by RDCs, and considers all possible side-chain rotamers. The complexity of the combinatorial search is reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution. Then an A* search algorithm is employed to find a set of optimal side-chain resonance assignments that best fit the NMR data. These side-chain resonance assignments are then used to resolve the NOE assignment ambiguity and compute high-resolution protein structures. Tests on five proteins show that NASCA: assigns resonances for more than 90% of side-chain protons, and achieves about 80% correct assignments. The final structures computed using the NOE distance restraints assigned by NASCA: have backbone RMSD 0.8-1.5 Å from the reference structures determined by traditional NMR approaches.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A toy example to illustrate the basic idea of the MRF framework. (A): Cartoon NOESY spectrum. Resonances are represented by lower case letters, and NOESY cross peaks are shown in blue circles. For clarity, symmetric and diagonal peaks are not shown. (B): The NOESY graph. Unassigned side-chain resonance nodes are represented by white squares, while assigned backbone resonance nodes are represented by red squares. (C): The proton labels. The backbone structure is shown in blue stick, and side-chain rotamers are shown in blue line. Each green circle represents a side-chain proton label. (D): The pairwise pseudo-energy matrix. (E): Complete enumeration of side-chain resonance assignments for nodes a and b using the A* algorithm after the DEE pruning. Assignments of resonance nodes a and b are represented by the branches in the first and second tiers respectively. Node marked with red X is pruned by the DEE algorithm from further consideration. The number at the bottom of each leaf node is the pseudo-energy of the corresponding assignments. The minimum pseudo-energy of the optimal assignments is shown in boldface. (F): All resonance assignments in order of increasing pseudo-energy.
Figure 2
Figure 2
Schematic illustration on the four major steps of NASCA. (A): Construction of the NOESY graph. (B): Construction of proton labels. (C): The side-chain resonance assignment process. (D): The NOE assignment process. An example of Steps (A), (B) and (C) is described in Fig. 1.
Figure 3
Figure 3
Accuracies of resonance assignments for different types of side-chain protons, where R stands for the aromatic protons.
Figure 4
Figure 4
Accuracies of side-chain resonance assignments for different residue types. The 4-χ type includes asparagine and lysine. The 3-χ type includes methionine, glutamine and glutamic acid. The 2-χ type includes aspartic acid, asparagine, isoleucine, leucine, histidine, phenylalanine, tryptophan and tyrosine. The 1-χ type includes proline, threonine, valine, serine and cysteine.
Figure 5
Figure 5
Accuracies of resonance assignments for side-chain protons with different solvent accessibilities. The solvent accessibility for each proton was computed using the software MOL-MOL (Koradi et al. 1996) with a solvent radius of 2.0 Å.
Figure 6
Figure 6
Final NMR structures computed using our automatically-assigned NOEs. Row 1: the ensemble of 20 lowest-energy NMR structures. Row 2: ribbon view of one structure in the ensemble. Row 3: backbone overlay of the mean structures (blue) vs. corresponding NMR reference structures (green) (PDB ID of GB1 (Juszewski et al. 1999): 3GB1; PDB ID of ubiquitin (Cornilescu et al. 1998): 1D3Z; PDB ID of FF2: 2E71; PDB ID of hSRI (Li et al. 2005): 2A7O; PDB ID of pol η UBZ (Bomar et al. 2007): 2I5O).
Figure 7
Figure 7
Plots of energy vs. SSE backbone RMSD to NMR reference structure for hSRI and FF2 structures computed by XPLOR-NIH, using the sparse data in Table 4. (A): Plot of hSRI. (B) and (C): Plots of FF2 with different starting structures used in the RDC refinement step. Top 20 structures with the lowest energies among total 100 structures computed by XPLOR-NIH are plotted here. The backbone RMSD between the mean coordinates and the NMR reference structure for SSE regions is 7.3 Å, 5.7 Å and 6.9 Å for plots (A), (B) and (C) respectively. These results show that XPLOR-NIH failed to bootstrap the initial global fold calculation using the sparse data in Table 4.

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. - PubMed
    1. Atreya HS, Sahu SC, Chary KV, Govil G. A tracked approach for automated nmr assignments in proteins (tatapro) J Biomol NMR. 2000;17(2):125–136. - PubMed
    1. Bahrami A, Assadi AH, Markley JL, Eghbalnia HR. Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput Biol. 2009;5(3) e1000307. - PMC - PubMed
    1. Bailey-Kellogg C, Chainraj S, Pandurangan G. A Random Graph Approach to NMR Sequential Assignment. Journal of Computational Biology. 2005;12(6):569–583. - PubMed
    1. Bailey-Kellogg C, Widge A, Kelley JJ, Berardi MJ, Bushweller JH, Donald BR. The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. Journal of Computational Biology. 2000;7(3–4):537–558. - PubMed

Publication types

LinkOut - more resources