Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Feb;43(2):63-78.
doi: 10.1007/s10858-008-9288-5. Epub 2008 Nov 26.

De novo protein structure generation from incomplete chemical shift assignments

Affiliations

De novo protein structure generation from incomplete chemical shift assignments

Yang Shen et al. J Biomol NMR. 2009 Feb.

Abstract

NMR chemical shifts provide important local structural information for proteins. Consistent structure generation from NMR chemical shift data has recently become feasible for proteins with sizes of up to 130 residues, and such structures are of a quality comparable to those obtained with the standard NMR protocol. This study investigates the influence of the completeness of chemical shift assignments on structures generated from chemical shifts. The Chemical-Shift-Rosetta (CS-Rosetta) protocol was used for de novo protein structure generation with various degrees of completeness of the chemical shift assignment, simulated by omission of entries in the experimental chemical shift data previously used for the initial demonstration of the CS-Rosetta approach. In addition, a new CS-Rosetta protocol is described that improves robustness of the method for proteins with missing or erroneous NMR chemical shift input data. This strategy, which uses traditional Rosetta for pre-filtering of the fragment selection process, is demonstrated for two paramagnetic proteins and also for two proteins with solid-state NMR chemical shift assignments.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flow chart of CS-Rosetta structure generation protocol. In the hybrid fragment selection procedure, shown in red, step 1 selects 200 fragments from an initial cohort of 2000 fragments which has been extracted from the structural database by standard Rosetta methods. In the standard CS-Rosetta method, step 1 takes its fragments directly from the 2,200,000 fragments present in the structural database.
Figure 2
Figure 2
CS-Rosetta structure generation for TM1442 with missing chemical shift assignments for certain types of nuclei. (A, B) Plots of accuracy of fragments selected using the MFR (blue) and hybrid (red) methods with the chemical shift inputs for δ15N, δ1HN and δ13Cα (as contained in the dataset Ih). Quality of three-residue (A) and nine-residue (B) fragments is represented by the average (bold lines) and lowest (lines with dots) rmsd of 200 selected fragments relative to the experimental coordinates of the corresponding TM1442 segment. (C and D) plots of Rosetta all-atom energy, rescored by using the input chemical shifts (as contained in the dataset Ih), versus Cα rmsd relative to the experimental TM1442 structure, for CS-Rosetta models obtained using MFR (C) and hybrid (D) fragment selection methods. (E-H) CS-Rosetta fragment selections and structure generations for TM1442 using only δ13Cα and δ13Cβ (as contained in the dataset Ii).
Figure 3
Figure 3
CS-Rosetta structure for TM1442 with missing chemical shifts. (A and B) Plots of accuracy of fragment candidates selected using the MFR (blue) and hybrid (red) methods using chemical shift values δ15N, δ1HN, δ13Cα, δ13Cβ, δ13C' and δ1Hα for residues 1–20, 30–51 and 60–120 (as contained in the dataset IIe). For each three-residue (A) and nine-residue (B) segment of TM1442, 200 fragments were selected. Average (bold lines) and lowest (lines with dots) rmsd of these fragments relative to the experimental coordinates of the corresponding TM1442 segment are plotted with respect to the position of the first segment residue in the TM1442 sequence. The regions corresponding to the “unassigned” residues are shaded; the secondary structure elements are displayed at the top. (C, D) Plots of Rosetta all atom energy, rescored by using the input chemical shifts (as contained in dataset IIe), versus Cα rmsd relative to the experimental TM1442 structure, for CS-Rosetta models obtained using MFR (C) and hybrid (D) fragment selection methods.
Figure 4
Figure 4
CS-Rosetta structure generation of TM1442 with chemical shift errors. (A,B) Plots of accuracy of fragments selected using the MFR (blue) and hybrid (red) methods, with the inputs swapped for the δ15N, δ1HN, δ13Cα, δ13Cβ, δ13C' and δ1Hα assignments of dipeptides Ser52-Ser53 and Ser82-Ser83 (as contained in the dataset IIIb). For each three-residue (A) and nine-residue (B) segment of TM1442, 200 fragments were selected. Average (bold lines) and lowest (lines with dots) rmsd of these fragments relative to the experimental coordinates of the corresponding TM1442 segment are plotted with respect to the position of the first segment residue in the TM1442 sequence. The regions corresponding to the “miss-assigned” residues are shaded; secondary structure elements are displayed at the top. (C, D) Plots of Rosetta all atom energy, rescored by using the input chemical shifts (as contained in the dataset IIIb), versus Cα rmsd relative to the experimental TM1442 structure, for CS-Rosetta models obtained using MFR (C) and hybrid (D) fragment selection methods.
Figure 5
Figure 5
CS-Rosetta fragment selection and structure generation for GB3 (A-C) and ubiquitin (D-F), using chemical shift assignments from solid-state NMR. (A,D) Plots of the lowest (upper panel) and average (lower panel) backbone coordinate rmsds (N, Cα and C’) between query segment and two hundred 3-residue fragments, selected using the MFR (blue) and hybrid methods (red), as a function of starting position in the sequence. (B,E) same as (A,D) but for 9-residue fragments. (C,F) Plots of Rosetta all atom energy, rescored by using the experimental ssNMR chemical shifts, versus Cα rmsd relative to the experimental NMR structures of GB3 and ubiquitin for the CS-Rosetta all-atom models obtain using MFR-selected (upper panel, blue dots) and the hybrid method (lower panel, red dots) fragments. The solid black lines in (C,F) represent the normalized number of structures found at a given Cα-rmsd.
Figure 6
Figure 6
CS-Rosetta structure generation for paramagnetic calbindin (A-C) and ferredoxin (D-F). (A,D) Plots of the lowest (upper panel) and average (lower panel) backbone coordinate rmsds (N, Cα and C’) between query segment and two hundred 3-residue fragment candidates, selected using the MFR (blue) and hybrid methods (red), as a function of starting position in the sequence. The regions lacking chemical shift assignments are shaded. (B,E) Same as (A,D), but for 9-residue fragments. (C,F) Plots of Rosetta all-atom energy, rescored by the experimental chemical shifts, versus Cα rmsd of final al-atom models (including only residues located in elements of secondary structure) relative to the corresponding X-ray (calbindin) and NMR (ferredoxin) structure. Only results from CS-Rosetta all-atom models obtained by the hybrid fragment selection procedure are shown; when using fragments from the standard MFR method, Rosetta fails to converge. Residues included in the backbone rmsd calculation include 3–14, 25–40, 46–53 and 63–74 for calbindin, and 4–11, 15–22, 27–34, 54–56, 71–75 and 91–93 for ferredoxin.
Figure 7
Figure 7
Comparison of experimental (blue) and lowest energy CS-Rosetta (red) structure for paramagnetic calbindin (A) and ferredoxin (B). Superposition is optimized for residues in secondary structure, defined in the caption to Fig. 6. The sidechains of residues involved in metal binding including their metal-ligating oxygen atoms, as well as the X-ray positions of the Ca2+ ions (cyan) are shown. Metal-ligating residues (atoms) include Ala14 (O), Glu17 (O), Asp19 (O), Gln22 (O), Glu27 (Oε1/Oε2), Asp54 (Oδ1), Asn56 (O), Asp58 (Oδ1), Glu60 (O) and Glu65 (Oε1/Oε2). (B) Backbone ribbon representation of the lowest-energy CS-Rosetta structure (red) superimposed on the experimental X-ray structure (blue) for ferredoxin, with superposition optimized for the residues in secondary structure (see caption to Fig. 6). The sidechain S atoms of Cys42, Cys47, Cys50 and Cys82, which coordinate the [2Fe-2S] cluster are marked as solid spheres. Figures made using Molmol (Koradi et al. 1996).

Similar articles

Cited by

References

    1. Agarwal V, Diehl A, Skrynnikov N, Reif B. High resolution H-1 detected H-1,C-13 correlation spectra in MAS solid-state NMR using deuterated proteins with selective H-1,H-2 isotopic labeling of methyl groups. J. Am. Chem. Soc. 2006;128:12620–12621. - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Ando I, Kameda T, Asakawa N, Kuroki S, Kurosu H. Structure of peptides and polypeptides in the solid state as elucidated by NMR chemical shift. J. Mol. Struct. 1998;441:213–230.
    1. Andreini C, Bertini I, Rosato A. A hint to search for metalloproteins in gene banks. Bioinformatics. 2004;20:1373–1380. - PubMed
    1. Asakura T, Demura M, Date T, Miyashita N, Ogawa K, Williamson MP. NMR study of silk I structure of Bombyx mori silk fibroin with N-15- and C-13-NMR chemical shift contour plots. Biopolymers. 1997;41:193–203.

Publication types