Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Feb;12(2):288-95.
doi: 10.1110/ps.0222303.

Accurate and automated classification of protein secondary structure with PsiCSI

Affiliations

Accurate and automated classification of protein secondary structure with PsiCSI

Ling-Hong Hung et al. Protein Sci. 2003 Feb.

Abstract

PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the translation of chemical shift information to secondary structure (Q3 = 87.4%) and is able to use sequence information as an effective substitute for sparse NMR data (Q3 = 86.9% without (13)C shifts and Q3 = 86.8% with only H(alpha) shifts available). Finally, errors made by PsiCSI almost exclusively involve the interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Distribution of Q3 accuracies. (A) The distribution of Q3 accuracies for the PsiCSI, CSI, and Psipred is shown. For CSI, consensus predictions were not available for two cases, and these were omitted. Not only is the increased accuracy readily apparent, but the consistency of PsiCSI is revealed by the a tight and nearly symmetrical distribution of accuracies. Both CSI and Psipred have significant populations in which the methods do relatively poorly in contrast to PsiCSI, in which no protein fares worse than 74%, and the average Q3 accuracy is 89%. (B) The distribution of the differences in the Q3 accuracy of PsiCSI is compared with that of CSI and Psipred for the same protein. As would be expected from the overall increased accuracy of PsiCSI, the distribution indicates that there are relatively few cases in which PsiCSI performs more poorly than CSI or Psipred, and only marginally so. Conversely, the improvement observed when using PsiCSI can be very large, indicating that the method can still be effective in cases in which CSI or Psipred do very poorly.

Similar articles

Cited by

References

    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25 3389–3402. - PMC - PubMed
    1. Bailey-Kellogg, C., Widge, A., Kelley, J.J., Berardi, M.J., Bushweller, J.H., and Donald, B.R. 2000. The NOESY jigsaw: Automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. J. Comput. Biol. 7 537–558. - PubMed
    1. Bonneau, R., Tsai, J., Ruczinski, I., Chivian, D., Rohl, C., Strauss, C.E., and Baker, D. 2001. Rosetta in CASP4: Progress in ab initio protein structure prediction. Proteins 45 119–126. - PubMed
    1. Bonvin, A.M., Houben, K., Guenneugues, M., Kaptein, R., and Boelens, R. 2001. Rapid protein fold determination using secondary chemical shifts and cross-hydrogen bond 15N-13C′ scalar couplings (3hbJNC′). J. Biomol. NMR 21 221–233. - PubMed
    1. Brenner, S.E. 2001. A tour of structural genomics. Nat. Rev. Genet. 2 801–809. - PubMed

Publication types

LinkOut - more resources