Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Oct 31:10:366.
doi: 10.1186/1471-2105-10-366.

(PS)2-v2: template-based protein structure prediction server

Affiliations

(PS)2-v2: template-based protein structure prediction server

Chih-Chieh Chen et al. BMC Bioinformatics. .

Abstract

Background: Template selection and target-template alignment are critical steps for template-based modeling (TBM) methods. To identify the template for the twilight zone of 15~25% sequence similarity between targets and templates is still difficulty for template-based protein structure prediction. This study presents the (PS)2-v2 server, based on our original server with numerous enhancements and modifications, to improve reliability and applicability.

Results: To detect homologous proteins with remote similarity, the (PS)2-v2 server utilizes the S2A2 matrix, which is a 60 x 60 substitution matrix using the secondary structure propensities of 20 amino acids, and the position-specific sequence profile (PSSM) generated by PSI-BLAST. In addition, our server uses multiple templates and multiple models to build and assess models. Our method was evaluated on the Lindahl benchmark for fold recognition and ProSup benchmark for sequence alignment. Evaluation results indicated that our method outperforms sequence-profile approaches, and had comparable performance to that of structure-based methods on these benchmarks. Finally, we tested our method using the 154 TBM targets of the CASP8 (Critical Assessment of Techniques for Protein Structure Prediction) dataset. Experimental results show that (PS)2-v2 is ranked 6th among 72 severs and is faster than the top-rank five serves, which utilize ab initio methods.

Conclusion: Experimental results demonstrate that (PS)2-v2 with the S2A2 matrix is useful for template selections and target-template alignments by blending the amino acid and structural propensities. The multiple-template and multiple-model strategies are able to significantly improve the accuracies for target-template alignments in the twilight zone. We believe that this server is useful in structure prediction and modeling, especially in detecting homologous templates with sequence similarity in the twilight zone.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The framework of the (PS)2-v2 server for protein structure prediction.
Figure 2
Figure 2
Overview of the (PS)2-v2 server. The protein sequence of telomere replication protein Est3 (UniProt Q03096) in Saccharomyces cerevisiae was used as the query. (A) Input format of the (PS)2-v2 server. (B) Search results of a query protein, comprising target name, sequence, predicted secondary structure, the graph of the aligned regions and the hits list of the templates of the query. (C) The selected template, target-template alignment and prediction structure of Est3. (D) The visualization of the predicted structure for Est3. (E) The model quality assessment.
Figure 3
Figure 3
The S2A2 substitution matrix. The scores are high if the residue-structure (RS) letters with similar residue types and the same secondary structure are aligned (red blocks). When two identical RS letters (e.g. diagonal entries) are aligned, the substitution scores are very high. In contrast, the scores are low when helix letters are aligned with strand letters (blue blocks).
Figure 4
Figure 4
Comparison the (PS)2-v2 server with (A) (PS)2-original and (B) (PS)2-CASP8 servers on the 154 TBM targets in CASP8. (PS)2-v2 yields 99 and 34 higher GDT_TS scores than (PS)2-original and (PS)2-CASP8, respectively, among these 154 targets. These three servers have the similar GDT_TS scores when the sequence identity (SI) between the target and template is more than 30% (blue +). (PS)2-v2 outperforms our previous servers when SI is less than 20% (green ×).
Figure 5
Figure 5
Comparison the (PS)2-v2 server with (PS)2-original and (PS)2-CASP8 servers on the target T0504 in CASP8. The (PS)2-CASP8 server uses human spindlin1 (PDB code 2ns2) as the template, conversely, (PS)2-v2 utilizes a multiple-template strategy and selects both 53BP1 tandem tudor domains (PDB code 2g3r) and PHD finger protein 20-like 1 (PDB code 2eqm) as templates. (PS)2-v2 significantly outperforms (PS)2-CASP8 on the T0504-D1 and T0504-D3 domains.
Figure 6
Figure 6
(PS)2-v2 results for using single-model and multiple-model strategies on 154 targets in CASP8 based on GDT_TS scores. (PS)2-v2 improves and decreases the GDT_TS scores for 23 and 4 targets, respectively, when the multiple-model method is utilized. For the other 127 targets, (PS)2-v2 obtains the same GDT_TS scores. The symbols "+", "▫" and "×" represent the performance when the sequence identity (SI) ≥ 30%, between 30% and 20%, and less than 20%, respectively.
Figure 7
Figure 7
(PS)2-v2 models the target T0471 in CASP 8 using multiple models. This server models T0471 by selecting top-ranking five structures (PDB code 2nwrA, 1peaA, 1nv8A, 1ufrA and 1v2dA) as templates using S2A2 matrix and PSSM scoring matrices. For each template, (PS)2-v2 generates 5 structures and (D) the final model (1nv8) is identified by the program ProQ based on LGscore.
Figure 8
Figure 8
An example of the prediction results of the target T0409 from the (PS)2-v2 server. The alignment and predicted structure of the BIG_1156.2 domain of putative penicillin-binding protein MrcA from Nitrosomonas europaea ATCC 19718 using the (PS)2-v2 server. (A) The alignment between the query and the selected template, translation initiation factor 5A protein (PDB code 1bkbA), from Pyrobaculum aerophilum. (B) The superposition, the native structure of T0409 (broad, PDB code 3d0f) and the predicted structure (thin). The green blocks are the regions that the predicted structure matches to the native structure. The yellow and purple blocks indicate the shift errors between predicted structure and native structure, the Cα distances between them are <5 Å and >5 Å, respectively.

Similar articles

Cited by

References

    1. Aloy P, Pichaud M, Russell RB. Protein complexes: structure prediction challenges for the 21(st) century. Curr Opin Struct Biol. 2005;15:15–22. doi: 10.1016/j.sbi.2005.01.012. - DOI - PubMed
    1. Pieper U, Eswar N, Davis FP, Braberg H, Madhusudhan MS, Rossi A, Marti-Renom M, Karchin R, Webb BM, Eramian D, et al. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res. 2006;34:D291–D295. doi: 10.1093/nar/gkj059. - DOI - PMC - PubMed
    1. Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res. 2003;31:3381–3385. doi: 10.1093/nar/gkg520. - DOI - PMC - PubMed
    1. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:40. doi: 10.1186/1471-2105-9-40. - DOI - PMC - PubMed
    1. Chivian D, Kim DE, Malmstrom L, Schonbrun J, Rohl CA, Baker D. Prediction of CASP6 structures using automated Robetta protocols. Proteins. 2005;61:157–166. doi: 10.1002/prot.20733. - DOI - PubMed

Publication types