Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity
- PMID: 9743619
- DOI: 10.1006/jmbi.1998.2061
Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity
Abstract
The application of an automated method for the screening of protein activity based on the sequence-to-structure-to-function paradigm is presented for the complete Escherichia coli genome. First, the structure of the protein is identified from its sequence using a threading algorithm, which aligns the sequences to the best matching structure in a structural database and extends sequence analysis well beyond the limits of local sequence identity. Then, the active site is identified in the resulting sequence-to-structure alignment using a "fuzzy functional form" (FFF), a three-dimensional descriptor of the active site of a protein. Here, this sequence-to-structure-to-function concept is applied to analysis of the complete E. coli genome, i.e. all E. coli open reading frames (ORFs) are screened for the thiol-disulfide oxidoreductase activity of the glutaredoxin/thioredoxin protein family. We show that the method can identify the active sites in ten sequences that are known to or proposed to exhibit this activity. Furthermore, oxidoreductase activity is predicted in two other sequences that have not been identified previously. This method distinguishes protein pairs with similar active sites from proteins pairs that are just topological cousins, i.e. those having similar global folds, but not necessarily similar active sites. Thus, this method provides a novel approach for extraction of active site and functional information based on three-dimensional structures, rather than simple sequence analysis. Prediction of protein activity is fully automated and easily extendible to new functions. Finally, it is demonstrated here that the method can be applied to complete genome database analysis.
Copyright 1998 Academic Press.
Similar articles
-
Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases.J Mol Biol. 1998 Sep 4;281(5):949-68. doi: 10.1006/jmbi.1998.1993. J Mol Biol. 1998. PMID: 9719646
-
Solution structure of Escherichia coli glutaredoxin-2 shows similarity to mammalian glutathione-S-transferases.J Mol Biol. 2001 Jul 20;310(4):907-18. doi: 10.1006/jmbi.2001.4721. J Mol Biol. 2001. PMID: 11453697
-
Structure, dynamics and electrostatics of the active site of glutaredoxin 3 from Escherichia coli: comparison with functionally related proteins.J Mol Biol. 2001 Jul 6;310(2):449-70. doi: 10.1006/jmbi.2001.4767. J Mol Biol. 2001. PMID: 11428900
-
[Periplasmatic disulfide oxidoreductases from bacterium Escherichia coli--their structure and function].Postepy Biochem. 2005;51(4):459-67. Postepy Biochem. 2005. PMID: 16676581 Review. Polish.
-
Protein disulfides and protein disulfide oxidoreductases in hyperthermophiles.FEBS J. 2006 Sep;273(18):4170-85. doi: 10.1111/j.1742-4658.2006.05421.x. Epub 2006 Aug 23. FEBS J. 2006. PMID: 16930136 Review.
Cited by
-
MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison.Protein Sci. 2002 Nov;11(11):2606-21. doi: 10.1110/ps.0215902. Protein Sci. 2002. PMID: 12381844 Free PMC article.
-
Identification of local conformational similarity in structurally variable regions of homologous proteins using protein blocks.PLoS One. 2011 Mar 18;6(3):e17826. doi: 10.1371/journal.pone.0017826. PLoS One. 2011. PMID: 21445259 Free PMC article.
-
Combining molecular dynamics and machine learning to improve protein function recognition.Pac Symp Biocomput. 2008:332-43. Pac Symp Biocomput. 2008. PMID: 18229697 Free PMC article.
-
A systematic study of low-resolution recognition in protein--protein complexes.Proc Natl Acad Sci U S A. 1999 Jul 20;96(15):8477-82. doi: 10.1073/pnas.96.15.8477. Proc Natl Acad Sci U S A. 1999. PMID: 10411900 Free PMC article.
-
Modeling of loops in protein structures.Protein Sci. 2000 Sep;9(9):1753-73. doi: 10.1110/ps.9.9.1753. Protein Sci. 2000. PMID: 11045621 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases