Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12
- PMID: 9298646
- DOI: 10.1002/elps.1150180807
Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12
Abstract
Mining the emerging abundance of microbial genome sequences for hypotheses is an exciting prospect of "functional genomics". At the forefront of this effort, we compared the predictions of the complete Escherichia coli genomic sequence with the observed gene products by assessing 381 proteins for their mature N-termini, in vivo abundances, isoelectric points, molecular masses, and cellular locations. Two-dimensional gel electrophoresis (2-DE) and Edman sequencing were combined to sequence Coomassie-stained 2-DE spots representing the abundant proteins of wild-type E. coli K-12 strains. Greater than 90% of the abundant proteins in the E. coli proteome lie in a small isoelectric point and molecular mass window of 4-7 and 10-100 kDa, respectively. We identified several highly abundant proteins, YjbJ, YjbP, YggX, HdeA, and AhpC, which would not have been predicted from the genomic sequence alone. Of the 223 uniquely identified loci, 60% of the encoded proteins are proteolytically processed. As previously reported, the initiator methionine was efficiently cleaved when the penultimate amino acid was serine or alanine. In contrast, when the penultimate amino acid was threonine, glycine, or proline, cleavage was variable, and valine did not signal cleavage. Although signal peptide cleavage sites tended to follow predicted rules, the length of the putative signal sequence was occassionally greater than the consensus. For proteins predicted to be in the cytoplasm or inner membrane, the N-terminal amino acids were highly constrained compared to proteins localized to the periplasm or outer membrane. Although cytoplasmic proteins follow the N-end rule for protein stability, proteins in the periplasm or outer membrane do not follow this rule; several have N-terminal amino acids predicted to destabilize the proteins. Surprisingly, 18% of the identified 2-DE spots represent isoforms in which protein products of the same gene have different observed pI and M(r), suggesting they are post-translationally processed. Although most of the predicted and observed values for isoelectric point and molecular mass show reasonable concordance, for several proteins the observed values significantly deviate from the expected values. Such discrepancies may represent either highly processed proteins or misinterpretations of the genomic sequence. Our data suggest that AhpC, CspC, and HdeA exist as covalent homomultimers, and that IcdA exists as at least three isoforms even under conditions in which covalent modification is not predicted. We enriched for proteins based on subcellular location and found several proteins in unexpected subcellular locations.
Similar articles
-
Identifying the major proteome components of Haemophilus influenzae type-strain NCTC 8143.Electrophoresis. 1997 Aug;18(8):1314-34. doi: 10.1002/elps.1150180808. Electrophoresis. 1997. PMID: 9298647
-
Comparison of predicted and observed properties of proteins encoded in the genome of Mycobacterium tuberculosis H37Rv.Biochem Biophys Res Commun. 1998 Dec 9;253(1):70-9. doi: 10.1006/bbrc.1998.9709. Biochem Biophys Res Commun. 1998. PMID: 9875222
-
Proteome analysis of Spiroplasma melliferum (A56) and protein characterisation across species boundaries.Electrophoresis. 1997 Aug;18(8):1335-46. doi: 10.1002/elps.1150180809. Electrophoresis. 1997. PMID: 9298648
-
Escherichia coli proteome analysis using the gene-protein database.Electrophoresis. 1997 Aug;18(8):1243-51. doi: 10.1002/elps.1150180805. Electrophoresis. 1997. PMID: 9298644 Review.
-
Identification of cytokine-regulated proteins in normal and malignant cells by the combination of two-dimensional polyacrylamide gel electrophoresis, mass spectrometry, Edman degradation and immunoblotting and approaches to the analysis of their functional roles.Electrophoresis. 1996 Nov;17(11):1655-70. doi: 10.1002/elps.1150171103. Electrophoresis. 1996. PMID: 8982598 Review. No abstract available.
Cited by
-
Proteogenomic analysis of bacteria and archaea: a 46 organism case study.PLoS One. 2011;6(11):e27587. doi: 10.1371/journal.pone.0027587. Epub 2011 Nov 17. PLoS One. 2011. PMID: 22114679 Free PMC article.
-
Distinct characteristics of two 2-Cys peroxiredoxins of Vibrio vulnificus suggesting differential roles in detoxifying oxidative stress.J Biol Chem. 2012 Dec 14;287(51):42516-24. doi: 10.1074/jbc.M112.421214. Epub 2012 Oct 24. J Biol Chem. 2012. PMID: 23095744 Free PMC article.
-
How many signal peptides are there in bacteria?Environ Microbiol. 2013 Apr;15(4):983-90. doi: 10.1111/1462-2920.12105. Environ Microbiol. 2013. PMID: 23556536 Free PMC article.
-
ATP-independent assembly machinery of bacterial outer membranes: BAM complex structure and function set the stage for next-generation therapeutics.Protein Sci. 2024 Feb;33(2):e4896. doi: 10.1002/pro.4896. Protein Sci. 2024. PMID: 38284489 Free PMC article. Review.
-
Experimental determination of translational starts using peptide mass mapping and tandem mass spectrometry within the proteome of Mycobacterium tuberculosis.Microbiology (Reading). 2007 Feb;153(Pt 2):521-528. doi: 10.1099/mic.0.2006/001537-0. Microbiology (Reading). 2007. PMID: 17259624 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials
Miscellaneous