Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1999 Sep 3;237(1):113-21.
doi: 10.1016/s0378-1119(99)00310-8.

How many potentially secreted proteins are contained in a bacterial genome?

Affiliations
Comparative Study

How many potentially secreted proteins are contained in a bacterial genome?

G Schneider. Gene. .

Abstract

Artificial neural networks were trained on the prediction of the subcellular location of bacterial proteins. A cross-validated average prediction accuracy of 93% was reached for distinction between cytoplasmic and non-cytoplasmic proteins, based on the analysis of protein amino-acid composition. Principal component analysis and self-organizing maps were used to create graphical representations of amino-acid sequence space. A clear separation of cytoplasmic, periplasmic, and extracellular proteins was observed. The neural network system was applied to predicting potentially secreted proteins in 15 complete genomes. For mesophile bacteria the predicted fractions of non-cytoplasmic proteins agree with previously published estimates, ranging between 15% and 30%. Characteristics of thermophile genomes might lead to an under-estimation of the fraction of secreted proteins by presently available prediction systems. A self-organizing map was constructed from all 15 bacterial genomes. This technique can reveal additional sequence features independent from exhaustive pair-wise sequence alignment. The Treponema pallidum and Mycobacterium tuberculosis data formed separate clusters indicating unusual characteristics of these genomes.

PubMed Disclaimer

Similar articles

Cited by

Publication types

LinkOut - more resources