Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 17;106(11):4201-6.
doi: 10.1073/pnas.0811922106. Epub 2009 Feb 27.

Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins

Affiliations

Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins

Tatsuya Niwa et al. Proc Natl Acad Sci U S A. .

Abstract

Protein folding often competes with intermolecular aggregation, which in most cases irreversibly impairs protein function, as exemplified by the formation of inclusion bodies. Although it has been empirically determined that some proteins tend to aggregate, the relationship between the protein aggregation propensities and the primary sequences remains poorly understood. Here, we individually synthesized the entire ensemble of Escherichia coli proteins by using an in vitro reconstituted translation system and analyzed the aggregation propensities. Because the reconstituted translation system is chaperone-free, we could evaluate the inherent aggregation propensities of thousands of proteins in a translation-coupled manner. A histogram of the solubilities, based on data from 3,173 translated proteins, revealed a clear bimodal distribution, indicating that the aggregation propensities are not evenly distributed across a continuum. Instead, the proteins can be categorized into 2 groups, soluble and aggregation-prone proteins. The aggregation propensity is most prominently correlated with the structural classification of proteins, implying that the prediction of aggregation propensity requires structural information about the protein.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Schematic illustration of the experiment. Each ORF in the ASKA library, which has all of the E. coli ORFs, was amplified by PCR using 2 common primers to translate the gene in the cell-free translation system. The reconstituted cell-free translation system (the PURE system) contains no chaperones. After the 60-min translation, an aliquot of the translation mixture was centrifuged to obtain the soluble fraction. The uncentrifuged (Total) and supernatant (Sup) fractions were subjected to SDS/PAGE, and the translated products were quantified by autoradiography.
Fig. 2.
Fig. 2.
Solubility distribution for quantified proteins. (A) Histogram of solubility for the 3,173 quantified proteins. The proteins with solubilities <30% and >70% were defined as the aggregation-prone (Agg, colored pink) and soluble (Sol, colored blue) groups, respectively. (B) Histogram of solubility for 2,277 predicted cytoplasmic proteins. (C) Histogram of solubility for essential proteins. (D) The ratio of subcellular location (predicted) in all quantified (Total), Agg, and Sol groups. Cyto, cytoplasmic proteins; IMP, integral membrane proteins; Peri, periplasmic proteins; MA, membrane-anchored proteins; OM, outer membrane lipoproteins and β-barrel proteins.
Fig. 3.
Fig. 3.
Correlation between solubility and physicochemical properties. (A) Histograms of molecular mass in the Total, Agg, and Sol groups. (B) Scatter plot of solubility versus isoelectric point. (C) Histograms of the relative contents of negatively charged residues (Asp and Glu) (Left) and hydrophobic residues (Val, Leu and Ile) (Right) in the Total, Agg, and Sol groups.
Fig. 4.
Fig. 4.
Correlation between solubility and tertiary structure. (A) Histograms of solubility in the SCOP classes. SCOP class abbreviations: all α proteins (a); all β proteins (b); α and β proteins (α/β) (c); α and β proteins (α+β) (d). (B) The ratio of the Agg and Sol proteins in each SCOP fold. Details of each fold and the assigned number of proteins with statistical significance (P values) in each fold are described in Table S2. (C) Histograms of solubility for the GroEL substrate proteins. The classification of the substrates is according to Kerner et al. (27), in which Classes I, II, and III are spontaneously foldable, chaperone-dependent (but partially GroEL-dependent) and obligate GroEL/ES-dependent substrates, respectively.

Similar articles

Cited by

References

    1. Anfinsen CB. Principles that govern the folding of protein chains. Science. 1973;181:223–230. - PubMed
    1. Hartl FU, Hayer-Hartl M. Molecular chaperones in the cytosol: Fom nascent chain to folded protein. Science. 2002;295:1852–1858. - PubMed
    1. Ventura S, Villaverde A. Protein quality in bacterial inclusion bodies. Trends Biotechnol. 2006;24:179–185. - PubMed
    1. Dobson CM. Protein folding and misfolding. Nature. 2003;426:884–890. - PubMed
    1. Chiti F, Dobson CM. Protein misfolding, functional amyloid, and human disease. Annu Rev Biochem. 2006;75:333–366. - PubMed

Publication types

Substances