Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 19;108(29):11954-8.
doi: 10.1073/pnas.1017361108. Epub 2011 Jul 5.

Reductive evolution of proteomes and protein structures

Affiliations

Reductive evolution of proteomes and protein structures

Minglei Wang et al. Proc Natl Acad Sci U S A. .

Abstract

The lengths of orthologous protein families in Eukarya are almost double the lengths found in Bacteria and Archaea. Here we examine protein structures in 745 genomes and show that protein length differences between superkingdoms arise as much shorter prokaryotic nondomain linker sequences. Eukaryotic, bacterial, and archaeal linkers are 250, 86, and 73 aa residues in length, respectively, whereas folded domain sequences are 281, 280, and 256 residues, respectively. Cryptic domains match linkers (P < 0.0001) with probabilities ranging between 0.022 and 0.042; accordingly, they do not affect length estimates significantly. Linker sequences support intermolecular binding within proteomes and they are probably enriched in intrinsically disordered regions as well. Reductively evolved linker sequence lengths in growth rate maximized cells should be proportional to proteome diversity. By using total in-frame coding capacity of a genome [i.e., coding sequence (CDS)] as a reliable measure of proteome diversity, we find linker lengths of prokaryotes clearly evolve in proportion to CDS values, whereas those of eukaryotes are more randomly larger than expected. Domain lengths scarcely change over the entire range of CDS values. Thus, the protein linkers of prokaryotes evolve reductively whereas those of eukaryotes do not.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Plot shows average lengths (amino acid numbers) of domains and linkers in 745 genomes, including 215 Eukarya (blue circles), 478 Bacteria (pink circles), and 52 Archaea (brown circles). Mean values of proteins with different domain numbers within the same genome could be separated well (dash lines) because of the increasing aggregate lengths.
Fig. 2.
Fig. 2.
Three-dimensional plot illustrates distribution of average lengths of corresponding N-terminal, C-terminal, and internal linker sequences in every genome analyzed in Fig. 1. The coordinates of every genome (circle) were determined by plotting average lengths of linkers of all protein sequences in a genome.
Fig. 3.
Fig. 3.
Average domain and linker length plotted against CDS length for individual genomes.

Similar articles

Cited by

References

    1. Zhang J. Protein-length distributions for the three domains of life. Trends Genet. 2000;16:107–109. - PubMed
    1. Liang P, Riley M. A comparative genomics approach for studying ancestral proteins and evolution. Adv Appl Microbiol. 2001;50:39–72. - PubMed
    1. Brocchieri L, Karlin S. Protein length in eukaryotic and prokaryotic proteomes. Nucleic Acids Res. 2005;33:3390–3400. - PMC - PubMed
    1. Kurland CG, Canbäck B, Berg OG. The origins of modern proteomes. Biochimie. 2007;89:1454–1463. - PubMed
    1. Ehrenberg M, Kurland CG. Costs of accuracy determined by a maximal growth rate constraint. Q Rev Biophys. 1984;17:45–82. - PubMed

Publication types

LinkOut - more resources