Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 25;134(2):341-52.
doi: 10.1016/j.cell.2008.05.042.

Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution

Affiliations

Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution

D Allan Drummond et al. Cell. .

Abstract

Strikingly consistent correlations between rates of coding-sequence evolution and gene expression levels are apparent across taxa, but the biological causes behind the selective pressures on coding-sequence evolution remain controversial. Here, we demonstrate conserved patterns of simple covariation between sequence evolution, codon usage, and mRNA level in E. coli, yeast, worm, fly, mouse, and human that suggest that all observed trends stem largely from a unified underlying selective pressure. In metazoans, these trends are strongest in tissues composed of neurons, whose structure and lifetime confer extreme sensitivity to protein misfolding. We propose, and demonstrate using a molecular-level evolutionary simulation, that selection against toxicity of misfolded proteins generated by ribosome errors suffices to create all of the observed covariation. The mechanistic model of molecular evolution that emerges yields testable biochemical predictions, calls into question the use of nonsynonymous-to-synonymous substitution ratios (Ka/Ks) to detect functional selection, and suggests how mistranslation may contribute to neurodegenerative disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Covariation of gene expression levels, patterns of codon usage, and rates of gene evolution are conserved across a bacterium, yeast, worm, fly, mouse, and human. A, All pairwise correlations between nonsynonymous and synonymous evolutionary rates (dN and dS), mRNA expression level, fraction of optimal codons (Fop), and transition-transversion ratio reveal conserved patterns of genome evolution across widely diverged taxa. Red lines show lowess-smoothed data. B, Correlation matrices and signs display a block structure. Correlation strengths (lower left of matrix) and signs (upper right of matrix) for each organism are shown; those with P > 0.05 after false-discovery-rate correction for multiple testing are shown overlaid with a black square. C, Human correlations controlled for intronic guanine and cytosine content recapitulate the structure conserved in other organisms, with the exception of a positive Fop–dS correlation. D, Similar correlations arise in a large-scale simulation involving selection against costs of mistranslation-induced protein misfolding (left), but not in the same simulation when mistranslation-induced misfolding imposes no cost (right).
Figure 2
Figure 2
Principal component analysis (PCA) of the correlation matrices in Figure 1 reveals a simple structure. A, The percentage of variance in all five variables explained by each of the first three principal components (of five; the remaining two are omitted for clarity). The dotted line indicates 20% of the variance, the cutoff for a component to have any meaningful explanatory value. B, Cluster analysis of the first three principal components reveals a tight cluster which contains the dominant principal components from all organisms and the misfolding-cost simulation (red box), but which excludes the dominant principal component from the no-cost simulation (green box).
Figure 3
Figure 3
The misfolding hypothesis. A, Outcomes of translation. Most proteins exit the ribosome (left) with no errors (bottom), but a substantial proportion contain at least one error (top). The probability of misfolding after correct translation is lower than after erroneous translation (center). Some proteins attain native state but then improperly unfold (right). Natural selection can act at four points: at 1), to reduce the frequency of translation errors in certain proteins; at 2), to reduce the proportion of error-containing proteins which misfold; at 3), to reduce the number of error-free proteins which misfold; and at 4), to reduce the number of proteins (with or without errors) which improperly unfold. B, Adaptations to higher misfolding costs constrain sequence evolution because adapted sequences are rare. So long as evolutionarily viable gene sequences are substantially more adapted than random sequences and adaptation levels are roughly bell-shaped in distribution, the number of alternative sequences (possible alleles) compatible with higher levels of adaptation (accuracy, robustness, etc.) declines rapidly. Adaptation to increasing misfolding costs therefore leads to increasing evolutionary constraint and slower sequence evolution.
Figure 4
Figure 4
Correlations of per-tissue mRNA levels with dN, dS, and ts/tv ratio for fly (A), mouse (B), and human (C, controlled for intronic guanine+cytosine content) vary systematically across tissues. Tissues composed primarily of neurons are indicated with a red bar. Dotted lines indicated the minimum and maximum correlation strengths.
Figure 5
Figure 5
Unexpected correlations between dS and other variables can be explained by selection against mistranslation-induced misfolding. A, An unexpected negative correlation between dS and transition/transversion ratio is strongest at third-position sites and weakest at second-codon-position sites in all six organisms, and the simulation under conditions where mistranslation-induced protein misfolding imposes a cost. For each organism, from left to right, the Spearman rank correlation between whole-gene dS and the transition/transversion ratio only for substitutions occurring in the first, second, or third codon positions in each gene are shown. B, An unexpected positive correlation arises between dS and dN/dS in most organisms and in the simulation when mistranslation-induced misfolding is costly; Spearman rank dS–dN/dS correlations in each organism and the simulations are shown.
Figure 6
Figure 6
Outcomes of translation in the simulation vary with expression level, and can be attributed to selection on the amino-acid sequence or the nucleotide sequence. A, Fraction of accurately translated polypeptides. B, Fraction of mistranslated polypeptides that fold properly, a measure of translational robustness. C, Fraction of truncated polypeptides. D, Fraction of folded proteins. Lines show the sliding-window medians for translation outcomes generated by genes evolved with a cost for misfolding (solid), by ensembles of genes with randomly chosen codons encoding the same protein sequences (dotted), and by genes evolved under no cost (dashed baseline). In C, a dash-dot line shows the proportion of mistranslated but full-length (not truncated) polypeptides that fold properly. E, Translational robustness is tightly linked to increases in thermodynamic stability (free energy of unfolding). F, Stability distributions of mistranslated proteins (containing at least one error) translated from the 100 lowest-expression (light gray) and 100 highest-expression (dark gray) genes. Arrows indicate the median free energy of each subpopulation. A free energy of unfolding of 5 kcal/mol (dotted line) is the minimum to be considered stably folded in the model.

Similar articles

Cited by

References

    1. Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136:927–935. - PMC - PubMed
    1. Anfinsen C. The Molecular Basis of Evolution. New York: John Wiley & Sons, Inc; 1959.
    1. Bacher JM, Schimmel P. An editing-defective aminoacyl-tRNA synthetase is mutagenic in aging bacteria via the SOS response. Proc Natl Acad Sci U S A. 2007;104:1907–1912. - PMC - PubMed
    1. Bierne N, Eyre-Walker A. The problem of counting sites in the estimation of the synonymous and nonsynonymous substitution rates: implications for the correlation between the synonymous substitution rate and codon usage bias. Genetics. 2003;165:1587–1597. - PMC - PubMed
    1. Bloom JD, Labthavikul S, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A. 2006;103:5869–5874. - PMC - PubMed

Publication types

LinkOut - more resources