Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2020 Aug 12;119(6):1123–1134. doi: 10.1016/j.bpj.2020.06.037

Effect of Protein Structure on Evolution of Cotranslational Folding

Victor Zhao 1, William M Jacobs 2, Eugene I Shakhnovich 1,
PMCID: PMC7499064  PMID: 32857962

Abstract

Cotranslational folding depends on the folding speed and stability of the nascent protein. It remains difficult, however, to predict which proteins cotranslationally fold. Here, we simulate evolution of model proteins to investigate how native structure influences evolution of cotranslational folding. We developed a model that connects protein folding during and after translation to cellular fitness. Model proteins evolved improved folding speed and stability, with proteins adopting one of two strategies for folding quickly. Low contact order proteins evolve to fold cotranslationally. Such proteins adopt native conformations early on during the translation process, with each subsequently translated residue establishing additional native contacts. On the other hand, high contact order proteins tend not to be stable in their native conformations until the full chain is nearly extruded. We also simulated evolution of slowly translating codons, finding that slower translation speeds at certain positions enhances cotranslational folding. Finally, we investigated real protein structures using a previously published data set that identified evolutionarily conserved rare codons in Escherichia coli genes and associated such codons with cotranslational folding intermediates. We found that protein substructures preceding conserved rare codons tend to have lower contact orders, in line with our finding that lower contact order proteins are more likely to fold cotranslationally. Our work shows how evolutionary selection pressure can cause proteins with local contact topologies to evolve cotranslational folding.

Significance

Substantial evidence exists for proteins folding as they are translated by the ribosome. Here, we developed a biologically intuitive evolutionary model to show that avoiding premature protein degradation or aggregation can be a sufficient evolutionary force to drive evolution of cotranslational folding. Furthermore, we find that whether a protein’s native fold consists of more local or more nonlocal contacts affects whether cotranslational folding evolves. Proteins with local contact topologies are more likely to evolve cotranslational folding through nonsynonymous mutations that strengthen native contacts as well as through synonymous mutations that provide sufficient time for cotranslational folding intermediates to form.

Introduction

Ribosomes synthesize proteins residue by residue. This ordered emergence of the polypeptide allows cotranslational formation of the protein native structure (1, 2, 3). Examples of cotranslational folding processes include forming folding intermediates (4, 5, 6, 7), domain-wise protein folding (8, 9, 10, 11, 12), and adoption of α-helices and other compact structures in the ribosome exit tunnel (12, 13, 14). Cotranslational folding has been shown to enhance protein folding yield by preventing misfolding and aggregation (6,15, 16, 17, 18).

Recent genomic studies from our group and others provide complementary evidence that cotranslational folding has been evolutionarily selected for (19,20). Specifically, examination of sequence-aligned homologous genes found that rare codons are evolutionarily conserved. Rare codons are translated at slower rates, and translational slowing along the transcript may facilitate formation of native structure (21). The study from our group examined Escherichia coli proteins in particular, finding that conserved rare codons are often located downstream of cotranslational folding intermediates that were identified using a native-centric model of cotranslational protein folding (20). In a subsequent computational study using a more realistic all-atom, sequence-based potential, we found that the positions of slowly translating, rare codons could correspond to nascent chain lengths that exhibit stable partly folded states as well as fast folding kinetics (22).

Although these findings suggest mechanistic reasons for the evolution of cotranslational folding, there is still no clear understanding of which proteins are likely to fold cotranslationally. Many proteins fold posttranslationally (11,23,24) or with the assistance of chaperones (25,26). Additionally, it is unclear to what degree sequences have evolved to optimize either cotranslational or posttranslational folding. To address these questions, we used an evolutionary modeling approach (27). We constructed a fitness function that depends on outcomes of protein translation and folding. We then simulated evolution of coarse-grained lattice proteins, whose folding performance we evaluate using Monte Carlo (MC) simulations of protein translation and folding.

Our evolutionary simulations investigate prototypical lattice proteins of varying contact orders (28). We find that evolved proteins with low contact order fold cotranslationally, forming native structure in a stepwise manner. Separately, we assessed the fitness effect of rare codons by simulating translation with a longer elongation interval for individual codons. We then performed a bioinformatics investigation using data from our previously published work, which associated rare codons with cotranslational folding intermediates in E. coli proteins (20). Our work mechanistically explains how proteins with local contact topologies can evolve cotranslational folding through nonsynonymous mutations that stabilize partial-length native states and synonymous mutations that provide additional time for such native states to form.

Methods

Model connecting protein translation and folding to cellular fitness

To simulate evolution of lattice protein sequences, we built a model relating outcomes of protein translation and folding to cellular fitness. A protein undergoing translation can reach its native state during or after translation (Fig. 1 A). After translation, a free protein not in its native state is vulnerable to degradation or aggregation with other proteins (29), processes assumed here to be irreversible; aggregation is effectively irreversible if disaggregation is slow (30). We consider the survival probability of a protein in the cellular environment over time. Let S(t) be the probability that a protein survives to time t, with t=0 being the time the protein leaves the ribosome and S(0)=1. For a protein not in its native state, there is some effective rate kd for the protein to be degraded or to aggregate. Based on this model, S(t) evolves according to the following differential equation:

dSdt=kd[1θ(t)]S,S(0)=1, (1)
St=expkd0t1θt'dt', (2)

where θ(t)={0,1} is an indicator function whose value is 1 when the protein is in its native state at time t. Under this model, S(t) does not decrease if the protein is in its native state. The half-lives of thermodynamically unstable proteins are as short as a few minutes (31, 32, 33), compared with hours or days for stable proteins (31,33,34), justifying this assumption. A protein thus has a higher survival probability if it folds quickly and remains stably folded.

Figure 1.

Figure 1

Connecting protein translation and folding to fitness and evolution. (A) Model of protein biogenesis: A protein undergoing translation may reach the native state before or after release from the ribosome. After it is released, the free protein is vulnerable to degradation or aggregation if it is not in its native state. The protein can only carry out its function when it is in the native state. (B) Example cotranslational folding trajectory in which the protein folds posttranslationally. The point at which the protein is complete but still tethered to the ribosome is indicated by the vertical dotted line. Top: folded/unfolded states depicted by number of native contacts formed. Middle: Survival probability S(t) (Eq. 2), which decreases after translation if the protein is not in its native state. Bottom: Cumulative protein activity (Eq. 4), which increases when the released protein is in its native state. (C) Evolutionary simulation scheme for evolving proteins. For each generation, a trial mutation is assessed via cotranslational folding simulations (as shown in (A)). The time to fold t and native state stability Pnat determine total protein activity (Eq. 5), which, averaged over multiple folding trajectories, determines cellular fitness. The mutation is fixed with probability π. To see this figure in color, go online.

We next model how S(t) affects the activity of the protein over time. We assume a protein can perform its biological function only when in its native state and if it has not been degraded. Based on these assumptions, the total cumulative activity of the protein (e.g., enzymatic output), Atotal, is described probabilistically by the following equation:

dAdt=kaS(t)θ(t),A(0)=0, (3)
Atotal=ka0TS(t)θ(t)dt, (4)

where ka is an activity rate constant (which we set to 1), and T is some long timescale corresponding to the period of time that the protein is biologically relevant, such as the length of the cell cycle.

Fig. 1 B illustrates the relationship between protein folding, survival, and activity for a single protein folding trajectory. In this particular trajectory, the protein folds posttranslationally. During translation, the protein is not vulnerable to degradation, and S(t) is 1. After release from the ribosome, S(t) decreases for every time unit the protein is not in the native state. S(t) decreases at a higher rate during the initial passage to the folded state and then decreases more slowly after the protein enters the native state energy basin and fluctuates in and out of the native state. Protein activity only begins to accumulate after the protein reaches the native state. Atotal corresponds to cumulative activity at a later time T.

To reduce the amount of computation required for evaluating the protein activity function (Eq. 4), we assume proteins fluctuate on fast timescales in and out of the native state, occupying the native state with probability Pnat=θt. After simplifications and applying this fast-fluctuation assumption (see Extended Methods in the Supporting Materials and Methods), Eq. 4 becomes the following equation:

Atotal=expkdtPnatkd1Pnat1expkdTt1Pnat, (5)

where t is the first passage time to the native state. According to Eq. 5, Atotal decreases exponentially with t. On the other hand, the relationship between Pnat and Atotal is more complex. For realistic values of kd and with TtT, kdT1. If Pnat is close to 1 such that the entire argument of the exponential is small, Atotal is proportional to Pnat. Otherwise, Atotal is proportional to Pnat/(1Pnat).

Finally, we relate Atotal to cellular fitness, f. For convenience, we divide Atotal by total time T to rescale it to the range [0,1]. We treat the protein as essential to cellular growth, and therefore its activity is related to cellular fitness. We use a metabolic flux-type equation to relate total protein activity to f (35, 36, 37, 38):

f=Atotal/TAtotal/T+A0, (6)

where A0 is a constant which sets the value of Atotal/T where fitness is half maximal. Together, Eqs. 5 and 6 formulate how protein folding kinetics and stability determine fitness in our model.

Evolutionary simulations using a lattice protein model

To explore how proteins evolve under prototypical functional selection, we ran evolutionary simulations that fix or reject mutations in a protein sequence based on the measured fitness. The evolutionary simulation scheme is illustrated in Fig. 1 C. We simulate evolution according to a discrete-generation monoclonal model in which, in each generation, a single arising mutation either fixes (takes over the entire population) or is lost. Our model organism has only a single gene corresponding to the protein under investigation. In every generation, a mutation is made to the current sequence, and the fitness of the trial sequence, f', is evaluated using protein folding simulations. A selection coefficient is calculated as s=f'f/f (27), and the mutation is fixed with probability

π=1exp2s1exp2Ns, (7)

where N is the population size. Eq. 7 comes from classical population genetics (39).

In the fitness assessment, we use MC simulations of lattice model proteins undergoing translation. A lattice protein, as illustrated in Fig. 1 A, treats protein residues as a connected set of vertices on a cubic lattice. Our model uses 20 amino acid types whose interaction energy is given by a 20 × 20 interaction matrix (40).

The MC simulations of translation and folding have two phases: translation and posttranslation. During translation, MC dynamics alternates with elongation of the nascent chain at the C-terminus. The ribosome is not explicitly modeled; rather, the nascent chain C-terminus is simulated as if connected to a straight chain of infinite length (as illustrated in Fig. 1 A), representing unextruded residues in a ribosomal channel. There are no energetic interactions between the protein and the untranslated residues, but the channel does exclude a volume that is one lattice unit in width. The protein remains tethered for one additional elongation interval after the final residue is added, representing the ribosome release interval. This treatment of lattice protein translation matches what was used in a previous study (41). During the posttranslation phase, the protein is no longer tethered and has no conformational restrictions. Fig. S2 shows example folding trajectories for sequences studied in this work. Because MC simulations are stochastic, multiple trajectories are used to estimate t and Pnat, and an average Atotal is used in Eq. 6. Additional details are described in the Extended Methods in the Supporting Materials and Methods.

The fitness function, as defined by Eqs. 5 and 6, by selecting for protein activity, effectively includes selection on folding kinetics and stability of proteins undergoing translation. As controls, we also ran the same evolutionary simulations under two alternative evolutionary scenarios in which we changed how we assessed fitness. In the first alternative scenario, we skip lattice protein translation. Instead, fitness assessments use MC simulations that begin with full-length proteins in a fully extended conformation, mimicking in vitro refolding and thereby making in vitro folding speed a determinant of fitness. The second alternative scenario ignores first passage time to the native state. In this case, MC simulations begin with full-length proteins already in their native conformations. t in Eq. 5 is set to 0, and MC simulations only measure Pnat. Fitness in this scenario, therefore, depends on stability and does not depend on folding rate. We refer to sequences evolved without translation as “evolved, no translation,” and we refer to sequences starting in the native state as “evolved, no folding.” The degradation rate kd and other simulation parameters remained unchanged for these alternative evolutionary scenarios.

Simulation parameters and analysis

The simulation time unit, t, is defined as t=MCstep/proteinlength, which accounts for the local nature of the MC move set. The key simulation parameters are the elongation interval and degradation rate kd. Simulation parameters were chosen so that ratios of timescales between translation, protein folding, and degradation are biologically reasonable. An explanation of how simulation parameters were selected is given in the Extended Methods, and a listing of the parameters is shown in Table S1.

A key characteristic of different lattice proteins simulated in this work is the topology of the native structure that the proteins fold to. Different structures differ in the degree to which residues forming native contacts are separated in primary sequence. Contact order is defined as

CO=1L×NcontactsNΔSi,j, (8)

where L is the length of the protein, N is the number of contacts, and ΔSi,j is the separation in primary sequence between contacting residues i and j (28).

Nine lattice protein native structures were selected from the representative 10,000 structure subset of 27-mers used in previous works (42,43). Each native structure arranges the chain in a 3 × 3 × 3 cubic native fold. Three each of low, medium, and high contact order structures were chosen. Initial sequences for evolution were designed to be thermodynamically stable in the selected native conformations via Z-score optimization (44, 45, 46); all initial sequence Z-scores were below −50. Table S2 shows unevolved and evolved sequences for the protein structures used in this study.

Key quantities measured in simulations are first passage time to the native state, t, folding stability, Pnat, and native contacts formed. Pnat is measured as the proportion of steps that the protein is in its native conformation, from the time that the protein reaches the native state until the end of the simulation. Proteins must be exactly in their native conformations to be considered native. For first passage time to the native state, t=0 is defined as the start of the posttranslational phase of simulation in which the protein chain has no conformational restrictions. Native contact counts are either normalized by the maximal possible number of native contacts at a particular chain length or by the number of native contacts in the full-length protein (28 for all lattice proteins in this work) and are typically reported as average values for each nascent chain length. The results presented in this work focus on for nascent chain lengths of 15–27.

Results

Proteins evolve improved stability and kinetics

Beginning with sequences designed for native state thermodynamic stability, we initiated evolutionary simulations. The evolutionary trajectories of the nine proteins are shown across Figs. S3 and S4. In each trajectory, over the course of 1000 mutation attempts, a number of mutations were fixed. Occasionally, deleterious mutations were fixed because fitness evaluations using MC simulations are stochastic. Nonetheless, the fitnesses, folding stabilities, and folding speeds of the evolved sequences are improved over the unevolved sequences. The remaining discussion focuses on the evolved sequences from the end of evolutionary simulation and comparison with unevolved sequences or evolved sequences from alternative evolutionary scenarios.

The overall outcomes of evolutionary simulations are shown in Fig. 2, which compares the fitnesses, folding stabilities, folding times, and native energies of the unevolved, initial sequences to those of the evolved sequences. The total protein activity (Eq. 5), which determines fitness (Eq. 6), is determined by the combination of stability (Pnat) and folding kinetics (the distribution of t). Here, protein folding stability is shown in terms of the two-state free energy (ΔF/kT)ln(Pnat/1Pnat) to illustrate differences in folding stability for Pnat close to 1 more clearly. Because fitness calculations use the entire ensemble of first passage times, folding time distributions are illustrated using boxplots, with t=0 defined as the moment of release from the ribosome.

Figure 2.

Figure 2

Outcomes of evolutionary simulations. Shown is a comparison of properties of initial, unevolved sequences to those of sequences obtained by evolutionary simulation under selection for stability and kinetics (Eqs. 5 and 6). Three groups of three native structures of low, medium, and high contact orders (numbered 1–9, vertical dotted lines indicate grouping) were selected for simulation. The four plots show fitness, folding stability (as ΔF/kTlnPnat/1Pnat, first passage time to the native state, and native state energy for unevolved sequences (blue) and evolved sequences (orange). First passage times are measured using MC simulations of translation and folding, boxplot whiskers show the 5th and 95th percentile values, bold line indicates median value, and t=0 is the moment of release from the ribosome. For unevolved sequences 6 and 8, 4 out of 900 and 27 out of 900 simulations failed to fold, respectively, within the posttranslation period of 109 time units. To see this figure in color, go online.

Initial, unevolved sequences have moderate stabilities that improve with evolution. The unevolved sequences also fold slowly relative to the degradation timescale (1/kd) of 200,000 time units. All medium and high contact order unevolved sequences have nonzero median first passage times, meaning that proteins fold posttranslationally in the majority of folding trajectories. The longest first passage times for unevolved sequences are greater than 109 time units, indicating long-lived unfolded or misfolded states. In comparison, evolved sequences have first passage times that mostly fall within the degradation timescale. Evolution of folding times to be within the protein degradation timescale has been predicted by a previous study (47). The only evolved sequence with a nonzero median first passage time is that of structure 4, at 9000 time units. This indicates that cotranslational folding, technically defined as reaching the native state before release from the ribosome, occurs in the majority of folding trajectories for evolved sequences. The section that follows (Two Opposing Strategies for Reaching the Native State) discusses folding during translation in greater detail.

To probe the effect of different selection pressures, evolutionary simulations were also run under two alternative scenarios. The first alternative scenario, “no translation,” simulates in vitro protein folding, which starts full-length proteins in fully extended conformations. The second alternative scenario, “no folding,” starts simulations with proteins already in their native conformations (effectively setting t in Eq. 5 to 0). The properties of sequences obtained from evolution under these two alternative fitness evaluation scenarios are shown in Fig. S5. Note that although evolved sequences were obtained under different evolutionary scenarios, for purpose of comparison, the fitnesses for all sequences shown in Fig. S5 were determined using the same, regular fitness evaluation involving MC simulations of translation and folding. Evolving for in vitro refolding kinetics (“no translation” scenario) results in proteins that fold in the context of translation just as fast as or even faster than the regular sequences which evolved with translation. The evolved “no translation” sequences, however, (except that of structure 5) are less stable than sequences evolved with translation. On the other hand, the results of evolution under the “no folding” scenario shows that long-lived kinetic traps that hamper folding are not eliminated under an evolutionary scenario where folding rates do not affect protein activity and fitness.

The bottom-most plot in Fig. 2 shows the native state energies of unevolved and evolved sequences. Evolved sequences for structures 1, 2, 3, and 6 have native energies that are about 25% lower than those of the evolved sequences for structures 4, 5, 7, 8, and 9. These native energies reflect how these proteins fold during translation—whether these proteins fold early on or late during translation—as we will discuss next.

Two opposing strategies for reaching the native state

Model proteins evolved sufficiently fast kinetics such that the majority of folding trajectories for evolved sequences achieve the native state before release from the ribosome (Fig. 2). To understand the nature of cotranslational folding in our model proteins, we examined folding trajectories and found that proteins have one of two different folding mechanisms. We first characterized trajectories by the fraction of possible native contacts formed at each nascent chain length, QL. This measure normalizes the number of native contacts formed at a particular chain length by the maximal number of native contacts that can possibly be formed at that chain length. Q is averaged over all trajectory samples at each chain length and over all trajectories, providing an ensemble-average folding trajectory.

Examination of QL shows that the behavior of evolved proteins falls into two groups, as illustrated in Fig. 3. Group 1 consists of low contact order structures 1, 2, and 3 as well as medium contact order structure 6. For evolved sequences in Group 1, QL is close to 1 for chain lengths beyond 16 (Fig. 3 A). This shows that Group 1 proteins have cotranslational folding mechanism in which native-like conformations establish early on during translation. Group 2 consists of medium contact order structures 4 and 5 and high contact order structures 7, 8, and 9. Evolved sequences in Group 2 do not develop high QL values until nascent chains grow to about 25 residues in length (Fig. 3 B). Thus, Group 1 proteins fold early on during translation, and Group 2 proteins fold toward the end of translation.

Figure 3.

Figure 3

Proteins can be separated into two groups based on behavior during translation. Behavior of nascent chains during translation for each sequence are illustrated using average Q at each chain length, QL. Q is the fraction of native contacts formed out of the total possible number of native contacts. Only data for nascent chain lengths 15–27 residues are shown here. (A) QL for evolved protein sequences in Group 1 in which the native structure is stable early on during translation. (B) QL for evolved protein sequences in Group 2 in which the native structure is stable toward the end of translation. (C and D) Same protein structures as in (A) and (B), respectively, but showing differences in QL between evolved and unevolved sequences. To guide the eye, 0 is indicated by a horizontal line. All error bars indicate 95% confidence intervals obtained by bootstrap sampling per-trajectory values. To see this figure in color, go online.

There are also differences in how sequences evolved, as illustrated by differences in QL between evolved and unevolved sequences. For Group 1, QL either increased or remained unchanged as a result of evolution (Fig. 3 C). Minimal change in QL reflects cases in which QL is already close to 1 for unevolved sequences (structures 1 and 3). Interestingly, for Group 2, evolved sequences other than the sequence folding to structure 9 have lower QL values at intermediate nascent chain lengths 15–22 compared with those of unevolved sequences (Fig. 3 D) (Mann-Whitney U test between per-trajectory values at each length for each sequence, p < 0.0001). Evolved sequences form fewer native contacts at those lengths.

To further understand these two kinds of proteins, we examined individual folding trajectories for each structure. Here, we switch to quantifying folding by native contact counts, to illustrate development of native structure during translation. Native contacts are normalized by the total number of native contacts for the full-length native conformation, which is 28 for all lattice proteins in this work. We focus on structures 2 and 7 as representatives of Group 1 and Group 2, respectively, because evolved sequences for structures 2 and 7 have the highest folding stability out of all nine evolved sequences. Individual folding trajectories for unevolved sequences folding to structures 2 and 7 are shown in Fig. 4, A and B, respectively, and the corresponding trajectories for evolved sequences are shown in Fig. 4, C and D, respectively. For each trajectory, native contacts are averaged over all samples collected at each nascent chain length. Folding trajectories for unevolved and evolved sequences of all nine native structures are shown Fig. S6.

Figure 4.

Figure 4

Structures 2 and 7 as respective examples of proteins folding early (Group 1) or late (Group 2) during translation. Individual folding trajectories for sequences folding to structures 2 and 7 are shown by native contacts at nascent chain lengths 15–27. Native contacts are normalized by the total number of native contacts in full-length native structures, 28, and averaged over all samples collected at each nascent chain length. Each colored line is a single trajectory, and each panel shows 100 trajectories. The shade of a colored line or bar indicates whether a particular trajectory folded before (dark) or after (light) release from the ribosome. The solid black line in each panel indicates maximum number of native contacts at each nascent chain length. (A and B) Folding trajectories for unevolved sequences folding to structures 2 and 7, respectively. (C and D) Folding trajectories for evolved sequences folding to structures 2 and 7, respectively. Righthand side of each panel shows proportion of trajectories that reach the native state before (“cotrans”) or after (“posttrans”) release from the ribosome. To see this figure in color, go online.

Sequences folding to structure 2 can stably occupy native-like conformations beginning at a nascent chain length of 16 residues (Fig. 4, A and C). There is an apparent bimodality to the folding trajectories, with the majority of trajectories occupying native-like conformations during translation, and a minority of trajectories in partially-native states. Most trajectories reach native-like states when the nascent chain is between 15 and 21 residues in length. Such trajectories show a steady buildup of native contacts as protein length increases. Trajectories that fail to fold by length 21, however, mostly fold posttranslationally. For instance, for the evolved sequence, 82% of folding trajectories reach the native state by length 21, but of the remaining trajectories, 67.5% remain unfolded at the end of translation. This demonstrates kinetic partitioning (48); the additionally translated residues produce kinetic traps in the folding energy landscape. The difference between the unevolved and evolved sequences is that for the evolved sequence, a greater proportion of trajectories reach the native state before the nascent chain reaches 21 residues in length, resulting in a higher proportion of cotranslational folding. Other proteins in Group 1 are similar: a majority of trajectories (>75%) achieve the native state before release from the ribosome, with folding largely proceeding through a steady increase of native contacts during translation (Fig. S6).

In contrast, for sequences folding to structure 7, folding does not occur until the nascent chain reaches a length of 25 residues (Fig. 4, B and D). Compared with the unevolved sequence, the evolved sequence also forms fewer native contacts at nascent chain lengths below 23 residues. Once the nascent chain passes 25 residues in length, however, evolved sequence trajectories show rapid folding to native-like conformations. Other proteins in Group 2 similarly fold only toward the end of translation (Fig. S6). Upon folding, the full native fold, minus just a few contacts, is achieved. Thus, although many folding trajectories for sequences in this second group technically exhibit cotranslational folding (folding before the end of translation), the folding process is closer to what would occur for full-length proteins.

Contact order is not a perfect predictor of cotranslational folding, as seen by the split of our medium contact order structures among Group 1 and Group 2. In particular, the evolved sequence for medium contact order structure 6 folds early on during translation. To explore this, we constructed two-dimensional contact maps in which average frequencies of residue-residue contacts at particular nascent chain lengths during translation are averaged across all folding trajectories. Figs. S7–S9 show these contact map-based plots of folding trajectories for unevolved and evolved sequences folding to structures 2, 6, and 7, respectively. For sequences folding to structures 2 and 6 (Figs. S7 and S8), evolution strengthened native contacts and weakened nonnative contacts. Both structures 2 and 6 equally support forming 17 native contacts at a nascent chain length of 20. By this point during translation, nascent chains adopt native-like conformations, from which the rest of the native structure forms. These native-like conformations are cotranslational folding intermediates, stable states on the path to the full-length native structure, similar to intermediates predicted in our previous work (20). Contact order roughly predicts whether such cotranslational folding intermediates can form. In contrast, for structure 7 (Fig. S9), only 10 native contacts can be formed when the nascent chain is 20 residues long, and the nascent chain instead forms more nonnative contacts. For sequences folding to structure 7, evolution weakened both native and nonnative contacts at shorter nascent chain lengths. These observations explain how the evolved sequence for structure 7 forms fewer native contacts at intermediate nascent chain lengths than does the unevolved sequence (Fig. 4 B versus Fig. 4 D).

We find further distinctions between Group 1 and Group 2 proteins when examining energetics as a function of nascent chain length. The free energies of native conformations are shown in Fig. S10. Mirroring observations made from analyzing native contacts and contact maps, for Group 1, evolution increased or maintained the stability of native conformations. On the other hand, for Group 2, evolution destabilized native conformations at shorter nascent chain lengths. This destabilization of native conformations is reflected in native state energies as well. Fig. S10 also shows the energies of native conformations for nascent chain lengths 15–27. For proteins in Group 2, the C-terminal residues for the evolved sequences contribute a greater fraction of the stabilization of the native state than is the case for the unevolved sequences. This pattern explains why evolved native energies for structures folding either early or late during translation are different in magnitude (Fig. 2, bottom). Overall, we find that the native structure of a protein determines how many native contacts are available at a nascent chain length, which decides whether the nascent chain can stably fold. This in turn influences whether evolution strengthens or weakens contacts made by residues at particular nascent chain lengths.

Kinetic characterizations contrast cotranslational folding with in vitro folding

For Group 1 proteins, folding trajectories that do not reach the native conformation when the nascent chain is 15–20 residues long mostly fold posttranslationally (Figs. 4, A and C and S6). This suggests that folding kinetics slows with increasing nascent chain length for these proteins. To investigate this, in vitro first passage times to the native state—the full-length protein starts in an extended conformation, and translation is not modeled—were measured. We compare in vitro first passage times of unevolved and evolved sequences with those of sequences obtained from evolution under the “no translation” scenario (Fig. 5); the latter sequences, by design, are optimized for in vitro folding. Compared with the unevolved sequences, both evolved and evolved, “no translation” sequences have faster folding kinetics. More significant, the four evolved sequences in Group 1 (structures 1, 2, 3, and 6) have slower first passage times than those of their “no translation” counterparts and those of evolved sequences in Group 2 (structures 4, 5, 7, 8, and 9). The in vitro first passage times of the evolved sequences that fold early on during translation are only moderately improved compared to those of their initial, unevolved counterparts.

Figure 5.

Figure 5

Structures evolved to fold cotranslationally have slow in vitro folding kinetics. Mean first passage times for folding from full-length, extended conformations for unevolved sequences (blue), evolved sequences (orange), and sequences evolved in the “no translation” evolutionary scenario (magenta). Structures in Group 2 have been given a shaded background. Error bars indicate 95% confidence intervals obtained by bootstrap sampling. To see this figure in color, go online.

These differences in kinetics between sequences reflect different selection pressures during evolution. When undergoing translation, evolved sequences in Group 1 fold to stable, native-like conformations at lengths of 15–20 residues. Additional translated residues then add to an existing native structure. Any slow-folding intermediates that form when folding from the fully unfolded state are thereby avoided. One consequence demonstrated here is that proteins that have evolved to fold cotranslationally have slow in vitro folding kinetics. Vectorial synthesis reduces the selection pressure for fast folding kinetics when proteins can start folding cotranslationally.

We further characterized the kinetics of our sequences by measuring first passage times to native conformations at chain lengths of 15–27 residues. We fit our data to a simple three-state, three-parameter model (see Extended Methods and Fig. S1). The fitted kinetic parameters are shown in Fig. S11. These results show how proteins in Group 1 have fast folding kinetics at intermediate chain lengths and slower kinetics as chain length increases.

Proteins that fold early on during translation benefit from midsequence slow codons

Thus far, our studies have examined nonsynonymous sequence changes, but another aspect of protein translation is that codon identity influences translation rates, which can affect protein folding efficiency (49). Furthermore, slowly translating rare codons have been associated with cotranslational folding (19,20). We next investigated the effects of changing the elongation intervals for different codon positions. Here, we restrict our investigation only to evolved lattice protein sequences.

Estimated translation rates for different codons in E. coli differ by up to an order of magnitude (50). We performed MC simulations of translation and folding in which the elongation interval for individual residues was increased 10-fold and measured the proportion of trajectories in which proteins folded cotranslationally. Note that increasing the elongation interval for the Nth codon means that the nascent chain spends additional time at a length of N-1 residues. Results from these simulations for evolved sequences folding to structures 2 and 7 are shown in Fig. 6, A and B, respectively, and results for all evolved sequences are available in Fig. S12. For nearly all sequences, increasing the elongation interval of codons at the C-terminus increases the proportion of cotranslational folding (Fig. S12), with proteins in Group 2 showing more substantial increases in cotranslational folding. Increased cotranslational folding due to slowly translated C-terminal codons is a somewhat trivial effect under our model, however, because folding at a nascent chain length of 25 or 26 residues is not very different from folding as a full-length 27-residue protein. Only sequences folding to structures 2 and 6, from Group 1, show increases in cotranslational folding from increased elongation intervals at midsequence positions. These positions reflect nascent chain lengths at which cotranslational folding intermediates become stable and resembles how rare codons are positioned before putative cotranslational folding intermediates in real proteins (20).

Figure 6.

Figure 6

Slowing translation at particular midsequence positions enhances cotranslational folding in lower contact order proteins. (A and B) Proportion of folding trajectories in which cotranslational folding occurs when translation is slowed at individual codon positions for evolved sequences folding to structures 2 and 7, respectively. 1500 folding simulations were performed for each slow codon position, and 900 folding simulations were performed for the original, flat translation schedule. Error bars indicate 95% confidence intervals calculated by Wilson score interval. Statistical significances of differences in fraction folding cotranslationally (compared with translation using a flat translation schedule) were evaluated using χ2 tests. (C and D) The frequency that a synonymous mutation was fixed in synonymous mutation evolutionary simulations for evolved sequences folding to structures 2 and 7, respectively. Frequencies for positions 1–15 are omitted. 1800 independent evolutionary simulations were performed for each sequence. Error bars indicate 95% confidence intervals calculated using Goodman’s method. Statistical significances of deviations from the neutral expectation of 1/27 were evaluated using independent binomial tests. (E) The median contact order of protein substructures preceding evolutionarily conserved rare codons compared with the median contact order of substructures preceding random positions in genes without rare codons at different p-value thresholds for evolutionary conservation of the rare codons. The number of genes with conserved rare codons at each p-value threshold is indicated inside the bars; the remaining genes without rare codons were used to generate random substructures with lengths distributed according to a geometric distribution. Statistical significances between distributions were evaluated using the Mann-Whitney U test (two-sided). (F) The distributions of contact orders for substructures preceding rare codons and for substructures preceding random positions at the 10−3p-value threshold for evolutionary conservation. For all panels, p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 0.0001. To see this figure in color, go online.

To confirm that increases in cotranslational folding proportion could provide a fitness advantage and therefore be selected by evolution, we performed evolutionary simulations in which mutations had the effect of slowing translation at a specific position in the sequence, mimicking the effect of synonymous mutation to a rare codon. In these evolutionary simulations, simulation trajectories were stopped once a single mutation was fixed (see Extended Methods). The distribution of fixed “synonymous mutations” for evolved sequences folding to structures 2 and 7 are shown in Fig. 6, C and D, respectively; the distributions deviate significantly from the neutral expectation of a uniform distribution (χ2 test, p < 0.0001). As predicted by the cotranslational folding proportions in Fig. 6, A and B, synonymous substitutions to slower codons are selected for by our evolutionary simulations.

Our model results suggest that proteins that can fold early on during translation benefit from translational slowing at specific midsequence positions. Because our model proteins that fold early on during translation are also lower in contact order, we wondered whether a bioinformatic signature of cotranslational folding, conserved rare codons, would be more likely to be found in genes coding proteins with lower contact orders. A recent study from our group identified conserved rare codons in E. coli (20). The study moreover examined structurally characterized E. coli proteins and found that conserved rare codons are frequently positioned downstream of predicted cotranslational folding intermediates. We measured the contact orders of protein substructures preceding rare codons identified in this previous study. Here, a substructure is defined as the portion of the native structure from the N-terminus to 30 residues before the location of a codon of interest, a length which accounts for the ribosome exit tunnel (51).

For each gene with rare codons, we measured the contact order of the substructure corresponding to the first evolutionarily conserved rare codon (excluding N-terminal-rare codons). We then compared this distribution of contact orders to control distributions generated by measuring the contact orders of substructures preceding random positions in genes without conserved rare codons. This analysis was performed at multiple p-value thresholds for determining evolutionary conservation of rare codons (see Extended Methods). We found that protein substructures preceding rare codons have lower contact orders than those of protein substructures preceding randomly drawn positions (Fig. 6 E). Statistical significance declines with decreasing p-value threshold as the number of genes with qualifying rare codon regions decreases. The most statistically significant difference is found at a p-value threshold for rare codon conservation of 10−3 (Mann-Whitney U test (two-sided), p = 0.0019). The distributions of contact orders at this conservation threshold are shown in Fig. 6 F and have medians of 0.2038 and 0.2338 for substructures preceding rare codons and random substructures, respectively. Our simulation results suggest a mechanism for this observation: structures lower in contact order support evolution of a sequential, cotranslational folding pathway, facilitated by slowly translated rare codons.

Discussion

Our results are summarized in Table 1. When a protein can fold early on during translation, the protein folds cotranslationally by first folding to a partial-length native structure consisting of the native contacts available at that length. Subsequent translated residues then add additional native contacts to this core structure. Evolution strengthens native contacts and weakens nonnative contacts to stabilize the native state. Four of nine model proteins—the three low contact order proteins and one medium contact order protein—follow this pattern. On the other hand, if native-like states are not stable until the protein is nearly fully translated, protein sequences evolve so that the nascent protein chain avoids making strong native and nonnative contacts that might trap the nascent chain in incompletely folded states. Folding occurs toward the end of translation, and whether the protein reaches the native state before release from the ribosome depends on folding kinetics versus translation speed. This pattern of evolution occurs for the remaining five model proteins.

Table 1.

Comparison between Proteins that Fold Early or Late during Translation

Folding Early during Translation Folding toward the End of Translation
Lower contact order Higher contact order
Evolution strengthens contacts Evolution weakens nascent chain contacts
Folds cotranslationally Folds cotranslationally if folding rate is faster than translation
Full-length protein may have slow folding kinetics Folding pathways on and off the ribosome are likely to be similar
Rare codons midsequence or at the C-terminus Rare codons at the C-terminus

As structural analyses found, contact order is a rough indicator for whether the protein native structure supports stable, partial-length conformations that facilitate cotranslational folding. We point out that the evolution of a cotranslational folding mechanism in low contact order proteins is not a consequence of faster folding kinetics in low contact order proteins (28), but because low contact order topologies support native-like folding intermediates.

One consequence of evolution toward a cotranslational folding mechanism that there is less selection pressure on the folding kinetics of the full-length chain because proteins fold via residue-by-residue cotranslational folding. For model proteins that fold early on during translation, in vitro folding times—measured from full-length extended conformations—are substantially longer than cotranslational folding times. Experiments on individual proteins have found that refolding from the denatured state is often less efficient than cotranslational folding in terms of folding rate or occurrence of irreversible aggregation (6,15,17,18). Our simulation results suggest one evolutionary factor for these phenomena: proteins that fold cotranslationally are not under selection to avoid forming slow-folding intermediates encountered when refolding from the denatured state. Consequently, we predict that proteins that fold cotranslationally are more prone to inefficient refolding from denatured states.

Interestingly, the observation of slow in vitro folding kinetics for evolved model proteins that fold early on during translation contradicts the expected relationship between contact order and folding speed (28). This difference may be because the study of contact order and folding speed has been limited to small proteins capable of in vitro refolding (28,52, 53, 54, 55). Indeed, many proteins are unable to refold once denatured in vitro (56, 57, 58, 59, 60). Although the model proteins in this study are admittedly short in length, their properties can still generalize onto the characteristics of longer, real proteins. We speculate that fast folding kinetics for partial-length nascent chains and slow folding kinetics for full-length proteins provides cells with a route for efficient production of long-lived, kinetically stable proteins which, once folded, remain protected from transient unfolding by a high folding-unfolding barrier.

For model proteins that fold toward the end of translation, rapid folding to the native state commences once a sufficient number of residues are extruded. Although it would be difficult to test whether protein sequences are optimized to avoid forming strong interresidue interactions until native-like conformations are stable, it is known that cotranslational chaperones such as trigger factor prevent nascent chains from making aberrant interactions and alter folding pathways (24,61,62). A future study could investigate the relationship between native state topology and chaperone interaction.

The model proteins that fold toward the end of translation are higher contact order proteins. Although their folding kinetics are sufficiently fast to fold before the end of translation in our simulations, the folding pathways of such proteins while tethered are not likely to differ from in vitro folding pathways. Recent studies on two proteins, the Src SH3 domain and titin I27, observed that ribosome-nascent chain complex folding pathways are similar to off-ribosome folding pathways (63,64). We calculate the contact order of these two proteins to be 0.37 and 0.41, respectively; these values are much higher than the median contact order of the E. coli proteins used in our bioinformatics analysis, 0.21. Our simulation results predict that such high contact order proteins should fold toward the end of translation or posttranslationally, which agrees with the experimental findings.

Finally, we investigated the effect of changing the elongation interval for specific positions along evolved sequences to simulate the effect of substitution to rare, slowly translating synonymous codons. These results show that slowly translating codons increase folding efficiency and provide an example of evolutionary selection on synonymous codons. Our results support a recent study which showed that synonymous substitutions in a gene can diminish fitness by increasing protein degradation (65). Increasing the elongation interval at mid-sequence positions increases folding efficiency only for model proteins that can fold early on during the translation process. We used an existing data set of conserved rare codons in E. coli genes to probe whether contact order has any association with conserved rare codons (20). By comparing the contact orders of protein substructures preceding conserved rare codons to the contact orders of substructures preceding random positions from genes without conserved rare codons, we find that contact orders of the former are lower than the contact orders of the latter. Our findings show that native structure topology indeed influences whether a nascent chain is likely to cotranslationally fold, with protein substructures with more local topologies (and lower contact order) more likely to precede rare codon stretches in real genes.

In summary, our simulations use a simplified model of protein translation and folding to study how sequences evolve under selection pressure for functional protein, assuming that the folded, native state is the functional state. In our model, proteins can begin to fold during translation and are vulnerable to degradation or aggregation while free and unfolded in solution. We find that the point at which native-like conformations become thermodynamically stable during translation influences how proteins evolve to fold during translation, and we predict that cotranslational folding is more likely to occur in lower contact order proteins.

Author Contributions

Designed research, W.M.J., E.I.S., and V.Z.; developed theoretical models, W.M.J., E.I.S., and V.Z.; performed research, V.Z.; analyzed data, E.I.S. and V.Z.; wrote the manuscript, E.I.S. and V.Z.

Acknowledgments

All computations in this work were run on the FASRC Odyssey and Cannon clusters supported by the FAS Division of Science Research Computing Group at Harvard University, United States. Lattice protein renderings were produced using Tachyon (66) within VMD (67). We thank Rostam M. Razban and Mobolaji Williams for helpful discussions.

This work was supported by the National Institute of General Medical Sciences of the National Institutes of Health (RO1GM068670) and the National Science Foundation Graduate Research Fellowship Program (DGE1745303, awarded to V.Z.).

Editor: Amedeo Caflisch.

Footnotes

Supporting Material can be found online at https://doi.org/10.1016/j.bpj.2020.06.037.

Supporting Citations

References (68, 69, 70, 71, 72, 73, 74) appear in the Supporting Materials and Methods.

Supporting Material

Document S1. Supporting Materials and Methods, Figs. S1–S12, and Table S1 and S2
mmc1.pdf (4.1MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (5.4MB, pdf)

References

  • 1.Komar A.A. Unraveling co-translational protein folding: concepts and methods. Methods. 2018;137:71–81. doi: 10.1016/j.ymeth.2017.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kramer G., Shiber A., Bukau B. Mechanisms of cotranslational maturation of newly synthesized proteins. Annu. Rev. Biochem. 2019;88:337–364. doi: 10.1146/annurev-biochem-013118-111717. [DOI] [PubMed] [Google Scholar]
  • 3.Sharma A.K., O’Brien E.P. Non-equilibrium coupling of protein structure and function to translation-elongation kinetics. Curr. Opin. Struct. Biol. 2018;49:94–103. doi: 10.1016/j.sbi.2018.01.005. [DOI] [PubMed] [Google Scholar]
  • 4.Cabrita L.D., Hsu S.-T.D., Christodoulou J. Probing ribosome-nascent chain complexes produced in vivo by NMR spectroscopy. Proc. Natl. Acad. Sci. USA. 2009;106:22239–22244. doi: 10.1073/pnas.0903750106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Clark P.L., King J. A newly synthesized, ribosome-bound polypeptide chain adopts conformations dissimilar from early in vitro refolding intermediates. J. Biol. Chem. 2001;276:25411–25420. doi: 10.1074/jbc.M008490200. [DOI] [PubMed] [Google Scholar]
  • 6.Evans M.S., Sander I.M., Clark P.L. Cotranslational folding promotes β-helix formation and avoids aggregation in vivo. J. Mol. Biol. 2008;383:683–692. doi: 10.1016/j.jmb.2008.07.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wruck F., Katranidis A., Hegner M. Translation and folding of single proteins in real time. Proc. Natl. Acad. Sci. USA. 2017;114:E4399–E4407. doi: 10.1073/pnas.1617873114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Nicola A.V., Chen W., Helenius A. Co-translational folding of an alphavirus capsid protein in the cytosol of living cells. Nat. Cell Biol. 1999;1:341–345. doi: 10.1038/14032. [DOI] [PubMed] [Google Scholar]
  • 9.Frydman J., Erdjument-Bromage H., Hartl F.U. Co-translational domain folding as the structural basis for the rapid de novo folding of firefly luciferase. Nat. Struct. Biol. 1999;6:697–705. doi: 10.1038/10754. [DOI] [PubMed] [Google Scholar]
  • 10.Hsu S.-T.D., Fucini P., Christodoulou J. Structure and dynamics of a ribosome-bound nascent chain by NMR spectroscopy. Proc. Natl. Acad. Sci. USA. 2007;104:16516–16521. doi: 10.1073/pnas.0704664104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Eichmann C., Preissler S., Deuerling E. Cotranslational structure acquisition of nascent polypeptides monitored by NMR spectroscopy. Proc. Natl. Acad. Sci. USA. 2010;107:9111–9116. doi: 10.1073/pnas.0914300107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Holtkamp W., Kokic G., Rodnina M.V. Cotranslational protein folding on the ribosome monitored in real time. Science. 2015;350:1104–1107. doi: 10.1126/science.aad0344. [DOI] [PubMed] [Google Scholar]
  • 13.Bhushan S., Gartmann M., Beckmann R. α-helical nascent polypeptide chains visualized within distinct regions of the ribosomal exit tunnel. Nat. Struct. Mol. Biol. 2010;17:313–317. doi: 10.1038/nsmb.1756. [DOI] [PubMed] [Google Scholar]
  • 14.Nilsson O.B., Hedman R., von Heijne G. Cotranslational protein folding inside the ribosome exit tunnel. Cell Rep. 2015;12:1533–1540. doi: 10.1016/j.celrep.2015.07.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Netzer W.J., Hartl F.U. Recombination of protein domains facilitated by co-translational folding in eukaryotes. Nature. 1997;388:343–349. doi: 10.1038/41024. [DOI] [PubMed] [Google Scholar]
  • 16.Zhang G., Hubalewska M., Ignatova Z. Transient ribosomal attenuation coordinates protein synthesis and co-translational folding. Nat. Struct. Mol. Biol. 2009;16:274–280. doi: 10.1038/nsmb.1554. [DOI] [PubMed] [Google Scholar]
  • 17.Ugrinov K.G., Clark P.L. Cotranslational folding increases GFP folding yield. Biophys. J. 2010;98:1312–1320. doi: 10.1016/j.bpj.2009.12.4291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Samelson A.J., Bolin E., Marqusee S. Kinetic and structural comparison of a protein’s cotranslational folding and refolding pathways. Sci. Adv. 2018;4:eaas9098. doi: 10.1126/sciadv.aas9098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chaney J.L., Steele A., Clark P.L. Widespread position-specific conservation of synonymous rare codons within coding sequences. PLoS Comput. Biol. 2017;13:e1005531. doi: 10.1371/journal.pcbi.1005531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jacobs W.M., Shakhnovich E.I. Evidence of evolutionary selection for cotranslational folding. Proc. Natl. Acad. Sci. USA. 2017;114:11434–11439. doi: 10.1073/pnas.1705772114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Koutmou K.S., Radhakrishnan A., Green R. Synthesis at the speed of codons. Trends Biochem. Sci. 2015;40:717–718. doi: 10.1016/j.tibs.2015.10.005. [DOI] [PubMed] [Google Scholar]
  • 22.Bitran A., Jacobs W.M., Shakhnovich E. Cotranslational folding allows misfolding-prone proteins to circumvent deep kinetic traps. Proc. Natl. Acad. Sci. USA. 2020;117:1485–1495. doi: 10.1073/pnas.1913207117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cabrita L.D., Cassaignau A.M.E., Christodoulou J. A structural ensemble of a ribosome-nascent chain complex during cotranslational protein folding. Nat. Struct. Mol. Biol. 2016;23:278–285. doi: 10.1038/nsmb.3182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Nilsson O.B., Müller-Lucks A., von Heijne G. Trigger factor reduces the force exerted on the nascent chain by a cotranslationally folding protein. J. Mol. Biol. 2016;428:1356–1364. doi: 10.1016/j.jmb.2016.02.014. [DOI] [PubMed] [Google Scholar]
  • 25.Hartl F.U., Bracher A., Hayer-Hartl M. Molecular chaperones in protein folding and proteostasis. Nature. 2011;475:324–332. doi: 10.1038/nature10317. [DOI] [PubMed] [Google Scholar]
  • 26.Kim Y.E., Hipp M.S., Hartl F.U. Molecular chaperone functions in protein folding and proteostasis. Annu. Rev. Biochem. 2013;82:323–355. doi: 10.1146/annurev-biochem-060208-092442. [DOI] [PubMed] [Google Scholar]
  • 27.Serohijos A.W., Shakhnovich E.I. Merging molecular mechanism and evolution: theory and computation at the interface of biophysics and evolutionary population genetics. Curr. Opin. Struct. Biol. 2014;26:84–91. doi: 10.1016/j.sbi.2014.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Plaxco K.W., Simons K.T., Baker D. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 1998;277:985–994. doi: 10.1006/jmbi.1998.1645. [DOI] [PubMed] [Google Scholar]
  • 29.Goldberg A.L. Protein degradation and protection against misfolded or damaged proteins. Nature. 2003;426:895–899. doi: 10.1038/nature02263. [DOI] [PubMed] [Google Scholar]
  • 30.Cho Y., Zhang X., Powers E.T. Individual and collective contributions of chaperoning and degradation to protein homeostasis in E. coli. Cell Rep. 2015;11:321–333. doi: 10.1016/j.celrep.2015.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Belle A., Tanay A., O’Shea E.K. Quantification of protein half-lives in the budding yeast proteome. Proc. Natl. Acad. Sci. USA. 2006;103:13004–13009. doi: 10.1073/pnas.0605420103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Grossman A.D., Straus D.B., Gross C.A. Sigma 32 synthesis can regulate the synthesis of heat shock proteins in Escherichia coli. Genes Dev. 1987;1:179–184. doi: 10.1101/gad.1.2.179. [DOI] [PubMed] [Google Scholar]
  • 33.Maurizi M.R. Proteases and protein degradation in Escherichia coli. Experientia. 1992;48:178–201. doi: 10.1007/BF01923511. [DOI] [PubMed] [Google Scholar]
  • 34.McShane E., Sin C., Selbach M. Kinetic analysis of protein stability reveals age-dependent degradation. Cell. 2016;167:803–815.e21. doi: 10.1016/j.cell.2016.09.015. [DOI] [PubMed] [Google Scholar]
  • 35.Bershtein S., Mu W., Shakhnovich E.I. Protein quality control acts on folding intermediates to shape the effects of mutations on organismal fitness. Mol. Cell. 2013;49:133–144. doi: 10.1016/j.molcel.2012.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bershtein S., Serohijos A.W.R., Shakhnovich E.I. Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria. PLoS Genet. 2015;11:e1005612. doi: 10.1371/journal.pgen.1005612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Dykhuizen D.E., Dean A.M., Hartl D.L. Metabolic flux and fitness. Genetics. 1987;115:25–31. doi: 10.1093/genetics/115.1.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Rodrigues J.V., Bershtein S., Shakhnovich E.I. Biophysical principles predict fitness landscapes of drug resistance. Proc. Natl. Acad. Sci. USA. 2016;113:E1470–E1478. doi: 10.1073/pnas.1601441113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kimura M. On the probability of fixation of mutant genes in a population. Genetics. 1962;47:713–719. doi: 10.1093/genetics/47.6.713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Miyazawa S., Jernigan R.L. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules. 1985;18:534–552. [Google Scholar]
  • 41.Wang P., Klimov D.K. Lattice simulations of cotranslational folding of single domain proteins. Proteins. 2008;70:925–937. doi: 10.1002/prot.21547. [DOI] [PubMed] [Google Scholar]
  • 42.Gilson A.I., Marshall-Christensen A., Shakhnovich E.I. The role of evolutionary selection in the dynamics of protein structure evolution. Biophys. J. 2017;112:1350–1365. doi: 10.1016/j.bpj.2017.02.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Heo M., Maslov S., Shakhnovich E. Topology of protein interaction network shapes protein abundances and strengths of their functional and nonspecific interactions. Proc. Natl. Acad. Sci. USA. 2011;108:4258–4263. doi: 10.1073/pnas.1009392108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Abkevich V.I., Gutin A.M., Shakhnovich E.I. Improved design of stable and fast-folding model proteins. Fold. Des. 1996;1:221–230. doi: 10.1016/S1359-0278(96)00033-8. [DOI] [PubMed] [Google Scholar]
  • 45.Dokholyan N.V., Shakhnovich E.I. Understanding hierarchical protein evolution from first principles. J. Mol. Biol. 2001;312:289–307. doi: 10.1006/jmbi.2001.4949. [DOI] [PubMed] [Google Scholar]
  • 46.Faísca P.F., Nunes A., Shakhnovich E.I. Non-native interactions play an effective role in protein folding dynamics. Protein Sci. 2010;19:2196–2209. doi: 10.1002/pro.498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zou T., Williams N., Ghosh K. Proteome folding kinetics is limited by protein halflife. PLoS One. 2014;9:e112701. doi: 10.1371/journal.pone.0112701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Thirumalai D., Klimov D.K., Woodson S.A. Kinetic partitioning mechanism as a unifying theme in the folding of biomolecules. Theor. Chem. Acc. 1997;96:14–22. [Google Scholar]
  • 49.Stein K.C., Frydman J. The stop-and-go traffic regulating protein biogenesis: how translation kinetics controls proteostasis. J. Biol. Chem. 2019;294:2076–2084. doi: 10.1074/jbc.REV118.002814. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ciryam P., Morimoto R.I., O’Brien E.P. In vivo translation rates can substantially delay the cotranslational folding of the Escherichia coli cytosolic proteome. Proc. Natl. Acad. Sci. USA. 2013;110:E132–E140. doi: 10.1073/pnas.1213624110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Chaney J.L., Clark P.L. Roles for synonymous codon usage in protein biogenesis. Annu. Rev. Biophys. 2015;44:143–166. doi: 10.1146/annurev-biophys-060414-034333. [DOI] [PubMed] [Google Scholar]
  • 52.Ivankov D.N., Garbuzynskiy S.O., Finkelstein A.V. Contact order revisited: influence of protein size on the folding rate. Protein Sci. 2003;12:2057–2062. doi: 10.1110/ps.0302503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Rustad M., Ghosh K. Why and how does native topology dictate the folding speed of a protein? J. Chem. Phys. 2012;137:205104. doi: 10.1063/1.4767567. [DOI] [PubMed] [Google Scholar]
  • 54.Zou T., Ozkan S.B. Local and non-local native topologies reveal the underlying folding landscape of proteins. Phys. Biol. 2011;8:066011. doi: 10.1088/1478-3975/8/6/066011. [DOI] [PubMed] [Google Scholar]
  • 55.Dinner A.R., Karplus M. The roles of stability and contact order in determining protein folding rates. Nat. Struct. Biol. 2001;8:21–22. doi: 10.1038/83003. [DOI] [PubMed] [Google Scholar]
  • 56.Sánchez-Ruiz J.M., López-Lacomba J.L., Mateo P.L. Differential scanning calorimetry of the irreversible thermal denaturation of thermolysin. Biochemistry. 1988;27:1648–1652. doi: 10.1021/bi00405a039. [DOI] [PubMed] [Google Scholar]
  • 57.Nury S., Meunier J.C. Molecular mechanisms of the irreversible thermal denaturation of guinea-pig liver transglutaminase. Biochem. J. 1990;266:487–490. doi: 10.1042/bj2660487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Lyubarev A.E., Kurganov B.I., Orlov V.N. Irreversible thermal denaturation of uridine phosphorylase from Escherichia coli K-12. Biophys. Chem. 1998;70:247–257. doi: 10.1016/s0301-4622(97)00133-6. [DOI] [PubMed] [Google Scholar]
  • 59.Gao Y.-S., Su J.-T., Yan Y.-B. Sequential events in the irreversible thermal denaturation of human brain-type creatine kinase by spectroscopic methods. Int. J. Mol. Sci. 2010;11:2584–2596. doi: 10.3390/ijms11072584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Goyal M., Chaudhuri T.K., Kuwajima K. Irreversible denaturation of maltodextrin glucosidase studied by differential scanning calorimetry, circular dichroism, and turbidity measurements. PLoS One. 2014;9:e115877. doi: 10.1371/journal.pone.0115877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Mashaghi A., Kramer G., Tans S.J. Reshaping of the conformational search of a protein by the chaperone trigger factor. Nature. 2013;500:98–101. doi: 10.1038/nature12293. [DOI] [PubMed] [Google Scholar]
  • 62.O’Brien E.P., Christodoulou J., Dobson C.M. Trigger factor slows co-translational folding through kinetic trapping while sterically protecting the nascent chain from aberrant cytosolic interactions. J. Am. Chem. Soc. 2012;134:10920–10932. doi: 10.1021/ja302305u. [DOI] [PubMed] [Google Scholar]
  • 63.Guinn E.J., Tian P., Marqusee S. A small single-domain protein folds through the same pathway on and off the ribosome. Proc. Natl. Acad. Sci. USA. 2018;115:12206–12211. doi: 10.1073/pnas.1810517115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tian P., Steward A., Best R.B. Folding pathway of an Ig domain is conserved on and off the ribosome. Proc. Natl. Acad. Sci. USA. 2018;115:E11284–E11293. doi: 10.1073/pnas.1810523115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Walsh I.M., Bowman M.A., Clark P.L. Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitness. Proc. Natl. Acad. Sci. USA. 2020;117:3528–3534. doi: 10.1073/pnas.1907126117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Stone J.E. University of Missouri–Rolla; 1998. An efficient library for parallel ray tracing and animation. Masters thesis. [Google Scholar]
  • 67.Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:33–38, 27–28. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  • 68.Mann M., Maticzka D., Backofen R. Classifying proteinlike sequences in arbitrary lattice protein models using LatPack. HFSP J. 2008;2:396–404. doi: 10.2976/1.3027681. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Salmon J.K., Moraes M.A., Shaw D.E. ACM Press; 2011. Parallel random numbers: as easy as 1, 2, 3. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC ’11; pp. 16:1–16:12. [Google Scholar]
  • 70.Lesh N., Mitzenmacher M., Whitesides S. ACM Press; 2003. A complete and effective move set for simplified protein folding. In Proceedings of the Seventh Annual International Conference on Computational Molecular Biology - RECOMB ’03; pp. 188–195. [Google Scholar]
  • 71.Györffy D., Závodszky P., Szilágyi A. “Pull moves” for rectangular lattice polymer models are not fully reversible. IEEE/ACM Trans. Comput. Biol. Bioinform. 2012;9:1847–1849. doi: 10.1109/TCBB.2012.129. [DOI] [PubMed] [Google Scholar]
  • 72.Milo R., Phillips R. Garland Science, Taylor & Francis Group; New York, NY: 2016. Cell Biology by the Numbers. [Google Scholar]
  • 73.Dewachter L., Verstraeten N., Michiels J. An integrative view of cell cycle control in Escherichia coli. FEMS Microbiol. Rev. 2018;42:116–136. doi: 10.1093/femsre/fuy005. [DOI] [PubMed] [Google Scholar]
  • 74.Berman H.M., Westbrook J., Bourne P.E. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supporting Materials and Methods, Figs. S1–S12, and Table S1 and S2
mmc1.pdf (4.1MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (5.4MB, pdf)

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES