Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb;91(1):33-45.
doi: 10.1007/s00239-022-10080-2. Epub 2022 Dec 3.

Consequences of Genetic Recombination on Protein Folding Stability

Affiliations

Consequences of Genetic Recombination on Protein Folding Stability

Roberto Del Amparo et al. J Mol Evol. 2023 Feb.

Abstract

Genetic recombination is a common evolutionary mechanism that produces molecular diversity. However, its consequences on protein folding stability have not attracted the same attention as in the case of point mutations. Here, we studied the effects of homologous recombination on the computationally predicted protein folding stability for several protein families, finding less detrimental effects than we previously expected. Although recombination can affect multiple protein sites, we found that the fraction of recombined proteins that are eliminated by negative selection because of insufficient stability is not significantly larger than the corresponding fraction of proteins produced by mutation events. Indeed, although recombination disrupts epistatic interactions, the mean stability of recombinant proteins is not lower than that of their parents. On the other hand, the difference of stability between recombined proteins is amplified with respect to the parents, promoting phenotypic diversity. As a result, at least one third of recombined proteins present stability between those of their parents, and a substantial fraction have higher or lower stability than those of both parents. As expected, we found that parents with similar sequences tend to produce recombined proteins with stability close to that of the parents. Finally, the simulation of protein evolution along the ancestral recombination graph with empirical substitution models commonly used in phylogenetics, which ignore constraints on protein folding stability, showed that recombination favors the decrease of folding stability, supporting the convenience of adopting structurally constrained models when possible for inferences of protein evolutionary histories with recombination.

Keywords: Molecular evolution; Protein evolution; Protein folding stability; Recombination; Substitution models of protein evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Variation of folding free energy between parental and recombined proteins at varying selection levels. The acceptation of a mutation or recombination event was defined as meeting ∆Gst∆Gr, where ∆Gs is the folding stability of the tested protein (i.e., generated by a mutation or recombination event), ∆Gr is the folding stability of the real protein (Table 1), and t is a user-specified selection threshold. We recombined stable proteins according to this criterion, and considered all recombined proteins, either stable or unstable. The plots show the difference in folding free energies between parent and recombined (descendant) protein sequences (y-axis) as a function of the selection threshold (x-axis). Plot above: difference of mean. The mean of the folding free energies of the descendants is only slightly different from the mean of the parents (note the small scale of the y-axis). Plot below: difference of differences. The difference of the folding free energies of the descendants is much larger than the same difference of the parents. Results based on simulations of the DDL protein family. Error bars represent the 95% confidence interval of the mean, assuming that different protein pairs are independent
Fig. 2
Fig. 2
Acceptance rates of mutated and recombined sequences in several protein families. The acceptation of a mutation or recombination event was defined as meeting ∆Gst∆Gr, where ∆Gs is the folding stability of the tested protein (i.e., generated by a mutation or recombination event), ∆Gr is the folding stability of the real protein (Table 1), and t is a user-specified threshold. In this figure, the threshold is 0.95. The figure shows the acceptance rates of mutated sequences and recombined sequences, as well as the rates of recombination events accepting only one recombined sequence and both recombined sequences. Error bars correspond to the standard error of the mean of the respective mutation or recombination events. Results for the same analysis but focused on recombination events with breakpoints occurring only in the middle position of sequences are shown in Fig. S16
Fig. 3
Fig. 3
Acceptance rates of protein sequences derived from mutation and recombination events at variable selection levels. The acceptation of a mutation or recombination event was defined as meeting ∆Gst∆Gr, where ∆Gs is the folding stability of the tested protein (i.e., generated by a mutation or recombination event), ∆Gr is the folding stability of the real protein (Table 1), and t is a user-specified threshold. The figure shows the acceptance rates of mutated sequences and recombined sequences, as well as the rates of recombination events accepting only one recombined sequence and both recombined sequences. Results based on simulations of the DDL protein family. Error bars correspond to the standard error of the mean of the respective mutation or recombination events
Fig. 4
Fig. 4
Rates of accepted mutated and recombined sequences that are more stable or unstable than their parent sequences at diverse selection levels. The figure shows the rate of mutated sequences more stable than their parent sequences and the rates of recombined (descendant) sequences that are more stable or unstable than both or one of the parental sequences. Results based on simulations of the DDL protein family. Error bars indicate standard error of the mean of the corresponding mutation and recombination events. This evaluation considers recombination events with breakpoints located in all the positions. Results for the same analysis but focused on recombination events with breakpoints occurring only in the middle position of sequences are shown in Fig. S20
Fig. 5
Fig. 5
Influence of sequence identity between parental sequences on the folding free energy caused by recombination in the protein family DDL. The figure shows the folding free energy variation produced by recombination (∆∆G) between recombinant (parental) and recombined (descendant) sequences. Negative values mean that the two sequences before recombining are more stable (mean) than the two sequences after recombining (mean), and the opposite for positive values, as a function of the sequence identity (shown on the right by intervals) between the parental sequences. Results based on a selection threshold of 0.95. The above plots refer to recombination events occurring in all the breakpoint positions (mean) and plots below refer to recombination events with breakpoint position only located in the middle of the sequences. Results for other protein families are shown in Figs. S21–24
Fig. 6
Fig. 6
Folding free energy of DDL proteins simulated upon coalescent trees with diverse combinations of population substitution and recombination rates. Folding free energy (∆G) of proteins simulated upon coalescent trees previously simulated under a variety of combinations of population substitution rate (θ) and population recombination rate (ρ) and where the protein sequences evolved under the best-fitting empirical substitution model (Table 1). The dashed line corresponds to the ∆G of the extant protein structure of the protein family (Table 1). Error bars represent the 95% confidence interval among the mean of computer simulations. Results for other protein families are shown in Fig. S28

Similar articles

Cited by

References

    1. Alves I, Houle AA, Hussin JG, Awadalla P. The impact of recombination on human mutation load and disease. Phil Trans R Soc B. 2017;372:20160465. doi: 10.1098/rstb.2016.0465. - DOI - PMC - PubMed
    1. Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003;164:1229–1236. doi: 10.1093/genetics/164.3.1229. - DOI - PMC - PubMed
    1. Araujo NM. Hepatitis B virus intergenotypic recombinants worldwide: an overview. Infect Genet Evol. 2015;36:500–510. doi: 10.1016/j.meegid.2015.08.024. - DOI - PubMed
    1. Arenas M. Simulation of molecular data under diverse evolutionary scenarios. PLoS Comput Biol. 2012;8:e1002495. doi: 10.1371/journal.pcbi.1002495. - DOI - PMC - PubMed
    1. Arenas M. Computer programs and methodologies for the simulation of DNA sequence data with recombination. Front Genet. 2013;4:9. doi: 10.3389/fgene.2013.00009. - DOI - PMC - PubMed

Publication types