Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jun 16:2024.06.16.599206.
doi: 10.1101/2024.06.16.599206.

Synthetic genomes unveil the effects of synonymous recoding

Affiliations

Synthetic genomes unveil the effects of synonymous recoding

Akos Nyerges et al. bioRxiv. .

Abstract

Engineering the genetic code of an organism provides the basis for (i) making any organism safely resistant to natural viruses and (ii) preventing genetic information flow into and out of genetically modified organisms while (iii) allowing the biosynthesis of genetically encoded unnatural polymers1-4. Achieving these three goals requires the reassignment of multiple of the 64 codons nature uses to encode proteins. However, synonymous codon replacement-recoding-is frequently lethal, and how recoding impacts fitness remains poorly explored. Here, we explore these effects using whole-genome synthesis, multiplexed directed evolution, and genome-transcriptome-translatome-proteome co-profiling on multiple recoded genomes. Using this information, we assemble a synthetic Escherichia coli genome in seven sections using only 57 codons to encode proteins. By discovering the rules responsible for the lethality of synonymous recoding and developing a data-driven multi-omics-based genome construction workflow that troubleshoots synthetic genomes, we overcome the lethal effects of 62,007 synonymous codon swaps and 11,108 additional genomic edits. We show that synonymous recoding induces transcriptional noise including new antisense RNAs, leading to drastic transcriptome and proteome perturbation. As the elimination of select codons from an organism's genetic code results in the widespread appearance of cryptic promoters, we show that synonymous codon choice may naturally evolve to minimize transcriptional noise. Our work provides the first genome-scale description of how synonymous codon changes influence organismal fitness and paves the way for the construction of functional genomes that provide genetic firewalls from natural ecosystems and safely produce biopolymers, drugs, and enzymes with an expanded chemistry.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement The authors declare competing financial interests. A.N. is an inventor on a patent related to directed evolution with random genomic mutations (DIvERGE) (US10669537B2: Mutagenizing Intracellular Nucleic Acids) that has been outlicensed. Harvard Medical School has filed provisional patent applications related to this work on which A.N. and G.M.C. are listed as inventors. Q.Z., M.W., M.L., A.J., K.C., Z.L., and F.H. are employed by GenScript USA Inc., but the company had no role in designing or executing experiments. G.M.C. is a founder of GRO Biosciences and EnEvolv (now part of Ginkgo Bioworks), in which he has related financial interests. Other potentially relevant financial interests of G.M.C. are listed at http://arep.med.harvard.edu/gmc/tech.html.

Figures

Figure 1.
Figure 1.. Design of a synthetic 57-codon Escherichia coli genome.
(a) The design of Ec_Syn57’s compressed genetic code. Magenta marks codons and their corresponding tRNA genes and release factor I (RF1, encoded by prfA) selected for elimination during genome design. (b) The computational genome design of Ec_Syn57 synonymously recoded all known instances of the seven target codons highlighted in Figure 1a and streamlined the synthetic chromosome. Prior to synonymous recoding, (I.) overlapping genes that contain forbidden codons within their overlapping region were disentangled while preserving the downstream gene’s ribosomal binding site; next, (II.) protein-coding genes were recoded to confer the 57-codon genetic code of Figure 1a while minimizing local changes in GC% and mRNA folding differences near the 5’ end of genes. Next, we streamlined DNA synthesis and subsequent genome assembly steps by eliminating unstable repeats (III.), removing cut sites of AarI, BsaI, and BsmBI restriction enzymes (IV.), and eliminating sequences containing >8 consecutive As, Cs, Ts, or >5 consecutive Gs (V.). Finally, the refactored, recoded genome was divided into 86 ~50 kbp segments, and the entire genome was synthesized.
Figure 2.
Figure 2.. Genome construction with challenging recoding schemes.
(a) Without sequence-based counterselection against the parental genome (i.e., CRISPR/Cas9-cut within the parental copy, marked as ▲), genome construction methods that rely on induced crossover and the use of only counterselectable marker genes (marked in orange) to eliminate the parental copy frequently result in chimeric genomes and the reversion of challenging modifications and deleterious genome-design issues (marked as *), preventing the construction and troubleshooting of genomes with less-tolerated recoding schemes. (b) Sequential integration coupled with CRISPR/Cas9-cut assisted sequence-based counterselection against the parental locus prevents unwanted chimera formation and allows the generation of fitness-decreasing recoding schemes. (c) The workflow of SynOMICS-based genome construction and troubleshooting. SynOMICS utilizes recombination deficient (i.e., ΔrecA) parental strains, leading to increased CRISPR/Cas9 selection stringency and preventing unwanted recombination between the parental and recoded copies. Recoded chromosomal segments are delivered using electroporation or conjugation (I.), followed by CRISPR/Cas9-assisted Lambda-Red recombineering-based deletion of the parental segment copy using an antibiotic-resistance-conferring deletion cassette and a genome-targeting crRNA plasmid (II.). Fitness-decreasing synonymous codons and genomic design errors are discovered using multi-omics analyses (III.) and troubleshot using multiplexed genome editing and laboratory evolution (IV.). Finally, the extrachromosomal recoded segment is integrated by delivering a 6-plex sgRNA expression plasmid (V.) that site-specifically integrates the synthetic part (VI.) and eliminates the genome-targeting crRNA plasmid of Step II. SynOMICS cycles are repeatable without interruption due to the inducible elimination of its pINTsg integration plasmid.
Figure 3.
Figure 3.. Multi-omics-based detection and troubleshooting of genome design issues.
(a) Overview of multi-omics analyses utilized in this study to identify fitness-impacting changes on synthetic genomes. (b) Ribosomes stall at forbidden codons present in Ec_Syn61Δ3. The figure shows ribosome footprint coverage based on Ribo-seq along rpoS containing a TCA codon (marked with *) in Ec_Syn61Δ3 lacking tRNASer(UGA) and tRNASer(CGA) needed to translate TCA and TCG codons, respectively, and in E. coli MDS42 bearing the canonical genetic code. (c) Ribosome A-(aminoacyl)-site coverage of sense codons in Ec_Syn61Δ3. Ribosomes in the compressed genetic code accumulate at TCA (mRNA UCA) and TCG (mRNA UCG) codons—marked in magenta—present due to genome annotation errors and hypermutagenesis-induced mutations. Ribo-Seq data was collected in n=3 independent replicates, error bars indicate standard deviation. (d) Synonymous recoding frequently reduces promoter activity. Figure shows the mRNA output of intragenic promoters that overlap with synonymous codon swaps in Ec_Syn57. r.c. marks recoded promoter variant from Ec_Syn57, while w.t. marks the parental promoter variant. Experiments were performed in n=10 replicates. Error bars denote standard deviation; *** indicates a P value ≤ 0.001, while **** indicates P ≤ 0.0001 based on unpaired two-sided Student’s t-test. (e) Troubleshooting synonymous recoding-induced lethality using multiplexed genome editing and laboratory evolution. The nonviable computational genome design (i) was troubleshot using a combination of (ii) targeted DIvERGE mutagenesis followed by (iii) ALE to identify optimal variants within the growth-essential promoter of the ribF-ispH operon.
Figure 4.
Figure 4.. Assembly of recoded chromosomal regions to create Ec_Syn57.
(a) The reversal of SynOMICS’s steps and its direction allows genome fission, fusion, and assembly from separately constructed synthetic chromosomal regions. Chromosome fission is achieved by delivering recipient BAC (i.e., the pYES2L fission BAC, abbreviated as pY2L) into a cell line bearing a partially recoded chromosome (I.), and (II.) following the expression of Lambda-Red and Cas9 from pRedCas2 (not shown), chromosome fission is initiated by delivering pFISSIONsg—a four-plex nonrepetitive sgRNA expression plasmid—and a version of the SynOMICS deletion cassette that only contains terminal genomic homologies. (III.) CRISPR/Cas9-cuts liberate the recoded chromosomal region from the genome and linearize the recipient BAC allowing the Lambda-Red-mediated terminal homology-directed transposition of the recoded chromosomal region into the recipient BAC while simultaneously sealing the genomic cut. Finally, (IV.) the fissioned chromosomal region is delivered into a new recipient cell where it is integrated using the standard SynOMICS workflow, depicted on Figure 2c. (b) Construction of the synthetic genome of Ec_Syn57 from 11 simultaneously constructed synthetic sections. Numbers indicate the steps in which genomic sections are merged to assemble the final, fully synthetic genome of Ec_Syn57. Segments 45 and 50 (marked as ▲) were added following the merging of sections 1 to 5. To date, sections 1–5 have been combined, as indicated on Figure 4c, yielding 7 strains containing the synthetic genome of Ec_Syn57. (c) SynOMICS-based sequential assembly and troubleshooting of recoded chromosomal sections to generate Ec_Syn57. Following the construction of recoded chromosomal regions, we utilized SynOMICS (Figure 4a) to transfer and integrate recoded chromosomal regions in E. coli MDS42 ΔrecA containing Segments 36 and 37. Following the assembly of three chromosomal regions, we initiated genome-editing- and ALE-based troubleshooting to increase the fitness of partially recoded strains before the next chromosomal region was transferred. Pie chart displays the steps of genome assembly, colored sections mark synthetic recoded chromosomal regions transferred to obtain E. coli MDS42 ΔrecA containing Segments 9–18 and 36–59. For the detailed description of assembly steps, see Supplementary Methods. Bar graph shows final optical density at 600 nm (OD600) following aerobic growth in 2×YT broth, a rich bacterial growth medium, at 37 °C. Source data is available in this paper.
Figure 5.
Figure 5.. Synonymous recoding induces widespread changes in gene expression.
(a) The recoded genome of Ec_Syn61Δ3 and recoded chromosomal regions of Ec_Syn57 display marked differences in mRNA expression. The figure shows differential mRNA expression for all protein-coding genes along the recoded genome of Ec_Syn61Δ3 and partially recoded genomes of Ec_Syn57, compared to their parental variant. Recoded regions on partially recoded genomes are marked in magenta. Example operons displaying the most significant changes in mRNA expression are indicated. Bars indicate mean differential expression. Differential expression values were calculated using the EdgeR algorithm using three independent replicates (n=3) Source data is provided in Supplementary Data 3. (b) Synonymous recoding induces the widespread appearance of cryptic promoters. Figure shows the position and direction of primary transcripts based on primary transcriptome sequencing using Cappable-seq. Recoding-induced novel transcriptional start sites are marked with a star (*). Cappable-seq reads in magenta marks reverse transcript orientation, while green marks forward transcript orientation. Black triangles indicate recoded codons. Cappable-seq experiments are based on n=2 independent replicates. For the analysis of the same locus in the parental MDS42, see Supplementary Figure 14. (c-d) Synonymous recoding-induced promoters behind transcriptome changes of Ec_Syn61Δ3 and Ec_Syn57, highlighted in Figure 5a. The figure shows RNA-seq coverage within the ybfO-P locus of Ec_Syn61Δ3 (c) and the ydiT-fadK locus in Ec_Syn57 (d) and their corresponding parental copy. Green triangles indicate recoded codons. RNA-seq experiments were performed in n=3 independent replicates.

Similar articles

Cited by

References

    1. Lajoie MJ, Rovner AJ, Goodman DB, Aerni H-R, Haimovich AD, Kuznetsov G, et al. Genomically Recoded Organisms Expand Biological Functions. Science 2013;342:357–60. 10.1126/science.1241459. - DOI - PMC - PubMed
    1. Nyerges A, Vinke S, Flynn R, Owen SV, Rand EA, Budnik B, et al. A swapped genetic code prevents viral infections and gene transfer. Nature 2023:1–8. 10.1038/s41586-023-05824-z. - DOI - PMC - PubMed
    1. Robertson WE, Funke LFH, Torre D de la, Fredens J, Elliott TS, Spinck M, et al. Sense codon reassignment enables viral resistance and encoded polymer synthesis. Science 2021;372:1057–62. 10.1126/science.abg3029. - DOI - PMC - PubMed
    1. Zürcher JF, Robertson WE, Kappes T, Petris G, Elliott TS, Salmond GPC, et al. Refactored genetic codes enable bidirectional genetic isolation. Science 2022;0:eadd8943. 10.1126/science.add8943. - DOI - PMC - PubMed
    1. Crick FHC, Barnett L, Brenner S, Watts-Tobin RJ. General Nature of the Genetic Code for Proteins. Nature 1961;192:1227–32. 10.1038/1921227a0. - DOI - PubMed

Publication types