Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul 24;109(30):12005-10.
doi: 10.1073/pnas.1205176109. Epub 2012 Jul 6.

Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet

Affiliations

Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet

Denis A Malyshev et al. Proc Natl Acad Sci U S A. .

Abstract

The natural four-letter genetic alphabet, comprised of just two base pairs (dA-dT and dG-dC), is conserved throughout all life, and its expansion by the development of a third, unnatural base pair has emerged as a central goal of chemical and synthetic biology. We recently developed a class of candidate unnatural base pairs, exemplified by the pair formed between d5SICS and dNaM. Here, we examine the PCR amplification of DNA containing one or more d5SICS-dNaM pairs in a wide variety of sequence contexts. Under standard conditions, we show that this DNA may be amplified with high efficiency and greater than 99.9% fidelity. To more rigorously explore potential sequence effects, we used deep sequencing to characterize a library of templates containing the unnatural base pair as a function of amplification. We found that the unnatural base pair is efficiently replicated with high fidelity in virtually all sequence contexts. The results show that, for PCR and PCR-based applications, d5SICS-dNaM is functionally equivalent to a natural base pair, and when combined with dA-dT and dG-dC, it provides a fully functional six-letter genetic alphabet.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Unnatural (d5SICS-dNaM) and natural Watson–Crick (dC-dG) base pairs.
Fig. 2.
Fig. 2.
(A) PCR selection scheme. X = NaM (or when biotinylated, its analog MMO2; see Fig. S5) and Y = 5SICS. (B) Library design. The regions proximal to the unnatural base pair that were analyzed for biases are shown in red, and the distal regions used as a control are shown in green. Sublibrary-specific two-nucleotide barcodes that indicate the position of the unnatural base pair flank the randomized regions and are shown in italics. Primer binding regions are denoted as PBR (sequences in SI Appendix, Table S1).
Fig. 3.
Fig. 3.
Fraction of single-copy sequences (Upper) and normalized Shannon entropy (Lower) for amplification with 1- (Left) or 4-min (Right) extension times. The red lines correspond to the regions proximal to the unnatural base pair, and the green lines correspond to the distal control regions (Fig. 2B). Populations that retained or lost the unnatural base pair are represented with solid or dotted lines, respectively. Error bars were determined from the independent analysis of each of the three sublibraries.
Fig. 4.
Fig. 4.
Analysis of amplification bias with 1-min extension time. In all cases, retained and lost refer to the populations that retained the unnatural base pair during amplification and the populations that lost it, respectively. (A) Single nucleotide bias. frelN(n) − 1 values are shown for each natural nucleotide (indicated along the top) as a function of position relative to dNaM in the amplified library. Amplification level is shown along the far right edge. (B) Normalized pairwise correlations C(n,n′). Only positive values of C(n,n′), which indicate amplification-dependent biases, are shown. For visualization, the discrete data are represented with continuous functions (surfaces); (C) 5′- and 3′-dinucleotide biases (frelNN(n,n′) − 1) are shown on the left and right, respectively, and are represented in a circular format, where the sequences read from the middle out, with X representing dNaM. For example, for each dinucleotide distribution, the upper-right quadrant corresponds to either 5′-NAX or XAN-3′, where N is (clockwise) A, C, G, or T. Correspondingly, the bottom-right quadrant corresponds to either 5′-NCX or XCN-3′, the bottom-left quadrant corresponds to either 5′-NGX or XGN-3′, and the top-left quadrant corresponds to either 5′-NTX or XTN-3′. Amplification level is indicated by gray shading, which is shown at the bottom.

Similar articles

Cited by

References

    1. Benner SA, Sismour AM. Synthetic biology. Nat Rev Genet. 2005;6:533–543. - PMC - PubMed
    1. Seo YJ, Malyshev DA, Lavergne T, Ordoukhanian P, Romesberg FE. Site-specific labeling of DNA and RNA using an efficiently replicated and transcribed class of unnatural base pairs. J Am Chem Soc. 2011;133:19878–19888. - PMC - PubMed
    1. Kawai R, et al. Site-specific fluorescent labeling of RNA molecules by specific transcription using unnatural base pairs. J Am Chem Soc. 2005;127:17286–17295. - PubMed
    1. Kimoto M, et al. A new unnatural base pair system between fluorophore and quencher base analogues for nucleic acid-based imaging technology. J Am Chem Soc. 2010;132:15418–15426. - PubMed
    1. Hollenstein M, Hipolito CJ, Lam CH, Perrin DM. A self-cleaving DNA enzyme modified with amines, guanidines and imidazoles operates independently of divalent metal cations (M2+) Nucleic Acids Res. 2009;37:1638–1649. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources