Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 14;16(8):e1008991.
doi: 10.1371/journal.pgen.1008991. eCollection 2020 Aug.

Cryptic genetic variation enhances primate L1 retrotransposon survival by enlarging the functional coiled coil sequence space of ORF1p

Affiliations

Cryptic genetic variation enhances primate L1 retrotransposon survival by enlarging the functional coiled coil sequence space of ORF1p

Anthony V Furano et al. PLoS Genet. .

Abstract

Accounting for continual evolution of deleterious L1 retrotransposon families, which can contain hundreds to thousands of members remains a major issue in mammalian biology. L1 activity generated upwards of 40% of some mammalian genomes, including humans where they remain active, causing genetic defects and rearrangements. L1 encodes a coiled coil-containing protein that is essential for retrotransposition, and the emergence of novel primate L1 families has been correlated with episodes of extensive amino acid substitutions in the coiled coil. These results were interpreted as an adaptive response to maintain L1 activity, however its mechanism remained unknown. Although an adventitious mutation can inactivate coiled coil function, its effect could be buffered by epistatic interactions within the coiled coil, made more likely if the family contains a diverse set of coiled coil sequences-collectively referred to as the coiled coil sequence space. Amino acid substitutions that do not affect coiled coil function (i.e., its phenotype) could be "hidden" from (not subject to) purifying selection. The accumulation of such substitutions, often referred to as cryptic genetic variation, has been documented in various proteins. Here we report that this phenomenon was in effect during the latest episode of primate coiled coil evolution, which occurred 30-10 MYA during the emergence of primate L1Pa7-L1Pa3 families. First, we experimentally demonstrated that while coiled coil function (measured by retrotransposition) can be eliminated by single epistatic mutations, it nonetheless can also withstand extensive amino acid substitutions. Second, principal component and cluster analysis showed that the coiled coil sequence space of each of the L1Pa7-3 families was notably increased by the presence of distinct, coexisting coiled coil sequences. Thus, sampling related networks of functional sequences rather than traversing discrete adaptive states characterized the persistence L1 activity during this evolutionary event.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Retrotransposition of ORF1p variants.
A—Generic L1 retrotransposon [1, 43]: 5’ UTR (untranslated regulatory region); ORF1 (open reading frame1) with coiled coil domain (CC), which mediates trimerization of ORF1p necessary for high affinity nucleic acid binding and chaperone activity [, –46]; ORF2, encodes the L1 replicase with endonuclease (EN) and reverse transcriptase (RT) domains; 3’ UTR with a conserved G-tetraplex forming domain (G) [47, 48] and an A-rich tail (A). P, location of conserved phosphorylation sites in mammalian ORF1p required for retrotransposition [1, 49]. NTD, RRM, CTD, N-terminal domain, RNA recognition motif, C-terminal domain, respectively. Also depicted are schematics of 8 ORF1p sequences 7 of which are mosaic structures consisting of the indicated regions of a modern active L1Pa1 (L1Hs) ORF1p (white, 111) and a resuscitated ancestral L1Pa5 ORF1p (black, 555) [26]. The names of the variants are given on the right and the white numbers indicate their amino acid differences. B—Alignment of the N-terminal 23 amino acids and the entire 14 heptad coiled domain (alternating green and yellow boxes). Note, the heptads are numbered from 1–14 (only 8 & 9 are indicated) so as to be congruent with amino-carboxy orientation of the protein. L1Pa1 ORF1p is the reference sequence and dots and letters indicate respectively identities and differences between it and the other variants. Four columns are listed on the left: Numbers, corresponding to the subset of the variants shown in S3 Fig that were mapped on the coiled coil sequence space shown in Fig 2; pro exp—green dots, variants tested for protein expression in HeLa cells. Both active and inactive protein were expressed (S1 Fig); % ret—% retrotransposition activity in HeLa cells; clone–variant names. The a-g heptad amino acid positions are shown for heptads 8 and 9, stm indicates the stammer in heptad 6. C–Box plots of retrotransposition assays of selected variants (bracketed in panel B) and representative stained G418 resistant foci and the numbers (n) of independent transfections (biological replicates) are indicated. Retrotransposition results of the other variants are shown in S2 Fig.
Fig 2
Fig 2. Principal component analysis of the ORF1p coiled coil.
A—The color code used for each family corresponds to one of the major clusters shown in Fig 3. B—Large and small circles correspond respectively to active and inactive ORF1p variants mapped on the sequence space of the coiled coil of L1pa7-L1Pa1 and are numbered per Fig 1 and S3 Fig. Active variants (large circles) exhibited ~80–100% of L1Pa1 activity and inactive ones (small circles) <5% of L1Pa1 activity.
Fig 3
Fig 3. Cluster analysis of coiled coil sequence space.
Panels A-F. Coiled coil clusters identified in L1Pa7 –L1Pa1 by the bios2mds R package [50] as described in the Materials and Methods. Clusters are designated as follows: cLn.n, the cluster number followed by the family number—cL3.7 is cluster 3 of the L1Pa7 family. Panel F shows the projection of L1Pa1 (cL1.1) on the sequence space of L1Pa2 (cL1.2) and cL1.3, using the mmds.project function of the Bios2mds package as described in the Materials & Methods. Panels G-L show projection of L1Pa6 or L1Pa4 clusters on the coiled coil sequence space of L1Pa5. Panel I shows 3 coiled coil clusters for L1Pa5: cL1.5mod, cL3.5mod, and cL2.5anc, which belong to the modern and ancestral versions of L1Pa5 (see text). The 50% consensus sequence of the cL3.5 cluster corresponds to the 555 ORF1p sequence, marked with an asterisk, *, on Figs 4 and 5.
Fig 4
Fig 4. Coiled coil phylogeny.
Maximum likelihood trees of the coiled coil clusters and C-termini were built on their 50% consensus sequences with the amino acids encoded by CG-affected codons treated as missing data. Note the 10-fold lower scale of the branch lengths for the C-terminus tree. The colored circles at the tips of the coiled coil cluster tree correspond to the cluster colors in Fig 3, panel A-F. The numbers at each node give its frequency as % of 1000 bootstrap replicates. The ORF2 tree was generated from amino acid consensus sequences of the human version of our previously described collection of L1Pa2 –L1Pa7 human/chimpanzee orthologues [51] and the currently active human L1Pa1 family (in particular, Ta1-d 5), represented here by the L1.3 element [52]. This tree is consistent with a previously described tree built from the 3’ 2 kb of nucleic acid sequence which includes the 3’UTR but mostly ORF2 sequence (Figure 4A in [7]).
Fig 5
Fig 5. Coiled coil amino acid changes.
The 50% consensus sequence of cL1.7, is at the top of the alignment and those of the other clusters given below, arranged according to their position on the phylogenetic tree (Fig 4), with the ones sharing the same node bracketed. The number of sequences in each cluster is given in the right-hand column. Dots indicate amino acid identity, and letters indicate differences, capitalized upon their first appearance. The thin and heavy underlined positions of the cL1a.4 consensus indicate respectively the modern (111) residues that had already arisen in the coiled coil or first appeared here. The pink and blue columns highlight the emergence and replacement of the residues in heptads 8 and 9 that are negatively epistatic in the modern coiled coil (Fig 1). The red box shows the emergence of I77, which is negatively epistatic in the ancestral context. The sequences of 111 with its differences from the active 555 and m39b variants and the inactive m39 variant are at the bottom of the alignment. S5 Fig shows the CG-less (_o) and CG-restored (rt) translation products of the coiled coil clusters consensus sequences. S6 Fig shows a LOGO plot of the CG-restored translation products, and S7–S11 Figs show various alignments of the coiled coil sequences that populate the cL1.3 (L1Pa3), L1Pa2 (cL1.2) and L1Pa1 (cL1.1) clusters (see Discussion).

Similar articles

Cited by

References

    1. Boissinot S, Sookdeo A. The Evolution of Line-1 in Vertebrates. Genome biology and evolution. 2016:3485–507. 10.1093/gbe/evw247 - DOI - PMC - PubMed
    1. IHGS-Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409(6822):860–921. - PubMed
    1. Skowronski J, Fanning TG, Singer MF. Unit-length line-1 transcripts in human teratocarcinoma cells. Mol Cell Biol. 1988;8(4):1385–97. 10.1128/mcb.8.4.1385 - DOI - PMC - PubMed
    1. Skowronski J, Singer MF. Expression of a cytoplasmic LINE-1 transcript is regulated in a human teratocarcinoma cell line. Proc Natl Acad Sci U S A. 1985;82(18):6050–4. 10.1073/pnas.82.18.6050 - DOI - PMC - PubMed
    1. Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol Biol Evol. 2000;17(6):915–28. 10.1093/oxfordjournals.molbev.a026372 - DOI - PubMed

Publication types

Grants and funding

This work was funded by the Intramural Research Program of the National Institute of Diabetes, Digestive and Kidney Diseases, of the National Institutes of Health. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.