Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov;85(22):12043-52.
doi: 10.1128/JVI.00867-11. Epub 2011 Aug 24.

Gypsy and the birth of the SCAN domain

Affiliations

Gypsy and the birth of the SCAN domain

Ryan O Emerson et al. J Virol. 2011 Nov.

Abstract

SCAN is a protein domain frequently found at the N termini of proteins encoded by mammalian tandem zinc finger (ZF) genes, whose structure is known to be similar to that of retroviral gag capsid domains and whose multimerization has been proposed as a model for retroviral assembly. We report that the SCAN domain is derived from the C-terminal portion of the gag capsid (CA) protein from the Gmr1-like family of Gypsy/Ty3-like retrotransposons. On the basis of sequence alignments and phylogenetic distributions, we show that the ancestral host SCAN domain (ESCAN for extended SCAN) was exapted from a full-length CA gene from a Gmr1-like retrotransposon at or near the root of the tetrapod animal branch. A truncated variant of ESCAN that corresponds to the annotated SCAN domain arose shortly thereafter and appears to be the only form extant in mammals. The Anolis lizard has a large number of tandem ZF genes with N-terminal ESCAN or SCAN domains. We predict DNA binding sites for all Anolis ESCAN-ZF and SCAN-ZF proteins and demonstrate several highly significant matches to Anolis Gmr1-like sequences, suggesting that at least some of these proteins target retroelements. SCAN is known to mediate protein dimerization, and the CA protein multimerizes to form the core retroviral and retrotransposon capsid structure. We speculate that the SCAN domain originally functioned to target host ZF proteins to retroelement capsids.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
SCAN and Gmr1-like CA alignment. Alignment of SCAN domains from Homo sapiens with Gmr1-like capsid sequences from various species. The SCAN domain shows strong similarity to the C-terminal portion of current Gmr1-like CA sequences, and human SCAN domains form a clade with respect to Gmr1-like CA sequences. The four helices in the known structure of the SCAN domain (23) are indicated below the sequence alignment.
Fig. 2.
Fig. 2.
Outgrouped tree of SCAN and Gmr1-like CA. Phylogeny of SCAN domain matches (blue) and Gmr1-like element CA domains (green) chosen from a variety of species to represent the phylogenetic diversity of each. One Ty3/Gypsy element capsid sequence was chosen for its high score to the SCAN domain profile and is included as an outgroup and shown in red. Host genomic SCAN domains form a clade within the capsid sequences of Gmr1-like elements with excellent branch support (>0.99 by the approximate likelihood ratio test [aLRT] and 188/200 bootstraps). The tree was constructed using phyml 3.0 (17) and was visualized with Dendroscope (21). Notably, the Anolis carolinensis lizard is alone in having a large number of Gmr1-like elements and host SCAN domains. Sequences from the following species are shown: elephant, Loxodonta africana (lafr); frog, Xenopus tropicalis (xtro); pufferfish, Tetraodon nigroviridis (tnig); human, Homo sapiens (hsap); lamprey, Petromyzon marinus (pmar); lizard, Anolis carolinensis (acar); medaka, Oryzias latipes (olat); mouse, Mus musculus (mmus); opossum, Monodelphis domestica (mdom); platypus, Ornithorhyncus anatinus (oana); sea urchin, Strongylocentrotus purpuratus (spur); stickleback, Gasterosteus aculeatus (gacu); tunicate, Ciona intestinalis (cint); zebra finch, Taeniopygia guttata (tgut); zebrafish, Danio rerio (drer).
Fig. 3.
Fig. 3.
Phylogenetic model of SCAN-ZF evolution. Evolution of the ESCAN and SCAN domains and mammalian SCAN-KRAB-ZF genes from ancestral Gmr1-like elements. Gmr1-like elements are ancient, and their capsid domains were likely exapted as SCAN domains near the root of the amniote branch. Domains descended from Gmr1-like CA sequences were fused to existing KRAB-ZF genes and later trimmed to include only the C-terminal portion to form the SCAN domain. The subsequent loss of Gmr1-like elements in at least two lineages is apparent. Species designated as containing Gmr1-like elements or SCAN domains include many examples of each, except for zebra finch which has a single identifiable SCAN domain. Clear Gmr1-like retrotransposons were also found in the genomes of a Hemichordate acorn worm (Saccoglossus kowalevskii), a mollusc (Lottia gigantea), an annelid (Helobdella robusta), a sponge (Amphimedon queenslandica), the Placozoan Trichoplax adherens, and the deer tick (Ixodes scapularis), but not in hydra (Hydra magnipapillata), honeybee (Apis mellifera), mosquito (Anopheles gambiae), Daphnia pulex, Drosophila melanogaster, Caenorhabditis elegans, a Choanoflagellate (Monosiga brevicollis), all fungi, and all vascular plants (fungi and plant genomes available on NCBI nr database on 5 September 2010). The elements found had both a high-scoring RSCAN domain and the characteristic INT-RT domain order. Since the sponge and probably Trichoplax are basal metazoans, many protostomes (Arthropoda, Nematoda, etc.) have probably lost an ancestral Gmr1-like element.
Fig. 4.
Fig. 4.
Model of SCAN-ZF derivation from Gmr1-like retrotransposons. A proposed model for the evolution of SCAN-KRAB-ZF and SCAN-ZF genes from ancestral KRAB-ZF and Gmr1-like sequences. Initial fusion of Gmr1-like CA domain to a KRAB-ZF gene by splicing is followed by gradual degradation of surrounding Gmr1-like sequence and truncation of the Gmr1-like CA to the shorter SCAN domain. The occasional subsequent loss of KRAB domain exons generates SCAN-ZF genes.
Fig. 5.
Fig. 5.
Distribution of SCAN-ZF/Gmr1-like binding relationships. (Top) Distribution of the number of predicted Gmr1-like targets (out of 88 total) for each of the 61 SCAN-ZF profiles with at least 1 predicted Gmr1-like target. Most SCAN-ZF profiles are predicted to bind only 1 Gmr1-like element, but 6 profiles are predicted to bind 10 or more Gmr1-like elements each. (Bottom) Distribution of the number of SCAN-ZF profiles predicted to bind to each of the 88 identified Anolis Gmr1-like elements. Of 88 Gmr1-like elements, 78 are predicted to be bound by at least one SCAN-ZF profile: 29 Gmr1-like elements have one predicted binding partner, and 49 Gmr1-like elements are matched by more than one predicted SCAN-ZF profile. All predicted binding relationships and associated statistics can be found in Data S6 in the supplemental material.
Fig. 6.
Fig. 6.
Predicted binding profile of PSKZ95. (Top) Predicted binding profile of PSKZ95 (anoCar1 scaffold_401:302793–304163). (Bottom) Consensus sequence of predicted primer binding site (PBS)-Leu (AAG/TAG) sites from Anolis carolinensis, the predicted binding target for PSKZ95. The predicted binding profile suggests that PSKZ95 specifically targets the PBS-Leu elements of many Gmr1-like elements. The logo and consensus representation were created with WebLogo (8).
Fig. 7.
Fig. 7.
Binding targets of predicted SZF genes. A schematic of the canonical structure of Gmr1-like elements in the Anolis lizard. Boundaries of the LTR sequences and protein domains within the Gmr1 ORF are approximate and correspond to protein domain matches or LTR_FINDER results (see Materials and Methods). The predicted binding targets of SCAN-ZF genes are marked with a red arrow and represent position with respect to the canonical domain structure. One red arrow is shown per binding target: PSKZ36 and PSKZ172 are both predicted to bind between the protease and integrase domains but not at exactly the same position and not in the same set of Gmr1-like elements, while PSKZ141 and PSKZ156 bind to the same sequence feature and often in the same Gmr1-like elements.

Similar articles

Cited by

References

    1. Allouch A., et al. 2011. The TRIM family protein KAP1 inhibits HIV-1 integration. Cell Host Microbe 9: 484–495 - PubMed
    1. Anisimova M., Gascuel O. 2006. Approximate likelihood-ratio test for branches: a fast, accurate and powerful alternative. Syst. Biol. 55: 539–552 - PubMed
    1. Barde I., et al. 2009. Regulation of episomal gene expression by KRAB/KAP1-mediated histone modifications. J. Virol. 83: 5574–5580 - PMC - PubMed
    1. Bellefroid E. J., et al. 1993. Clustered organization of homologous KRAB zinc-finger genes with enhanced expression in human T lymphoid cells. EMBO J. 12: 1363–1374 - PMC - PubMed
    1. Bellefroid E. J., et al. 1991. The evolutionarily conserved Krüppel-associated box domain defined a subfamily of eukaryotic multifingered proteins. Proc. Natl. Acad. Sci. U. S. A. 88: 3608–3612 - PMC - PubMed

LinkOut - more resources