Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 22;9(3):e0225421.
doi: 10.1128/spectrum.02254-21. Epub 2021 Dec 15.

Genome-Wide Characterization of Zebrafish Endogenous Retroviruses Reveals Unexpected Diversity in Genetic Organizations and Functional Potentials

Affiliations

Genome-Wide Characterization of Zebrafish Endogenous Retroviruses Reveals Unexpected Diversity in Genetic Organizations and Functional Potentials

Jun Bai et al. Microbiol Spectr. .

Abstract

Endogenous retroviruses (ERVs) occupy a substantial fraction of mammalian genomes. However, whether ERVs extensively exist in ancient vertebrates remains unexplored. Here, we performed a genome-wide characterization of ERVs in a zebrafish (Danio rerio) model. Approximately 3,315 ERV-like elements (DrERVs) were identified as Gypsy, Copia, Bel, and class I-III groups. DrERVs accounted for approximately 2.3% of zebrafish genome and were distributed in all 25 chromosomes, with a remarkable bias on chromosome 4. Gypsy and class I are the two most abundant groups with earlier insertion times. The vast majority of the DrERVs have varied structural defects. A total of 509 gag and 71 env genes with coding potentials were detected. The env-coding elements were well-characterized and classified into four subgroups. A ERV-E4.8.43-DanRer element shows high similarity with HERV9NC-int in humans and analogous sequences were detected in species spanning from fish to mammals. RNA-seq data showed that hundreds of DrERVs were expressed in embryos and tissues under physiological conditions, and most of them exhibited stage and tissue specificity. Additionally, 421 DrERVs showed strong responsiveness to virus infection. A unique group of DrERVs with immune-relevant genes, such as fga, ddx41, ftr35, igl1c3, and tbk1, instead of intrinsic viral genes were identified. These DrERVs are regulated by transcriptional factors binding at the long terminal repeats. This study provided a survey of the composition, phylogeny, and potential functions of ERVs in a fish model, which benefits the understanding of the evolutionary history of ERVs from fish to mammals. IMPORTANCE Endogenous retroviruses (ERVs) are relics of past infection that constitute up to 8% of the human genome. Understanding the genetic evolution of the ERV family and the interplay of ERVs and encoded RNAs and proteins with host function has become a new frontier in biology. Fish, as the most primitive vertebrate host for retroviruses, is an indispensable integral part for such investigations. In the present study, we report the genome-wide characterization of ERVs in zebrafish, an attractive model organism of ancient vertebrates from multiple perspectives, including composition, genomic organization, chromosome distribution, classification, phylogeny, insertion time, characterization of gag and env genes, and expression profiles in embryos and tissues. The result helps uncover the evolutionarily conserved and fish-specific ERVs, as well as the immune-relevant ERVs in response to virus infection. This study demonstrates the previously unrecognized abundance, diversification, and extensive activity of ERVs at the early stage of ERV evolution.

Keywords: endogenous retrovirus; evolution; expression; structure; zebrafish.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIG 1
FIG 1
Structural characterization and genome distribution of DrERVs. (A) Structural elements and proportion statistics of DrERVs. LTR in parentheses (LTR) represents only single LTR existing at either end. (B) Length statistics of DrERVs. (C) Chromosome distribution of DrERVs. The expected ERV number was calculated by multiplying the chromosome length and whole genome average density of DrERV. The detected ERV number represents the number of actually identified DrERVs. (D) Distribution of PCGs, DrERVs, and NLR elements in chromosome 4. Dotted line indicates the boundary between the long arm and the short arm.
FIG 2
FIG 2
Phylogenetic tree of DrERV-like elements. (A) Phylogenetic analysis of DrERVs based on RT region. (B) Schematic diagram of the numbers and relationships between XRV-related DrERV groups. (C) Phylogenetic tree constructed using retrotrans_gag. (D) Phylogenetic tree constructed using the TLV_coat (HR1–HR2) domain. In (C) and (D), the DrERVs that have been classified as class I, class III, and gypsy by RT are annotated with purple, orange, and dark blue boxes, respectively; XRVs or ERVs from other species are annotated with light blue boxes; the newly identified DrERVs are annotated with red boxes.
FIG 3
FIG 3
Prediction of insertion times of DrERVs. (A) The insertion times of DrERVs with different LTR length. (B) The insertion times of DrERVs with different structures. (C) The insertion times of DrERVs in different families.
FIG 4
FIG 4
Phylogeny and predicted conserved domains of class I–III gag genes without retrotrans_gag domain. The left panel shows the phylogenetic tree constructed with gag sequences without predictable retrotrans_gag domain. Eight viruses or ERVs were used as references according to the BLAST results of class I–III gag genes without retrotrans_gag domain. The right panel shows the composition of the five predicted conserved domains in each elements. The length ratio between the lines refers to the actual sequence length ratio, and the positions of domains refer to their relative position in the unaligned sequences.
FIG 5
FIG 5
Classification and characteristic of env genes in DrERVs. (A) Phylogenetic tree of 71 env genes and diagrams of the ISD of four env groups. (B) Annotation of DrEnv1–4 protein groups, in which the representative elements are selectively shown. The ERV-E18.4b.4-DanRer of DrEnv4 does not follow this feature. (C) Pattern diagram of the structure of a representative Env protein located in the cell membrane. (D) Schematic diagram of the transcription and splicing of the env gene of ERV-E5.1.38-DanRer.
FIG 6
FIG 6
Comparative analysis of DrERVs and HERVs. (A) Chromosomal distribution of BLAST hits in human genome and query in zebrafish genome. (B) Composition of annotated genes that overlapped with RT BLAST hits. (C) Hit number, score, and E-value of BLAST hits generated by different DrERV classes. (D) Phylogenetic tree constructed by BLAST hits of HERV9NC-int and ERV-E4.8.43-DanRer in nine species. BLAST hits generated by RT sequences in HERV9NC-int and ERV-E4.8.43-DanRer are annotated by H and Z, respectively. (E) Comparison of the similarity between HERV9NC-int and ERV-E18.3.2-DanRer.
FIG 7
FIG 7
Transcriptional expression analysis of DrERVs during embryogenesis and in seven tissues. (A) Expression of DrERVs in four embryo developmental stages. (B) Venn diagram of the overlapping DrERVs among four embryonic stages. (C) Expression of DrERVs in seven tissues. (D) Venn diagram of the overlapping DrERVs among five tissues. Head kidney and liver tissues were excluded because none of the DrERVs expressed in these two tissues overlapped with the other tissues. (E) GO enrichment analysis of the genes co-expressed with heart-specific DrERVs. (F) Venn diagram of the overlapping DrERVs among four embryonic stages and adult tissues. Data from seven tissues were combined as a data set.
FIG 8
FIG 8
Transcriptional expression analysis of DrERVs, IFN, and ISG under SVCV stimulation. (A) Upregulated expression of IFN and ISG genes in the spleen, head kidney, and gut tissues of zebrafish in response to SVCV infection (**, P < 0.01; ***, P < 0.001). (B) Transcriptional expression analysis of DrERVs in the spleen, head kidney, and gut tissues of zebrafish upon SVCV infection.
FIG 9
FIG 9
Potential regulatory elements and transcriptional expression analysis of DrERVs upon SVCV infection. (A) Schematic diagram of the potential TF-binding sites at the LTRs of DrERVs inserted with different functional genes that were actively expressed in response to SVCV stimulation. These DrERVs are designated as virus-responsive ERVs (VREs), and their associated functional genes are named as VRE-aid genes. The positional relationship between VREs and VRE-aid genes is shown. (B) Transcriptional expression analysis of two representative VRE-aid genes in head kidney and gut tissues upon SVCV infection. (C) Examination of the TF-binding activity of two representative TFs (IRF1 and RelA) at the LTRs of two VREs by ChIP–qPCR analysis. (D) Comparison of the sequence diversity of LTRs among different VRE types. (E) Schematic diagram of two typical noncoding VREs and the transcripts. The gray box represents the additional LTRs detected inside the DrERVs. (F) Expression analysis of noncoding VREs upon SVCV stimulation (*, P < 0.05; ***, P < 0.001; ****, P < 0.0001; ns, no significant difference).

Similar articles

Cited by

References

    1. Stoye JP. 2012. Studies of endogenous retroviruses reveal a continuing evolutionary saga. Nat Rev Microbiol 10:395–406. doi:10.1038/nrmicro2783. - DOI - PubMed
    1. Johnson WE. 2019. Origins and evolutionary consequences of ancient endogenous retroviruses. Nat Rev Microbiol 17:355–370. doi:10.1038/s41579-019-0189-2. - DOI - PubMed
    1. Johnson WE. 2015. Endogenous retroviruses in the genomics era. Annu Rev Virol 2:135–159. doi:10.1146/annurev-virology-100114-054945. - DOI - PubMed
    1. Zhou B, Qi F, Wu F, Nie H, Song Y, Shao L, Han J, Wu Z, Saiyin H, Wei G, Wang P, Ni T, Qian F. 2019. Endogenous retrovirus-derived long noncoding RNA enhances innate immune responses via derepressing RELA Expression. mBio 10:e00937-19. - PMC - PubMed
    1. Feschotte C, Gilbert C. 2012. Endogenous viruses: insights into viral evolution and impact on host biology. Nat Rev Genet 13:283–296. doi:10.1038/nrg3199. - DOI - PubMed

Publication types

Substances

LinkOut - more resources