Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 20;13(2):148.
doi: 10.3390/v13020148.

The Kaumoebavirus LCC10 Genome Reveals a Unique Gene Strand Bias among "Extended Asfarviridae"

Affiliations

The Kaumoebavirus LCC10 Genome Reveals a Unique Gene Strand Bias among "Extended Asfarviridae"

Khalil Geballa-Koukoulas et al. Viruses. .

Abstract

Kaumoebavirus infects the amoeba Vermamoeba vermiformis and has recently been described as a distant relative of the African swine fever virus. To characterize the diversity and evolution of this novel viral genus, we report here on the isolation and genome sequencing of a second strain of Kaumoebavirus, namely LCC10. Detailed analysis of the sequencing data suggested that its 362-Kb genome is linear with covalently closed hairpin termini, so that DNA forms a single continuous polynucleotide chain. Comparative genomic analysis indicated that although the two sequenced Kaumoebavirus strains share extensive gene collinearity, 180 predicted genes were either gained or lost in only one genome. As already observed in another distant relative, i.e., Faustovirus, which infects the same host, the center and extremities of the Kaumoebavirus genome exhibited a higher rate of sequence divergence and the major capsid protein gene was colonized by type-I introns. A possible role of the Vermamoeba host in the genesis of these evolutionary traits is hypothesized. The Kaumoebavirus genome exhibited a significant gene strand bias over the two-third of genome length, a feature not seen in the other members of the "extended Asfarviridae" clade. We suggest that this gene strand bias was induced by a putative single origin of DNA replication located near the genome extremity that imparted a selective force favoring the genes positioned on the leading strand.

Keywords: Kaumoebavirus; Nucleocytoviricota; Vermamoeba vermiformis; extended Asfarviridae; gene strand bias; nucleo-cytoplasmic large DNA virus.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Electronic microscopy observation of negatively stained Kaumoebavirus LCC10.
Figure 2
Figure 2
Sequence structures at the ends of the KV-LCC10 chromosome. Schematic representation of the linear KV-LCC10 genome with covalently closed, incompletely base-paired hairpin termini flanked by TIRs. The 102 bp hairpin sequences and TIRs on each side of the genome are inverted and complementary. The minimum free energy (dG) fold of the two versions of the hairpin loop (flip/flop) was inferred using the MFOLD program. Please note that the positioning of the flip/flop versions relative to the left/right sides of the genome could not be determined using the current data and is therefore represented here arbitrarily.
Figure 3
Figure 3
Comparative genomics of KV-LCC10. (A) Taxonomic classification of the KV-LCC10 protein best matches in the TrEMBL database. Best matches were identified with MMSEQS and a E-value threshold of 1E-05. Match to KV-Sc proteins were not included. (B) Phylogenetic tree of the DNA polymerase proteins. The multiple alignment was done with MAFFT and the phylogenetic tree was reconstructed with FastTree using default parameters. Branch supports indicated beside internal nodes were obtained using the SH-aLRT method as implemented in FastTree. The scale bar indicates the number of substitutions per amino acid sites. (C) Protein families shared between the two KVs strains. Numbers outside and inside brackets indicate the number of protein families involved in the category and the cumulative number of proteins in those families, respectively. For families shared between K-LCC10 and K-Sc, 423 is the number of proteins for K-LCC10 and 371 is the number of proteins for K-Sc. (D) Protein sequence similarity between reciprocal best matches between the two KV strains. The blue distribution represents proteins that have a significant match in the TrEMBL database (with exclusion of KV entries); the green distribution represents proteins with no significant matches in TrEMBL and therefore considered as hypothetical proteins.
Figure 4
Figure 4
Type-1 introns in KV genes. Schematic representation of the MCP and RNAP genes in the two KV strains (KV-LCC10 above, KV-Sc below). Exons are represented by blue rectangles connected by blue lines showing splicing sites. Nucleotide sequence similarity is indicated by grey shaded areas. Homing endonuclease ORFs are shown by red rectangles, with numbers referring to the gene locus id in the respective Genbank records.
Figure 5
Figure 5
Genome conservation and colinearity between KV strains. Top panel: Levels of amino-acid similarity between the LCC10 and Sc genomes as measured by the tBLASTx program. The genomic coordinates refer to LCC10 strain. Bottom panel: gene collinearity between the LCC10 (top) and Sc (below) genomes. Open and purple-filled boxes represent single copy genes and genes engaged in a multi-gene family, respectively. Green asterisks indicate genes that are unique to the considered KV strain (i.e., without a BLASTP match with e-value < 1 × 10−5 against the other KV strain), whereas purple asterisks show strain specific genes that are also engaged in a multi-gene family. Colored links associate the putative LCC10 and Sc orthologous genes identified by the reciprocal best BLAST hit criterion. The coloring of the links represents the level of amino-acid similarity between orthologous proteins according to the color scale given. The horizontal red lines mark the position of the MCP genes.
Figure 6
Figure 6
Cumulative CDS skew of the KV-LCC10 genome. The top panel shows the evolution of the cumulative CDS skew curve along the genome. The horizontal dashed line marks the position of the minimum value of the cumulative skew, near the coordinate 82Kb. The bottom panel shows the distribution of the coding sequence (in red) on the forward (F; upper lane) and reverse (R; lower lane) strands.

Similar articles

Cited by

References

    1. La Scola B. A Giant Virus in Amoebae. Science. 2003;299:2033. doi: 10.1126/science.1081867. - DOI - PubMed
    1. Walker P.J., Siddell S.G., Lefkowitz E.J., Mushegian A.R., Dempsey D.M., Dutilh B.E., Harrach B., Harrison R.L., Hendrickson R.C., Junglen S., et al. Changes to virus taxonomy and the International Code of Virus Classification and Nomenclature ratified by the International Committee on Taxonomy of Viruses (2019) Arch. Virol. 2019;164:2417–2429. doi: 10.1007/s00705-019-04306-w. - DOI - PubMed
    1. Reteno D.G., Benamar S., Khalil J.B., Andreani J., Armstrong N., Klose T., Rossmann M., Colson P., Raoult D., La Scola B. Faustovirus, an Asfarvirus-Related New Lineage of Giant Viruses Infecting Amoebae. J. Virol. 2015;89:6585–6594. doi: 10.1128/JVI.00115-15. - DOI - PMC - PubMed
    1. Benamar S., Reteno D.G.I., Bandaly V., Labas N., Raoult D., La Scola B. Faustoviruses: Comparative Genomics of New Megavirales Family Members. Front. Microbiol. 2016;7:3. doi: 10.3389/fmicb.2016.00003. - DOI - PMC - PubMed
    1. Khalil J.Y.B., Andreani J., Raoult D., La Scola B. A Rapid Strategy for the Isolation of New Faustoviruses from Environmental Samples Using Vermamoeba vermiformis. J. Vis. Exp. 2016;54104 doi: 10.3791/54104. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources