Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 14;12(1):2420.
doi: 10.1038/s41598-022-06046-5.

The low abundance of CpG in the SARS-CoV-2 genome is not an evolutionarily signature of ZAP

Affiliations

The low abundance of CpG in the SARS-CoV-2 genome is not an evolutionarily signature of ZAP

Ali Afrasiabi et al. Sci Rep. .

Abstract

The zinc finger antiviral protein (ZAP) is known to restrict viral replication by binding to the CpG rich regions of viral RNA, and subsequently inducing viral RNA degradation. This enzyme has recently been shown to be capable of restricting SARS-CoV-2. These data have led to the hypothesis that the low abundance of CpG in the SARS-CoV-2 genome is due to an evolutionary pressure exerted by the host ZAP. To investigate this hypothesis, we performed a detailed analysis of many coronavirus sequences and ZAP RNA binding preference data. Our analyses showed neither evidence for an evolutionary pressure acting specifically on CpG dinucleotides, nor a link between the activity of ZAP and the low CpG abundance of the SARS-CoV-2 genome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
PCA of viral motifs representations. D-values of all dinucleotide, trinucleotide and tetranucleotide motifs in all viral sequences form a matrix, which is used as an input for PCA. (A) PC1-PC2 plot shows four clusters, one for each virus family: H1N1, coronaviruses (CoV), HBV, and HIV-1. (B) PC1-PC2 plot classifies coronaviruses into two clusters. All 664 SARS-CoV-2 (including reference sequence), Bat-RaTG13, RmYN02, 4 Bat-Coronaviridae viruses (MW703458, MW251308, MG772933, MG772934), and 10 Pangolin-Coronaviridae (EPI_ISL_410721, EPI_ISL_410539, EPI_ISL_410542, EPI_ISL_410543, EPI_ISL_410538, EPI_ISL_410541, EPI_ISL_410540, MT040336.1, MT040333.1, MT121216.1) formed a cluster (SARS-CoV-2-like group), which is separated from the rest of coronavirus sequences including Human coronavirus 229E, Bat-Coronaviridae, Human coronavirus HKU1, Murine coronavirus, MERS coronavirus, Human coronavirus NL63, Human coronavirus OC43, Primates-Coronaviridae, SARS coronavirus Tor2, SARS coronavirus Ubani, Viverrids-Coronaviridae, SARS coronavirus wtic_MB, SARS coronavirus GZ02. SARS-CoV-2-like are highlighted with a square.
Figure 2
Figure 2
Comparison of dinucleotide motif representations between SRAS-CoV-2-like and SARS-CoV groups. D-values of each dinucleotide were compared between the two viral groups SARS-CoV-2-like and SARS-CoV. Mann–Whitney test was used to examine the difference in the median of D-values between the two coronavirus groups. D-value (motif representation) is defined as the ratio of the observed frequency (Pobs) of a motif over its expected frequency (Pexp). Pobs is simply the observed relative frequency of the motif. Pexp is quantified using the frequency of the motif in the sequence and the frequencies of the smaller constituting motifs.
Figure 3
Figure 3
Co-location of ZAP binding regions and CpG motifs. Overlaying of the ZAP binding peaks and CpG densities in (A) JEV genome (Japanese Encephalitis Virus) and (B) HIV-1. The ZAP binding peaks (density of reads aligned to the genome) are estimated using a 250 bp sliding window moving by 1 bp along the viral genomes. The CpG density was calculated using the same sliding window analysis method, except we used a 200 bp window sliding by 1 bp in JEV and a 250 bp window sliding by 1pb in HIV-1. ZAP binding peaks and CpG densities are shown in green and red, respectively. # Location of CpGs. * Number of CpGs per 1 Kb.
Figure 4
Figure 4
Comparison of the abundance of ZAP optimal binding motif C(n7)G(n)CG with the control motif C(n7)C(n)CG in viruses of SARS-CoV-2-like group. The abundance of ZAP optimal binding motif C(n7)G(n)CG was compared to C(n7)C(n)CG in the SARS-CoV-2-like group. The motif C(n7)C(n)CG was used here as a control. Mann–Whitney test was used to determine the difference in the median of abundance between these two motifs.

Similar articles

Cited by

References

    1. Romano M, Ruggiero A, Squeglia F, Maga G, Berisio R. A structural view of SARS-CoV-2 RNA replication machinery: RNA synthesis, proofreading and final capping. Cells. 2020 doi: 10.3390/cells9051267. - DOI - PMC - PubMed
    1. Coronaviridae Study Group of the International Committee on Taxonomy of, V. The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2. Nat. Microbiol.5, 536–544. 10.1038/s41564-020-0695-z (2020). - PMC - PubMed
    1. Gussow AB, et al. Genomic determinants of pathogenicity in SARS-CoV-2 and other human coronaviruses. Proc. Natl. Acad. Sci. U S A. 2020;117:15193–15199. doi: 10.1073/pnas.2008176117. - DOI - PMC - PubMed
    1. Di Gioacchino A, et al. The heterogeneous landscape and early evolution of pathogen-associated CpG dinucleotides in SARS-CoV-2. Mol. Biol. Evol. 2021;38:2428–2445. doi: 10.1093/molbev/msab036. - DOI - PMC - PubMed
    1. Pollock DD, et al. Viral CpG deficiency provides no evidence that dogs were intermediate hosts for SARS-CoV-2. Mol. Biol. Evol. 2020;37:2706–2710. doi: 10.1093/molbev/msaa178. - DOI - PMC - PubMed

Publication types

MeSH terms