Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 22;9(10):e0067324.
doi: 10.1128/msystems.00673-24. Epub 2024 Sep 16.

HCMV detection in Asian gastric cancer RNA-seq data sets and clinical validation in Indian GC patients reveals the HCMV-GC specific gene signatures

Affiliations

HCMV detection in Asian gastric cancer RNA-seq data sets and clinical validation in Indian GC patients reveals the HCMV-GC specific gene signatures

Pandikannan Krishnamoorthy et al. mSystems. .

Abstract

Gastric cancer (GC) prevalence is very high in the Asian population. Oncogenic viruses play a crucial role in the progression of different types of cancers. Through reanalysis of clinical RNA-seq data sets derived from Asian GC patients, this study identified the presence of human cytomegalovirus (HCMV) in Asian GC tumors, next to the well-studied association of EBV. Clinical recruitment of the Indian GC cohort and screening for HCMV presence identified a 14.28% occurrence, similar to that observed in the bioinformatics analysis. A combinatorial approach of rank-based meta-analysis and ranking of groups based on an expectation-maximization algorithm identified that the upregulated LINC02864 and MAGEA10 correlated with poor survival of GC patients and downregulated tumor suppressor genes enriching for gastric acid secretion pathway to be associated with HCMV-positive GC patients, revealing the progressive role of HCMV infection in GC. Genes that discriminate between different stages of GC were identified through feature selection implemented in a machine-learning approach. LTF and KLK10 expressions were found to be specifically dysregulated by HCMV and can also indicate the GC stages. The results of this study will guide future studies to identify the functional role of these genes in the HCMV-associated GC.IMPORTANCENearly 75% of gastric cancer (GC) cases reported globally are from the Asian population. Most existing public databases, such as TCGA, comprise only a fractional portion of data derived from Asian ancestry. This study identified EBV and human cytomegalovirus (HCMV)'s higher detection in GC patients. The presence and role of EBV associated with GC are well-known, and the observation of HCMV prompted us to validate the findings in a small cohort of 40 Indian GC patients. We observed a 14.28% occurrence of HCMV in the Indian cohort, similar to that observed from next-generation sequencing. A combinatorial approach of rank-based meta-analysis and ranking of groups based on an expectation-maximization algorithm identified that the upregulated LINC02864 and MAGEA10 correlated with poor survival of GC patients and downregulated tumor suppressor genes enriching for gastric acid secretion pathway to be associated with HCMV-positive GC patients, revealing the progressive role of HCMV infection in GC.

Keywords: HCMV; RNA-seq; clinical screening; gastric cancer; meta-analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Workflow for identifying HCMV detection and associated gene signature in Asian GC patients. (A) The Schematic of the pipeline used to quantify the viral reads from the RNA-seq data sets. Raw fastq files were quality checked, adapter trimmed, and aligned to the human genome. The mapped reads were used to quantify the host gene expression, while the unmapped reads were aligned to 762 virus genomes to quantify the viral reads present in each sample using the VIRTUS2 pipeline. Tumor samples from 42 Indian GC patients were subjected to nested PCR to identify the presence of HCMV samples. Rank-based meta-analysis identifies genes associated with HCMVPosGC and gene ranking based on posterior probability identifies HCMV_GC specific genes and the final part involves the identification of genes associated with HCMV_GC and different stages of GC progression.
Fig 2
Fig 2
Screening of viruses from high-throughput sequencing data sets derived from Asian GC patients. (A) Venn diagram represents the top viruses that are detected in more than 10 samples and 10 reads in the data sets GSE184336, GSE113255, and GSE122401. EBV and HCMV are the common viruses detected in all three data sets, while HPIV5 and HPV71 are detected only in GSE184336 and GSE113255, respectively. The mean value of V/H ratio, reads mapped on viral genome/read mapped on human genome of all samples of top viruses are represented as bar plots for the data sets (B) GSE184436, (C) GSE113255, and (D) GSE122401. Heatmap denoting the viruses with a minimum of 100 reads in at least one sample and a total of 1,000 reads across all the samples in the data sets, (E) GSE184336, (F) GSE113255, and (G) GSE122401. HPIV5. human parainfluenza virus 5; HCMV, human cytomegalovirus; EBV-4_wt, Epstein-Barr virus wild-type genome; HHV-4_com, human herpes virus-4 complete genome; and MLV, murine leukemia virus.
Fig 3
Fig 3
Screening of HCMV in Indian GC patients using nested PCR. (A) GC tumor tissues were collected from 42 GC patients from Mizoram, India, where GC is highly prevalent. (B) To avoid false-positive detection, nested PCR targeting the immediate early gene of HCMV was performed. (C) Nested PCR identified the presence of HCMV in the clinical samples visualized by agarose gel electrophoresis. (D) Sanger sequencing was performed to confirm the presence of HCMV in the clinical samples.
Fig 4
Fig 4
Significant genes enriched between HCMVPosGC and HCMVNegGC in the GSE184336 data set. (A) Differential expression analysis between HCMVPosGC (samples with more than 10 HCMV reads) and HCMVNegGC (samples with no reads of EBV and HCMV) was performed. PCA plot depicting the segregation of the groups. (B) Significantly upregulated and downregulated genes were visualized through a volcano plot. (C) Over-representation analysis for reactome pathways for upregulated and (D) downregulated genes were represented in dot plots. (E) Significant pathways dysregulated and the log-fold change of associated genes were visualized through the circos plot. (F) The relation between the top genes and pathways was represented through the chord plot.
Fig 5
Fig 5
Rank-based meta-analysis identified the robust genes associated with HCMV-positive GC. (A) Differential expression analysis and over-representation analysis of the upregulated and (B) downregulated genes in all three data sets were visualized through the dot plot. (C) Heatmap representing the logFC of the robust genes in each data set. (D) Dotplot depicting the pathways enriched for the top robust genes identified through robust rank aggregation. Expression of top robust upregulated genes (E) LINC02864 and (F) MAGEA10 in the Indian GC cohort using RT-PCR. (G and H) Survival analysis identified that overexpression of these genes is associated with poor survival outcomes.
Fig 6
Fig 6
HCMV_GC specific signature identified ranking algorithm based on posterior probability. (A) Schematic of the approach implemented to identify the genes that are specifically dysregulated in the HCMV_GC group (minimum of 10 HCMV reads and no EBV reads) compared to HCMV_EBV_GC (presence of HCMV and EBV cumulative reads with the minimum sum of 10 reads) and EBV_GC groups. (B) Venn diagram depicting the number of samples in each group. (C) The PCA plot depicts the segregation of three groups. (D) The genes ranked and considered significant in each class are visualized, setting the posterior probability threshold to be more than 0.95. (E) Gene correlation and interaction network of the top 50 genes in the HCMV_GC group and (F) the HCMV_EBV_GC group identifies the gene clusters associated with the group.
Fig 7
Fig 7
Gastric acid secretion pathway genes are specific for HCMV-specific GC. (A) Comparison of the genes identified through rank-based meta-analysis and HCMV_GC specific ranked genes above posterior probability against EBV_GC and HCMV_EBV_GC groups identified gastric acid secretion pathway genes such as ATP4A, PGA5, ATP4B, and PGA3 to be significantly dysregulated by HCMV. (B) While LINC02864 is specifically dysregulated in HCMV_EBV_GC group, (C) ATP4A, (D) PGA5, (E) ATP4B, and (F) PGA3 were specifically downregulated in HCMV_GC group.
Fig 8
Fig 8
GC stage-wise discriminatory features identify genes associated with HCMV and late stage of GC. (A) The GSE184336 data set was segregated into groups with four different stages of GC as provided in metadata and was processed to identify the best set of features that can discriminate different stages of GC. (B) Feature selection is performed based on RRelief importance scores, and the top 50 genes are plotted. (C) The MDS plot depicts the segregation of samples based on the expression of the 160 genes identified as significant discriminatory features. (D) Venn diagram depicting the identified two genes common between ranked HCMV_GC and stage discriminators. (E) LTF expression was specifically downregulated, while (F) KLK10 expression was specifically upregulated in the HCMV_GC group. (G) The single-cell RNA-seq reanalysis of GSE134520 depicts that (H) LTF and (I) KLK10 expression is predominantly found in GC patients' stromal cells.

Similar articles

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. 2021. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71:209–249. doi:10.3322/caac.21660 - DOI - PubMed
    1. Slavin TP, Weitzel JN, Neuhausen SL, Schrader KA, Oliveira C, Karam R. 2019. Genetics of gastric cancer: what do we know about the genetic risks? Transl Gastroenterol Hepatol 4:55. doi:10.21037/tgh.2019.07.02 - DOI - PMC - PubMed
    1. Krump NA, You J. 2018. Molecular mechanisms of viral oncogenesis in humans. Nat Rev Microbiol 16:684–698. doi:10.1038/s41579-018-0064-6 - DOI - PMC - PubMed
    1. Tavakoli A, Monavari SH, Solaymani Mohammadi F, Kiani SJ, Armat S, Farahmand M. 2020. Association between Epstein-Barr virus infection and gastric cancer: a systematic review and meta-analysis. BMC Cancer 20:493. doi:10.1186/s12885-020-07013-x - DOI - PMC - PubMed
    1. Wang H, Chen X-L, Liu K, Bai D, Zhang W-H, Chen X-Z, Hu J-K, SIGES research group . 2020. Associations between gastric cancer risk and virus infection other than Epstein-Barr virus: a systematic review and meta-analysis based on epidemiological studies. Clin Transl Gastroenterol 11:e00201. doi:10.14309/ctg.0000000000000201 - DOI - PMC - PubMed

Supplementary concepts