Detection of DNA viruses in prostate cancer

Smelov, Vitaly; Bzhalava, Davit; Arroyo Mühr, Laila Sara; Eklund, Carina; Komyakov, Boris; Gorelov, Andrey; Dillner, Joakim; Hultin, Emilie

doi:10.1038/srep25235

Download PDF

Article
Open access
Published: 28 April 2016

Detection of DNA viruses in prostate cancer

Scientific Reports volume 6, Article number: 25235 (2016) Cite this article

3311 Accesses
26 Citations
9 Altmetric
Metrics details

Subjects

A Corrigendum to this article was published on 31 May 2016

This article has been updated

Abstract

We tested prostatic secretions from men with and without prostate cancer (13 cases and 13 matched controls) or prostatitis (18 cases and 18 matched controls) with metagenomic sequencing. A large number (>200) of viral reads was only detected among four prostate cancer cases (1 patient each positive for Merkel cell polyomavirus, JC polyomavirus and Human Papillomavirus types 89 or 40, respectively). Lower numbers of reads from a large variety of viruses were detected in all patient groups. Our knowledge of the biology of the prostate may be furthered by the fact that DNA viruses are commonly shed from the prostate and can be readily detected by metagenomic sequencing of expressed prostate secretions.

The landscape of viral associations in human cancers

Article Open access 05 February 2020

Detection of high-risk Human Papillomavirus in prostate cancer from a UK based population

Article Open access 10 May 2023

Single-cell analysis of human primary prostate cancer reveals the heterogeneity of tumor-associated epithelial cell states

Article Open access 10 January 2022

Introduction

Among risk factors proposed for prostate cancer^1,2 are chronic inflammation of the prostate^3,4,5 and sexually transmitted infections^{6,7,8,9,10,11}. Expressed prostate secretions (EPS) are readily obtained and recommended samples for diagnosis of prostatic inflammation^12,13. EPS are also informative for viral epidemiological studies^14,15. Metagenomic sequencing (MGS) determines non-human sequences present in a sample without prior cloning and is fundamental for modern infectious disease epidemiology^16,17,18,19. We previously reported that JC polyomavirus was readily detectable in EPS using MGS²⁰. However, the MGS technology used at that time had a limited sequencing depth and some samples used also contained urine. To obtain a more complete picture of the viruses shed from the prostate, we wished to analyze pure EPS samples using modern MGS.

Results

Of the total 927 million reads, 13974 reads (0.0014%) mapped to viruses. Odds ratios (ORs) and 95% confidence intervals (CIs), computed based on the Wald approximation, did not reveal any statistically significant differences between patient groups regarding presence of viral sequences (Tables 1 and 2).

Table 1 Viruses found in prostate cancer cases and controls.

Full size table

Table 2 Viruses in Prostatitis cases and controls.

Full size table

Viral reads were detected in 7/13 cancer cases (2 to 7938 reads) and in 4/13 cancer controls (2 to 96 reads). There were 13654 viral reads in cancers, compared to only 170 reads in controls (Table 1). The study design considered individual patients and post hoc analyses of whether total number of reads differed between cases and controls were therefore not performed. There were only a few viral reads in 5/18 prostatitis cases (2 to 14 reads) and in 4/18 of their matched controls (4 to 46 reads) (Table 2).

A patient with 7938 reads of Merkel cell polyomavirus generated most of the viral reads, followed by a patient with 4478 reads of HPV type 89. The only other subjects with >200 viral reads were one cancer case with 286 reads of JC polyomavirus and a fourth cancer case with 222 reads of HPV type 40. JC polyomavirus was also detected in one additional cancer case (12 reads). Interestingly, the cancer controls had much less viral reads. One control was positive for both Epstein Barr virus (EBV) (88 reads) and JC polyomavirus (8 reads), another was positive for only EBV (14 reads) and a third was positive for HPV43 (40 reads).

Discussion

We report that many prostate cancer patients shed virus DNA and that there were 4 subjects that had one clearly dominant viral species. Strengths of the study include novelty regarding use of modern, deep MGS with an average of 943 million reads per sample, at a 150 bp read length. This corresponds to a sequencing depth of the human genome of >40 times. The Illumina specification of the system promises only a 30 times sequencing depth of the huma genome, suggesting that a virus needs to be present in a proportion of about 1 virus copy per 30 human geneomes. Also, the fact that we studied a prostatic sample (EPS) that can be readily obtained also for large-scale, epidemiological studies as a strength.

Limitations include the limited number of observations. In particular, the higher number of viral reads that we found in cancer cases was based on a small number of subjects and may thus have been attributed to chance. The fact that MGS provides the exact sequence of the virus enables the study of viral subtypes in cases and controls, but the present study had too few positive observations for a meaningful analysis of subtypes. Also, the study was based on subjects that already have the disease. Thus, it is possible that presence of disease may cause presence of viruses rather than the opposite. For example, malignancy may have caused changes that are beneficial for viral replication. Also, since also the control subjects had PSA levels >4 ng/ml, there is a possibility that some of them are false-negative for prostate cancer even though the 18 core biopsy protocol used did not find any cancer²⁶. Larger studies, preferably prospective studies, would be required to elucidate whether viral infections are involved in prostatic disease.

Most previous studies have only studied one or a few infections at a time and/or have used samples (e.g. prostatic biopsies) that are difficult to obtain on a large-scale and identical manner from cases and controls^{6,8,9,10,11,27}. Although many studies have used MGS for comprehensive detection of viruses in human specimens such as skin samples^16,18,19, serum¹⁷ or cervical cells²⁵, as far as we know only a single previous study, from our lab, has used MGS on prostatic specimens²⁰. That study was performed using a now outdated MGS technology²⁰, but is consistent with our results for example regarding frequent detection of JC virus in EPS.

In summary, the knowledge that DNA viruses are commonly shed from the prostate and can be readily detected by MGS of EPS may further the possibilities to study the biology of the prostate.

Materials and Methods

Study design and patients

Thirteen men with prostate cancer were diagnosed among 100 visitors of an oncological dispensary. A standard TRUS-guided 18-core prostate biopsy with subsequent histopathological analysis was carried out in those with abnormal (>4.0 ng/ml) serum prostate-specific antigen (PSA). All cases included were newly diagnosed prostate cancer patients who had not received treatment and where diagnosis was based on the histopathological findings. Out of 100 men, 13 age-matched control patients histologically found to have non-malignant disorders of the prostate were included. The median age of the controls was 68 years (range 42–78), whereas the cases had a mean age of 70 years (range 55–79). EPS specimens were obtained 1–2 days before prostate biopsy. From 900 attendees of a urology unit at a genitourinary clinic, 18 patients with chronic (persisting for >3 months) inflammation of the prostate, without any microbiological agent detectable by standard methods, were age-matched with 18 healthy subjects from the same cohort^15,20. The median age of the controls was 32 years, of the prostatitis cases 36 years. The prostatitis controls and cases were highly sexually active (average of 15–20 lifetime sexual partners), whereas the prostate cancer cases and controls all reported having had only 1 partmer for at least 10 years. Data on smoking, diet or exercising habits was not collected.

The institutional review boards, namely the Department of Clinical Investigations and Intellectual Property, of St. Petersburg Medical Academy of Postgraduate Studies (North-Western State Medical University named after I.I. Mechnikov since 2011) under the Federal Agency of Public Health and Social Development of Roszdrav approved the study. The methods were carried out in accordance with the approved guidelines. All participants provided written informed consent for this study.

DNA isolation and Metagenomic sequencing

After the EPS samples had been frozen and thawed, DNA was extracted by boiling 5 uL of EPS sample diluted in 95 uL of 1 × TE-buffer at 107 °C for 10 min. All 62 samples together with 2 negative controls were subjected to random whole genome amplification using the Illustra Ready-2-Go GenomiPhi HY DNA Amplification kit (GE Healthcare, UK), following the manufacturer’s guidelines except that the incubation time was prolonged from the recommended 4 hours to 7 hours. Laboratory grade water (Sigma-Aldrich, St. Louis, MO, US) was used as negative control to assess false positive reads. After the amplification reaction, all samples was diluted 1:2 in water and quantified using QuantiFluor-ST (Promega, US)²⁰, a fluorometric assay quantifying dsDNA, according to manufacturer’s user guide. DNA concentration after WGA ranged between 344 to 1288 ng/ul, with a median of 676 ng/ul. DNA libraries were prepared using the Nextera DNA Sample Preparation kit with the 96 index system according to the user guide revision B (Illumina), starting with 50 ng DNA in the tagmentation reaction. The library pools were quantified with the QuantiFluor system as above and the library sizes were checked using the Bioanalyzer High Sensitivity DNA chip (Agilent). DNA concentration after library prep ranged between 1.1 to 8.5 ng/ul, with a median of 3.3 ng/ul. Each library was normalized to 10 nM before pooling of all 64 libraries. The library pool was denatured and diluted to 2.6 pM and spiked with 1% PhiX control according to Denaturing and Diluting Libraries for the NextSeq 500 Rev. B manual (Illumina) prior to paired-end sequencing of 151+151 cycles (2 × 151 bp read length) on the NextSeq instrument and NCS v1.2 using NextSeq 500 High-Output Reagent Kit (Illumina), following NextSeq 500 System User Guide Rev. E (Illumina). The sequencing flow cell cluster density was 227 K/mm² and the total yield was 147 Gb with 67% >Q30 and approximately 7.4 million paired-end reads passing filters per sample.

Data analysis

Bioinformatic analyses used R (www.R-project.org) and python (www.python.org) scripts run on a 40 core, 2 TB RAM Linux server. Short index sequences (part of the Illumina primers) were used to assign sequences to the originating sample²¹. Sequences were quality checked and trimmed according to Phred quality scores²². Reads were screened against the human reference genome hg19 using BWA-MEM²³ and reads with >95% identity over 75% of their length to human DNA were removed. Sequences were then normalized (http://ged.msu.edu/papers/2012-diginorm) to discard redundant data and reduce sampling variation and sequencing errors. The normalized dataset was assembled using Trinity²⁴, SOAPdenovo and SOAPdenovo-Trans (http://soap.genomics.org.cn/) into contiguous sequences (contigs). Reads before assembly were re-mapped to contigs and the result was used to calculate number of reads for each contig. The use of several assembly algorithms and re-mapping of all singleton reads to assembled contigs were used to validate assembly results^19,25. Contigs that were assembled by at least one of the assemblers were retained. Overall, 76322 contigs were assembled. For 65443 of them (86%) there were at least 4 raw reads (2 pair-end reads) that were remapped to the contig and the contig was then considered to be calid. Assembled, valid contigs were taxonomically classified by comparison against GenBank nucleotide database using Paracel (www.strikingdevelopment.com) blastn. To identify possible artifactual chimeras (contigs containing sequences originating from different DNA sequences) the sequence that aligned to its most closely related sequence in GenBank was divided into three equal segments. If at least one of the segments differed in similarity to the corresponding overlapping parts with more than 5% (for example, if segment 1 was 88% similar and segment 2 was 94% similar) the sequence was considered a possible chimera.

Additional Information

How to cite this article: Smelov, V. et al. Detection of DNA viruses in prostate cancer. Sci. Rep. 6, 25235; doi: 10.1038/srep25235 (2016).

Change history

31 May 2016
A correction has been published and is appended to both the HTML and PDF versions of this paper. The error has been fixed in the paper.

References

Jemal, A. et al. Cancer statistics, 2009. CA Cancer J Clin. 59, 225–249 (2009).
Article Google Scholar
Heidenreich, A. et al. EAU guidelines on prostate cancer. Part II: Treatment of advanced, relapsing, and castration-resistant prostate cancer. Eur Urol. 65, 467–479 (2014).
Article CAS Google Scholar
Dennis, L. K., Lynch, C. F. & Torner, J. C. Epidemiologic association between prostatitis and prostate cancer. Urology 60, 78–83 (2002).
Article Google Scholar
Daniels, N. A. et al. Correlates and prevalence of prostatitis in a large community-based cohort of older men. Urology 66, 964–970 (2005).
Article Google Scholar
Jiang, J. et al. The role of prostatitis in prostate cancer: meta-analysis. Plos One 8, e85179 (2013).
Article ADS Google Scholar
Dillner, J. et al. Sero-epidemiological association between human-papillomavirus infection and risk of prostate cancer. Int J Cancer 75, 564–567 (1998).
Article CAS Google Scholar
Nelson, W. G., De Marzo, A. M. & Isaacs, W. B. Prostate cancer. N Engl J Med. 349, 366–381 (2003).
Article CAS Google Scholar
Taylor, M. L., Mainous, A. G. & Wells, B. J. Prostate cancer and sexually transmitted diseases: a meta-analysis. Fam Med. 37, 506–512 (2005).
PubMed Google Scholar
Sutcliffe, S. et al. Gonorrhea, syphilis, clinical prostatitis, and the risk of prostate cancer. Cancer Epidemiol Biomark Prev. 15, 2160–2166 (2006).
Article Google Scholar
Wagenlehner, F. M. E. et al. The role of inflammation and infection in the pathogenesis of prostate carcinoma. BJU Int. 100, 733–737 (2007).
Article CAS Google Scholar
Cheng, I. et al. Prostatitis, sexually transmitted diseases, and prostate cancer: the California Men’s Health Study. Plos One 5, e8736 (2010).
Article ADS Google Scholar
Litwin, M. S. et al. The National Institutes of Health chronic prostatitis symptom index: development and validation of a new outcome measure. Chronic Prostatitis Collaborative Research Network. J Urol. 162, 369–375 (1999).
Article CAS Google Scholar
Fall, M. et al. EAU guidelines on chronic pelvic pain. Eur Urol. 57, 35–48 (2010).
Article Google Scholar
Arroyo, L. S. et al. Next generation sequencing for human papillomavirus genotyping. J Clin Virol. 58, 437–442 (2013).
Article CAS Google Scholar
Smelov, V., Eklund, C., Bzhalava, D., Novikov, A. & Dillner, J. Expressed prostate secretions in the study of human papillomavirus epidemiology in the male. Plos One 8, e66630 (2013).
Article CAS ADS Google Scholar
Ekström, J., Bzhalava, D., Svenback, D., Forslund, O. & Dillner, J. High throughput sequencing reveals diversity of Human Papillomaviruses in cutaneous lesions. Int J Cancer 129, 2643–2650 (2011).
Article Google Scholar
Bzhalava, D. et al. Phylogenetically diverse TT virus viremia among pregnant women. Virology 432, 427–434 (2012).
Article CAS Google Scholar
Foulongne, V. et al. Human skin microbiota: high diversity of DNA viruses identified on the human skin by high throughput sequencing. Plos One 7, e38499 (2012).
Article CAS ADS Google Scholar
Bzhalava, D. et al. Unbiased approach for virus detection in skin lesions. Plos One 8, e65953 (2013).
Article CAS ADS Google Scholar
Smelov, V. et al. Metagenomic sequencing of expressed prostate secretions. J Med Virol. 86, 2042–2048 (2014).
Article Google Scholar
Bzhalava, D. Bioinformatics for Viral Metagenomics. J Data Min Genomics Proteomics 04 (2013).
Bokulich, N. A. et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat Methods 10, 57–59 (2013).
Article CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinforma Oxf Engl. 26, 589–595 (2010).
Article Google Scholar
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 8, 1494–1512 (2013).
Article CAS Google Scholar
Meiring, T. L. et al. Next-generation sequencing of cervical DNA detects human papillomavirus types not detected by commercial kits. Virol J. 9, 164 (2012).
Article CAS Google Scholar
Pietzak, E. J. et al. Multiple repeat prostate biopsies and the detection of clinically insignificant cancer in men with large prostates. Urology 84, 380–385 (2014).
Article Google Scholar
Singh, N. et al. Implication of high risk human papillomavirus HR-HPV infection in prostate cancer in Indian population–a pioneering case-control analysis. Sci Rep. 5, 7822 (2015).
Article CAS Google Scholar

Download references

Acknowledgements

This study was supported by the Swedish Cancer Society and the Swedish Research Council. In part, the work reported in this paper was undertaken during the tenure of a Postdoctoral Fellowship (V.S.) from the International Agency for Research on Cancer, partially supported by the European Commission FP7 Marie Curie Actions – People – Co-funding of regional, national and international programmes (COFUND).

Author information

Authors and Affiliations

Department of Laboratory Medicine, Karolinska Institutet, Stockholm, 141 86, Sweden
Vitaly Smelov, Davit Bzhalava, Laila Sara Arroyo Mühr, Carina Eklund, Joakim Dillner & Emilie Hultin
International Agency for Research on Cancer, World Health Organization, Screening Group, Lyon, 69372, France
Vitaly Smelov
Department of Urology and Andrology, North-Western State Medical University named after I.I. Mechnikov, St. Petersburg, 191015, Russia
Boris Komyakov
St. Petersburg State University, Faculty of Medicine, St. Petersburg, 199034, Russia
Andrey Gorelov

Authors

Vitaly Smelov
View author publications
You can also search for this author in PubMed Google Scholar
Davit Bzhalava
View author publications
You can also search for this author in PubMed Google Scholar
Laila Sara Arroyo Mühr
View author publications
You can also search for this author in PubMed Google Scholar
Carina Eklund
View author publications
You can also search for this author in PubMed Google Scholar
Boris Komyakov
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Gorelov
View author publications
You can also search for this author in PubMed Google Scholar
Joakim Dillner
View author publications
You can also search for this author in PubMed Google Scholar
Emilie Hultin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

V.S., E.H. and J.D. wrote the main manuscript text. V.S., E.H. and D.B. prepared Tables 1 and 2. D.B. performed bioinformatic analysis. L.S.A.M., C.E. and E.H. performed the experiments. V.S. and A.G. performed sampling. J.D. and B.K. provided administrative support. All authors reviewed the manuscript.

Corresponding author

Correspondence to Joakim Dillner.

Ethics declarations

Competing interests

Dr. VS’s work has partially been was undertaken during the tenure of a Postdoctoral Fellowship from the International Agency for Research on Cancer, partially supported by the European Commission FP7 Marie Curie Actions – People – Co-funding of regional, national and international programmes (COFUND). Other authors declare no competing financial interests.

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Smelov, V., Bzhalava, D., Arroyo Mühr, L. et al. Detection of DNA viruses in prostate cancer. Sci Rep 6, 25235 (2016). https://doi.org/10.1038/srep25235

Download citation

Received: 11 January 2016
Accepted: 08 April 2016
Published: 28 April 2016
DOI: https://doi.org/10.1038/srep25235

This article is cited by

Human papillomavirus and prostate cancer: systematic review and meta-analysis
- Irina A. Tsydenova
- Marina K. Ibragimova
- Nikolai V. Litviakov
Scientific Reports (2023)
Prostate carcinogenesis: inflammatory storms
- Johann S. de Bono
- Christina Guo
- Andrea Alimonti
Nature Reviews Cancer (2020)
Machine Learning for detection of viral sequences in human metagenomic datasets
- Zurab Bzhalava
- Ardi Tampuu
- Joakim Dillner
BMC Bioinformatics (2018)
Utility of high-throughput DNA sequencing in the study of the human papillomaviruses
- Noé Escobar-Escamilla
- José Ernesto Ramírez-González
- José Alberto Díaz-Quiñonez
Virus Genes (2018)