Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 4;2(5):e00180-11.
doi: 10.1128/mBio.00180-11. Print 2011.

Raw sewage harbors diverse viral populations

Affiliations

Raw sewage harbors diverse viral populations

Paul G Cantalupo et al. mBio. .

Abstract

At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(+)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity.

Importance: At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Raw sewage contains diverse viruses. (A) Raw sewage was obtained from three cities (P, Pittsburgh, Pennsylvania, United States; B, Barcelona, Spain; A, Addis Ababa, Ethiopia) on three different continents. Virion populations were concentrated by organic flocculation (31). Raw sewage metagenomes were obtained through pyrosequencing, and the sequences are classified by subsequent bioinformatic methods. 10 L, 10 liters; NA, nucleic acid. (Reproduced from Google—Map data ©2011 Geocentre Consulting, MapLink, Tele Atlas.) (B) Examination of raw sewage by electron microscopy reveals a diversity of virion morphologies. All black bars represent 100 nm, except the top bar, which represents 50 nm. (C) Total nucleic acid (DNA and reverse-transcribed RNA) was sequenced and binned according to taxa based on BLAST searches. Most sequences found within virions do not match the sequences in public databases.
FIG 2
FIG 2
Raw sewage contains many known and novel viruses. (A) Known sequences (n = 3,027) identified by BLAST are related to many different viral families. Families with <1% abundance were collapsed into the “Other” category. Only the prefixes of family names are shown (e.g., Virga for Virgaviridae). (B) Distribution of the hosts of the known eukaryotic virus reads (n = 1,748). Plant, human, and insect viruses are abundant in raw sewage. (C) Distribution of the hosts of the known bacteriophage reads (n = 1,279). (D) Novel sequences (n = 43,381) identified by BLAST are related to many different virus families. Families with <1% abundance were collapsed into the “Other” category. See Table S6 for a list of families and hosts in the “other” category.
FIG 3
FIG 3
Most virus-related pyrosequencing reads found in raw sewage represent previously unknown viruses. Diversity plot of selected viral families (only the prefix of a family name is shown) are organized by genomic content (dsDNA, ssDNA, ssRNA, and dsRNA). The four phage families are underlined. The rings of the plot represent bins of increasing percent identity (20%, 50%, and 100% are marked for orientation) relative to the GenBank reference sequence as identified by the top BLAST hit. The area of each circle is proportional to the number of virus family reads in that location. The geographic locations from which the sequence reads were obtained are indicated by color: blue, Addis Ababa, Ethiopia; green, Barcelona, Spain; red, Pittsburgh, PA, USA.
FIG 4
FIG 4
Novel virus analysis from selected virus families. (A to C) A selected set of novel assembled sequences from three different virus families that overlapped each other on a representative genome from each family was aligned with ClustalW2. The nucleotide alignment is shown graphically in the fragment recruitment plot (top) with vertical black broken lines marking the common alignment region against a selected reference genome from the virus family. Each assembled sequence was translated, and the resulting ORFs were aligned with ClustalW2. DNA and protein neighbor-joining (NJ) phylogenetic trees were constructed from homologous positions without any gaps. Metagenomic sequences (red circles) and GenBank sequences (black circles) are indicated. Metagenomic sequences that are labeled with a number represent different novel virus species in the raw sewage. (A) For the Parvovirinae, 4 novel assembled sequences were aligned with 9 selected reference Parvovirinae genomes, and the nonstructural (NS) gene from each genome was used for the protein alignment. (B) For the Picornaviridae, 3 novel assembled sequences were aligned with 12 selected reference Picornaviridae genomes, and the polyprotein from each genome was used for the protein alignment. Alignment is in the P-loop NTPase domain of the 2C ATPase mature peptide of the polyprotein. (C) For the Picobirnaviridae, for segment 1 (left), 5 novel assembled sequences were aligned with the 2 reference segment 1 sequences in GenBank (human and rabbit), and the segment 1 ORF from each genome was used for the protein alignment. For segment 2 (right), 6 novel assembled sequences were aligned with the 2 reference segment 2 sequences in GenBank (human and porcine) and 14 RdRp ORFs (13 human and 1 bovine) was used for the protein alignment. See Fig. S3 for the ORF alignments. Virus abbreviations: PorPV, porcine parvovirus; AleutMDV, aleutian mink disease virus; AAV5, adeno-associated virus 5; BovAAV, bovine AAV; CMV, canine minute virus; PorBV, porcine bocavirus; HepAV, hepatitis A virus; AvianEV, avian encephalomyelitis virus; HumPV1, human parechovirus 1; DuckHAV, duck hepatitis A virus; PorTV, porcine teschovirus; SaffoldV, Saffold virus; AichiV, Aichi virus; FootMDV, foot-and-mouth disease virus; EquineRBV, equine rhinitis B virus; SenecaVV, Seneca Valley virus; Rabbit PbV, rabbit picobirnavirus; HumPbV, human picobirnavirus; BPV, bovine picobirnavirus.
FIG 5
FIG 5
An assembled genome of non-A, non-B hepatitis virus from raw sewage shows that it belongs to the Inoviridae family. (A) BLASTN alignment of WW-nAnB and the non-A, non-B hepatitis virus with GenBank accession no. X53411 (X53411 sequence) is displayed as a dot matrix plot. The WW-nAnB sequence and the X53411 sequence run 5′ to 3′ on the x axis and y axis, respectively. The positions of the insertions (I) and deletion (D) are labeled. (B) Protein alignment of the X53411 ORF1 with the corrected gene C sequence from WW-nAnB. Identical amino acids (*), highly similar amino acids (:), and amino acids with low similarity (.) are indicated. (C) Forty-five cycles of PCR were performed with 0.1 and 1 µl of five different virion preparations from raw sewage. Shinola was used as a negative control (Neg. Con.). PCR products were visualized by EtBr on a 1.5% agarose gel. DNA ladder sizes are indicated in base pairs. The specific PCR product bands (373 bp) were excised and sequenced. The resulting nucleotide sequences were aligned (shown at the bottom of the panel), and a bootstrapped phylogenetic tree was generated based on the alignment (top right corner of panel). (D) WW-nAnB belongs to the Inoviridae family of bacteriophages. The genomic organization of WW-nAnB compared to non-A, non-B hepatitis virus (GenBank accession no. X53411), Propionibacterium phage phiB5 (B5) (GenBank accession no. AF428260), enterobacterial phage fd (GenBank accession no. J02451), and bacteriophage Pf3 (GenBank accession no. M19377) is shown. Unlabeled X53411 ORFs are homologous to the similarly located ORFs in WW-nAnB. DNA replication initiation proteins are shown in red, assembly proteins with an ATPase domain are shown in yellow, absorption proteins are shown in blue, and all other identified ORFs are shown in green. Each tick mark in the ruler below each genome represents 100 bp. IG, noncoding intergenic region.

Similar articles

Cited by

References

    1. Angly FE, et al. 2006. The marine viromes of four oceanic regions. PLoS Biol. 4:e368 - PMC - PubMed
    1. Breitbart M, et al. 2002. Genomic analysis of uncultured marine viral communities. Proc. Natl. Acad. Sci. U. S. A. 99:14250–14255 - PMC - PubMed
    1. Culley AI, Lang AS, Suttle CA. 2006. Metagenomic analysis of coastal RNA virus communities. Science 312:1795–1798 - PubMed
    1. Monier A, Claverie JM, Ogata H. 2008. Taxonomic distribution of large DNA viruses in the sea. Genome Biol. 9:R106 - PMC - PubMed
    1. Rusch DB, et al. 2007. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 5:e77 - PMC - PubMed

Publication types

Associated data