CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome

doi:10.1371/journal.pone.0019051

. 2011 Apr 22;6(4):e19051.

doi: 10.1371/journal.pone.0019051.

CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome

Ann L Griffen¹, Clifford J Beall, Noah D Firestone, Erin L Gross, James M Difranco, Jori H Hardman, Bastienne Vriesendorp, Russell A Faust, Daniel A Janies, Eugene J Leys

Affiliations

PMID: 21544197
PMCID: PMC3081323
DOI: 10.1371/journal.pone.0019051

CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome

Ann L Griffen et al. PLoS One. 2011.

. 2011 Apr 22;6(4):e19051.

doi: 10.1371/journal.pone.0019051.

Authors

Ann L Griffen¹, Clifford J Beall, Noah D Firestone, Erin L Gross, James M Difranco, Jori H Hardman, Bastienne Vriesendorp, Russell A Faust, Daniel A Janies, Eugene J Leys

Affiliation

¹ Division of Pediatric Dentistry, College of Dentistry, The Ohio State University, Columbus, Ohio, United States of America. griffen.1@osu.edu

PMID: 21544197
PMCID: PMC3081323
DOI: 10.1371/journal.pone.0019051

Abstract

Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically curated 16S rDNA database of the core oral microbiome, CORE, was developed. The goal was to include a comprehensive and minimally redundant representation of the bacteria that regularly reside in the human oral cavity with computationally robust classification at the level of species and genus. Clades of cultivated and uncultivated taxa were formed based on sequence analyses using multiple criteria, including maximum-likelihood-based topology and bootstrap support, genetic distance, and previous naming. A number of classification inconsistencies for previously named species, especially at the level of genus, were resolved. The performance of the CORE database for identifying clinical sequences was compared to that of three publicly available databases, GenBank nr/nt, RDP and HOMD, using a set of sequencing reads that had not been used in creation of the database. CORE offered improved performance compared to other public databases for identification of human oral bacterial 16S sequences by a number of criteria. In addition, the CORE database and phylogenetic tree provide a framework for measures of community divergence, and the focused size of the database offers advantages of efficiency for BLAST searching of large datasets. The CORE database is available as a searchable interface and for download at http://microbiome.osu.edu.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Circular phylogenetic tree at level of genus.**
The tree was generated with RAxML and viewed in ITOL . Genera are color-coded by phyla, except for the Firmicutes and Proteobacteria, which are shown at the level of class.

**Figure 2. Cumulative distribution of clinical sequences against database entries.**
The frequency with which each of the sequences in CORE were encountered in the clinical datasets used for curation are shown as the cumulative percent of total sequences. They are ordered from most to least common. The majority of clinical sequences were accounted for by fewer than 1000 CORE entries.

**Figure 3. Numbers of S-OTUs by phylum in CORE.**
Number of S-OTUs assigned to each of the 14 phyla observed in the oral cavity and pharynx. A) Common phyla B) Rare phyla (<10 S-OTUs). The fraction of S-OTUs for which a cultivated member has not been reported is indicated.

**Figure 4. Plot of the variability of the 16S gene within the oral microbiome.**
668 full-length 16S sequences selected to comprehensively represent the oral microbiome were aligned. The Shannon entropy index (H’) was calculated for each base position, and mean information entropy for primer-sized and amplicon-sized windows along the length of the sequence were plotted. Variable and conserved regions can be visualized. (Because of gaps inserted in the alignment the numbering does not correspond directly to *E. coli* numbering.)

**Figure 5. Position of 1st named match in BLAST results.**
A 1000 sequence test set of clinical sequences was BLAST searched against 4 databases. We ranked the results by sequence identity level (more appropriate than e-value because of the presence of truncated database sequences in some cases) and scanned the lists above the 98% similarity level to find the position of the 1^st match that included a full Latin name (genus plus species). A) Bar graph showing the results for queries for which a named match was found in at least one of the 4 databases. B) Box and whisker plots of position of 1^st named match for queries that returned a >98% identical named match for all databases. The lower limit, middle line, and upper limit of the blue box indicate the 25^th, 50^th and 75^th percentiles of the data respectively. The whiskers are 1.5 times the inter-quartile distance, and jittered data points are shown. For CORE and HOMD, the boxes and whiskers are compressed at the 1 value because of the large number of named matches in the first result for these two databases.

**Figure 6. Completeness of databases.**
The percent of test sequences that failed to match any sequence is shown for each database for a range of similarity cut-offs.

**Figure 7. Ambiguity in databases.**
The mean number of species names that matched the test sequences is shown for each database for similarity thresholds from 98 to 99.5%.

See this image and copyright information in PMC

Cited by

Guided Plasma Application in Dentistry-An Alternative to Antibiotic Therapy.
Gross T, Ledernez LA, Birrer L, Bergmann ME, Altenburger MJ. Gross T, et al. Antibiotics (Basel). 2024 Aug 5;13(8):735. doi: 10.3390/antibiotics13080735. Antibiotics (Basel). 2024. PMID: 39200035 Free PMC article.
Isolation, identification, and biological characterization of bacterial endophytes isolated from Gunnera perpensa L.
Mahlangu SG, Zulu N, Serepa-Dlamini MH, Tai SL. Mahlangu SG, et al. FEMS Microbiol Lett. 2024 Jan 9;371:fnae056. doi: 10.1093/femsle/fnae056. FEMS Microbiol Lett. 2024. PMID: 39039013 Free PMC article.
A shared group of bacterial taxa in the duodenal microbiota of undernourished Pakistani children with environmental enteric dysfunction.
Iqbal NT, Chen RY, Griffin NW, Hibberd MC, Khalid A, Sadiq K, Jamil Z, Ahmed K, Iqbal J, Hotwani A, Kabir F, Rahman N, Rizvi A, Idress R, Ahmed Z, Ahmed S, Umrani F, Syed S, Moore SR, Ali A, Barratt MJ, Gordon JI. Iqbal NT, et al. mSphere. 2024 Jun 25;9(6):e0019624. doi: 10.1128/msphere.00196-24. Epub 2024 May 14. mSphere. 2024. PMID: 38742887 Free PMC article.
Microbial Symphony: Navigating the Intricacies of the Human Oral Microbiome and Its Impact on Health.
Bhandary R, Venugopalan G, Ramesh A, Tartaglia GM, Singhal I, Khijmatgar S. Bhandary R, et al. Microorganisms. 2024 Mar 13;12(3):571. doi: 10.3390/microorganisms12030571. Microorganisms. 2024. PMID: 38543622 Free PMC article. Review.
Evaluating Alterations of the Oral Microbiome and Its Link to Oral Cancer among Betel Quid Chewers: Prospecting Reversal through Probiotic Intervention.
Diwan P, Nirwan M, Bahuguna M, Kumari SP, Wahlang J, Gupta RK. Diwan P, et al. Pathogens. 2023 Jul 30;12(8):996. doi: 10.3390/pathogens12080996. Pathogens. 2023. PMID: 37623956 Free PMC article. Review.

See all "Cited by" articles

References

1. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75:7537–7541. - PMC - PubMed
1. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed
1. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Meth. 2010;7:335–336. - PMC - PubMed
1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
1. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75:7537–7541. - PMC - PubMed

[2] Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75:7537–7541. - PMC - PubMed

[3] Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed

[4] Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed

[5] Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Meth. 2010;7:335–336. - PMC - PubMed

[6] Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Meth. 2010;7:335–336. - PMC - PubMed

[7] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed

[8] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed

[9] Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. - PMC - PubMed

[10] Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome

Affiliation

CORE: a phylogenetically-curated 16S rDNA database of the core oral microbiome

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials