Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Feb;23(1):64-71.
doi: 10.1016/j.copbio.2011.11.028. Epub 2011 Dec 13.

Advancing analytical algorithms and pipelines for billions of microbial sequences

Affiliations
Review

Advancing analytical algorithms and pipelines for billions of microbial sequences

Antonio Gonzalez et al. Curr Opin Biotechnol. 2012 Feb.

Abstract

The vast number of microbial sequences resulting from sequencing efforts using new technologies require us to re-assess currently available analysis methodologies and tools. Here we describe trends in the development and distribution of software for analyzing microbial sequence data. We then focus on one widely used set of methods, dimensionality reduction techniques, which allow users to summarize and compare these vast datasets. We conclude by emphasizing the utility of formal software engineering methods for the development of computational biology tools, and the need for new algorithms for comparing microbial communities. Such large-scale comparisons will allow us to fulfill the dream of rapid integration and comparison of microbial sequence data sets, in a replicable analytical environment, in order to describe the microbial world we inhabit.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Moving away from pie charts and trees, current analytical methods
With a few sequences from a small number of samples, pie charts and trees was sufficient for comparing microbial community samples. In contrast, with modern technologies that allow sequencing large number of samples with millions of reads, the new analysis “gold-standard” has moved towards deploying new tools. Here we show data from ref. (6) analyzed with several methods: TopiaryExplorer (41) allows visualization of large trees in the context of per-sample data, , in this example visualizing the GreenGenes reference tree colored by body site matches (red-stool, blue-oral, orange-skin), showing pie charts of most abundant sequences and zooming into the different clades; QIIME PCoA plot comparing all samples color by body site (same colors than in TopiaryExplorer), PCoA with explicit time axis and tracing to follow individuals over time in each body site, allowing visually inspection the changes over time (female: red-gut, blue-oral, orange-skin; male: green-gut, purple-oral, yellow-skin); and semivariograms to assess temporal correlation of observations in the stool samples separated by sex (red-female, blue-male).

Similar articles

Cited by

References

    1. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A. 2011 Mar 15;108(Suppl 1):4516–22. Demonstrates that the Illumina platform's short reads can be used successfully for 16S rRNA profiling in a wide range of environments. - PMC - PubMed
    1. Bartram AK, Lynch MD, Stearns JC, Moreno-Hagelsieb G, Neufeld JD. Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads. Appl Environ Microbiol. 2011 Jun;77(11):3846–52. Demonstrates the advances in sampling depth and quality by using Illumina technologies; also shows the possibility of reducing erroneous sequences by using paired-end reads. - PMC - PubMed
    1. Frank DN, St Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci U S A. 2007 Aug 21;104(34):13780–5. - PMC - PubMed
    1. Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, et al. A core gut microbiome in obese and lean twins. Nature. 2009 Jan 22;457(7228):480–4. - PMC - PubMed
    1. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009 Dec 18;326(5960):1694–7. - PMC - PubMed

Publication types