Abstract
High-throughput sequencing has revolutionized microbial ecology, but read quality remains a considerable barrier to accurate taxonomy assignment and α-diversity assessment for microbial communities. We demonstrate that high-quality read length and abundance are the primary factors differentiating correct from erroneous reads produced by Illumina GAIIx, HiSeq and MiSeq instruments. We present guidelines for user-defined quality-filtering strategies, enabling efficient extraction of high-quality data and facilitating interpretation of Illumina sequencing results.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Yatsunenko, T. et al. Nature 486, 222–227 (2012).
Gilbert, J.A. & Meyer, F. ASM Microbe 7, 64–69 (2012).
Reeder, J. & Knight, R. Nat. Methods 7, 668–669 (2010).
Quince, C. et al. Nat. Methods 6, 639–641 (2009).
Caporaso, J.G. et al. Proc. Natl. Acad. Sci. USA 108, 4516–4522 (2011).
Minoche, A.E. et al. Genome Biol. 12, R112 (2011).
Caporaso, J.G. et al. Nat. Methods 7, 335–336 (2010).
Caporaso, J.G. et al. ISME J. 6, 1621–1624 (2012).
Bokulich, N.A. et al. PLoS ONE 7, e36357 (2012).
Bokulich, N.A., Bamforth, C.W. & Mills, D.A. PLoS ONE 7, e35507 (2012).
Lozupone, C. & Knight, R. Appl. Environ. Microbiol. 71, 8228–8235 (2005).
Edgar, R.C. Bioinformatics 26, 2460–2461 (2010).
Wang, Q., Garrity, G.M., Tiedje, J.M. & Cole, J.R. Appl. Environ. Microbiol. 73, 5261–5267 (2007).
DeSantis, T.Z. et al. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
Caporaso, J.G. et al. Bioinformatics 26, 266–267 (2010).
Acknowledgements
We thank G. Giannoukos (Broad Institute of MIT and Harvard), I. Rasolonjatovo (Illumina), M. Gebert (University of Colorado, Boulder) and L. Wegener Parfrey (University of Colorado, Boulder) for contributing mock community sequencing data used in this study, and S. Huse and A. Gonzalez for useful feedback and discussions of this manuscript. This work was supported in part by grants from the US National Institutes of Health (NIH DK78669 to J.I.G., NIH R01HD059127 to D.A.M. and NIH U54HG004969 to D.G.), the Juvenile Diabetes Research Fund (D.G.), the Crohn's and Colitis Foundation of America (J.I.G. and D.G.), and the Howard Hughes Medical Institute. N.A.B. was supported by the 2012–2013 Dannon Probiotics Fellow Program (The Dannon Company) and a Wine Spectator scholarship.
Author information
Authors and Affiliations
Contributions
N.A.B., J.G.C., D.A.M. and R.K. conceived and designed the experiments; N.A.B. performed the experiments and data analysis. All authors contributed sequencing data sets and wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–16, Supplementary Tables 1–9, Supplementary Note (PDF 21952 kb)
Rights and permissions
About this article
Cite this article
Bokulich, N., Subramanian, S., Faith, J. et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat Methods 10, 57–59 (2013). https://doi.org/10.1038/nmeth.2276
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.2276