Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 30;9(1):e0035523.
doi: 10.1128/msphere.00355-23. Epub 2023 Dec 6.

Waste not, want not: revisiting the analysis that called into question the practice of rarefaction

Affiliations

Waste not, want not: revisiting the analysis that called into question the practice of rarefaction

Patrick D Schloss. mSphere. .

Abstract

In 2014, McMurdie and Holmes published the provocatively titled "Waste not, want not: why rarefying microbiome data is inadmissible." The claims of their study have significantly altered how microbiome researchers control for the unavoidable uneven sequencing depths that are inherent in modern 16S rRNA gene sequencing. Confusion over the distinction between the definitions of rarefying and rarefaction continues to cloud the interpretation of their results. More importantly, the authors made a variety of problematic choices when designing and analyzing their simulations. I identified 11 factors that could have compromised the results of the original study. I reproduced the original simulation results and assessed the impact of those factors on the underlying conclusion that rarefying data is inadmissible. Throughout, the design of the original study made choices that caused rarefying and rarefaction to appear to perform worse than they truly did. Most important were the approaches used to assess ecological distances, the removal of samples with low sequencing depth, and not accounting for conditions where sequencing effort is confounded with treatment group. Although the original study criticized rarefying for the arbitrary removal of valid data, repeatedly rarefying data many times (i.e., rarefaction) incorporates all the data. In contrast, it is the removal of rare taxa that would appear to remove valid data. Overall, I show that rarefaction is the most robust approach to control for uneven sequencing effort when considered across a variety of alpha and beta diversity metrics.IMPORTANCEOver the past 10 years, the best method for normalizing the sequencing depth of samples characterized by 16S rRNA gene sequencing has been contentious. An often cited article by McMurdie and Holmes forcefully argued that rarefying the number of sequence counts was "inadmissible" and should not be employed. However, I identified a number of problems with the design of their simulations and analysis that compromised their results. In fact, when I reproduced and expanded upon their analysis, it was clear that rarefaction was actually the most robust approach for controlling for uneven sequencing effort across samples. Rarefaction limits the rate of falsely detecting and rejecting differences between treatment groups. Far from being "inadmissible", rarefaction is a valuable tool for analyzing microbiome sequence data.

Keywords: 16S rRNA gene seqeuncing; amplicon sequencing; bioinformatics; microbial ecology; microbiome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Rarefaction resulted in larger and less variable clustering accuracies. With the exception of Unweighted UniFrac distances, the improved performance by rarefaction was observed at smaller effect sizes. In the first row of panels, larger values mean that the accuracies by rarefaction were better than those of subsampling. In the second row of samples, larger values mean that interquartile range (IQR) for rarefaction was larger than that of subsampling.
Fig 2
Fig 2
K-means clustering was consistently as good or better than PAM or hierarchical clustering when comparing rarefaction to other normalization methods. Each point represents the percentage of 100 simulations where that clustering method performed as well or better than the other methods for that normalization procedure.
Fig 3
Fig 3
When the median sequencing depth was 2,000 sequences or more, rarefaction of the entire data set performed better than removing the smallest 15% of samples when using K-means clustering. This figure is analogous to Fig. S4, except that K-means clustering was used instead of PAM.
Fig 4
Fig 4
K-means clustering of distances calculated with rarefaction was as good or better than any other normalization method. This figure is analogous to Fig. S3, except that K-means clustering was used instead of PAM; rarefaction on the full data set was used instead of subsampling to the size of the sample at the 15th percentile; and DESeq Variance Stabilization normalized OTU counts were only used with Euclidean distances.
Fig 5
Fig 5
Clustering accuracies that used rarefaction were as good or better than the other normalization procedures when there is a log-scaled distribution of sequencing depths. This figure is analogous to Fig. 4, except that the sequencing depths for each of the 80 samples in each simulation were drawn without replacement from a log-scaled distribution rather than from the GlobalPatterns sequencing depths.
Fig 6
Fig 6
Rarefaction was consistently as good or better than all other normalization methods at assigning samples to the correct treatment group regardless of whether sequencing depth was confounded by treatment group. Because the clustering algorithms forced samples into one of two groups, the expected accuracy with an effect size of 1.00 was 0.51. With an effect size of 1.15, the expected accuracy was 1.00. Each point represents the median of 100 replicates, and the error bars represent the observed 95% confidence interval. Data are shown for a median sequencing depth (ÑL) of 10,000 sequences when individual sequencing depths were sampled with replacement from the GlobalPatterns data set or without replacement from the log-scaled distribution.
Fig 7
Fig 7
Rarefaction was consistently as good or better than all other normalization methods at controlling for Type I error and maximizing power to detect differences in treatment group using adonis2 regardless of whether sequencing depth was confounded by treatment group. Type I errors were assessed as the fraction of 100 simulations that yielded a significant P value (i.e., less than or equal to 0.05) at an effect size of 1.00. Power was assessed as the fraction of 100 simulations that yielded a significant P value at an effect size of 1.15. Data are shown for a median sequencing depth (ÑL) of 10,000 sequences when individual sequencing depths were sampled with replacement from the GlobalPatterns data set or without replacement from the log-scaled distribution.
Fig 8
Fig 8
Rarefaction was consistently as good or better than all other normalization methods at controlling for Type I error and maximizing power to detect differences in treatment groups using alpha-diversity metrics regardless of whether sequencing depth was confounded by treatment group when using sequencing depths drawn from the GlobalPatterns data sets. Statistical comparisons of OTU richness and Shannon diversity were performed using the non-parametric Wilcoxon two-sampled test. Type I errors were assessed as the fraction of 100 simulations that yielded a significant P value (i.e., less than or equal to 0.05) at an effect size of 1.00. Power was assessed as the fraction of 100 simulations that yielded a significant P value at an effect size of 1.15. Data are shown for when the case when individual sequencing depths were sampled with replacement from the GlobalPatterns data set.

Similar articles

Cited by

References

    1. Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, Neal PR, Arrieta JM, Herndl GJ. 2006. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc Natl Acad Sci U S A 103:12115–12120. doi:10.1073/pnas.0605127103 - DOI - PMC - PubMed
    1. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, Fierer N, Knight R. 2011. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A 108:4516–4522. doi:10.1073/pnas.1000080107 - DOI - PMC - PubMed
    1. Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. 2013. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq illumina sequencing platform. Appl Environ Microbiol 79:5112–5120. doi:10.1128/AEM.01043-13 - DOI - PMC - PubMed
    1. Schloss PD. 2020. Removal of rare amplicon sequence variants from 16S rRNA gene sequence surveys biases the interpretation of community structure data. bioRxiv. doi:10.1101/2020.12.11.422279 - DOI
    1. Sanders HL. 1968. Marine benthic diversity: a comparative study. Am Nat 102:243–282. doi:10.1086/282541 - DOI

MeSH terms

Substances

LinkOut - more resources