Zero is not absence: censoring-based differential abundance analysis for microbiome data
- PMID: 38331411
- PMCID: PMC10885211
- DOI: 10.1093/bioinformatics/btae071
Zero is not absence: censoring-based differential abundance analysis for microbiome data
Abstract
Motivation: Microbiome data analysis faces the challenge of sparsity, with many entries recorded as zeros. In differential abundance analysis, the presence of excessive zeros in data violates distributional assumptions and creates ties, leading to an increased risk of type I errors and reduced statistical power.
Results: We developed a novel normalization method, called censoring-based analysis of microbiome proportions (CAMP), for microbiome data by treating zeros as censored observations, transforming raw read counts into tie-free time-to-event-like data. This enables the use of survival analysis techniques, like the Cox proportional hazards model, for differential abundance analysis. Extensive simulations demonstrate that CAMP achieves proper type I error control and high power. Applying CAMP to a human gut microbiome dataset, we identify 60 new differentially abundant taxa across geographic locations, showcasing its usefulness. CAMP overcomes sparsity challenges, enabling improved statistical analysis and providing valuable insights into microbiome data in various contexts.
Availability and implementation: The R package is available at https://github.com/lapsumchan/CAMP.
© The Author(s) 2024. Published by Oxford University Press.
Conflict of interest statement
None declared.
Figures
Similar articles
-
An empirical Bayes approach to normalization and differential abundance testing for microbiome data.BMC Bioinformatics. 2020 Jun 3;21(1):225. doi: 10.1186/s12859-020-03552-z. BMC Bioinformatics. 2020. PMID: 32493208 Free PMC article.
-
Transformation and differential abundance analysis of microbiome data incorporating phylogeny.Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543. Bioinformatics. 2021. PMID: 34302462
-
A novel normalization and differential abundance test framework for microbiome data.Bioinformatics. 2020 Jul 1;36(13):3959-3965. doi: 10.1093/bioinformatics/btaa255. Bioinformatics. 2020. PMID: 32311021 Free PMC article.
-
Statistical normalization methods in microbiome data with application to microbiome cancer research.Gut Microbes. 2023 Dec;15(2):2244139. doi: 10.1080/19490976.2023.2244139. Gut Microbes. 2023. PMID: 37622724 Free PMC article. Review.
-
Correlation and association analyses in microbiome study integrating multiomics in health and disease.Prog Mol Biol Transl Sci. 2020;171:309-491. doi: 10.1016/bs.pmbts.2020.04.003. Epub 2020 May 23. Prog Mol Biol Transl Sci. 2020. PMID: 32475527 Review.
Cited by
-
ADAPT: Analysis of Microbiome Differential Abundance by Pooling Tobit Models.bioRxiv [Preprint]. 2024 May 17:2024.05.14.594186. doi: 10.1101/2024.05.14.594186. bioRxiv. 2024. PMID: 38798558 Free PMC article. Preprint.
References
-
- Anders S, McCarthy DJ, Chen Y. et al. Count-based differential expression analysis of RNA sequencing data using R and bioconductor. Nat Protoc 2013;8:1765–86. - PubMed
-
- Friedman GD, Cutter GR, Donahue RP. et al. CARDIA: study design, recruitment, and some characteristics of the examined subjects. J Clin Epidemiol 1988;41:1105–16. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous