RNA-seq differential expression studies: more sequence or more replication?
- PMID: 24319002
- PMCID: PMC3904521
- DOI: 10.1093/bioinformatics/btt688
RNA-seq differential expression studies: more sequence or more replication?
Abstract
Motivation: RNA-seq is replacing microarrays as the primary tool for gene expression studies. Many RNA-seq studies have used insufficient biological replicates, resulting in low statistical power and inefficient use of sequencing resources.
Results: We show the explicit trade-off between more biological replicates and deeper sequencing in increasing power to detect differentially expressed (DE) genes. In the human cell line MCF7, adding more sequencing depth after 10 M reads gives diminishing returns on power to detect DE genes, whereas adding biological replicates improves power significantly regardless of sequencing depth. We also propose a cost-effectiveness metric for guiding the design of large-scale RNA-seq DE studies. Our analysis showed that sequencing less reads and performing more biological replication is an effective strategy to increase power and accuracy in large-scale differential expression RNA-seq studies, and provided new insights into efficient experiment design of RNA-seq studies.
Availability and implementation: The code used in this paper is provided on: http://home.uchicago.edu/∼jiezhou/replication/. The expression data is deposited in the Gene Expression Omnibus under the accession ID GSE51403.
Figures
Similar articles
-
Power analysis and sample size estimation for RNA-Seq differential expression.RNA. 2014 Nov;20(11):1684-96. doi: 10.1261/rna.046011.114. Epub 2014 Sep 22. RNA. 2014. PMID: 25246651 Free PMC article.
-
Statistical detection of differentially expressed genes based on RNA-seq: from biological to phylogenetic replicates.Brief Bioinform. 2016 Mar;17(2):243-8. doi: 10.1093/bib/bbv035. Epub 2015 Jun 24. Brief Bioinform. 2016. PMID: 26108230 Review.
-
A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments.BMC Bioinformatics. 2013 Aug 21;14:254. doi: 10.1186/1471-2105-14-254. BMC Bioinformatics. 2013. PMID: 23965047 Free PMC article.
-
Empirical assessment of the impact of sample number and read depth on RNA-Seq analysis workflow performance.BMC Bioinformatics. 2018 Nov 14;19(1):423. doi: 10.1186/s12859-018-2445-2. BMC Bioinformatics. 2018. PMID: 30428853 Free PMC article.
-
The power and promise of RNA-seq in ecology and evolution.Mol Ecol. 2016 Mar;25(6):1224-41. doi: 10.1111/mec.13526. Epub 2016 Mar 1. Mol Ecol. 2016. PMID: 26756714 Review.
Cited by
-
Neurons That Underlie Drosophila melanogaster Reproductive Behaviors: Detection of a Large Male-Bias in Gene Expression in fruitless-Expressing Neurons.G3 (Bethesda). 2016 Aug 9;6(8):2455-65. doi: 10.1534/g3.115.019265. G3 (Bethesda). 2016. PMID: 27247289 Free PMC article.
-
Robust principal component analysis for accurate outlier sample detection in RNA-Seq data.BMC Bioinformatics. 2020 Jun 29;21(1):269. doi: 10.1186/s12859-020-03608-0. BMC Bioinformatics. 2020. PMID: 32600248 Free PMC article.
-
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken.BMC Genomics. 2018 Aug 7;19(1):594. doi: 10.1186/s12864-018-4972-7. BMC Genomics. 2018. PMID: 30086717 Free PMC article.
-
A transcriptome software comparison for the analyses of treatments expected to give subtle gene expression responses.BMC Genomics. 2022 Jun 20;23(1):452. doi: 10.1186/s12864-022-08673-8. BMC Genomics. 2022. PMID: 35725382 Free PMC article.
-
Next maSigPro: updating maSigPro bioconductor package for RNA-seq time series.Bioinformatics. 2014 Sep 15;30(18):2598-602. doi: 10.1093/bioinformatics/btu333. Epub 2014 Jun 3. Bioinformatics. 2014. PMID: 24894503 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources