The Impact of Normalization Methods on RNA-Seq Data Analysis
- PMID: 26176014
- PMCID: PMC4484837
- DOI: 10.1155/2015/621690
The Impact of Normalization Methods on RNA-Seq Data Analysis
Abstract
High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably.
Figures
Similar articles
-
Normalization of Single-Cell RNA-Seq Data.Methods Mol Biol. 2021;2284:303-329. doi: 10.1007/978-1-0716-1307-8_17. Methods Mol Biol. 2021. PMID: 33835450
-
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17. Brief Bioinform. 2013. PMID: 22988256
-
A Zipf-plot based normalization method for high-throughput RNA-seq data.PLoS One. 2020 Apr 9;15(4):e0230594. doi: 10.1371/journal.pone.0230594. eCollection 2020. PLoS One. 2020. PMID: 32271772 Free PMC article.
-
Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools.IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):566-586. doi: 10.1109/TCBB.2018.2873010. Epub 2018 Oct 1. IEEE/ACM Trans Comput Biol Bioinform. 2020. PMID: 30281477 Review.
-
Normalization for Single-Cell RNA-Seq Data Analysis.Methods Mol Biol. 2019;1935:11-23. doi: 10.1007/978-1-4939-9057-3_2. Methods Mol Biol. 2019. PMID: 30758817 Review.
Cited by
-
Homoeologs in Allopolyploids: Navigating Redundancy as Both an Evolutionary Opportunity and a Technical Challenge-A Transcriptomics Perspective.Genes (Basel). 2024 Jul 24;15(8):977. doi: 10.3390/genes15080977. Genes (Basel). 2024. PMID: 39202338 Free PMC article. Review.
-
siqRNA-seq is a spike-in-independent technique for quantitative mapping of mRNA landscape.BMC Genomics. 2024 Jul 30;25(1):743. doi: 10.1186/s12864-024-10650-2. BMC Genomics. 2024. PMID: 39080556 Free PMC article.
-
Normalization of RNA-Seq data using adaptive trimmed mean with multi-reference.Brief Bioinform. 2024 Mar 27;25(3):bbae241. doi: 10.1093/bib/bbae241. Brief Bioinform. 2024. PMID: 38770720 Free PMC article.
-
Protocol for generating customizable and reproducible plots of sequencing coverage data using the seqNdisplayR package.STAR Protoc. 2024 Jun 21;5(2):102960. doi: 10.1016/j.xpro.2024.102960. Epub 2024 Mar 18. STAR Protoc. 2024. PMID: 38502686 Free PMC article.
-
Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases.Genes (Basel). 2023 Nov 8;14(11):2051. doi: 10.3390/genes14112051. Genes (Basel). 2023. PMID: 38002994 Free PMC article. Review.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources