A protocol to evaluate RNA sequencing normalization methods
- PMID: 31861985
- PMCID: PMC6923842
- DOI: 10.1186/s12859-019-3247-x
A protocol to evaluate RNA sequencing normalization methods
Abstract
Background: RNA sequencing technologies have allowed researchers to gain a better understanding of how the transcriptome affects disease. However, sequencing technologies often unintentionally introduce experimental error into RNA sequencing data. To counteract this, normalization methods are standardly applied with the intent of reducing the non-biologically derived variability inherent in transcriptomic measurements. However, the comparative efficacy of the various normalization techniques has not been tested in a standardized manner. Here we propose tests that evaluate numerous normalization techniques and applied them to a large-scale standard data set. These tests comprise a protocol that allows researchers to measure the amount of non-biological variability which is present in any data set after normalization has been performed, a crucial step to assessing the biological validity of data following normalization.
Results: In this study we present two tests to assess the validity of normalization methods applied to a large-scale data set collected for systematic evaluation purposes. We tested various RNASeq normalization procedures and concluded that transcripts per million (TPM) was the best performing normalization method based on its preservation of biological signal as compared to the other methods tested.
Conclusion: Normalization is of vital importance to accurately interpret the results of genomic and transcriptomic experiments. More work, however, needs to be performed to optimize normalization methods for RNASeq data. The present effort helps pave the way for more systematic evaluations of normalization methods across different platforms. With our proposed schema researchers can evaluate their own or future normalization methods to further improve the field of RNASeq normalization.
Keywords: Biological variability; Normalization; RNASeq; Standardization.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):513. doi: 10.1186/s12859-016-1366-1. BMC Bioinformatics. 2016. PMID: 28155708 Free PMC article.
-
Expression analysis of RNA sequencing data from human neural and glial cell lines depends on technical replication and normalization methods.BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):412. doi: 10.1186/s12859-018-2382-0. BMC Bioinformatics. 2018. PMID: 30453873 Free PMC article.
-
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017. PLoS One. 2017. PMID: 28459823 Free PMC article.
-
[RNA-Seq and its applications: a new technology for transcriptomics].Yi Chuan. 2011 Nov;33(11):1191-202. doi: 10.3724/sp.j.1005.2011.01191. Yi Chuan. 2011. PMID: 22120074 Review. Chinese.
-
Single-cell RNA-sequencing: The future of genome biology is now.RNA Biol. 2017 May 4;14(5):637-650. doi: 10.1080/15476286.2016.1201618. Epub 2016 Jul 21. RNA Biol. 2017. PMID: 27442339 Free PMC article. Review.
Cited by
-
Multi-Omics Integrative Analyses Identified Two Endotypes of Hip Osteoarthritis.Metabolites. 2024 Sep 1;14(9):480. doi: 10.3390/metabo14090480. Metabolites. 2024. PMID: 39330487 Free PMC article.
-
The link between gene duplication and divergent patterns of gene expression across a complex life cycle.Evol Lett. 2024 Jul 2;8(5):726-734. doi: 10.1093/evlett/qrae028. eCollection 2024 Sep. Evol Lett. 2024. PMID: 39328286 Free PMC article.
-
Extracellular vesicles carry transcriptional 'dark matter' revealing tissue-specific information.J Extracell Vesicles. 2024 Aug;13(8):e12481. doi: 10.1002/jev2.12481. J Extracell Vesicles. 2024. PMID: 39148266 Free PMC article.
-
Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy.Biology (Basel). 2024 Jun 28;13(7):482. doi: 10.3390/biology13070482. Biology (Basel). 2024. PMID: 39056677 Free PMC article.
-
Zero is not absence: censoring-based differential abundance analysis for microbiome data.Bioinformatics. 2024 Feb 1;40(2):btae071. doi: 10.1093/bioinformatics/btae071. Bioinformatics. 2024. PMID: 38331411 Free PMC article.
References
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources