Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 28;36(5):662-7.
doi: 10.1128/MCB.00970-14.

The Overlooked Fact: Fundamental Need for Spike-In Control for Virtually All Genome-Wide Analyses

Affiliations

The Overlooked Fact: Fundamental Need for Spike-In Control for Virtually All Genome-Wide Analyses

Kaifu Chen et al. Mol Cell Biol. .

Abstract

Genome-wide analyses of changes in gene expression, transcription factor occupancy on DNA, histone modification patterns on chromatin, genomic copy number variation, and nucleosome positioning have become popular in many modern laboratories, yielding a wealth of information during health and disease states. However, most of these studies have overlooked an inherent normalization problem that must be corrected with spike-in controls. Here we describe the reason why spike-in controls are so important and explain how to appropriately design and use spike-in controls for normalization. We also suggest ways to retrospectively renormalize data sets that were wrongly interpreted due to omission of spike-in controls.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Schematic to show why sequencing experiments require spike-in controls for accurate comparison between samples. Examples are shown for specific regions of the genome. (a) When the same degree of change happens everywhere on the genome, normalizing total sequencing reads to the same number hides the change, whereas normalizing spike-in reads to the same number reveals the global change of read density. (b) When signal increases happen at specific genomic regions, normalizing total sequencing reads between samples introduces artifactual reductions in the number of reads from other regions of the genome, which is falsely interpreted as being reduced under the specific experimental condition. Such artificial changes can be avoided by using spike-in controls as a reference for normalization. (c) Differences in copy numbers of methylated DNA, such as at repeat regions, can be detected accurately only with a spike-in reference, although the methylation ratio per se may be analyzed correctly without spike-in controls.
FIG 2
FIG 2
The power of the spike-in control. (a) Snapshot of genome track showing nucleosome occupancy determined by MNase-seq in young and old cells without spike-in control. (b) Snapshot of the same region of the genome showing nucleosome occupancy determined by MNase-seq in young and old cells with spike-in normalization. (c) Heat map showing gene expression fold change determined by RNA-seq in young and old cells without spike-in control normalization. (d) Heat map showing gene expression fold change determined by RNA-seq in young and old cells with spike-in control normalization. Here, we used global-scaling normalization to a spike-in control, which is ideal for normalizing for global changes between experimental conditions.
FIG 3
FIG 3
Schematic of normalization of sequencing data with spike-in controls. At step 1, numbers of raw reads (top left, y axis) of each spike-in (top left and right, x axis) need to be normalized to be the same between experimental conditions. At step 2, by comparing numbers of normalized reads (top right, y axis) to numbers of raw reads, a normalize function can be generated specifically for each experimental condition (middle). These functions are used at step 3 to normalize read numbers at each genomic position (from bottom left to bottom right). This is an example of global-scaling normalization to a spike-in control, which is ideal for normalizing for global changes between experimental conditions.

Similar articles

Cited by

References

    1. Feser J, Truong D, Das C, Carson JJ, Kieft J, Harkness T, Tyler JK. 2010. Elevated histone expression promotes life span extension. Mol Cell 39:724–735. doi:10.1016/j.molcel.2010.08.015. - DOI - PMC - PubMed
    1. Hu Z, Chen K, Xia Z, Chavez M, Pal S, Seol JH, Chen CC, Li W, Tyler JK. 2014. Nucleosome loss leads to global transcriptional up-regulation and genomic instability during yeast aging. Genes Dev 28:396–408. doi:10.1101/gad.233221.113. - DOI - PMC - PubMed
    1. Lesur I, Campbell JL. 2004. The transcriptome of prematurely aging yeast cells is similar to that of telomerase-deficient cells. Mol Biol Cell 15:1297–1312. doi:10.1091/mbc.E03-10-0742. - DOI - PMC - PubMed
    1. Wyrick JJ, Holstege FC, Jennings EG, Causton HC, Shore D, Grunstein M, Lander ES, Young RA. 1999. Chromosomal landscape of nucleosome-dependent gene expression and silencing in yeast. Nature 402:418–421. doi:10.1038/46567. - DOI - PubMed
    1. Lovén J, Orlando DA, Sigova AA, Lin CY, Rahl PB, Burge CB, Levens DL, Lee TI, Young RA. 2012. Revisiting global gene expression analysis. Cell 151:476–482. doi:10.1016/j.cell.2012.10.012. - DOI - PMC - PubMed

Publication types

MeSH terms