Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Aug 10:13:199.
doi: 10.1186/1471-2105-13-199.

Normalization of ChIP-seq data with control

Affiliations

Normalization of ChIP-seq data with control

Kun Liang et al. BMC Bioinformatics. .

Abstract

Background: ChIP-seq has become an important tool for identifying genome-wide protein-DNA interactions, including transcription factor binding and histone modifications. In ChIP-seq experiments, ChIP samples are usually coupled with their matching control samples. Proper normalization between the ChIP and control samples is an essential aspect of ChIP-seq data analysis.

Results: We have developed a novel method for estimating the normalization factor between the ChIP and the control samples. Our method, named as NCIS (Normalization of ChIP-seq) can accommodate both low and high sequencing depth datasets. We compare statistical properties of NCIS against existing methods in a set of diverse simulation settings, where NCIS enjoys the best estimation precision. In addition, we illustrate the impact of the normalization factor in FDR control and show that NCIS leads to more power among methods that control FDR at nominal levels.

Conclusion: Our results indicate that the proper normalization between the ChIP and control samples is an important step in ChIP-seq analysis in terms of power and error rate control. Our proposed method shows excellent statistical properties and is useful in the full range of ChIP-seq applications, especially with deeply sequenced data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ChIP/control ratio as a function of total count for C.elegans data. (a) Marginal ChIP/control ratio against total count, both in log (10) scale, from a C.elegans ChIP-seq dataset of transcription factor PHA-4 [18]. Sizes of the plotting circles are proportional to log (10) of numbers of reads. Vertical dash line marks the total count selected by NCIS to estimate the normalization constant. Horizontal dash line marks the normalization factor estimate from NCIS. (b) Normalization constant as a function of bin-width. Vertical dash line marks the bin-width selected by NCIS to estimate the normalization constant. Horizontal dash line marks the normalization factor estimate from NCIS.
Figure 2
Figure 2
Statistical properties of normalization factor estimators. Mean and MSE (log10) for estimating the normalization factor in simulation setting 1 (left), setting 2 (middle) and setting 3 (right) with c = 1. The true value of the normalization factor is 1.
Figure 3
Figure 3
FDR control and power. FDR control with the sample-swapping method. (a) compares FDR levels with different normalization factor estimators. (b) Power comparison between between FDR control at 0.05 level with different normalization factor estimators.
Figure 4
Figure 4
ChIP vs control bin counts for yeast strain SEG1. ChIP versus control bin counts for yeast strain SEG1 plotted with bin-width of 500 bp. The upper black line represents the sequencing depth ratio, and the lower blue line the NCIS normalization factor estimate.
Figure 5
Figure 5
ChIP/control ratio as a function of total count for human NFκB data. NFκB marginal ChIP/control ratio against total with bin-width of 100 bp, both in natural log scale. Sizes of the plotting symbols are proportional to the log (10) of the number of reads. Horizontal dash line indicates the NCIS estimate of the normalization factor. Vertical dash line represents the NCIS total count threshold (tw).

Similar articles

Cited by

References

    1. Blow M, McCulley D, Li Z, Zhang T, Akiyama J, Holt A, Plajzer-Frick I, Shoukry M, Wright C, Chen F. ChIP-Seq identification of weakly conserved heart enhancers. Nat Genet. 2010;42(9):806–810. doi: 10.1038/ng.650. - DOI - PMC - PubMed
    1. Ramagopalan S, Heger A, Berlanga A, Maugeri N, Lincoln M, Burrell A, Handunnetthi L, Handel A, Disanto G, Orton S. A ChIP-seq defined genome-wide map of vitamin D receptor binding: Associations with disease and evolution. Genome Res. 2010;20(10):1352. doi: 10.1101/gr.107920.110. - DOI - PMC - PubMed
    1. Smagulova F, Gregoretti I, Brick K, Khil P, Camerini-Otero R, Petukhova G. Genome-wide analysis reveals novel molecular features of mouse recombination hotspots. Nature. 2011;472(7343):375–378. doi: 10.1038/nature09869. - DOI - PMC - PubMed
    1. Park P. ChIP–seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10(10):669–680. doi: 10.1038/nrg2641. - DOI - PMC - PubMed
    1. Xu H, Handoko L, Wei X, Ye C, Sheng J, Wei C, Lin F, Sung W. A signal–noise model for significance analysis of ChIP-seq with negative control. Bioinformatics. 2010;26(9):1199–1204. doi: 10.1093/bioinformatics/btq128. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources