Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Aug;2(4):587-98.
doi: 10.2217/epi.10.36.

Protocol matters: which methylome are you actually studying?

Affiliations

Protocol matters: which methylome are you actually studying?

Mark D Robinson et al. Epigenomics. 2010 Aug.

Abstract

The field of epigenetics is now capitalizing on the vast number of emerging technologies, largely based on second-generation sequencing, which interrogate DNA methylation status and histone modifications genome-wide. However, getting an exhaustive and unbiased view of a methylome at a reasonable cost is proving to be a significant challenge. In this article, we take a closer look at the impact of the DNA sequence and bias effects introduced to datasets by genome-wide DNA methylation technologies and where possible, explore the bioinformatics tools that deconvolve them. There remains much to be learned about the performance of genome-wide technologies, the data we mine from these assays and how it reflects the actual biology. While there are several methods to interrogate the DNA methylation status genome-wide, our opinion is that no single technique suitably covers the minimum criteria of high coverage and, high resolution at a reasonable cost. In fact, the fraction of the methylome that is studied currently depends entirely on the inherent biases of the protocol employed. There is promise for this to change, as the third generation of sequencing technologies is expected to again 'revolutionize' the way that we study genomes and epigenomes.

Keywords: DNA methylation; epigenetics; high-throughput sequencing; tiling arrays.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Read count by GC content and mappability
(A) Number of uniquely mapped reads per 2-kb region of a normal prostate genome, stratified into 50 equally-sized groups based on GC content. (B) Number of uniquely mapped reads per 2-kb region of a normal prostate genome, stratified into 50 equally-sized groups based on ‘mappability’.
Figure 2
Figure 2. Mapping bisulfite-converted sequence to a reference genome
The example shown illustrates the potential methylation-status-specific bias that can be introduced when mapping bisulfite-converted sequences to a degenerate BS genome (left track) or to a 3-base bisulfite-converted genome (right track).
Figure 3
Figure 3. Observed CpG coverage and ‘effciency’ of various DNA methylation analyses
(A) Total number of genomic CpG sites interrogated at given coverage for several published DNA methylation mapping studies. Dotted lines represent enrichment-based techniques. Solid lines represent bisulfite-based techniques. (B) Observed ‘efficiency’ of methylation mapping (number of CpG sites interrogated per number of uniquely mapping reads). The maximum number of CpG sites interrogated and the number of uniquely mapping reads are shown. MeDIP-seq: Methylated DNA immunoprecipitation sequencing; MethylC-seq: Whole-genome bisulfite sequencing; MiGS: Methyl CpG binding domain-isolated genome sequencing; RRBS: Reduced representation bisulfite sequencing.
Figure 4
Figure 4. CpG-density-dependent coverage of various DNA methylation analyses
(A) Average coverage by local CpG density for several published DNA methylation mapping studies. Local CpG density is calculated as the number of CpG sites 200 bases upstream or downstream from every CpG site. (B) Genome-wide distribution of the local CpG density. MeDIP-seq: Methylated DNA immunoprecipitation sequencing; MethylC-seq: Whole-genome bisulfite sequencing; MiGS: Methyl CpG binding domain-isolated genome sequencing; RRBS: Reduced representation bisulfte sequencing.
Figure 5
Figure 5. Copy number bias in enrichment-based analyses of DNA methylation
(A) Copy number estimates of MCF7 breast cancer cell line analyzed by the PICNIC algorithm [57] across human chromosome 20 (data from Sanger Cancer Genome Project [102]). (B) Log-ratios of library-size-normalized read counts between input sequencing data for MCF7 and human mammary epithelial cell (HMEC) line at 20 kb nonoverlapping intervals (data from [13]). (C) Log-ratios of library-size-normalized read counts between methylated DNA immunoprecipitation-sequencing data for MCF7 and HMEC line at 20 kb nonoverlapping intervals (data from [13]).

Similar articles

Cited by

References

Bibliography

    1. Clark SJ, Melki J. DNA methylation and gene silencing in cancer: which is the guilty party? Oncogene. 2002;21:5380–5387. - PubMed
    1. Jones PA, Baylin SB. The epigenomics of cancer. Cell. 2007;128:683–692. - PMC - PubMed
    1. Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987;196:261–282. - PubMed
    1. Clark SJ, Harrison J, Frommer M. CpNpG methylation in mammalian cells. Nat Genet. 1995;10:20–27. - PubMed
    1. Lister R, Pelizzola M, Dowen RH, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–322. - PMC - PubMed
    2. First single-base resolution methylome using whole-genome shotgun bisulfite (BS) sequencing.

Websites

    1. IUPAC Codes. www.bioinformatics.org/sms/iupac.html.
    1. Sanger Cancer Genome Project. www.sanger.ac.uk/genetics/CGP/

Publication types