Abstract
We report scM&T-seq, a method for parallel single-cell genome-wide methylome and transcriptome sequencing that allows for the discovery of associations between transcriptional and epigenetic variation. Profiling of 61 mouse embryonic stem cells confirmed known links between DNA methylation and transcription. Notably, the method revealed previously unrecognized associations between heterogeneously methylated distal regulatory elements and transcription of key pluripotency genes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).
Guo, H. et al. Genome Res. 23, 2126–2135 (2013).
Smallwood, S.A. et al. Nat. Methods 11, 817–820 (2014).
Farlik, M. et al. Cell Rep. 10, 1386–1397 (2015).
Levsky, J.M., Shenoy, S.M., Pezo, R.C. & Singer, R.H. Science 297, 836–840 (2002).
Yan, L. et al. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).
Macaulay, I.C. et al. Nat. Methods 12, 519–522 (2015).
Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Nat. Biotechnol. 33, 285–289 (2015).
Schübeler, D. Nature 517, 321–326 (2015).
Jones, P.A. Nat. Rev. Genet. 13, 484–492 (2012).
Singer, Z.S. et al. Mol. Cell 55, 319–331 (2014).
Kalmar, T. et al. PLoS Biol. 7, e1000149 (2009).
Chambers, I. et al. Nature 450, 1230–1234 (2007).
Singh, A.M., Hamazaki, T., Hankowski, K.E. & Terada, N. Stem Cells 25, 2534–2542 (2007).
Torres-Padilla, M.E. & Chambers, I. Development 141, 2173–2181 (2014).
Ficz, G. et al. Cell Stem Cell 13, 351–359 (2013).
Klein, A.M. et al. Cell 161, 1187–1201 (2015).
Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).
Habibi, E. et al. Cell Stem Cell 13, 360–369 (2013).
Stadler, M.B. et al. Nature 480, 490–495 (2011).
Lee, H.J., Hore, T.A. & Reik, W. Cell Stem Cell 14, 710–719 (2014).
Papp, B. & Plath, K. EMBO J. 31, 4255–4257 (2012).
Whyte, W.A. et al. Cell 153, 307–319 (2013).
Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).
Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).
Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).
Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).
Bourgon, R., Gentleman, R. & Huber, W. Proc. Natl. Acad. Sci. USA 107, 9546–9551 (2010).
Acknowledgements
We thank A. Kolodziejczyk and S.A. Teichmann for providing a list of 86 ESC pluripotency and differentiation genes18. We thank W. Haerty for his supervision and valuable advice to T.X.H. We thank the Wellcome Trust Sanger Institute sequencing pipeline team for assistance with Illumina sequencing. We thank the members of the Sanger–European Bioinformatics Institute (EBI) Single-Cell Genomics Centre for general advice. W.R. is supported by the UK Biotechnology and Biological Sciences Research Council (BBSRC), the Wellcome Trust and the EU. G.K. is supported by the BBSRC, the UK Medical Research Council (MRC) and the EU. C.P.P. is supported by the Wellcome Trust and the MRC. T.V. is supported by the Wellcome Trust and KU Leuven (SymBioSys, PFV/10/016). H.J.L. is supported by EU Network of Excellence EpiGeneSys. O.S. is supported by the European Molecular Biology Laboratory (EMBL), the Wellcome Trust and the EU.
Author information
Authors and Affiliations
Contributions
C.A. performed all statistical analyses of the data. H.J.L., I.C.M., S.J.C. and S.A.S. developed the protocol and performed experiments. H.J.L., I.C.M., C.A., S.J.C., O.S., W.R. and C.P.P. interpreted the results. M.J.T. contributed to method development. T.X.H. processed RNA-seq data. F.K. processed BS-seq data. W.R., G.K., I.C.M. and T.V. contributed protocols and reagents. H.J.L., I.C.M., W.R. and T.V. conceived the project. W.R., O.S., T.V. and G.K. jointly supervised the project. O.S., H.J.L., S.J.C., W.R. and I.C.M. wrote the paper with input from all other authors. Names of authors who contributed equally to this work are ordered alphabetically on the first page.
Corresponding authors
Ethics declarations
Competing interests
W.R. is a consultant and shareholder of Cambridge Epigenetix.
Integrated supplementary information
Supplementary Figure 1 Detailed flow chart of the scM&T-seq protocol.
Single cells are collected and lysed before poly-A RNA is captured on magnetic beads and physically separated from DNA. Amplified cDNA is generated from mRNA on beads whilst DNA is bisulfite converted and Illumina sequencing libraries are prepared from both components in parallel.
Supplementary Figure 2 Quality metrics of scRNA-seq data obtained from mouse ESCs profiled using scM&T-seq.
(a,b) Number of genes detected on (Y-axis) as a function of the expression cut off (x-axis). In each cell, between 4,000 and 8,000 genes were expressed (TPM>1) (the dashed line drawn at X=1). High quality cells generally have about 5,000 genes detectable at the cut-off of TPM>1, indicating a high level of quality among the 61 serum ESCs (or the 14 2i ESCs). (c,d) Distribution of Pearson correlation coefficient calculated pairwise on the 61 serum ESCs (or the 14 2i ESCs). The observed correlation coefficient tended to be between 0.7-0.99, indicating a high degree of technical consistency in the measured transcriptome of the cells considered, and attesting high quality of scRNA-seq data.
Supplementary Figure 3 Quality metrics of single-cell methylomes in serum ESCs profiled using alternative protocols.
Shown are quality metrics for the scM&T-seq protocol to profile 20 serum ESCs, compared with scBS-seq (Smallwood et al. 2014) to profile 20 serum cells. (a) Read mapping efficiency. (b) Read duplication rate. (c) Genome-wide CpG and CHH methylation rate per cell. (d) Analysis of representation bias for different genomic contexts. (e) FASTQC report of adapter content from one representative single cell bisulfite library (Read 1 of cell B06). A large proportion of sequenced fragments are concatemers of the primer used in first strand synthesis which substantially limits the alignment rates of these libraries. It may be possible to improve mapping efficiencies by reducing oligo concentrations or reaction times but this is likely to result in reduced genomic coverage.
Supplementary Figure 4 Methylation coverage in different genomic contexts.
Shown is the percentage of genomic contexts of different classes (y-axis) that are covered for an increasing number of minimum cells (x-axis), considering both scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq.
Supplementary Figure 5 Genome-wide methylation coverage.
Shown is the percentage of genome-wide 10kb, 5kb, and 1kb windows covered (y-axis) by an increasing minimum number of cells (x-axis), for scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq.
Supplementary Figure 6 Hierarchical clustering of DNA-methylation profiles generated by scM&T-seq and scBS-seq.
Shown s a joint hierarchical clustering from 61 serum and 16 2i cells profiled using scM&T-seq, as well as 20 serum and 12 2i ESCs profiled by scBS-seq (Smallwood et al. 2014), as well as corresponding synthetic bulk samples and an independent bulk BS-seq sample from serum ESCs (Ficz et al. 2013). The clustering analysis was performed on gene body methylation of the 500 genes with the largest epigenome heterogeneity.
Supplementary Figure 7 Correlation between single-cell methylomes and the methylome of a bulk cell population.
Shown is a scatter plot, relating bulk gene-body methylation (Ficz et al. 2013) on the x-axis, versus synthetic bulk estimates of gene-body methylation derived using either scBS-seq (Smallwood et al. 2014, green) or scM&T-seq (blue) on the y-axis. Synthetic bulk methylation profiles are derived form averages of the single-cell methylation profiles. The true bulk methylation profile is concordant with both single-cell profiles, where the scM&T-seq bulk estimates correlate slightly better (R=0.77) than the scBS-seq bulk (R=0.69).
Supplementary Figure 8 Principal-component analysis of gene-body methylation and gene expression in serum-grown ESCs.
Shown are projections onto first two principle components (left) alongside with percentage of variance explained by individual components (right) for both gene expression levels (a) and gene body methylation (b). Cells are color-coded based on clustering obtained using gene expression values, showing that that the methylation principal components partially recapitulate the structure in the expression data.
Supplementary Figure 9 Scatter-plot matrix of principal components from methylation and gene expression profiles.
Shown are scatter plots between individual principal components of gene expression levels (y-axis) and corresponding gene body methylation (x-axis), using 61 serum cells profiled using scM&T-seq. Cells are color coded as in Supplementary Fig. 8. There is a strong correlation between the second principal component of DNA methylation and the corresponding component from gene expression, suggesting shared axes of variation between transcriptome and methylome profiles.
Supplementary Figure 10 Clustering analysis of transcriptome and methylation data from 61 serum ESCs.
Shown are heatmaps for the gene body methylation (left) and gene expression profiles (right) using the 300 most heterogeneous genes (based on gene expression). The order of genes was taken from an individual clustering analysis based on gene methylation whereas cells were clustered separately either using DNA methylation or expression data, showing unlinked clusters (colored clusters). The bar plots in the center show the heterogeneity in DNA methylation (left) and gene expression (right).
Supplementary Figure 11 Bootstrap robustness analysis of the gene-specific correlation analysis.
Shown is the absolute (a) and relative (b) reduction in the number of significant methylation-expression associations for different genomic contexts, as well as the root mean squared error of Pearson’s correlation coefficient (c) when either considering the full datasets or alternatively boot-strapped samples for the methylation-RNA correlation analysis. Bootstrap samples were obtained from independent draws of 60%, 70%, or 80% of the total set of cells. As expected, a reduction in the number of analyzed cells resulted in reduced power to detect significant associations (a, b). Overall, only a relatively small number of linkages were affected and the concordance to the full dataset remained high (c).
Supplementary Figure 12 Correlation coefficients for associations between DNA-methylation profiles in alternative genomic contexts and gene expression levels.
Shown are boxplots of the correlation coefficient (Pearson r) between DNA methylation in different genomic contexts and corresponding gene expression levels (see Supplementary Table 2).
Supplementary Figure 13 Volcano plots for association tests between DNA-methylation profiles in alternative genomic contexts and gene expression levels.
For each context, shown is the correlation coefficient (Pearson r, x-axis) versus the adjusted p-value (Benjamini Hochberg adjustment; y-axis). The blue horizontal line corresponds to the 10% FDR significance level. Each dot corresponds to a gene and the size to the adjusted p-value of the association test. Genes colored in red correspond to known pluripotency genes (Supplementary Table 5). The vertical orange line denotes the average correlation coefficient across all genes for a given annotation.
Supplementary Figure 15 Comparison of cell-specific correlation analysis with known covariates (CpG coverage).
For alternative genomic contexts, shown are scatter plots between cell-specific methylation-expression correlation coefficients and the (technical) CpG coverage in the corresponding cell. The lack of associations suggests that technical factors do not drive the heterogeneity in the coupling between methylation and expression between cells.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–15 and Supplementary Table 3 (PDF 2789 kb)
Supplementary Table 1
scRNA-seq and scBS-seq quality metrics. (XLSX 119 kb)
Supplementary Table 2
Genomic contexts considered for the methylation–gene expression association analyses. (XLSX 9 kb)
Supplementary Table 4
Gene-level results of the association tests between DNA-methylation variation in alternative genomic contexts and gene expression variation. (XLSX 21480 kb)
Supplementary Table 5
List of 86 literature-derived pluripotency genes. (XLS 33 kb)
Supplementary Table 6
Summary statistics obtained for the cell-specific association analysis correlating the methylome and the transcriptome in individual cells. (XLSX 51 kb)
Supplementary Software
scMT-seq software (ZIP 11 kb)
Source data
Rights and permissions
About this article
Cite this article
Angermueller, C., Clark, S., Lee, H. et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat Methods 13, 229–232 (2016). https://doi.org/10.1038/nmeth.3728
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3728
This article is cited by
-
Aging-related aneuploidy is associated with mitochondrial imbalance and failure of spindle assembly
Cell Death Discovery (2023)
-
Single-cell multi-omics profiling reveals key regulatory mechanisms that poise germinal vesicle oocytes for maturation in pigs
Cellular and Molecular Life Sciences (2023)
-
Single-cell multi-omics profiling links dynamic DNA methylation to cell fate decisions during mouse early organogenesis
Genome Biology (2022)
-
EMeth: An EM algorithm for cell type decomposition based on DNA methylation data
Scientific Reports (2021)
-
Smart-RRBS for single-cell methylome and transcriptome analysis
Nature Protocols (2021)