Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb:32:109-121.
doi: 10.1016/j.molmet.2019.12.006. Epub 2019 Dec 20.

Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures

Affiliations

Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures

Vivek Rai et al. Mol Metab. 2020 Feb.

Abstract

Objective: Type 2 diabetes (T2D) is a complex disease characterized by pancreatic islet dysfunction, insulin resistance, and disruption of blood glucose levels. Genome-wide association studies (GWAS) have identified > 400 independent signals that encode genetic predisposition. More than 90% of associated single-nucleotide polymorphisms (SNPs) localize to non-coding regions and are enriched in chromatin-defined islet enhancer elements, indicating a strong transcriptional regulatory component to disease susceptibility. Pancreatic islets are a mixture of cell types that express distinct hormonal programs, so each cell type may contribute differentially to the underlying regulatory processes that modulate T2D-associated transcriptional circuits. Existing chromatin profiling methods such as ATAC-seq and DNase-seq, applied to islets in bulk, produce aggregate profiles that mask important cellular and regulatory heterogeneity.

Methods: We present genome-wide single-cell chromatin accessibility profiles in >1,600 cells derived from a human pancreatic islet sample using single-cell combinatorial indexing ATAC-seq (sci-ATAC-seq). We also developed a deep learning model based on U-Net architecture to accurately predict open chromatin peak calls in rare cell populations.

Results: We show that sci-ATAC-seq profiles allow us to deconvolve alpha, beta, and delta cell populations and identify cell-type-specific regulatory signatures underlying T2D. Particularly, T2D GWAS SNPs are significantly enriched in beta cell-specific and across cell-type shared islet open chromatin, but not in alpha or delta cell-specific open chromatin. We also demonstrate, using less abundant delta cells, that deep learning models can improve signal recovery and feature reconstruction of rarer cell populations. Finally, we use co-accessibility measures to nominate the cell-specific target genes at 104 non-coding T2D GWAS signals.

Conclusions: Collectively, we identify the islet cell type of action across genetic signals of T2D predisposition and provide higher-resolution mechanistic insights into genetically encoded risk pathways.

Keywords: Chromatin; Deep learning; Epigenomics; Islet; Single cell; Type 2 diabetes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of sci-ATAC-seq study. (A) Sci-ATAC-seq protocol for generating single-nuclei ATAC-seq data from a pancreatic islet sample. The data are then used to identify constituent cell types and use a deep learning model to predict peaks on the clusters with fewer nuclei count. (B) ATAC-seq signal tracks for 10 bulk islet samples and the sci-ATAC-seq islet sample. Bottom tracks show the signal across a random subset of up to 400 single nuclei. Signal tracks are normalized to one million reads and scaled between 0 and 2. (C) Spearman's correlation between aggregate sci-ATAC-seq, 13 bulk islets, 3 adipose, 2 muscle, 2 CD4+ T-cells, and 1 GM12878 sample (see Table S1). (D) Distribution of aggregate sci-ATAC-seq TSS proximal and distal peaks across bulk islet derived ChromHMM segmentations.
Figure 2
Figure 2
Clustering and identification of cell-type clusters in sci-ATAC-seq data. (A) UMAP projection with clustering of 1,456 single-nuclei islets represented by each single point into four clusters as identified by density-based clustering. (B) Enrichment of cells from each cluster relative to their expected population proportion across different read sequencing depth bins. Sequencing depth increases with the bin number. (C) Genome browser tracks showing signals at different cell-type marker loci: alpha (GCG), beta (INS-IGF2), delta (SST), and a housekeeping gene (GAPDH). Tracks are normalized to one million reads and scaled between 0 and 5. (D) Overview of the independent cluster verification scheme utilizing cell-type signature genes as identified by an islet scRNA-seq study by Lawlor et al. (2017). (E) Plot of aggregate ATAC-seq signals (RPKM) at scRNA-seq derived cell-type signature genes for alpha, beta, and delta cells. Number of signature genes for each cell type indicated in the title.
Figure 3
Figure 3
Deep learning upscaling from sparse low-count nuclei clusters. (A) Schematic of the U-Net training scheme. Two models are depicted in the illustration: one trained on alpha cells data as input and other trained on beta cells as input. Delta cell peak predictions from both models are combined to obtain final predictions (see Methods). (B) Precision-recall curve comparing peak calls from MACS2 on downscaled data (alpha cell type) with predicted peak calls from the 28-cell U-Net model (trained on beta, predicted on alpha). (C) Example loci illustrating peak upscaling with the model. For each cell type, four tracks are shown: full signal track, peak calls on full data, peak calls on subsampled data, and predicted peak calls. The predicted peak calls are obtained from a model trained on a different cell type. For delta predicted peak calls, intersection of prediction from both alpha and beta models are shown. Signal tracks normalized to one million reads and scaled between 0 and 2. (D) Fold enrichment (log2) of single-cell RNA-seq derived signature genes (scRSGs) in 28-cell MACS2 and U-Net predicted peaks for three cell types. (E) Reproducibility of master peaks from bulk islet ATAC-seq across individual samples. (F) Fold enrichment (log2) of different sets of reproducible peaks from bulk islet ATAC-seq across 13 islet chromatin states. Genic enhancer is not shown because of no enrichment. (G) Overlap of cell-type peaks (alpha, beta, and predicted delta) with different sets of reproducible peaks from bulk islet ATAC-seq data.
Figure 4
Figure 4
Enrichment of T2D GWAS signals in cell-type-specific chromatin and linking them to target genes. (A) Fold enrichment (log2) of T2D GWAS SNPs in cell-type peaks in single and conditional analysis mode using the fGWAS tool. For each cell type, three enrichment values with 95% confidence intervals are shown: none (single-annotation mode), alpha (conditioned on alpha), beta (conditioned on beta), and delta (conditioned on delta). (B) Partitioning of alpha, beta, and predicted delta peaks in mutually exclusive sets of cell-type-specific peaks. The subplot (on right) shows the total number of peaks for each cell type. (C) Distance-matched Fisher odds that beta cell co-accessibility links overlap islet Hi-C, islet pcHi-C, and ChIA-PET chromatin loops across different co-accessibility threshold bins. (D) Overlap of T2D GWAS credible set SNPs with cell-type-specific peaks. Bin is colored if there is at least one SNP (PPAg > 0.05) in the 99% genetic credible set of the T2D GWAS signal located within 1 kb of an ATAC-seq peak. The Cicero score columns are colored to indicate the score of the highest scoring link to the target gene. (E) Viewpoint plot of alpha Cicero connections centered at rs7163757 for C2CD4A/B locus. (F) Alpha Cicero connections centered at rs11708067 for ADCY5 locus. (G) Beta Cicero connections centered at rs13262861 for ANK1 locus. (H) Cicero connections for both alpha and beta centered at rs62059712 for ATP1B2 locus. The viewpoint region spans ±1 kb from the variant.

Similar articles

Cited by

References

    1. DeFronzo R.A., Ferrannini E., Groop L., Henry R.R., Herman W.H., Holst J.J. Type 2 diabetes mellitus. Nature reviews Disease primers. 2015;1:15019. - PubMed
    1. Spellman C.W. Pathophysiology of type 2 diabetes: targeting islet cell dysfunction. Journal of the American Osteopathic Association. 2010;110:S2–S7. - PubMed
    1. Mahajan A., Taliun D., Thurner M., Robertson N.R., Torres J.M., Rayner N.W. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nature Genetics. 2018;1 - PMC - PubMed
    1. Viñuela A., Varshney A., van de Bunt M., Prasad R.B., Asplund O.B., Bennett A. Influence of genetic variants on gene expression in human pancreatic islets – implications for type 2 diabetes. BioRxiv. 2019:655670.
    1. Parker S.C.J., Stitzel M.L., Taylor D.L., Orozco J.M., Erdos M.R., Akiyama J.A. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proceedings of the National Academy of Sciences. 2013;110:17921–17926. - PMC - PubMed

Publication types

MeSH terms