Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 11;17(1):21.
doi: 10.1186/s13040-024-00374-0.

Transcriptome- and DNA methylation-based cell-type deconvolutions produce similar estimates of differential gene expression and differential methylation

Affiliations

Transcriptome- and DNA methylation-based cell-type deconvolutions produce similar estimates of differential gene expression and differential methylation

Emily R Hannon et al. BioData Min. .

Abstract

Background: Changing cell-type proportions can confound studies of differential gene expression or DNA methylation (DNAm) from peripheral blood mononuclear cells (PBMCs). We examined how cell-type proportions derived from the transcriptome versus the methylome (DNAm) influence estimates of differentially expressed genes (DEGs) and differentially methylated positions (DMPs).

Methods: Transcriptome and DNAm data were obtained from PBMC RNA and DNA of Kenyan children (n = 8) before, during, and 6 weeks following uncomplicated malaria. DEGs and DMPs between time points were detected using cell-type adjusted modeling with Cibersortx or IDOL, respectively.

Results: Most major cell types and principal components had moderate to high correlation between the two deconvolution methods (r = 0.60-0.96). Estimates of cell-type proportions and DEGs or DMPs were largely unaffected by the method, with the greatest discrepancy in the estimation of neutrophils.

Conclusion: Variation in cell-type proportions is captured similarly by both transcriptomic and methylome deconvolution methods for most major cell types.

Keywords: Deconvolution; Gene expression; PBMC.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Steps for data collection and analyses. Purple lines go from the raw samples to the co-extraction step. Blue lines represent DNA-methylation-derived deconvolution methods and red lines represent transcriptome-derived deconvolution methods. For RNA, steps included Illumina whole transcriptome followed by deconvolution with the Cibersortx and the LM22 reference. For DNA, the EPIC chip was used to measure the DNA methylation across 850 K probe sites and then used for deconvolution with the IDOL and the extended blood reference [11]. The results from each deconvolution were used as covariates to model the differential gene expression and differential methylation represented by the crossing red and blue lines going from deconvolution method to differential gene expression and DNA methylation
Fig. 2
Fig. 2
a. Cell percent estimates derived from the transcriptome (column labeled “RNA-derived”) and DNA methylome (column labeled “DNAm-derived”). Black bars indicate a statistically significant change in cell-type proportion. b. Scatter plots and linear model fits of association between cell proportion estimates from transcriptome-derived methods and DNAm-derived methods. The Pearson correlation value r is labeled in correlation plots. Each individual is labeled a different color
Fig. 3
Fig. 3
Correlation between the first and second principal components from RNA-derived and DNAm-derived deconvolution methods. Dotted lines represent the identity line. The percent of total variation captured by each respective principal component is in parentheses. The Pearson correlation coefficients left to right were 0.90 and 0.85 (both correlations have a p-value < 0.001). The x-axes represent the values for the first (PC1) and second (PC2) that were calculated with the transcriptome-derived deconvolution. The y-axes represent the values for the first (PC1) and second (PC2) from the DNAm-derived deconvolution
Fig. 4
Fig. 4
a. The number of differentially expressed genes detected by each model is on the y-axis at multiple p-value thresholds. The models are indicated by cell-type adjustment labeled across the top, contrast labeled along the right side, and deconvolution approach labeled along the x-axis. Transcriptome-derived and DNAm-derived cell-type adjustments are marked by RNA and DNAm, respectively. As colors become darker, the significance threshold becomes lower. b. Venn diagrams showing the overlap in DEGs detected with p-value < 0.003 from the unadjusted, transcriptome-derived principal components, and the DNAm-derived principal components models
Fig. 5
Fig. 5
Log fold-change estimates are represented on the x-axis after transcriptome-derived cell-type adjustment versus the corresponding logFC from the DNAm-derived model. Cell-type adjustments are labeled along the top and contrasts along the right side. Red points represent deconvolution-sensitive DEGs (genes whose estimates vary the most using the difference deconvolutions) and their count is listed in the bottom right of each panel if n > 0. The dashed line represents the identity line
Fig. 6
Fig. 6
(a) The number of differentially methylated positions is on the y-axis at several p-value thresholds as detected by cell-type adjustment (labeled across the top), contrast (labeled along the right side), and deconvolution approach (along the x-axis). Transcriptome-derived and DNA-methylation-derived cell-type adjustments are marked by RNA and DNAm, respectively. As colors become darker, the significance threshold becomes higher and are listed in the legend. “Prin. Comp.” refers to the principal component-adjusted models. (b) Venn diagrams depict the overlap in DMPs selected with p-value < 0.001 in unadjusted, transcriptome-derived principal components, and the DNAm-derived principal components models
Fig. 7
Fig. 7
Relative logFC estimates in high interest CpG locations using transcriptome-derived vs. DNAm-derived cell-type adjustments by model (labeled along the top) and contrast (labeled along the right side). Red points are deconvolution-sensitive DMPs and their count is listed in the bottom right corner of each panel if n > 0. Dashed lines represent the identity line. Deconvolution-sensitive CpG sites vary the most with different deconvolution approaches

Update of

Similar articles

Cited by

References

    1. Moncunill G, Scholzen A, Mpina M, Nhabomba A, Hounkpatin AB, Osaba L et al. Antigen-stimulated PBMC transcriptional protective signatures for malaria immunization. Sci Transl Med. 2020;12(543). - PubMed
    1. Campbell KA, Colacino JA, Puttabyatappa M, Dou JF, Elkin ER, Hammoud SS et al. Placental cell type deconvolution reveals that cell proportions drive preeclampsia gene expression differences. Commun Biol. 2023;6(1). - PMC - PubMed
    1. Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15(2). - PMC - PubMed
    1. Patrick E, Taga M, Ergun A, Ng B, Casazza W, Cimpean M et al. Deconvolving the contributions of cell-type heterogeneity on cortical gene expression. Plos Comput Biol. 2020;16(8). - PMC - PubMed
    1. Qi L, Teschendorff AE. Cell-type heterogeneity: why we should adjust for it in epigenome and biomarker studies. Clin Epigenetics. 2022;14(1). - PMC - PubMed

LinkOut - more resources