Diffusion maps for high-dimensional single-cell analysis of differentiation data
- PMID: 26002886
- DOI: 10.1093/bioinformatics/btv325
Diffusion maps for high-dimensional single-cell analysis of differentiation data
Abstract
Motivation: Single-cell technologies have recently gained popularity in cellular differentiation studies regarding their ability to resolve potential heterogeneities in cell populations. Analyzing such high-dimensional single-cell data has its own statistical and computational challenges. Popular multivariate approaches are based on data normalization, followed by dimension reduction and clustering to identify subgroups. However, in the case of cellular differentiation, we would not expect clear clusters to be present but instead expect the cells to follow continuous branching lineages.
Results: Here, we propose the use of diffusion maps to deal with the problem of defining differentiation trajectories. We adapt this method to single-cell data by adequate choice of kernel width and inclusion of uncertainties or missing measurement values, which enables the establishment of a pseudotemporal ordering of single cells in a high-dimensional gene expression space. We expect this output to reflect cell differentiation trajectories, where the data originates from intrinsic diffusion-like dynamics. Starting from a pluripotent stage, cells move smoothly within the transcriptional landscape towards more differentiated states with some stochasticity along their path. We demonstrate the robustness of our method with respect to extrinsic noise (e.g. measurement noise) and sampling density heterogeneities on simulated toy data as well as two single-cell quantitative polymerase chain reaction datasets (i.e. mouse haematopoietic stem cells and mouse embryonic stem cells) and an RNA-Seq data of human pre-implantation embryos. We show that diffusion maps perform considerably better than Principal Component Analysis and are advantageous over other techniques for non-linear dimension reduction such as t-distributed Stochastic Neighbour Embedding for preserving the global structures and pseudotemporal ordering of cells.
Availability and implementation: The Matlab implementation of diffusion maps for single-cell data is available at https://www.helmholtz-muenchen.de/icb/single-cell-diffusion-map.
Contact: fbuettner.phys@gmail.com, fabian.theis@helmholtz-muenchen.de
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Similar articles
-
Model-based branching point detection in single-cell data by K-branches clustering.Bioinformatics. 2017 Oct 15;33(20):3211-3219. doi: 10.1093/bioinformatics/btx325. Bioinformatics. 2017. PMID: 28582478 Free PMC article.
-
destiny: diffusion maps for large-scale single-cell data in R.Bioinformatics. 2016 Apr 15;32(8):1241-3. doi: 10.1093/bioinformatics/btv715. Epub 2015 Dec 14. Bioinformatics. 2016. PMID: 26668002
-
Diffusion pseudotime robustly reconstructs lineage branching.Nat Methods. 2016 Oct;13(10):845-8. doi: 10.1038/nmeth.3971. Epub 2016 Aug 29. Nat Methods. 2016. PMID: 27571553
-
Establishing the human naïve pluripotent state.Curr Opin Genet Dev. 2015 Oct;34:35-45. doi: 10.1016/j.gde.2015.07.005. Epub 2015 Aug 24. Curr Opin Genet Dev. 2015. PMID: 26291026 Review.
-
Advancing haematopoietic stem and progenitor cell biology through single-cell profiling.FEBS Lett. 2016 Nov;590(22):4052-4067. doi: 10.1002/1873-3468.12231. Epub 2016 Jun 21. FEBS Lett. 2016. PMID: 27259698 Review.
Cited by
-
Broad H3K4me3 Domain Is Associated with Spatial Coherence during Mammalian Embryonic Development.bioRxiv [Preprint]. 2023 Dec 12:2023.12.11.570452. doi: 10.1101/2023.12.11.570452. bioRxiv. 2023. PMID: 38168252 Free PMC article. Preprint.
-
Heterochromatin establishment during early mammalian development is regulated by pericentromeric RNA and characterized by non-repressive H3K9me3.Nat Cell Biol. 2020 Jul;22(7):767-778. doi: 10.1038/s41556-020-0536-6. Epub 2020 Jun 29. Nat Cell Biol. 2020. PMID: 32601371 Free PMC article.
-
StableMate: a statistical method to select stable predictors in omics data.NAR Genom Bioinform. 2024 Sep 28;6(4):lqae130. doi: 10.1093/nargab/lqae130. eCollection 2024 Sep. NAR Genom Bioinform. 2024. PMID: 39345755 Free PMC article.
-
Prediction of protein-RNA interactions from single-cell transcriptomic data.Nucleic Acids Res. 2024 Apr 12;52(6):e31. doi: 10.1093/nar/gkae076. Nucleic Acids Res. 2024. PMID: 38364867 Free PMC article.
-
Interpretable trajectory inference with single-cell Linear Adaptive Negative-binomial Expression (scLANE) testing.bioRxiv [Preprint]. 2023 Dec 20:2023.12.19.572477. doi: 10.1101/2023.12.19.572477. bioRxiv. 2023. PMID: 38187622 Free PMC article. Preprint.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical