Abstract
We present single-cell interpretation via multikernel learning (SIMLR), an analytic framework and software which learns a similarity measure from single-cell RNA-seq data in order to perform dimension reduction, clustering and visualization. On seven published data sets, we benchmark SIMLR against state-of-the-art methods. We show that SIMLR is scalable and greatly enhances clustering performance while improving the visualization and interpretability of single-cell sequencing data.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).
Pollen, A.A. et al. Nat. Biotechnol. 32, 1053–1058 (2014).
Usoskin, D. et al. Nat. Neurosci. 18, 145–153 (2015).
Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).
Pierson, E. & Yau, C. Genome Biol. 16, 241 (2015).
Macosko, E.Z. et al. Cell 161, 1202–1214 (2015).
Zheng, G.X.Y. et al. Nat. Commun. 8, 14049 (2017).
Bach, F.R., Lanckriet, G.R.G. & Jordan, M.I. In Proc. 21st Int. Conf. Mach. Learn (eds. Greiner, R. & Schuurmans, D.) 6 (ICML, 2004).
Gönen, M. & Alpaydin, E. J. Mach. Learn. Res. 12, 2211–2268 (2011).
Wang, B. et al. Nat. Methods 11, 333–337 (2014).
Buettner, F. et al. Nat. Biotechnol. 33, 155–160 (2015).
Jolliffe, I. Principal Component Analysis (Wiley Online Library, 2002).
Van der Maaten, L. & Hinton, G. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Frey, B.J. & Dueck, D. Science 315, 972–976 (2007).
Ding, C. & He, X. In Proc. 21st Int. Conf. Mach. Learn (eds. Greiner, R. & Schuurmans, D.) 225–232 (ICML, 2004).
Paul, F. et al. Cell 163, 1663–1677 (2015).
Zeisel, A. et al. Title. Science 347, 1138–1142 (2015).
von Luxburg, U. Stat. Comput. 17, 395–416 (2007).
Wang, B. et al. Adv. Neural Inf. Process. Syst. 3297–3305 (2016).
Nesterov, Y., Nemirovskii, A. & Ye, Y. Interior-Point Polynomial Algorithms in Convex Programming (SIAM, 1994).
Parlett, B.N. The Symmetric Eigenvalue Problem (SIAM, 1980).
Yang, J. & Leskovec, J. In Proc. 10th IEEE Conf. Data Min. (eds. Webb, G.I. et al.) 599–608 (IEEE, 2010).
He, X., Cai, D. & Niyogi, P. Adv. Neural Inf. Process. Syst. 18, 507–514 (2005).
Kolde, R., Laur, S., Adler, P. & Vilo, J. Bioinformatics 28, 573–580 (2012).
Van Der Maaten, L. J. Mach. Learn. Res. 15, 3221–3245 (2014).
Acknowledgements
The authors would like to thank G.X. Zheng, J. Terry and T. Mikkelsen from 10x Genomics for providing access to the PBMC data as well as suggestions for the manuscript and the in silico experiments. E.P. acknowledges support from an NDSEG Fellowship and a Hertz Fellowship. J.Z. acknowledges support from a Stanford Graduate Fellowship.
Author information
Authors and Affiliations
Contributions
B.W., J.Z., and S.B. conceived the study and planned experiments. B.W. designed the algorithm and implemented the software in MATLAB. D.R. and B.W. developed the software package in R. J.Z. and E.P. performed data analysis and implemented the simulation study. J.Z. and E.P. drafted the manuscript. B.W. and S.B. contributed to the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
S.B. is currently on a leave of absence from Stanford, and he is VP of Applied and Computational Biology at Illumina.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–29, Supplementary Tables 1–10 and Supplementary Notes 1–10 (PDF 18964 kb)
Supplementary Software 1
Matlab and R implementations of SIMLR with four small-scale single-cell RNA-seq datasets (ZIP 161889 kb)
Rights and permissions
About this article
Cite this article
Wang, B., Zhu, J., Pierson, E. et al. Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat Methods 14, 414–416 (2017). https://doi.org/10.1038/nmeth.4207
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.4207
This article is cited by
-
p-clustval: a novel \(p\)-adic approach for enhanced clustering of high-dimensional single-cell RNASeq data
International Journal of Data Science and Analytics (2025)
-
scCTS: identifying the cell type-specific marker genes from population-level single-cell RNA-seq
Genome Biology (2024)
-
Dimension reduction, cell clustering, and cell–cell communication inference for single-cell transcriptomics with DcjComm
Genome Biology (2024)
-
Cauchy hyper-graph Laplacian nonnegative matrix factorization for single-cell RNA-sequencing data analysis
BMC Bioinformatics (2024)
-
aKNNO: single-cell and spatial transcriptomics clustering with an optimized adaptive k-nearest neighbor graph
Genome Biology (2024)