Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May;33(5):495-502.
doi: 10.1038/nbt.3192. Epub 2015 Apr 13.

Spatial reconstruction of single-cell gene expression data

Affiliations

Spatial reconstruction of single-cell gene expression data

Rahul Satija et al. Nat Biotechnol. 2015 May.

Abstract

Spatial localization is a key determinant of cellular fate and behavior, but methods for spatially resolved, transcriptome-wide gene expression profiling across complex tissues are lacking. RNA staining methods assay only a small number of transcripts, whereas single-cell RNA-seq, which measures global gene expression, separates cells from their native spatial context. Here we present Seurat, a computational strategy to infer cellular localization by integrating single-cell RNA-seq data with in situ RNA patterns. We applied Seurat to spatially map 851 single cells from dissociated zebrafish (Danio rerio) embryos and generated a transcriptome-wide map of spatial patterning. We confirmed Seurat's accuracy using several experimental approaches, then used the strategy to identify a set of archetypal expression patterns and spatial markers. Seurat correctly localizes rare subpopulations, accurately mapping both spatially restricted and scattered groups. Seurat will be applicable to mapping cellular localization within complex patterned tissues in diverse systems.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of Seurat
As input, Seurat takes single-cell RNA-seq data (1, left) from dissociated cells (e.g., cells A–C), where information about the original spatial context was lost during dissociation, and (2, right) in situ hybridization patterns for a series of landmark genes. To generate a binary spatial reference map, the tissue of interest is divided into a discrete set of user-defined bins, and the in situ data is binarized to reflect the detection of gene expression within each bin, as is shown for genes X, Y, and Z. (3) Seurat uses expression measurements across many correlated genes to ameliorate stochastic noise in individual measurements for landmark genes. As schematized, Seurat learns a model of gene expression for each of the landmark genes based on other variable genes in the dataset, reducing the reliance on a single measurement, and mitigating the effect of technical errors. Seurat then builds statistical models of gene expression in each bin (4) by relating the bimodal expression patterns of the RNA-seq estimates to the binarized in situ data. Shown are probability distributions for genes X, Y, and Z for three different embryonic bins. Finally, Seurat uses these models to infer the cell’s original spatial location (5), assigning posterior probability of origin (depicted in shades of purple) to each bin. Seurat can map exclusively to one bin (e.g., cell C), or assign probability to multiple bins in some cases (e.g., cells A & B).
Figure 2
Figure 2. Single-cell RNA-seq from zebrafish embryos
(a) Cartoon schematic of the zebrafish embryo at 50% epiboly, depicting cell layers (enveloping layer, EVL; deep cell layer, DEL; yolk syncytial layer, YSL), important structures (the embryonic margin), and the two major spatial axes (animal–vegetal and dorsal–ventral). (b) To create the spatial reference map, we used 47 colorogenic in situ hybridization patterns (i.e., ‘landmark’ genes), which were previously published in the scientific literature. We subdivided the embryo into 64 bins and visually scored each landmark as ‘on’ or ‘off’ within each bin using in situs oriented in both lateral and animal views. Shown here is an in situ for ta/no tail and its resultant binary representation. (c) After dissection of the embryo, single cells were dissociated, plated and picked into microtiter plates, and profiled using a single-cell RNA-seq protocol that was modified to include unique molecule indices (Methods).
Figure 3
Figure 3. Seurat correctly infers the spatial position of cells
(a) Seurat maps cells throughout the embryo, consistent with the random dissociation of the tissue. Shown are cell centroids for randomly dissociated cells. (b) A smaller number of cells were prepared with a modified protocol that depletes for the animal cap (bin rows 6-8) (Supplementary Movie 2), and Seurat captures this depletion in its mapping of these cells. Shown are the fold-changes in localization percentages (Y axis) between the randomly dissociated and animal-depleted cells along the margin to animal axis (X axis, as bins). (c) A small number of ‘reference’ cells were manually picked under a dissecting microscope so that their original spatial location can be estimated (Supplementary Movie 3). Since at 50% epiboly dorsal-ventral specification is not morphologically apparent, a cluster of previously transplanted fluorescent cells is used as a fiducial mark to track where cells were taken from and to deduce the cell’s location once the dorsal-ventral axis becomes apparent at shield stage (Methods). (d–e) Evaluation of Seurat using ‘reference’ cells. (d) Representative examples of Seurat’s inferred location for reference cells (centroid: green, posterior bin probabilities: shades of purple, with the strongest color representing 100% posterior probability) vs. experimentally annotated locations (pink). The embryonic margin is also depicted in khaki. The experimentally annotated sphere is drawn with a larger radius to reflect degree of confidence in the experimental measurement. (e) Histogram of distance errors between inferred and measured location for reference cells. The median error is 2 bins, divided equally between the animal–vegetal and dorsal–ventral axes. (f) For each landmark gene, we removed the gene from the input reference map, and then re-inferred its in situ pattern from Seurat’s spatial patterns. Shown are representative examples (middle) compared to the binarized input pattern (top) and ROC scores (bottom). (g–h) ROC analysis of the accuracy of the inferred landmark in situs vs. the binarized input pattern (median ROC = 0.96).
Figure 4
Figure 4. Nine archetypal patterns discovered through spatial clustering
(a) We calculated imputed expression patterns based on Seurat’s spatial mapping for 290 highly variable genes (Methods). Genes were then clustered by their imputed spatial localization (Supplementary Fig. 5) into 9 ‘archetypes’ that broadly describe the patterns of multiple genes: RM, restricted to margin; VM, ventral margin; DEM, dorsally enriched margin; DRM, dorsally restricted margin; EM, extended margin; V, ventral; DA, dorsal animal; VA, ventral animal; A, animal. (b) Genes were selected from various archetypes that did not have published (assessed on Sep. 4, 2014), expression patterns at 50% epiboly and then analyzed by RNA in situ hybridization. Top to bottom: Seurat’s predicted expression pattern, a lateral view of the in situ (dorsal to the right), and an animal cap view of the in situ (dorsal to the right). Experimentally determined patterns exhibit high accord with Seurat’s predictions, as described in the main text. Genes are connected to the archetype with which they clustered by black lines. Scale bar represents 200 µm.
Figure 5
Figure 5. Seurat identifies and characterizes rare cell populations
(a) A cartoon depicting the prechordal plate progenitors (green) clustered at the dorsal margin, and endodermal progenitors (blue) scattered along the embryonic margin. (b) Violin plots of the distribution of expression of classical endoderm markers (sox32, cxcr4a), classical prechordal plate marker (gsc) and novel proposed prechordal plate marker (ripply1), in the cell populations determined by PCA analysis: all marginal cells (“Margin”), endodermal progenitors (“Endo”), and prechordal plate progenitors (“PCP”). (c) Seurat localizes the endodermal progenitors (blue) and prechordal plate progenitors (green) to their characteristic locations. (d) Seurat’s predicted expression pattern (left) and in situ validation (right) of the expression of ripply1, a novel prechordal plate marker. (e) Double in situ for gsc (orange) and ripply1 (blue) confirming that ripply1 is expressed in the prechordal plate progenitors. (f) PCA of the entire embryo revealed a previously uncharacterized group of cells (magenta) distinguished by PC4, and expressing high levels of genes which are hallmarks of apoptosis. (g) Seurat’s projected localization of these ‘apoptotic-like’ cells (magenta) are scattered around the embryo, but enriched towards the animal pole. (h) Violin plots of the distribution of expression of isg15 and mat2al, markers of the ‘apoptotic-like’ population in all the cells and the putative apoptotic-like cells. (i) In situ hybridization of four markers of the ‘apoptotic-like’ cells are expressed in similarly scattered patterns. Top: Lateral view, bottom: animal pole view. (j) Double fluorescent in situ hybridization for aplnrb (magenta) and isg15 (green) reveals that these markers are co-expressed, as predicted by Seurat. Notably, cells appear to express high levels of either aplnrb or isg15 and lower levels of the other gene. Scale bars represent 100 µm.

Comment in

Similar articles

Cited by

References

    1. Graveley BR, et al. The developmental transcriptome of Drosophila melanogaster. Nature. 2011;471:473–479. - PMC - PubMed
    1. Gerstein MB, et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science. 2010;330:1775–1787. - PMC - PubMed
    1. Schier AF. Genomics: Zebrafish earns its stripes. Nature. 2013;496:443–444. - PubMed
    1. Islam S, et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Research. 2011;21:1160–1167. - PMC - PubMed
    1. Ramsköld D, et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol. 2012;30:777–782. - PMC - PubMed

Publication types

MeSH terms

Associated data