Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 4;47(10):e56.
doi: 10.1093/nar/gkz146.

Cell lineage inference from SNP and scRNA-Seq data

Affiliations

Cell lineage inference from SNP and scRNA-Seq data

Jun Ding et al. Nucleic Acids Res. .

Abstract

Several recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However, both approaches suffer from drawbacks that limit their use. Here, we develop a method to detect significant, cell type specific, sequence mutations from scRNA-Seq data. We show that only a few mutations are enough for reconstructing good branching models. Integrating these mutations with expression data further improves the accuracy of the reconstructed models. As we show, the majority of mutations we identify are likely RNA editing events indicating that such information can be used to distinguish cell types.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
TBSP Method Overview. (A) Cells used in the study. (B) Reads are mapped to the reference genome. (C and D) Reads are used to determine expression levels and to identify SNPs. (E) Cells are clustered based on identified SNPs. (F) Iterating between selecting a subset of key SNPs and clustering using selected SNPs. Once a set of key SNPs is established, it is combined with expression values to determine the branching model. (G) Final predicted trajectories (using SNPs and/or expression information).
Figure 2.
Figure 2.
Predicted models using SNP information. To use the SNP data for inferring trajectories we relied on a well-established phylogenetic method: neighbor-joining. We interpret the resulting branching models as cell trajectories. In the figure, gray circles represent internal nodes in the phylogenetic tree that are not assigned any cells. The length of the lines in the constructed lineage graph represents the phylogenetic distance between nodes (clusters). The circle size represents the number of cells within the cluster. (A) Predicted model for the Neuron data (35). The model correctly starts with Cluster 1 (Mouse Embryonic Fibroblasts-MEF) and then continues to d2_intermediate (Cluster 2), d2_induced (Cluster 4), d5_intermediate and d5_failedReprog (Cluster 5), d5_earlyiN (Cluster 3) and Neuron (Cluster 6). This trajectory is very similar to the one presented in the original paper. (B) Predicted model for the Liver data (13). Similar to the mouse neuron data, cells are clustered well using only SNP information. As for the trajectory analysis, the original study (13) reported a bifurcation in 2D and 3D trajectories. In the 2D culture, the iPSC cells differentiate to mature hepatocyte-like (MH) cells, which are different from the liver bud (LB) and mesenchymal stem cell (MSC)-LB cells in their 3D differentiation counterparts. This is also the branching determined based on the SNP information.
Figure 3.
Figure 3.
Distribution of predicted SNPs in clusters. All the clusters are ordered based on the trajectory inference. In many cases, we see SNPs at contiguous clusters which can explain their usefulness for reconstructing the trajectories of the different studies. Still, some SNPs (e.g. cluster 2 in the liver data) are very specific and only detected for one cell type.
Figure 4.
Figure 4.
Predicted SNPs may represent RNA-editing changes. (A) Predicted SNPs are enriched with A/G(A→G,G → A) or C/T(C → T,T → C). Similar to several other studies that characterize RNA editing sites we find that SNPs detected by TBSP are enriched for specific substitutions. (B) The predicted SNPs are enriched in 3’UTR regions which is also where RNA editing sites are enriched in.
Figure 5.
Figure 5.
Combining expression data with SNP information to improve the reconstruction of branching models. Expression only and SNP added trajectory inference for the mouse Neuron data. (Left) A model reconstructed using only expression information for the neuron development data. The lowest cluster (red) is a descendant of the cluster with a mixture of Fibroblast, Myocyte and Neuron cells. In contrast, when using the SNPs. (Right) the neuron dominant cluster is descending from the d2_induced and d5_earlyiN dominant states. Also the d5_earlyiN cells are relatively closer to the Neuron cells, which is more consistent with the trajectory reported in (35).

Similar articles

Cited by

References

    1. Bendall S.C., Davis K.L., Amir E.D., Tadmor M.D., Simonds E.F., Chen T.J., Shenfeld D.K., Nolan G.P., Pe’er D.. Single-cell trajectory detection uncovers progression and regulatory coordination in human b cell development. Cell. 2014; 157:714–725. - PMC - PubMed
    1. Trapnell C., Cacchiarelli D., Grimsby J., Pokharel P., Li S., Morse M., Lennon N.J., Livak K.J., Mikkelsen T.S., Rinn J.L.. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 2014; 32:381. - PMC - PubMed
    1. Qiu X., Mao Q., Tang Y., Wang L., Chawla R., Pliner H.A., Trapnell C.. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods. 2017; 14:979. - PMC - PubMed
    1. Satija R., Farrell J.A., Gennert D., Schier A.F., Regev A.. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 2015; 33:495. - PMC - PubMed
    1. Setty M., Tadmor M.D., Reich-Zeliger S., Angel O., Salame T.M., Kathail P., Choi K., Bendall S., Friedman N., Pe’er D.. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat. Biotechnol. 2016; 34:637. - PMC - PubMed

Publication types