Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun 15;28(12):i172-8.
doi: 10.1093/bioinformatics/bts236.

Xenome--a tool for classifying reads from xenograft samples

Affiliations

Xenome--a tool for classifying reads from xenograft samples

Thomas Conway et al. Bioinformatics. .

Abstract

Motivation: Shotgun sequence read data derived from xenograft material contains a mixture of reads arising from the host and reads arising from the graft. Classifying the read mixture to separate the two allows for more precise analysis to be performed.

Results: We present a technique, with an associated tool Xenome, which performs fast, accurate and specific classification of xenograft-derived sequence read data. We have evaluated it on RNA-Seq data from human, mouse and human-in-mouse xenograft datasets.

Availability: Xenome is available for non-commercial use from http://www.nicta.com.au/bioinformatics.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
A Venn diagram showing the different classes that a given k-mer may belong to. The marginal host (and marginal graft) partitions are for those host (and graft) k-mers that are Hamming distance 1 from a k-mer in the graft (and host) reference
Fig. 2.
Fig. 2.
Summary of the results with Human cDNA. Each of the classes of reads is divided into those reads assigned to the class only by Xenome (Xenome), only by the Tophat analysis (Tophat) or by both Xenome and the Tophat analysis (Concordant)
Fig. 3.
Fig. 3.
Summary of the results with Murine cDNA
Fig. 4.
Fig. 4.
Summary of the results with BM18 xenograft cDNA
Fig. 5.
Fig. 5.
Validation of the in silico classification of xenograft RNA-Seq data with qRT-PCR. The horizontal axis shows log10FPKM for the Xenome-derived gene expression for the 18 test genes. The vertical axis shows the Ct value for each gene relative to the Ct of actin. There were two RNA-Seq samples processed (biological replicates), and four replicates of the qRT-PCR. For each gene, an ellipse is shown centered on the mean log10FPKM in the x-axis, and on the mean relative Ct in the y-axis. The horizontal and vertical radii show the variance in the samples
Fig. 6.
Fig. 6.
An in silico analysis showing the degree of ambiguity in HG19 refGene, according to the k-mer based analysis used by Xenome. In this analysis, k = 25
Fig. 7.
Fig. 7.
A plot showing the distribution of human genes with respect to the proportion of xenograft reads which are classed as both by the Tophat-based analysis and the Xenome analysis. The reads considered are only those mapped by Tophat since Xenome does not yield mappings, so cannot be used to assign reads to genes. Only genes for which at least 20 reads mapped were considered. The horizontal axis corresponds to the number of reads classified as both or ambiguous by Xenome as a proportion of all the reads that might possibly be human (i.e. both, ambiguous or human). The vertical axis corresponds to the number of reads classified as both by the Tophat-based analysis, once again, as a proportion of all the reads that might possibly be human

Similar articles

Cited by

References

    1. Arbitman Y., et al. 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. Los Alamos California: IEEE Computer Society; 2010. Backyard cuckoo hashing: Constant worst-case operations with a succinct representation; pp. 787–796.
    1. Chung D., et al. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. PLoS Comput. Biol. 2011;7:e1002111. - PMC - PubMed
    1. Conway T.C., Bromage A.J. Succinct data structures for assembling large genomes. Bioinformatics. 2011;27:479–486. - PubMed
    1. Ding L., et al. Genome remodelling in a basal-like breast cancer metastasis and xenograft. Nature. 2010;464:999–1005. - PMC - PubMed
    1. Hormozdiari F., et al. Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery. Bioinformatics. 2010;26:i350–i357. - PMC - PubMed

Publication types

Substances