Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 7;18(1):155.
doi: 10.1186/s12859-017-1556-5.

The Repertoire Dissimilarity Index as a method to compare lymphocyte receptor repertoires

Affiliations

The Repertoire Dissimilarity Index as a method to compare lymphocyte receptor repertoires

Christopher R Bolen et al. BMC Bioinformatics. .

Abstract

Background: The B and T cells of the human adaptive immune system leverage a highly diverse repertoire of antigen-specific receptors to protect the human body from pathogens. The sequencing and analysis of immune repertoires is emerging as an important tool to understand immune responses, whether beneficial or harmful (in the case of autoimmunity). However, methods for studying these repertoires, and for directly comparing different immune repertoires, are lacking.

Results: In this paper, we present a non-parametric method for directly comparing sequencing repertoires, with the goal of rigorously quantifying differences in V, D, and J gene segment utilization. This method, referred to as the Repertoire Dissimilarity Index (RDI), uses a bootstrapped subsampling approach to account for variance in sequencing depth, and, coupled with a data simulation approach, allows for direct quantification of the average variation between repertoires. We use the RDI method to recapitulate known differences in the formation of the CD4+ and CD8+ T cell repertoires, and further show that antigen-driven activation of naïve CD8+ T cells is more selective than in the CD4+ repertoire, resulting in a more specialized CD8+ memory repertoire.

Conclusions: We prove that the RDI method is an accurate and versatile method for comparisons of immune repertoires. The RDI method has been implemented as an R package, and is available for download through Bitbucket.

Keywords: Immunology; Nonparametric methods; Repertoire sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Repertoire subsampling accurately controls for variance inflation. A simulated sequencing dataset was generated by drawing 30 replicate samples from a single pool containing 50 genes of varying prevalence. For each replicate, the number of sequences was chosen randomly, and the total count varied between 3000 and 12,000. a The frequency of each gene was tallied, and the euclidean distance between each pair of replicates was calculated. b Each repertoire was subsampled to the size of the smallest repertoire (n = 3216), and euclidean distance was calculated based on normalized gene frequency in the subsampled dataset. The distance measurement was then averaged across multiple subsampling steps. All distance metrics are compared against the original repertoire size for the smaller repertoire
Fig. 2
Fig. 2
The RDI metric scales with differences in gene frequency. Simulated datasets were generated by randomly drawing genes from a set of fixed probability vectors. Probabilities were generated by perturbing a constant baseline probability vector such that the absolute log-fold difference in each gene was between 0 (no change) and 8 (256-fold increase or decrease in each gene) relative to baseline. Each perturbation vector was used to generate datasets containing varying numbers of sequences (n = 50 to 20,000), and a set of equally-sized baseline datasets were generated and compared to the perturbed datasets using the RDI metric. a The average RDI score for each perturbed dataset (y axis) is shown against the true average absolute log fold change (relative to baseline) of each perturbation vector (x axis). Spline models were fit to the data (dotted lines). b Mean and standard deviation of the RDI value was estimated from the spline model at multiple fold change values, and are plotted as probability density functions for a variety of different repertoire sizes (y axis)
Fig. 3
Fig. 3
RDI accounts for repertoire size heterogeneity. TRB sequences from a single donor in the Rubelt et al. dataset were randomly assigned to one of two unevenly-sized groups. The smaller group contained 1000, 2500, 5000, 10,000, or 50,000 total sequences, and all remaining sequences were assigned to the second group. V gene frequencies from the two repertoires were compared using the RDI method. The distribution of RDI values across 1000 replicates (black histogram) was compared with simulated data (grey curves) with controlled levels of variance (average fold change of gene segments = 1, 1.2, or 1.5; indicated numbers)
Fig. 4
Fig. 4
T cell repertoire differences are magnified by clonal expansion. Individual naïve and memory CD4+ and CD8+ V gene repertoires were tallied based on either the clonally collapsed (clonal) dataset or the full (molecular) dataset from Rubelt et al. Naïve and memory repertoires were then compared within each individual donor (n = 10), and log-fold change values were estimated from each RDI value. Individual log-fold change values (tick marks) and a kernel density plot (curved line) are shown for each group

Similar articles

Cited by

References

    1. Tonegawa S. Somatic generation of antibody diversity. Nature. 1983;302:575–81. doi: 10.1038/302575a0. - DOI - PubMed
    1. Davis MM, Bjorkman PJ. T-cell antigen receptor genes and T-cell recognition. Nature. 1988;334:395–402. doi: 10.1038/334395a0. - DOI - PubMed
    1. Schatz DG, Ji Y. Recombination centres and the orchestration of V(D)J recombination. Nat. Rev. Immunol. [Internet]. 2011 [cited 2016 Aug 15];11:251–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21394103. - PubMed
    1. Rubelt F, Bolen CR, McGuire HM, Vander Heiden JA, Gadala-Maria D, Levin M, Euskirchen GM, Mamedov MR, Swan GE, Dekker CL, et al. Individual heritable differences result in unique cell lymphocyte receptor repertoires of naïve and antigen-experienced cells. Nat Commun. 2016;7:11112. Available from: https://www.ncbi.nlm.nih.gov/pubmed/27005435. - PMC - PubMed
    1. Yaari G, Kleinstein SH. Practical guidelines for B-cell receptor repertoire sequencing analysis. Genome Med. 2015;7:121. doi: 10.1186/s13073-015-0243-2. - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources