Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 29;160(3):420-32.
doi: 10.1016/j.cell.2015.01.020.

HIV-1 integration landscape during latent and active infection

Affiliations

HIV-1 integration landscape during latent and active infection

Lillian B Cohn et al. Cell. .

Abstract

The barrier to curing HIV-1 is thought to reside primarily in CD4(+) T cells containing silent proviruses. To characterize these latently infected cells, we studied the integration profile of HIV-1 in viremic progressors, individuals receiving antiretroviral therapy, and viremic controllers. Clonally expanded T cells represented the majority of all integrations and increased during therapy. However, none of the 75 expanded T cell clones assayed contained intact virus. In contrast, the cells bearing single integration events decreased in frequency over time on therapy, and the surviving cells were enriched for HIV-1 integration in silent regions of the genome. Finally, there was a strong preference for integration into, or in close proximity to, Alu repeats, which were also enriched in local hotspots for integration. The data indicate that dividing clonally expanded T cells contain defective proviruses and that the replication-competent reservoir is primarily found in CD4(+) T cells that remain relatively quiescent.

PubMed Disclaimer

Figures

Figure 1
Figure 1. HIV-1 integration libraries, see also Figure S1
A) Diagram of integration library construction. B) Table of unique integrations identified in viremic controllers (C), viremic untreated progressors (V), and treated progressors (T). C) Proportion of integrations (Int) that are in genic or intergenic regions in C, V or T. D) Proportion of genic integrations located in introns in C, V or T. E). Proportion of integrations in genes with high, medium or low expression. P-values refer to proportion of integrations in highly expressed genes. F) Transcriptional orientation of integrated HIV-1 relative to host gene in controllers, viremic or treated progressors. ns: not significant *P<0.05**P< 0.01 ***P<0.0001 using two-proportion z-test.
Figure 2
Figure 2. Identification of clonally expanded cells bearing integrated HIV-1, See also Figure S2
A) Proportion of viral integrations (Int) that are clonally expanded, as identified by the same integration site with multiple shears in controllers (C), viremic (V) or treated progressors (T). B) Proportion of infected cells deriving from clonal expansion in C, V or T.. C) Proportion of clonally expanded (CE) and single (S) viral integrations in genic or intergenic regions. D) Proportion of clonally expanded and single viral integrations in introns. E) Proportion of clonally expanded or single viral integrations in genes with high, medium or low expression. P values refer to proportion of integrations in highly expressed genes. F) Seven-parameter flow cytometry sorting strategy to identify CD4+ T cell subsets. CM, TM, and EM cell subsets were identified based on their CD45RA, CCR7, and CD27 expression. Shown is one representative sort. G) Proportion of viral integrations (Int) that are clonally expanded, as identified by the same integration site with multiple shears in sorted CD4+ T cell subsets from patient 9. H) Proportion of infected cells deriving from clonal expansion in sorted CD4+ T cell subsets from patient 9. ns: not significant ***P<0.0001 using two-proportion z-test.
Figure 3
Figure 3. Clonally expanded viral integrations increase and single integrations decrease during therapy, See also Figure S3
Graphs show data from patients 1 (blue), 2 (red) and 3 (green) from longitudinal time points (Table S1). Time was normalized from 0 to 1 (727 days pre therapy to 2617 days post therapy). Dotted line at t = 0.21 marks therapy initiation. Trendline was determined by linear regression model. Solid lines indicate significant change in proportion of events; dashed lines indicate insignificant change in proportion of events. A) Proportion of clonally expanded viral integrations (Int). B) Proportion of single viral integrations. C) Proportion of genic clonally expanded viral integrations. D) Proportion of genic single viral integrations. E) Proportion of intergenic clonally expanded viral integrations. F) Proportion of intergenic single viral integrations.
Figure 4
Figure 4. Integrations in genes permissive for clonal expansion occur in multiple patients. See also Figure S4
A) Percent viral integrations present in more than one time point (persistent integrations) in patients 1, 2 and 3 (Table S1). B) Comparison of persistent (P) and clonally expanded (CE) viral integrations in genic or intergenic region. C) Proportion of persistent and clonally expanded viral integrations in genes with high, medium or low expression. P values refer to proportion of integrations in highly expressed genes. D-F) Heatmap showing overlap between samples of genes containing clonally expanded or single viral integrations between samples. Patients are indicated by P1-13. Multiple samples from one individual are marked by a bracket. The amount of overlap is denoted by color (see legend); Red = 100% overlap. G) Genes containing single or clonally expanded viral integrations were analyzed for their presence in multiple patients. Genes with integrations in more than one individual were classified as “overlapping”; genes with integrations in only one individual were classified as “unique.” Shown is the proportion of single and clonally expanded unique and overlapping viral integrations in genes with high, medium or low expression. P values refer to proportion of integrations in highly expressed genes. H) Genes with integrations were analyzed for their association with cancer. Proportions of cancer-associated genes are shown for single, clonally expanded and persistent viral integrations. The number indicates the total number of genes from each category. I) Graph shows proportion of integrations in cancer-related genes from patients 1 (blue), 2 (red) and 3 (green) from longitudinal time points (Table S1). Time was normalized from 0 to 1 (727 days pre therapy to 2617 days post therapy). Dotted line at t = 0.21 marks therapy initiation. Trendline was determined by linear regression model and indicates significant change in proportion of events, p=0.023. ns: not significant *P<0.05**P< 0.01 ***P<0.0001 using two-proportion z-test.
Figure 5
Figure 5. Large expanded clones are defective, See also Data S1
A) Sequence analysis of 5’LTRs in clonally expanded integrations. Of 75 different clonally expanded integrations from 8 individuals, 24 showed fragmented 5’ LTRs, 44 didn't have a recoverable 5’ LTR, and 8 contained intact 5’LTRs. B) Strategy for HIV-1 sequencing. 8 proviruses were analyzed for intact viral sequence. Nested genomic primers and internal HIV primers were used in a PCR walking strategy to amplify fragments a-e from specific clonally expanded integrations. PCR products were sequenced directly. C) Summary of HIV-1 sequencing from large expanded clones. Sequences were aligned to HXB2 and examined for presence of large internal deletions. Intact sequences were analyzed for G → A hypermutation by Los Alamos Hypermut algorithm (Rose and Korber, 2000). Non hypermutated products were analyzed for intact reading frames and frameshift mutations by Los Alamos HIVQC. Green dot: intact, non hypermutated sequence. Red dot: no PCR product recovered. Red triangle: sequence with internal deletion. – : not done.
Figure 6
Figure 6. Identification of hotspots for HIV-1 integration
A) Number of hotspots identified by hot-scan in viremic controllers (C), viremic untreated (V) and treated progressors (T). B) Integrations in MKL2 from patients 10 and 11. Gray vertical arrows indicate site of integrations. Colored horizontal lines show fragments of DNA spanning the point of integration through sheared end. Green: viruses integrated in the same orientation as gene. Red: convergent orientation. Orange: viruses integrated with both orientations. C) HIV-1 gag was amplified from integrated proviruses in MKL2 from patients 10 and 11. PCR was performed using nested integration site-specific primers and HIV-1 gag primers. Sequences were clustered to assess DNA sequence similarity. The scale bar represents 0.007 substitutions per site. D) Proportion of virus integrations inside hotspots. E) Proportion of hotspots in genic and intergenic regions. F) Proportion of hotspots in introns. G) Proportion of hotspots in genes with high, medium or low expression. P values refer to proportion of integrations in highly expressed genes. H) Percentage of total single and clonally expanded viral integrations inside hotspots. Enrichment of clonally expanded viral integrations compared to single integrations is significant, p <0.0001. ns: not significant *P<0.05 **P< 0.01 ***P<0.0001 using proportion test
Figure 7
Figure 7. Consensus motif for viral integration
A) 30bp sequence consensus motif (INT-motif). 100bp around all viral integration sites were analyzed for a consensus sequence by MEME (Bailey and Elkan, 1994). 444 integration sites were identified with the INT-motif. E-value: 6.4×10−4071. The dotted line shows the preferred site of integration (see also (C)). B) Number of single (S) and clonally expanded (CE) which were identified to contain INT-motif within 100bp of the integration site. P<0.0001, using two-proportion z-test. C) Conserved integration site within INT-motif. Histogram maps the start site (5’ end) of INT-motif with respect to the integration site (dotted line). Peak shows the majority of integration sites occur 20bp from the 5’ end of the motif start site. Shaded region represents the location of the INT-motif relative to the majority of the integration sites. D) Location of integration preference and INT-motif inside Alu repeats is overlapping. Left, location of integration sites Alu repeats were plotted relative to the midpoint of the repeat. Right, the location of the start site of INT-motifs within Alu repeats. E) Integrations are enriched inside Alu repeats. Total integrations identified inside Alu repeats were enumerated (red diamond) and compared to the expected value as defined by Monte Carlo simulation. The boxplot displays the variation of the number of random integrations identified inside Alu repeats by each iteration of the simulation. F) Integrations are near Alu repeats in genes and intergenic regions. Average distance to the nearest Alu repeat for all integrations inside genes or intergenic regions was calculated (red diamond) and compared to the expected distance as defined by Monte Carlo simulation. The boxplot displays the variation of the distance of random integrations from Alu repeats in genes or intergenic regions by each iteration of the simulation. G) Distance to Alu repeats from the center of highly, medium, low, trace or silently expressed genes. H) Distance to Alu repeats in highly, medium, low, trace or silently expressed genes. I) Positive correlation between Alu repeats and integrations inside hotspots. Graph shows number of Alu repeats (X axis) vs. integrations in hotspots (Y axis). Hotspots not containing Alu repeats were removed from this analysis. The scatter plot shows the linear relationship between the number of INT-motifs and integrations inside hotspots (Pearson's correlation, ρ = 0.86).

Similar articles

Cited by

References

    1. Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings / International Conference on Intelligent Systems for Molecular Biology ; ISMB International Conference on Intelligent Systems for Molecular Biology. 1994;2:28–36. - PubMed
    1. Berry CC, Gillet NA, Melamed A, Gormley N, Bangham CR, Bushman FD. Estimating abundances of retroviral insertion sites from DNA fragment length data. Bioinformatics. 2012;28:755–762. - PMC - PubMed
    1. Brady T, Agosto LM, Malani N, Berry CC, O'Doherty U, Bushman F. HIV integration site distributions in resting and activated CD4+ T cells infected in culture. Aids. 2009;23:1461–1471. - PMC - PubMed
    1. Buzon MJ, Sun H, Li C, Shaw A, Seiss K, Ouyang Z, Martin-Gayo E, Leng J, Henrich TJ, Li JZ, et al. HIV-1 persistence in CD4+ T cells with stem cell-like properties. Nature medicine. 2014;20:139–142. - PMC - PubMed
    1. Chomont N, El-Far M, Ancuta P, Trautmann L, Procopio FA, Yassine-Diab B, Boucher G, Boulassel MR, Ghattas G, Brenchley JM, et al. HIV reservoir size and persistence are driven by T cell survival and homeostatic proliferation. Nature medicine. 2009;15:893–900. - PMC - PubMed

Publication types