Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Aug;2(8):E234.
doi: 10.1371/journal.pbio.0020234. Epub 2004 Aug 17.

Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences

Affiliations

Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences

Rick S Mitchell et al. PLoS Biol. 2004 Aug.

Abstract

The completion of the human genome sequence has made possible genome-wide studies of retroviral DNA integration. Here we report an analysis of 3,127 integration site sequences from human cells. We compared retroviral vectors derived from human immunodeficiency virus (HIV), avian sarcoma-leukosis virus (ASLV), and murine leukemia virus (MLV). Effects of gene activity on integration targeting were assessed by transcriptional profiling of infected cells. Integration by HIV vectors, analyzed in two primary cell types and several cell lines, strongly favored active genes. An analysis of the effects of tissue-specific transcription showed that it resulted in tissue-specific integration targeting by HIV, though the effect was quantitatively modest. Chromosomal regions rich in expressed genes were favored for HIV integration, but these regions were found to be interleaved with unfavorable regions at CpG islands. MLV vectors showed a strong bias in favor of integration near transcription start sites, as reported previously. ASLV vectors showed only a weak preference for active genes and no preference for transcription start regions. Thus, each of the three retroviruses studied showed unique integration site preferences, suggesting that virus-specific binding of integration complexes to chromatin features likely guides site selection.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no conflicts of interest exist.

Figures

Figure 1
Figure 1. Relationship between Integration Sites and Transcriptional Intensity in the Human Genome
The human chromosomes are shown numbered. HIV integration sites from all datasets in Table 1 are shown as blue “lollipops”; MLV integration sites are shown in lavender; and ASLV integration sites are shown in green. Transcriptional activity is shown by the red shading on each of the chromosomes (derived from quantification of nonnormalized EST libraries, see text). Centromeres, which are mostly unsequenced, are shown as grey rectangles.
Figure 2
Figure 2. Integration Intensity in Genes and Intergenic Regions
Genes or intergenic regions were normalized to a common length and then divided into ten intervals to allow comparison. The number of integration sites in each interval was divided by the number of matched random control sites and the value plotted. A value of one indicates no difference between the experimental sites and the random controls. Viruses and cell types studied are as marked above each graph. The direction of transcription within each gene is from left to right. Note that our normalization method de-emphasizes favored MLV integration events just upstream of gene 5′ ends (outside transcription units), as reported by Wu et al. (2003). We carried out an analysis specially designed to identify this effect and confirmed that the regions just upstream of gene 5′ ends are favored for MLV integration when reanalyzed with the matched random control data (unpublished data).
Figure 3
Figure 3. Influence of Gene Activity on Integration Frequency
Expression levels were assayed using Affymetrix HU-95Av2 or HU-133A microarrays and scored by the average difference value as defined in the Affymetrix Microarray Suite 4.1 software package. All the genes assayed by the chip were divided into eight “bins” according to their relative level of expression (the leftmost bin in each panel is lowest expression levels and the rightmost the highest). Genes that hosted integration events were then distributed into the same bins, summed, and expressed as a percent of the total. The y-axis indicates the percent of all genes in the indicated bin. P values were determined using the Chi-square test for trends by comparison to a null hypothesis of no bias due to expression level. All average difference values were ranked prior to analysis, and the analysis was carried out on the ranked data. This was done to avoid possible complications due to differential normalization or other data processing differences arising during work up of the microarrays.
Figure 4
Figure 4. Effects of Tissue-Specific Transcription on Integration Site Selection in Different Cell Types
Genes hosting integration events by the HIV vector were analyzed for their expression levels in transcriptional profiling data from IMR90, PBMC, and SupT1 cells. For each gene hosting an integration event, the expression values from the three cell types were then ranked lowest (red), medium (orange), and highest (yellow). The values were summed and displayed separately for each set of integration sites: (A) IMR90 sites, (B) PBMC sites, and (C) SupT1 sites. In each case there was a significant trend for the cell type hosting the integration events to show the highest expression values relative to the other two (p < 0.05 for all comparisons).
Figure 5
Figure 5. Comparison of Transcriptional Intensity to Integration Intensity on Human Chromosome 11
All data were quantified in 2-Mb intervals. The top line shows summed EST data documenting the “transcriptional intensity” for each chromosomal interval (data from Mungall and al. [2003]). The bottom three lines show the summed frequency of integration site sequences in each interval. The numbers of ESTs (top) or integration sites (bottom three) are shown on the y-axis.
Figure 6
Figure 6. The Effects of Proximity to CpG Islands Differs for HIV, MLV, and ASLV Integration
The viral vectors and cell types studied are indicated by color. A value of one indicates no bias, less than one indicates disfavored integration, and more than one indicates favored integration. The x-axis (from plus or minus 1 kb to 50 kb) indicates distance from the edge of a CpG island in either direction along the genome. The statistical analysis specifically removed the favorable effects of being in a gene and being in a region containing expressed genes to highlight the effects of CpG islands alone. When effects of gene density and activity are left in, HIV integration goes from disfavored at short distances (less than 1 kb) to favored at longer distances (more than 10 kb). This is because at longer distances the association with genes is significant—many CpG islands are within 10 kb of a gene, and genes are favored targets for HIV integration. To carry out this analysis, the numbers of experimentally determined and matched control sites were fitted according to whether they were near a CpG island, whether they were in genes, and the level of the expression density variable. Each variable contributes a “multiplier” for the ratio of the number of experimental to control sites. The multiplier for “near CpG island” is shown (see Protocol S2, p. 9–12).
Figure 7
Figure 7. Frequency of Integration in Human Chromosomes
Human chromosome numbers are indicated at the bottom of the figure. The number of integration events detected in each chromosome was divided by the number expected from the matched random control. The line at one indicates the bar height expected if the observed number of integration events matched the expected number. Higher bars indicate favored integration, lower bars, disfavored integration. Most of the cell types studied were from human females; too little data were available for the Y chromosome for meaningful analysis.

Similar articles

Cited by

References

    1. Boeke JD, Devine SE. Yeast retrotransposons: Finding a nice quiet neighborhood. Cell. 1998;93:1087–1089. - PubMed
    1. Boyle S, Gilchrist S, Bridger JM, Mahy NL, Ellis JA, et al. The spatial organization of human chromosomes within the nuclei of normal and emerin-mutant cells. Hum Mol Genet. 2001;10:211–219. - PubMed
    1. Bushman FD. Tethering human immunodeficiency virus 1 integrase to a DNA site directs integration to nearby sequences. Proc Natl Acad Sci U S A. 1994;91:9233–9237. - PMC - PubMed
    1. Bushman FD. New York: Cold Spring Harbor Laboratory Press; 2001. Lateral DNA transfer: Mechanisms and consequences; 448 pp.
    1. Bushman FD. Targeting survival: Integration site selection by retroviruses and LTR-retrotransposons. Cell. 2003;115:135–138. - PubMed

Publication types

MeSH terms

Associated data