Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 23;12(1):4503.
doi: 10.1038/s41467-021-24462-5.

Stochastic pausing at latent HIV-1 promoters generates transcriptional bursting

Affiliations

Stochastic pausing at latent HIV-1 promoters generates transcriptional bursting

Katjana Tantale et al. Nat Commun. .

Abstract

Promoter-proximal pausing of RNA polymerase II is a key process regulating gene expression. In latent HIV-1 cells, it prevents viral transcription and is essential for latency maintenance, while in acutely infected cells the viral factor Tat releases paused polymerase to induce viral expression. Pausing is fundamental for HIV-1, but how it contributes to bursting and stochastic viral reactivation is unclear. Here, we performed single molecule imaging of HIV-1 transcription. We developed a quantitative analysis method that manages multiple time scales from seconds to days and that rapidly fits many models of promoter dynamics. We found that RNA polymerases enter a long-lived pause at latent HIV-1 promoters (>20 minutes), thereby effectively limiting viral transcription. Surprisingly and in contrast to current models, pausing appears stochastic and not obligatory, with only a small fraction of the polymerases undergoing long-lived pausing in absence of Tat. One consequence of stochastic pausing is that HIV-1 transcription occurs in bursts in latent cells, thereby facilitating latency exit and providing a rationale for the stochasticity of viral rebounds.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Single cell characterization of HIV-1 gene expression, with and without Tat.
A Schematic of HIV-1 transcriptional regulation. Left: in the absence of Tat, pTEFb is not recruited and polymerases binds NELF and DSIF and pause near the promoter. Right: in the presence of Tat, pTEFb, composed of Cyclin T1 and Cdk9 associated with the super-elongation complex, is recruited to the nascent TAR RNA. Cdk9 phosphorylates NELF, DSIF, and RNA polymerase II, thereby triggering pausing exit and processive elongation. B Schematic of the HIV-1 reporter construct. SD1: major HIV-1 splice site donor; SA7: last HIV-1 splice site acceptor; ψ: packaging signal; RRE: Rev-responsive element; LTR: long terminal repeat. CE. Expression of the 128xMS2 HIV-1 tagged reporter in cells expressing high levels of Tat. C- microscopy images of High Tat HeLa cells where the unspliced HIV-1 pre-mRNA is detected by smFISH with probes against the 128xMS2 tag. Cells bear a single copy of the reporter gene integrated with the Flp-in system. The bright spots in the nuclei correspond to nascent RNA at their transcription sites, while the dimmer spots correspond to single pre-mRNA molecules. Scale bar: 10 μm. This experiment has been done three times with similar results. D distribution of the number of released HIV-1 pre-mRNAs per cell, in High Tat cells. Experimental RNA distributions are from smFISH data. X-axis: number of HIV-1 pre-mRNA molecules per cell; y-axis: number of cells; inset: mean number of HIV-1 pre-mRNAs per cell. E distribution of the number of nascent HIV-1 pre-mRNAs per transcription site, in High Tat cells. Experimental RNA distribution is from smFISH data. X-axis: number of nascent HIV-1 pre-mRNA molecules per transcription site; y-axis: number of transcription sites; inset: mean number of nascent HIV-1 pre-mRNAs per cell. D, E source data are provided as a Source Data file. FH Expression of the 128xMS2 HIV-1 tagged reporter in cells expressing low levels of Tat. Legend as in (CE), except that experiments are from Low Tat cells. This experiment has been done three times with similar results. G, H source data are provided as a Source Data file. IK Expression of the 128xMS2 HIV-1 tagged reporter in cells not expressing Tat. Legend as in (CE), except that experiments are from No Tat cells. Image contrast adjustment is identical for panels C, F, and I. This experiment has been done four times with similar results. J, K Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Fluctuation of HIV-1 transcription over short time periods, with and without Tat.
AF Fluctuations of HIV-1 transcription over 15–20 min periods, with one image stack recorded every 3 s. A, C, E Each graph is a single transcription site; the x-axis represents the time (in minutes) and y-axis represents the intensity of transcription sites, expressed in equivalent numbers of full-length pre-mRNA molecules. B, D, F Each line is a cell and the transcription site intensity is color-coded (scale on the right). A, B High-Tat cells. C, D Low-Tat cells. E, F No-Tat cells. Source data are provided as a Source Data file. G Schematic of a polymerase convoy. Top: a polymerase convoy, with polymerases in orange and the gene represented as a black horizontal arrow. Npol: number of polymerases; tspace: spacing between successive RNA polymerases (in seconds); vel: elongation rate. Bottom: schematics describing the different phases of a transcription cycle (left) and the position of the polymerase convoy on the MS2 tagged gene (right; the green box is theMS2 tag). H Box-plots representing the parameters values of the best-fit models, measured for a set of isolated transcription cycles in each cell line (n = 89, 36, and 59 for High Tat, No Tat, and Low Tat, respectively). tproc is the 3′-end RNA processing time; Npol is the number of polymerases in the convoy; Vel is the elongation rate (in kb/min); tspace is the spacing between successive polymerase (in seconds). The bottom line displays the first quartile, the box corresponds to the second and third quartile, the top line to the last quartile, and the double circle is the median. Small circles are outliers (1.5 times the inter-quartile range above or below the upper and lower quartile, respectively). Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Fluctuation of HIV-1 transcription over long time periods, with and without Tat.
A Fluctuations of HIV-1 transcription over 8 h, with one image stack recorded every 3 min. The x-axis represents the time (in hours) and y-axis represents the intensity of transcription sites, expressed in arbitrary units. Periods of HIV-1 promoter activity are colored in green, and periods of inactivity in red. B Active and inactive periods of the HIV-1 promoter, for the indicated cell lines. Each line is a cell and the activity of the HIV-1 promoter is color-coded (green: active; red: inactive), using the threshold shown in panel A. x-axis: time in hours. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Analysis and modeling strategy for the live cell transcriptional data.
A, B Determination of models for transcription initiation. A example of a complex multiple state promoter model, describing the different steps leading to transcription initiation and their kinetic relationship. OFF: inactive promoter state; ON: active promoter state; orange ball: RNA polymerase. B the survival function (equal to one minus the cumulative function) describes the distribution of polymerase waiting times (delay between two successive initiation events). For multiple state models such as the one depicted on the left, the survival function can be fitted by a sum of exponentials, with the number of exponentials being equal to the number of promoter states. C Experimental and machine learning strategy to determine the survival function of polymerase waiting times. Left: signals of short movies made at high temporal resolution result from the convolution of the signal from a single polymerase and the sequence of temporal positions of initiation events. The sequence of initiation events can thus be reconstructed by a deconvolution numerical method (see Supplementary Note 2), provided that the signal of a single polymerase is known. This allows us to estimate the distribution of waiting times for waiting times shorter than the movie duration (i.e. a conditional distribution). Right: long movies made with a lower temporal resolution, in the order of the residency time of RNA polymerase on the gene (3 min), allow us to estimate the distribution of polymerase waiting times for waiting times greater than the temporal resolution. The two conditional survival functions, short and long, can then be combined to reconstitute the complete, multiple time scale survival function. The reconstitution uses affine transformations of the conditional survival functions, defined by two parameters ps and pl. pl is the probability that the waiting time is larger than the frame rate of the long movie. It is proportional to the number of waiting times hidden within active periods of the long movie, and is estimated from the number of inactive intervals and the cumulative duration of active periods of the long movie (see Supplementary Note 3). ps is the probability that the waiting time is larger than the short movie length and is fitted to minimize the distance between short and long parts of the distribution. Finally, the complete survival function is fitted with a sum of exponentials to determine the number of promoter state, the kinetics of transitions between them, and the initiation rate. Multiple models can be easily fitted to the same survival function and the most appropriate one is selected based on parsimony, parametric indeterminacy and consistency with complementary experiments.
Fig. 5
Fig. 5. Accuracy and robustness of the analysis and modeling pipeline.
A Accuracy and robustness of the deconvolution method. Left panels: simulation of short movies for an artificial set of polymerase initiation events, with noise added (bottom), or without (top). x-axis is time in minutes; y-axis is the intensity of transcription sites (expressed in number of RNA molecules). Right panels: positions of the transcription initiation events (vertical bars), for the original artificial data (black; bottom lines), the reconstructed data from the simulated short movies after the genetic algorithm (GA, red, middle lines), or the final reconstruction after both the GA and the local optimization (blue; top lines). x-axis is time in minutes. BE. Accuracy and robustness of the overall analysis pipeline. B The linear three-state promoter model used for Monte Carlo simulations. C Examples of artificial short movies (black lines), with various levels of noise added (red lines). Note that the experimentally measured noise level (resulting from the fitting deviations) corresponds to the 1x condition. x-axis is time in seconds; y-axis is the intensity of transcription sites expressed in number of RNA molecules. D Survival functions reconstructed from artificial short and long movies (red and green circles, respectively), and fitted to a sum of three exponentials (black line). The theoretical survival function obtained with the model parameters used for the simulation is shown for comparison (blue line). x-axis: time intervals between successive initiations events, in seconds and in log10 scale. y-axis: probability of Δt > x (log10 scale). E Accuracy of determining the model parameters. Graphs plot the parameters used to generate the artificial data (x-axis), against the parameter measured by the deconvolution and fitting procedure (y-axis). Vertical bars: confidence intervals estimated during the fitting procedure (see Methods). Three parameter sets were used, corresponding to the values obtained with the experimental data from the High Tat cells (circles), Low Tat cells (crosses), and No Tat cells (triangles).
Fig. 6
Fig. 6. A facultative pausing model reproduces the live cell transcription data and predicts a long-lived pause.
A Schematics of the different models used to fit the live cell HIV-1 transcriptional data. Polymerases are represented by small orange balls. B Fits of the experimental survival functions. Graphs represent the survival functions reconstructed from the live cell data for the High Tat, Low Tat, and No Tat conditions, with the part deriving from the short and long movies in red and green, respectively. Blue line: fit of the 3-state model with a facultative pause; “+“: fit of the 3-state model with an obligatory pause; “x”: fit with a facultative pause. x-axis: time intervals between successive initiations events, in seconds and in log10 scale. y-axis: probability of Δt > x (log10 scale). C Model scores. The graph depicts the score of each model (inverse of the minimal value of the fitted Objective Function), for each of the model and cell line. D, E Pausing characteristics predicted by the models. D Predicted pausing times, for the relevant models and cell lines (see text for details). E Predicted pausing frequencies (in %), for the indicated cell line and model. For the model with the facultative pause and systematic abortion, the two indicated values come from the two branches of the model that could each correspond to the paused state (see the symmetric representation of the model M2 when krelease = 0 in Supplementary Notes Fig. S5). F, G Features of the model with the facultative pause. F The graphs represent the number of mRNA per cell measured by smFISH experiments (violet bars), or predicted from the model parameters (blue bars, with the center being the best fit value predicted from the model). Error bars are the standard deviation for the smFISH data (estimated from independent measurements; n= 3 for High Tat and Low Tat cells, and n = 4 for No Tat cells) and the confidence intervals for the prediction from the model (see Methods). Source data are provided as a Source Data file. G Estimated initiation rate (in s−1), for the three cell lines (left), and the fraction of the cells with the promoter in the ON state (in %; right). The center is the best fit value predicted from the model and error bars are confidence intervals estimated during the fitting procedure (see Methods and Supplementary Note 3.3). Source data are provided as a Source Data file.
Fig. 7
Fig. 7. Biochemical measurements indicate a long-lived paused state at the HIV-1 promoter.
A Residency time of RNA polymerase II at the HIV-1 promoter. The graph depicts the RNA polymerase II ChIP signals at the HIV-1 and GAPDH promoters during a Triptolide time course experiment, for the High Tat and No Tat cell lines. GAPDH TSS: transcription start site of the human GAPDH gene; HIV-1 TSS: transcription start site of the HIV-1 promoter; Control DNA: a non-transcribed genomic locus. ChIP signals were measure by qPCR and values are expressed as percent of input and normalized to the zero time point. For the control genomic regions (Control DNA), values are normalized to that of GAPDH TSS at time zero. Values are averaged from two independent experiments (+/- standard deviation) and source data are provided as a Source Data file. B Effect of pTEFb inhibition on the residency time of RNA polymerase II at the GAPDH promoter. Legend as in panel A, except that the KM sample was pretreated with the Cdk9 inhibitor KM05382 for 2 h before triptolide addition. Values are averaged from two independent experiments (±standard deviation) and source data are provided as a Source Data file. C Model depicting the dynamics of the HIV-1 promoter and highlighting the positive and negative effects of Tat. The numbers are from the facultative pausing model fitted to the High Tat and No Tat data (see Fig. 6C and S4; Supplementary Notes Table S4). The model with facultative pausing has two symmetrical branches (see model M2 in the Supplementary Notes Figure S5), and each branch of the model could correspond to the paused state. The values indicated attribute the pause state to the branch that is most affected by the presence of Tat.
Fig. 8
Fig. 8. Bursting of the HIV-1 promoter in latently infected HeLa cells.
A Schematic of the HIV-1 reporter construct used to generate latent cells. SD1: major HIV-1 splice site donor; SA7: last HIV-1 splice site acceptor; ψ: packaging signal; RRE: Rev-responsive element; LTR: long terminal repeat; IRES: internal ribosome entry site; Hygro: hygromycin selectable marker; TK: herpes simplex thymidine kinase counter selectable marker. B Expression of HIV-1 in three latently infected HeLa clones. The histograms represent the distribution of the number of released HIV-1 128xMS2 pre-mRNAs per cell, in each of the three clones. Experimental RNA distributions are from smFISH data. x-axis: number of HIV-1 pre-mRNA molecules per cell; y-axis: number of cells. Red bars: untreated cells; blue bars cells incubated with TNFα (50 ng/ml for 30 min); inset: cell treatment, with the mean number of HIV-1 pre-mRNAs per cell indicated in parenthesis. Source data are provided as a Source Data file. C Active and inactive periods of the HIV-1 promoter, for the indicated cell lines. Each line is a cell and the activity of the HIV-1 promoter is color-coded (green: active; red: inactive), using the threshold shown in Fig. S4. x-axis: time in hours. D, E Fluctuations of HIV-1 transcription over 15–30 min periods, with one image stack recorded every 3 s in cells from the clone 12. D each graph is a single transcription site; the x-axis represents the time (in minutes) and y-axis represents the intensity of transcription sites, expressed in equivalent numbers of full-length pre-mRNA molecules. E Each line is a cell and the transcription site intensity is color-coded (scale on the right). Source data are provided as a Source Data file. F Model scores. The graph depicts the score of each model (inverse of the minimal value of the fitted Objective Function), for clone 12 and for each of the model of Fig. 6A. G pausing characteristics predicted by the model of facultative pausing for the clone 12. The two indicated values come from the two branches of the model that could each correspond to the paused state (see Fig. S4).

Similar articles

Cited by

References

    1. Schier A, Taatjes D. Structure and mechanism of the RNA polymerase II transcription machinery. Genes Dev. 2020;34:465–488. doi: 10.1101/gad.335679.119. - DOI - PMC - PubMed
    1. Jonkers I, Lis J. Getting up to speed with transcription elongation by RNA polymerase II. Nat. Rev. Mol. Cell Biol. 2015;16:167–177. doi: 10.1038/nrm3953. - DOI - PMC - PubMed
    1. Harlen KM, Chrchman LS. The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat. Rev. Mol. Cell Biol. 2017;18:263–273. doi: 10.1038/nrm.2017.10. - DOI - PubMed
    1. Fisher R. Cdk7: a kinase at the core of transcription and in the crosshairs of cancer drug discovery. Transcription. 2019;10:47–56. doi: 10.1080/21541264.2018.1553483. - DOI - PMC - PubMed
    1. Rimel J, Taatjes D. The essential and multifunctional TFIIH complex. Protein Sci. 2018;27:1018–1037. doi: 10.1002/pro.3424. - DOI - PMC - PubMed

Publication types

Substances