Strand-specific, high-resolution mapping of modified RNA polymerase II

doi:10.15252/msb.20166869

. 2016 Jun 10;12(6):874.

doi: 10.15252/msb.20166869.

Strand-specific, high-resolution mapping of modified RNA polymerase II

Laura Milligan¹, Vân A Huynh-Thu², Clémentine Delan-Forino¹, Alex Tuck³, Elisabeth Petfalski¹, Rodrigo Lombraña⁴, Guido Sanguinetti⁵, Grzegorz Kudla⁶, David Tollervey⁷

Affiliations

¹ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
² School of Informatics, University of Edinburgh, Edinburgh, UK Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
³ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Cambridge, UK.
⁴ MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, UK.
⁵ School of Informatics, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.
⁶ MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.
⁷ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.

PMID: 27288397
PMCID: PMC4915518
DOI: 10.15252/msb.20166869

Strand-specific, high-resolution mapping of modified RNA polymerase II

Laura Milligan et al. Mol Syst Biol. 2016.

. 2016 Jun 10;12(6):874.

doi: 10.15252/msb.20166869.

Authors

Laura Milligan¹, Vân A Huynh-Thu², Clémentine Delan-Forino¹, Alex Tuck³, Elisabeth Petfalski¹, Rodrigo Lombraña⁴, Guido Sanguinetti⁵, Grzegorz Kudla⁶, David Tollervey⁷

Affiliations

¹ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
² School of Informatics, University of Edinburgh, Edinburgh, UK Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
³ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Cambridge, UK.
⁴ MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, UK.
⁵ School of Informatics, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.
⁶ MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.
⁷ Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK d.tollervey@ed.ac.uk gkudla@gmail.com gsanguin@inf.ed.ac.uk.

PMID: 27288397
PMCID: PMC4915518
DOI: 10.15252/msb.20166869

Abstract

Reversible modification of the RNAPII C-terminal domain links transcription with RNA processing and surveillance activities. To better understand this, we mapped the location of RNAPII carrying the five types of CTD phosphorylation on the RNA transcript, providing strand-specific, nucleotide-resolution information, and we used a machine learning-based approach to define RNAPII states. This revealed enrichment of Ser5P, and depletion of Tyr1P, Ser2P, Thr4P, and Ser7P in the transcription start site (TSS) proximal ~150 nt of most genes, with depletion of all modifications close to the poly(A) site. The TSS region also showed elevated RNAPII relative to regions further 3', with high recruitment of RNA surveillance and termination factors, and correlated with the previously mapped 3' ends of short, unstable ncRNA transcripts. A hidden Markov model identified distinct modification states associated with initiating, early elongating and later elongating RNAPII. The initiation state was enriched near the TSS of protein-coding genes and persisted throughout exon 1 of intron-containing genes. Notably, unstable ncRNAs apparently failed to transition into the elongation states seen on protein-coding genes.

Keywords: hidden Markov model; polymerase CTD phosphorylation; transcription; yeast.

PubMed Disclaimer

Figures

**Figure EV1. Rpo21 crosslinking and purification during mCRAC**
The schematic indicates the major steps in the mCRAC protocol. The inset gel shows two replicates of the gel purification of Rpo21, with antibodies against the modifications indicated or a no antibody control (NAb), visualized by autoradiography of the 5′ [³²P] labeled, crosslinked RNA.

**Figure EV2. Distribution of RNAPII with respect to gene features**
RNAPII signals drop around the position of the translation termination codon. The graphs show the site of the termination codon as 0 plus 500 nt flanking regions. The density of crosslinking over all protein‐coding genes is shown for Rpo21 (total RNAPII), as well as the poly(A)‐binding protein Pab1 and the cytoplasmic degradation factor Xrn1 (data from Tuck & Tollervey, 2013). The frequency of polyadenylation signals, AWTAAA and TATATA, are also indicated. Nucleosome distribution: The curve shows the fraction of transcripts where a nucleosome matches the corresponding position.

**Figure 2. Binding by RNA surveillance and degradation factors is strongly enriched close to the TSS on protein‐coding genes**
A–J
Each panel shows the hit density, normalized to the maximal binding value across all mRNA genes longer than 500 nt inside each individual experiment. The two lines in each panel represent results from independent CRAC experiments. The TSS‐proximal 150 nt region is shaded. (A) Rpo21 (RNAPII); (B) Rrp44; (C) Rrp6; (D) Trf4; (E) Air2; (F) Nab3; (G) Mex67; (H) Distribution of the 3′ ends of short, promoter‐proximal, sense‐orientated ncRNA transcripts (S CUTs); (I) Nab2; (J) Hrp1. Sequence data source: (A‐E) (this work), (F) (Holmes *et al*, 2015). (G, I, J) (Tuck & Tollervey, 2013), (H) (Neil *et al*, 2009).

**Figure 3. Profiles of RNAPII phosphorylation on mRNA and ncRNA genes can be generated using mCRAC**
Distribution of RNAPII phosphorylation across protein‐coding genes aligned at the TSS, as in Fig 1C as determined by mCRAC analyses on Rpo21‐HTP. Red color indicates depletion, and green color indicates enrichment of phosphorylation relative to total RNAPII.
As panel (A) but with genes aligned at the polyadenylation site.
Metagene analysis of RNAPII phosphorylation enrichment relative to the 5,000 strongest RNAPII pause sites in mRNA genes, identified by NET‐Seq (Churchman & Weissman, 2011).
Metagene analysis of RNAPII phosphorylation enrichment relative to transcription start sites, calculated for all mRNA genes. The TSS‐proximal 150 nt region, where Ser5P is enriched and Ser2P and Tyr1P are depleted, is indicated with a dashed line.
Metagene analysis of RNAPII phosphorylation on CUTs as for (D).
Metagene analysis of RNAPII phosphorylation on SUTs as for (D).

**Figure EV3. Distribution of RNAPII phosphorylation across mRNAs and non‐coding RNA genes**
Distribution of RNAPII phosphorylation as in Fig 3A, across all protein‐coding genes, compared to the SUT, CUT, and snoRNA classes of ncRNA. Red color indicates depletion, and green color indicates enrichment of phosphorylation relative to total RNAPII. The graph below each panel shows a metagene analysis of RNAPII phosphorylation enrichment for all genes in each class.

**Figure EV4. Comparison of distributions of phosphorylation enrichment between different transcript classes**
Distributions of phosphorylation enrichment on individual mRNAs, CUTs, and SUTs. The boxes show the median and interquartile range of phosphorylation enrichment scores for mRNAs (N = 5,171), CUTs (N = 925), and SUTs (N = 847).
Distributions of phosphorylation enrichment in the first 500 nt of mRNAs, CUTs, and SUTs. Only genes with length greater than 500 nt were analyzed. The boxes show the median and interquartile range of phosphorylation enrichment scores for mRNAs (N = 5,040), CUTs (N = 294), and SUTs (N = 563).
Data information: The whiskers indicate 1.5 times the interquartile range, and the asterisks indicate statistical significance (P < 0.01, Wilcoxon test with Bonferroni correction, N = 30).

**Figure 4. HMM analyses of phosphorylation state distributions**
A
Metagene analysis of frequency distribution for each state on all protein‐coding genes. See expanded view for analyses of replicate datasets and statistical analyses.
B
Genome browser views showing the distribution of the 8 states in the HMM over an unspliced gene (*LCP5*) and a spliced gene (*RPS0B*).
C
Learned emission matrix of the 8‐state HMM. Each row shows the average log‐enrichment levels of the different phosphorylated forms of Rpo21 over total RNAPII in one of the states.
D
Comparison of the locations of the 3′ end of initiation state I1 with the 3′ boundary of nucleosome 1.
E
Comparison of the locations of the 5′ end of early elongation state EE with the 5′ boundary of nucleosome 2.
F–H
The presence of an intron is associated with displacement of phosphorylation state boundaries. The graphs show state fold enrichment for each state over protein‐coding genes lacking an intron (F), containing short exon 1 (< 100 nt) regions (G) or long (> 100 nt) exon 1 regions (H). For each panel, the length of each gene has been divided into ten bins to allow the combination of genes with different lengths. T, Transcript; E1, Exon 1; I, Intron; E2, Exon 2. Intron‐containing genes in yeast are generally highly expressed, and the top quartile of intronless genes was therefore taken for comparison.

**Figure EV5. HMM transition matrix, reproducibility of results, and state enrichment analysis**
Plot showing the mean squared error with respect to the number of states in the HMM.
Learned state transition matrix for 8‐state HMM. Each entry (i,j) of the matrix shows the probability of transitioning from hidden state i to j at any position along the genome.
State distributions across first quartile of protein‐coding genes from two independent mCRAC analyses.
State distributions across first quartile of protein coding relative to the positions of nucleosomes 1–5 (N1–N5) on these genes.

**Figure EV6. Comparison of the effects of different state numbers in the HMM**
A
Emission matrices for 6‐, 8‐, and 10‐state HMM models.
B–D
(B) 6‐state HMM model; (C) 8‐state HMM model; (D) 10‐state HMM model. In each panel, the left graph (I) shows the top quartile of protein‐coding genes, the middle graph (II) shows spliced genes with short exon 1 regions (< 100 nt) and the right graph (III) shows spliced genes with long exon 1 regions (> 100 nt). In each model, preferential initiation and elongation states are clearly resolved. In all models, the initiation state is clearly extended further 3′ on intron‐containing genes.

**Figure EV7. The presence of an intron is associated with displacement of phosphorylation state boundaries**
A
Average state probabilities shown individually for each state over protein‐coding genes lacking an intron (intronless) or with short (< 100 nt) or long (> 100 nt) exon 1 regions, aligned by the transcription start site (TSS).
B, C
Average state probabilities shown individually for each state over protein‐coding genes with short (< 100 nt) or long (> 100 nt) exon 1 regions, aligned by 5′ splice site (5′SS; panel B) or 3′ splice site (3′SS; panel C).
D
Distribution of phosphorylated RNAPII relative to the 3′ splice site (3′SS) in intron‐containing genes.

**Figure 5. Metagene HMM analyses of phosphorylation states on protein‐coding and ncRNA genes**
Comparison of state frequencies over expression‐matched protein‐coding mRNAs, SUTs, and CUTs.
Comparison of state distributions over expression‐matched protein‐coding mRNAs and CUTs as in A.
The failure of lncRNAs to exit initiation state I1 is not a consequence of short length. The curves show the positions at which RNAPII exits initiation state I1 (red curve) and enters early elongation state EE (black curve), relative to the lengths of CUTs (green line) and SUTs (blue line).

**Figure 6. Model comparing the state transitions on coding and non‐coding RNA transcripts**
Yeast genes typically have a well‐ordered nucleosome close to the transcription start site. On both mRNA and ncRNA genes, RNAPII generally in an initiation state (I1 and I2; shown as I1) while traversing the first nucleosome. This state favors recruitment of nuclear RNA surveillance machinery, including the NNS termination complex and the TRAMP–exosome degradation system. On mRNAs, RNAPII then transitions into the early elongation state (EE) followed by the major elongation states (E1 to E3; shown as E1). On intron‐containing mRNAs, the transition from initiation of early elongation states is displaced further 3′. On ncRNAs, the initiation state persists and the failure to transition to an elongation states (EE and E1) favors termination and transcript degradation.

See this image and copyright information in PMC

Cited by

Nascent RNA and the Coordination of Splicing with Transcription.
Neugebauer KM. Neugebauer KM. Cold Spring Harb Perspect Biol. 2019 Aug 1;11(8):a032227. doi: 10.1101/cshperspect.a032227. Cold Spring Harb Perspect Biol. 2019. PMID: 31371351 Free PMC article. Review.
A Nuclear Export Block Triggers the Decay of Newly Synthesized Polyadenylated RNA.
Tudek A, Schmid M, Makaras M, Barrass JD, Beggs JD, Jensen TH. Tudek A, et al. Cell Rep. 2018 Aug 28;24(9):2457-2467.e7. doi: 10.1016/j.celrep.2018.07.103. Cell Rep. 2018. PMID: 30157437 Free PMC article.
Surveillance-ready transcription: nuclear RNA decay as a default fate.
Bresson S, Tollervey D. Bresson S, et al. Open Biol. 2018 Mar;8(3):170270. doi: 10.1098/rsob.170270. Open Biol. 2018. PMID: 29563193 Free PMC article. Review.
RNA Binding by Histone Methyltransferases Set1 and Set2.
Sayou C, Millán-Zambrano G, Santos-Rosa H, Petfalski E, Robson S, Houseley J, Kouzarides T, Tollervey D. Sayou C, et al. Mol Cell Biol. 2017 Jun 29;37(14):e00165-17. doi: 10.1128/MCB.00165-17. Print 2017 Jul 15. Mol Cell Biol. 2017. PMID: 28483910 Free PMC article.
Diverse and conserved roles of the protein Ssu72 in eukaryotes: from yeast to higher organisms.
Liu C, Zhang W, Xing W. Liu C, et al. Curr Genet. 2021 Apr;67(2):195-206. doi: 10.1007/s00294-020-01132-5. Epub 2020 Nov 26. Curr Genet. 2021. PMID: 33244642 Review.

See all "Cited by" articles

References

1. Alexander RD, Innocente SA, Barrass JD, Beggs JD (2010) Splicing‐dependent RNA polymerase pausing in yeast. Mol Cell 40: 582–593 - PMC - PubMed
1. Arigo JT, Eyler DE, Carroll KL, Corden JL (2006) Termination of cryptic unstable transcripts is directed by yeast RNA‐binding proteins Nrd1 and Nab3. Mol Cell 23: 841–851 - PubMed
1. de Boer CG, van Bakel H, Tsui K, Li J, Morris QD, Nislow C, Greenblatt JF, Hughes TR (2014) A unified model for yeast transcript definition. Genome Res 24: 154–166 - PMC - PubMed
1. Buratowski S (2009) Progression through the RNA Polymerase II CTD Cycle. Mol Cell 36: 541–546 - PMC - PubMed
1. Camblong J, Iglesias N, Fickentscher C, Dieppois G, Stutz FÁ (2007) Antisense RNA stabilization induces transcriptional gene Silencing via histone deacetylation in S. cerevisiae . Cell 131: 706–717 - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Alexander RD, Innocente SA, Barrass JD, Beggs JD (2010) Splicing‐dependent RNA polymerase pausing in yeast. Mol Cell 40: 582–593 - PMC - PubMed

[2] Alexander RD, Innocente SA, Barrass JD, Beggs JD (2010) Splicing‐dependent RNA polymerase pausing in yeast. Mol Cell 40: 582–593 - PMC - PubMed

[3] Arigo JT, Eyler DE, Carroll KL, Corden JL (2006) Termination of cryptic unstable transcripts is directed by yeast RNA‐binding proteins Nrd1 and Nab3. Mol Cell 23: 841–851 - PubMed

[4] Arigo JT, Eyler DE, Carroll KL, Corden JL (2006) Termination of cryptic unstable transcripts is directed by yeast RNA‐binding proteins Nrd1 and Nab3. Mol Cell 23: 841–851 - PubMed

[5] de Boer CG, van Bakel H, Tsui K, Li J, Morris QD, Nislow C, Greenblatt JF, Hughes TR (2014) A unified model for yeast transcript definition. Genome Res 24: 154–166 - PMC - PubMed

[6] de Boer CG, van Bakel H, Tsui K, Li J, Morris QD, Nislow C, Greenblatt JF, Hughes TR (2014) A unified model for yeast transcript definition. Genome Res 24: 154–166 - PMC - PubMed

[7] Buratowski S (2009) Progression through the RNA Polymerase II CTD Cycle. Mol Cell 36: 541–546 - PMC - PubMed

[8] Buratowski S (2009) Progression through the RNA Polymerase II CTD Cycle. Mol Cell 36: 541–546 - PMC - PubMed

[9] Camblong J, Iglesias N, Fickentscher C, Dieppois G, Stutz FÁ (2007) Antisense RNA stabilization induces transcriptional gene Silencing via histone deacetylation in S. cerevisiae . Cell 131: 706–717 - PubMed

[10] Camblong J, Iglesias N, Fickentscher C, Dieppois G, Stutz FÁ (2007) Antisense RNA stabilization induces transcriptional gene Silencing via histone deacetylation in S. cerevisiae . Cell 131: 706–717 - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Strand-specific, high-resolution mapping of modified RNA polymerase II

Affiliations

Strand-specific, high-resolution mapping of modified RNA polymerase II

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials