Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep;22(9):1616-25.
doi: 10.1101/gr.134445.111.

Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs

Affiliations

Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs

Hagen Tilgner et al. Genome Res. 2012 Sep.

Abstract

Splicing remains an incompletely understood process. Recent findings suggest that chromatin structure participates in its regulation. Here, we analyze the RNA from subcellular fractions obtained through RNA-seq in the cell line K562. We show that in the human genome, splicing occurs predominantly during transcription. We introduce the coSI measure, based on RNA-seq reads mapping to exon junctions and borders, to assess the degree of splicing completion around internal exons. We show that, as expected, splicing is almost fully completed in cytosolic polyA+ RNA. In chromatin-associated RNA (which includes the RNA that is being transcribed), for 5.6% of exons, the removal of the surrounding introns is fully completed, compared with 0.3% of exons for which no intron-removal has occurred. The remaining exons exist as a mixture of spliced and fewer unspliced molecules, with a median coSI of 0.75. Thus, most RNAs undergo splicing while being transcribed: "co-transcriptional splicing." Consistent with co-transcriptional spliceosome assembly and splicing, we have found significant enrichment of spliceosomal snRNAs in chromatin-associated RNA compared with other cellular RNA fractions and other nonspliceosomal snRNAs. CoSI scores decrease along the gene, pointing to a "first transcribed, first spliced" rule, yet more downstream exons carry other characteristics, favoring rapid, co-transcriptional intron removal. Exons with low coSI values, that is, in the process of being spliced, are enriched with chromatin marks, consistent with a role for chromatin in splicing during transcription. For alternative exons and long noncoding RNAs, splicing tends to occur later, and the latter might remain unspliced in some cases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) Long RNA-seq data sets used in this analysis. (B) Definition of the completed splicing index (coSI) for each internal exon and each RNA-seq data set.
Figure 2.
Figure 2.
Histogram of coSI values (left) and boxplots of coSI values in bins according to the distance of an exon to the annotated polyA site (intervals on x-axis give minimum and maximum distance in each bin; right) for the total chromatin-associated RNA fraction (A), the polyA− nuclear fraction (B), the polyA+ nuclear fraction (C), and the polyA+ cytosolic fraction (D). P-values were calculated comparing the first and the last bin, using a two-sided Wilcoxon rank sum test. Numbers below boxplots indicate the median value of the according distribution.
Figure 3.
Figure 3.
RNA-seq profile plots using RNA-seq reads mapping to the genome only, aligned at the acceptor. At each aligned position, the average number of overlapping RNA-seq reads (mapping to the genome) for all exons in each bin (according to coSI values in the total chromatin-associated RNA fraction in all four subfigures) is plotted for sense (solid lines) and antisense strand (dashed lines). Here, only exons that are at least 150 bp away from any other exon are used. RNA-seq profiles for the total chromatin-associated RNA fraction (A), the polyA− nuclear fraction (B), the polyA+ nuclear fraction (C), and the polyA+ cytosolic fraction (D). (Dark gray area) Positions that are guaranteed to be covered only by reads that were not used for the coSI value calculation; (both gray areas) positions that are guaranteed to be intronic. Note that these profiles are not normalized for gene expression. We added profiles normalized for cytosolic polyA+ gene expression in Supplemental Figure S6.
Figure 4.
Figure 4.
An RPM was calculated based on short RNA-seq in each subcellular fraction—total chromatin fraction (CHR; red), total cytoplasmic fraction (CYT; yellow), total nucleoli fraction (NL; green), total nucleoplasmic fraction (NP; light blue), total nuclear fraction (NUC; purple), total whole-cell fraction (WC; pink)—and summed for all genes encoding for U1-RNA (A), U2-RNA (B), U3-RNA (C), U4-RNA (D), U5-RNA (E), U6 RNA (F), U6atac (G), and non–U-RNA snoRNAs (H).
Figure 5.
Figure 5.
Linear model connecting exon-coSI values to gene, exon, and chromatin structure variables. (A) Smoothed scatterplot and correlation between predicted coSI values and measured coSI values using the entire model. (B) Correlation of predicted coSI values and measured coSI values using four increasing subsets of variables and the entire model: model with distance to TSS and distance to polyA site (pos); model additionally including acceptor strength, donor strength, log-exon-length, log-upstream-intron-length, log-downstream-intron-length and exonic GC content (+struc); model additionally including gene RPKMs from polyA+ nuclear RNA (+GE); model additionally including ChIP-seq related variables (+chrom); model including all variables (entire). (C) Coefficients in the entire model of distance to the TSS and to the polyA-site. (D) Acceptor strength (accSc), donor strength (donSc), exonic GC content (GC), log-exon-length [lg(exLen)], log-upstream-intron-length [lg(upILen)], log-downstream-intron length [lg(doIlen)] and gene RPKMs from polyA+ nuclear RNA (GE). (E) MNase and histone modification values as described in Figure S10.
Figure 6.
Figure 6.
(A) Clustering of subcellular RNA fractions and exons according to exonic coSI values using four RNA fractions. From left to right: total chromatin-associated RNA, polyA− nuclear RNA, polyA+ nuclear RNA, polyA+ cytosolic RNA. Note that the scale is only linear from coSI ≥ 0.5 on. (B) Overlap between exons with a tendency for post-transcriptional splicing (postTS) and entirely coding exons (CDS). (C) Overlap between cell type specifically included AS-exons and exons with a tendency for postTS. (D) The distribution of coSI scores for various exon sets is shown, based on calculations for chromatin total RNA. Information is plotted for the 4933 lncRNA exons and 372,306 protein-coding gene exons that have sufficient RNA-seq reads to calculate a confident coSI score. In addition, we extracted exon values for three known lncRNAs: H19 (18 exons), XIST (19 exons), U50HG-SNHG5 (22 exons). The difference between lncRNA and protein exon coSI values is statistically significant (Wilcox test; P < 2.2 × 10−16). (E) Gene-level coSI scores from chromatin total RNA are plotted for 92 lncRNAs and 4066 protein-coding genes. The difference between the distributions is statistically significant (Wilcox test; P < 2.2 × 10−16). (F) Exon-level coSI scores from nuclear polyA+ RNA are plotted for 206 lncRNA exons and 32,496 protein-coding exons. The difference between the distributions is statistically significant (Wilcox test; P < 2 × 10−16).

Similar articles

Cited by

References

    1. Allo M, Buggiano V, Fededa JP, Petrillo E, Schor I, de la Mata M, Agirre E, Plass M, Eyras E, Elela SA, et al. 2009. Control of alternative splicing through siRNA-mediated transcriptional gene silencing. Nat Struct Mol Biol 16: 717–724 - PubMed
    1. Ameur A, Zaghlool A, Halvardson J, Wetterbom A, Gyllensten U, Cavelier L, Feuk L 2011. Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat Struct Mol Biol 18: 1435–1440 - PubMed
    1. Andersson R, Enroth S, Rada-Iglesias A, Wadelius C, Komorowski J 2009. Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res 19: 1732–1741 - PMC - PubMed
    1. Barash YCJ, Gao W, Pan Qu, Wang X, Shai O, Blencowe J, Frey B 2010. Deciphering the splicing code. Nature 465: 53–59 - PubMed
    1. Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K 2007. High-resolution profiling of histone methylations in the human genome. Cell 129: 823–837 - PubMed

Publication types

Associated data