Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 26;13(11):R107.
doi: 10.1186/gb-2012-13-11-r107.

Transposable elements reveal a stem cell-specific class of long noncoding RNAs

Transposable elements reveal a stem cell-specific class of long noncoding RNAs

David Kelley et al. Genome Biol. .

Abstract

Background: Numerous studies over the past decade have elucidated a large set of long intergenic noncoding RNAs (lincRNAs) in the human genome. Research since has shown that lincRNAs constitute an important layer of genome regulation across a wide spectrum of species. However, the factors governing their evolution and origins remain relatively unexplored. One possible factor driving lincRNA evolution and biological function is transposable element (TE) insertions. Here, we comprehensively characterize the TE content of lincRNAs relative to genomic averages and protein coding transcripts.

Results: Our analysis of the TE composition of 9,241 human lincRNAs revealed that, in sharp contrast to protein coding genes, 83% of lincRNAs contain a TE, and TEs comprise 42% of lincRNA sequence. lincRNA TE composition varies significantly from genomic averages - L1 and Alu elements are depleted and broad classes of endogenous retroviruses are enriched. TEs occur in biased positions and orientations within lincRNAs, particularly at their transcription start sites, suggesting a role in lincRNA transcriptional regulation. Accordingly, we observed a dramatic example of HERVH transcriptional regulatory signals correlating strongly with stem cell-specific expression of lincRNAs. Conversely, lincRNAs devoid of TEs are expressed at greater levels than lincRNAs with TEs in all tissues and cell lines, particularly in the testis.

Conclusions: TEs pervade lincRNAs, dividing them into classes, and may have shaped lincRNA evolution and function by conferring tissue-specific expression from extant transcriptional regulatory signals.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Transposable element composition of human lincRNAs. We intersected TE annotations with a catalog of 9,241 human lincRNAs. (a) TEs compose less lincRNA sequence than genomic background but much more than protein coding genes. Promoters for the two gene classes are more similar than the transcripts. (b) The lincRNA frequencies of many specific TE families differ significantly (based on a shuffling statistical test) from their genomic averages. Larger families are to the right. Enrichments are above zero on the y-axis, and depletions are below zero. ERV1 families (labeled in blue) are particularly enriched.
Figure 2
Figure 2
Example lincRNAs with TE annotations. lincRNA exons are drawn above in blue, with introns colored lighter. TEs are colored by family, matching the legend in Figure 1a. (a) TUG1 serves as a typical example of a lincRNA containing multiple TE families. (b) Alternatively, HOTAIR and 1,531 (17%) of the lincRNAs in our catalog are devoid of TEs. (c) Linc-ROR is almost entirely composed of TEs, including its TSS in the LTR of a HERVH element. (d) BC026300 also initiates transcription in a HERVH. The images were created using the software AnnotationSketch [93].
Figure 3
Figure 3
ERV1 LTRs associate with lincRNA TSSs. We plotted the coverage of various TE families approaching lincRNA TSSs. The prevalent L1 and Alu families are depleted in lincRNAs. Accordingly, their coverage drops throughout lincRNA promoters leading up to the TSS. Alternatively, ERV1 elements are enriched in lincRNAs, and coverage of the transcription-promoting ERV1 LTRs peaks at the TSS. This pattern was not observed for mRNAs (Figure S11 in Additional file 1).
Figure 4
Figure 4
HERVH elements associate with stem cell-specific lincRNA expression. (a) HERVH is a primate-specific 9 kb endogenous retrovirus containing the group specific antigen (Gag), protease (Pro), polymerase (Pol), and envelope (Env) proteins, surrounded on both sides by transcription-promoting LTRs. (b) 127 lincRNAs (columns) contain HERVH elements and expression of these lincRNAs (measured as log2(FPKM + 0.25)) across cell types (rows) is highly specific to the pluripotent H1-hESCs and iPSCs. (c) HERVH-lincRNAs are expressed at much greater levels than lincRNAs devoid of HERVH (dHERVH-lincRNAs) in ESCs, displayed here as the cumulative distribution of FPKM + 0.25. (d) ChIP-Seq read coverage indicates that HERVH-lincRNAs are marked by the activating histone modification H3K4me3 in H1-hESCs but not GM12878 where expression is low. (e) The transcription factor SP1 was previously found to be required for HERVH transcription. Accordingly, ChIP-Seq read coverage shows SP1 occupies the TSSs of HERVH-lincRNAs in H1-hESC but not GM12878.
Figure 5
Figure 5
Mouse lincRNAs share TE composition properties. (a) Similar to the human genome, in a catalog of 981 mouse lincRNAs, TEs are depleted overall relative to the genomic background frequency, but still a substantial 33% of sequence is TE-derived. (b) TEs also exhibit biased composition in mouse lincRNAs, with strong L1 depletion and ERV1 enrichment, matching observations in human.

Similar articles

Cited by

References

    1. Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010;28:503–510. - PMC - PubMed
    1. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. - PMC - PubMed
    1. Ulitsky I, Shkumatava A, Jan CH, Sive H, Bartel DP. Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell. 2011;147:1537–1550. - PMC - PubMed
    1. Pauli A, Valen E, Lin MF, Garber M, Vastenhouw NL, Levin JZ, Fan L, Sandelin A, Rinn JL, Regev A, Schier AF. Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis. Genome Res. 2012;22:577–591. - PMC - PubMed
    1. Young RS, Marques AC, Tibbit C, Haerty W, Bassett AR, Liu J-L, Ponting CP. Identification and properties of 1,119 candidate lincRNA loci in the Drosophila melanogaster genome. Genome Biol Evol. 2012;4:427–442. - PMC - PubMed

Publication types

LinkOut - more resources