Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jan;16(1):20-9.
doi: 10.1101/gr.4139206. Epub 2005 Dec 12.

Organization of the Caenorhabditis elegans small non-coding transcriptome: genomic features, biogenesis, and expression

Affiliations

Organization of the Caenorhabditis elegans small non-coding transcriptome: genomic features, biogenesis, and expression

Wei Deng et al. Genome Res. 2006 Jan.

Abstract

Recent evidence points to considerable transcription occurring in non-protein-coding regions of eukaryote genomes. However, their lack of conservation and demonstrated function have created controversy over whether these transcripts are functional. Applying a novel cloning strategy, we have cloned 100 novel and 61 known or predicted Caenorhabditis elegans full-length ncRNAs. Studying the genomic environment and transcriptional characteristics have shown that two-thirds of all ncRNAs, including many intronic snoRNAs, are independently transcribed under the control of ncRNA-specific upstream promoter elements. Furthermore, the transcription levels of at least 60% of the ncRNAs vary with developmental stages. We identified two new classes of ncRNAs, stem-bulge RNAs (sbRNAs) and snRNA-like RNAs (snlRNAs), both featuring distinct internal motifs, secondary structures, upstream elements, and high and developmentally variable expression. Most of the novel ncRNAs are conserved in Caenorhabditis briggsae, but only one homolog was found outside the nematodes. Preliminary estimates indicate that the C. elegans transcriptome contains approximately 2700 small non-coding RNAs, potentially acting as regulatory elements in nematode development.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Clonal and functional distributions. (A) Distribution of sequenced library clones on different RNA species and categories. The E. coli RNAs are contaminants from food bacteria ingested by C. elegans. (tRNAscan) tRNAs detected by tRNAscan (Lowe and Eddy 1997); (mtRNA) mitochondrial RNA. (B) Functional distribution of all novel and known ncRNAs detected in this study. Sectors representing RNAs of un defined and novel functional classes are hatched or gridded. misc RNA includes RNase P RNA and Y RNA.
Figure 2.
Figure 2.
The stem-bulge RNAs of C. elegans. (A) Sequential composition of the 5′-end (IM1; E = 1.2 × 10–19) and 3′-end (IM2; E = 1.0 × 10–20) motifs of the sbRNAs (see Supplemental material for details on the E-value). (B) Relative positions of IM1 and IM2 within each of the verified sbRNAs. (C) Predicted (Mfold) secondary structure of sbRNAs CeN73-1 and CeN76. (D) Relative expression of seven of the sbRNAs at heatshock (HS) and different developmental stages (Egg through Dauer; see complete list of abbreviations in Supplemental material).
Figure 3.
Figure 3.
The snlRNAs of C. elegans. (A) Sequence of the IM3 motif of snlRNAs (E = 1.8 × 10–37). (B) Internal position of IM3 in the snlRNAs. (C) Relative expression of five snlRNAs and two C. elegans cyclin mRNAs (T06E6.2 and F43D2.1; Wang and Kim 2003). (D) Comparison of the secondary structures of snlRNA CeN25-1 and snRNA U1. In addition to the Sm-binding site, both RNAs show a similar 3′-tail stem-loop structure, but the remainder of the IM3 motif is absent in U1 snRNA.
Figure 4.
Figure 4.
Upstream motifs discovered at ncRNA loci. (A) Upstream motif 1 (UM1; E = 4.0 × 10–521). (B) Upstream motif 2 (UM2; E = 7.3 × 10–179). (C) Upstream motif 3 (UM3; E = 1.1 × 10–38). (For explanation of E, see Fig. 2.) (D) Distribution of distances from motif position 1 of UM1, UM2, and UM3, respectively, to 5′-end of the ncRNA transcripts. UM1 and UM3 have defined distances from start of the motif to the 5′-end of transcript. The two peaks for UM1 represent distances for loci with (smaller peak) and without (larger peak) an additional TATA-box. The distances between UM2 and transcript 5′-ends are more variable, possibly indicative of post-transcriptional 5′-end processing for this group of ncRNAs.
Figure 5.
Figure 5.
Arrangements of transcriptional elements and genomic locations of small non-coding ncRNA loci, as inferred from genomic and experimental data. (A) TATA-less loci with UM1. This type of locus is characterized by the Upstream Motif 1 and is found both intergenically and intronically. Transcripts from TATA-less UM1 loci generally carry a 5′-end cap, most likely transcribed by RNA polymerase II, and make up biogenesis group I-A, which comprises most spliceosomal snRNAs, a fraction of the SL RNAs, most snlRNAs, and a few C/D snoRNAs along with some unclassified transcripts. (B) Loci with UM1 and a TATA-box. This type of locus combines the UM1 with a TATA-box, and most often a tract of four or more Ts is found within 10 bp of the transcript 3′-terminus. Known RNA polymerase III transcripts like U6 snRNA and RNase P RNA are found at this type of locus. The transcripts may have a single methyl group added at the γ-phosphate post-transcriptionally, as is commonly found in U6 and 7SK snRNAs (Gupta et al. 1990). (C) Loci with UM2. This type of locus comprises a number of both intergenic and intronic snoRNA-like transcripts, along with a few uncharacterized ncRNAs, and makes up biogenesis group II. Transcripts are generally uncapped, and an oligo-T tract is found close to the 3′-terminus, indicating transcription by RNA polymerase III. FB (Front Box) and TB (Tail Box) are the most conserved 15-bp motifs within the 100-bp upstream sequence of these loci, and show strong resemblance to Box A and Box B of the tRNA promoter. A “possible tRNA transcription” initiation site has been indicated to account for the possibility that UM2 is transcribed as a part of the primary transcript (see Supplemental material for details). (D) Loci with UM3. This type of locus has only been found in sbRNAs, and is characterized by UM3, which contains a TATA-box preceded by a strongly conserved G residue. The loci are terminated by an oligo-T tract, and most transcripts are uncapped, suggesting transcription by RNA polymerase III. (E) SRP RNA loci. The C. elegans SRP RNA loci are characterized by a rudimentary TATA-box and a Box A element at ∼10-20 bp downstream of the transcription start, and are terminated by an oligo-T tract. (F) Independently transcribed intronic loci. This type of locus represents subgroups of locus types A-E, in which both the transcribed sequence and the corresponding control elements (promoter, terminator) are found within the intron of a protein-coding gene. This type of locus is found for all the above promoter elements, but is most common for UM1 and UM2 type loci. (G) Motif-less intronic loci. These loci are exclusively made up of snoRNA-like genes, and are often found within an intron of a ribosomal gene. The distance between the ncRNA locus and the preceding exon is generally short (<50 bp) and AT-rich. Transcription is initiated from the host gene promoter, and the snoRNA is processed either directly from the pre-mRNA, or from a spliced intron lariat.

Similar articles

Cited by

References

    1. Aspegren, A., Hinas, A., Larsson, P., Larsson, A., and Soderbom, F. 2004. Novel non-coding RNAs in Dictyostelium discoideum and their expression during development. Nucleic Acids Res. 32 4646-4656. - PMC - PubMed
    1. Atzorn, V., Fragapane, P., and Kiss, T. 2004. U17/snR30 is a ubiquitous snoRNA with two conserved sequence motifs essential for 18S rRNA production. Mol. Cell. Biol. 24 1769-1778. - PMC - PubMed
    1. Bailey, T.L. and Elkan, C. 1995. The value of prior knowledge in discovering motifs with MEME. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3 21-29. - PubMed
    1. Bejerano, G., Pheasant, M., Makunin, I., Stephen, S., Kent, W.J., Mattick, J.S., and Haussler, D. 2004. Ultraconserved elements in the human genome. Science 304 1321-1325. - PubMed
    1. Bertone, P., Stolc, V., Royce, T.E., Rozowsky, J.S., Urban, A.E., Zhu, X., Rinn, J.L., Tongprasit, W., Samanta, M., Weissman, S., et al. 2004. Global identification of human transcribed sequences with genome tiling arrays. Science 306 2242-2246. - PubMed

Publication types

MeSH terms

Associated data

LinkOut - more resources