Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;4(4):427-42.
doi: 10.1093/gbe/evs020. Epub 2012 Mar 8.

Identification and properties of 1,119 candidate lincRNA loci in the Drosophila melanogaster genome

Affiliations

Identification and properties of 1,119 candidate lincRNA loci in the Drosophila melanogaster genome

Robert S Young et al. Genome Biol Evol. 2012.

Abstract

The functional repertoire of long intergenic noncoding RNA (lincRNA) molecules has begun to be elucidated in mammals. Determining the biological relevance and potential gene regulatory mechanisms of these enigmatic molecules would be expedited in a more tractable model organism, such as Drosophila melanogaster. To this end, we defined a set of 1,119 putative lincRNA genes in D. melanogaster using modENCODE whole transcriptome (RNA-seq) data. A large majority (1.1 of 1.3 Mb; 85%) of these bases were not previously reported by modENCODE as being transcribed. Significant selective constraint on the sequences of these loci predicts that virtually all have sustained functionality across the Drosophila clade. We observe biases in lincRNA genomic locations and expression profiles that are consistent with some of these lincRNAs being involved in the regulation of neighboring protein-coding genes with developmental functions. We identify lincRNAs that may be important in the developing nervous system and in male-specific organs, such as the testes. LincRNA loci were also identified whose positions, relative to nearby protein-coding loci, are equivalent between D. melanogaster and mouse. This study predicts that the genomes of not only vertebrates, such as mammals, but also an invertebrate (fruit fly) harbor large numbers of lincRNA loci. Our findings now permit exploitation of Drosophila genetics for the investigation of lincRNA mechanisms, including lincRNAs with potential functional analogues in mammals.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.—
FIG. 1.—
(A) Definition of genomically adjacent protein-coding gene model (FB.4119) and a novel putative lincRNA locus (lincRNA.626). The black boxes denote exons called by Cufflinks for this tissue, with arrowed lines representing introns separating exons within the same transcript. A histogram of read counts that support these models’ sequences is shown below (from embryonic tissues, 4–6 h after egg laying). Note that only Cufflinks transcripts >200 bp are displayed. At the foot of this UCSC genome browser snapshot (Kent et al. 2002) is the FlyBase annotation corresponding to FB.4119, supporting messenger RNAs and ESTs, and a PhastCons track showing genome sequence conservation across multiple arthropods. (B) Venn diagram showing strong overlap between modENCODE (Graveley et al. 2011) and gene model exons and a low degree (13%) of overlap between the lincRNA exons defined in this study and modENCODE exons. (C) Concordance of qRT–PCR data with stage-matched log2(FPKM) expression values from RNA-seq analysis for lincRNA.626. Mean log2(FPKM) values are calculated and plotted for qRT–PCR experiments which cover more than one modENCODE developmental time point. Error bars represent 95% confidence intervals for qRT–PCR.
F<sc>IG</sc>. 2.—
FIG. 2.—
Evidence for substantial purifying selection acting on putative lincRNA sequences. (A) Cumulative frequency distributions of exonic nucleotide substitution rates when aligned between Drosophila melanogaster and D. yakuba: Substitution rates of gene models are indicated in blue, and those for lincRNA loci are in red. The black line plots the cumulative substitution rates for untranscribed intergenic regions. The dashed line indicates the 50th percentile. (B) Cumulative frequency distributions of exonic nucleotide substitution rates when aligned between D. melanogaster and D. yakuba for lincRNA loci identified by modENCODE (red), novel lincRNAs with a maximum FPKM ≥1 (black), and novel lincRNAs with a maximum FPKM <1. (C) Enrichments or deficits of conserved sequence (indel-purified segments, in red, and MCS, in blue) within exonic sequences from gene models and lincRNA loci, and intergenic space, relative to genome-wide random expectations (***P < 0.001). Numbers of gene models and lincRNA loci overlapping each conserved sequence type are displayed in brackets
F<sc>IG</sc>. 3.—
FIG. 3.—
(A) Expression levels of gene models and putative lincRNA loci across 30 developmental time points. Summed log2(FPKM) values for each time point are plotted for gene models (left vertical axis) and lincRNA loci (right axis). (B) Box and whiskers plots of log2(substitution rates) for gene models (left) and lincRNA loci (right) for increasing breadth of expression across one or more of four developmental stages (linear regression, ***P < 0.001). Red lines indicate log2(mean substitution rate) for the genes examined here. Blue lines indicate the log2(mean substitution rate) for presumed neutrally evolving short introns. Note that only genes and lincRNAs that are expressed at greater than 1 FPKM in at least one developmental stage are graphed here. (C) Enrichments or deficits of different chromatin types within gene models, lincRNA loci, and untranscribed intergenic sequence relative to genome-wide random expectations (*P < 0.05, **P < 0.01, and ***P < 0.001). Numbers of gene models and lincRNA loci overlapping each chromatin type are displayed in brackets. Repressive (“Black”) chromatin is depleted approximately 8% for both lincRNAs and gene models and modestly (0.6%) enriched in intergenic regions. (D) GO terms with associated protein-coding gene territories, which contain a significantly greater than expected density of lincRNA loci using a genome-wide association test (P < 0.01, FDR < 0.6). The top two terms are “cellular component” terms, whereas “serine-type endopeptidase activity” is a “molecular function” term and remaining terms are drawn from the “biological process” ontology.
F<sc>IG</sc>. 4.—
FIG. 4.—
(A) Example UCSC genome browser view of a spliced putative lincRNA locus in the vicinity of the mbl protein-coding gene which has read support for expression in one sex but not the other. The small exon at the right of the lincRNA (indicated by an arrow) is supported by messenger RNA but not EST evidence and has been annotated by Graveley et al. as a protein-coding gene (CG43108). Note that this annotation has not been added to the UCSC genome browser. (B) Cumulative distributions of the nucleotide substitution rate for gene models (left) and lincRNA loci (right) with different sex-specific expression profiles. Blue—male-specific; solid black—no sex specificity. The dashed line indicates the 50th percentile.
F<sc>IG</sc>. 5.—
FIG. 5.—
An example of positionally equivalent putative lincRNA loci in both Drosophila melanogaster and Mus. musculus. The arrows within the protein-coding gene models and originating at the lincRNA transcriptional start sites indicate the shared orientation of transcription in both species. The boxed genomic regions indicate the orthologous protein-coding gene neighborhoods for D. melanogaster (fkh) and M. musculus (Foxa1). Note that only multiexonic transcripts are shown for the D. melanogaster gene models. The positionally equivalent lincRNA loci are indicated by the two-headed arrow.

Similar articles

  • A Genomic Analysis of Factors Driving lincRNA Diversification: Lessons from Plants.
    Nelson AD, Forsythe ES, Devisetty UK, Clausen DS, Haug-Batzell AK, Meldrum AM, Frank MR, Lyons E, Beilstein MA. Nelson AD, et al. G3 (Bethesda). 2016 Sep 8;6(9):2881-91. doi: 10.1534/g3.116.030338. G3 (Bethesda). 2016. PMID: 27440919 Free PMC article.
  • Evolutionary dynamics of lincRNA transcription in nine citrus species.
    Ke L, Zhou Z, Xu XW, Wang X, Liu Y, Xu Y, Huang Y, Wang S, Deng X, Chen LL, Xu Q. Ke L, et al. Plant J. 2019 Jun;98(5):912-927. doi: 10.1111/tpj.14279. Epub 2019 Mar 18. Plant J. 2019. PMID: 30739398
  • Comparative validation of the D. melanogaster modENCODE transcriptome annotation.
    Chen ZX, Sturgill D, Qu J, Jiang H, Park S, Boley N, Suzuki AM, Fletcher AR, Plachetzki DC, FitzGerald PC, Artieri CG, Atallah J, Barmina O, Brown JB, Blankenburg KP, Clough E, Dasgupta A, Gubbala S, Han Y, Jayaseelan JC, Kalra D, Kim YA, Kovar CL, Lee SL, Li M, Malley JD, Malone JH, Mathew T, Mattiuzzo NR, Munidasa M, Muzny DM, Ongeri F, Perales L, Przytycka TM, Pu LL, Robinson G, Thornton RL, Saada N, Scherer SE, Smith HE, Vinson C, Warner CB, Worley KC, Wu YQ, Zou X, Cherbas P, Kellis M, Eisen MB, Piano F, Kionte K, Fitch DH, Sternberg PW, Cutter AD, Duff MO, Hoskins RA, Graveley BR, Gibbs RA, Bickel PJ, Kopp A, Carninci P, Celniker SE, Oliver B, Richards S. Chen ZX, et al. Genome Res. 2014 Jul;24(7):1209-23. doi: 10.1101/gr.159384.113. Genome Res. 2014. PMID: 24985915 Free PMC article.
  • Genome engineering: Drosophila melanogaster and beyond.
    Venken KJ, Sarrion-Perdigones A, Vandeventer PJ, Abel NS, Christiansen AE, Hoffman KL. Venken KJ, et al. Wiley Interdiscip Rev Dev Biol. 2016 Mar-Apr;5(2):233-67. doi: 10.1002/wdev.214. Epub 2015 Oct 8. Wiley Interdiscip Rev Dev Biol. 2016. PMID: 26447401 Free PMC article. Review.
  • Genome-wide approaches to understanding behaviour in Drosophila melanogaster.
    Neville M, Goodwin SF. Neville M, et al. Brief Funct Genomics. 2012 Sep;11(5):395-404. doi: 10.1093/bfgp/els031. Epub 2012 Jul 26. Brief Funct Genomics. 2012. PMID: 22843979 Review.

Cited by

References

    1. Adams MD, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287(5461):2185–2195. - PubMed
    1. Andolfatto P. Adaptive evolution of non-coding DNA in Drosophila. Nature. 2005;437(7062):1149–1152. - PubMed
    1. Artieri CG, Haerty W, Singh RS. Ontogeny and phylogeny: molecular signatures of selection, constraint, and temporal pleiotropy in the development of Drosophila. BMC Biol. 2009;7:42. - PMC - PubMed
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25(1):25–29. - PMC - PubMed
    1. Berglund AC, Sjolund E, Ostlund G, Sonnhammer ELL. InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res. 2007;36 (Database):D263–D266. - PMC - PubMed

Publication types

Substances

LinkOut - more resources