Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Feb;22(2):96-118.
doi: 10.1038/s41580-020-00315-9. Epub 2020 Dec 22.

Gene regulation by long non-coding RNAs and its biological functions

Affiliations
Review

Gene regulation by long non-coding RNAs and its biological functions

Luisa Statello et al. Nat Rev Mol Cell Biol. 2021 Feb.

Erratum in

Abstract

Evidence accumulated over the past decade shows that long non-coding RNAs (lncRNAs) are widely expressed and have key roles in gene regulation. Recent studies have begun to unravel how the biogenesis of lncRNAs is distinct from that of mRNAs and is linked with their specific subcellular localizations and functions. Depending on their localization and their specific interactions with DNA, RNA and proteins, lncRNAs can modulate chromatin function, regulate the assembly and function of membraneless nuclear bodies, alter the stability and translation of cytoplasmic mRNAs and interfere with signalling pathways. Many of these functions ultimately affect gene expression in diverse biological and physiopathological contexts, such as in neuronal disorders, immune responses and cancer. Tissue-specific and condition-specific expression patterns suggest that lncRNAs are potential biomarkers and provide a rationale to target them clinically. In this Review, we discuss the mechanisms of lncRNA biogenesis, localization and functions in transcriptional, post-transcriptional and other modes of gene regulation, and their potential therapeutic applications.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Biogenesis and cellular fates of long non-coding RNAs.
a | Biogenesis of long non-coding RNAs (lncRNAs). Unlike mRNAs, many RNA polymerase II (Pol II)-transcribed lncRNAs are inefficiently processed,, and are retained in the nucleus,,,, (mechanisms of lncRNA nuclear retention are shown in parts be), whereas others are spliced and exported to the cytoplasm. The lncRNAs (and mRNAs) that contain one or only few exons are exported to the cytoplasm by nuclear RNA export factor 1 (NXF1). b | Some lncRNAs are transcribed by dysregulated Pol II, remain on chromatin and, subsequently, are degraded by the nuclear exosome. c | Numerous lncRNAs with a certain U1 small nuclear RNA (U1 snRNA) binding motif can recruit the U1 small nuclear ribonucleoprotein (U1 snRNP) and through it associate with Pol II at various loci. d | In many lncRNAs, the sequence between the 3′ splice site and the branch point is longer and contains a shorter polypyrimidine tract (PPT) than in mRNAs,, which results in inefficient splicing. e | Sequence motifs in cis and factors in trans coordinately contribute to nuclear localization of lncRNAs. A nuclear retention element (NRE) U1 snRNA-binding site and C-rich motifs can recruit U1 snRNP and heterogeneous nuclear ribonucleoprotein K (hnRNPK),, respectively, to enhance lncRNA nuclear localization. Other, differentially expressed RNA-binding proteins (RBPs), such as peptidylprolyl isomerase E (PPIE), inhibit splicing of groups of lncRNAs, resulting in their nuclear retention. f | In the cytoplasm, lncRNAs usually interact with diverse RBPs. g | Many lncRNAs in the cytoplasm are associated with ribosomes through ‘pseudo’ 5′ untranslated regions (UTRs); ribosome-associated lncRNAs tend to have short half-lives owing to unknown mechanisms. h | Several lncRNAs are sorted into mitochondria by unknown mechanisms,. For example, the RNA component of mitochondrial RNA-processing endoribonuclease (RMRP) is recruited to mitochondria and is stabilized by binding G-rich RNA sequence-binding factor 1 (GRSF1). i | Some lncRNAs are also found in other organelles, such as exosomes, probably by forming lncRNA–RBP complexes,. m7G, 7-methyl guanosine 5′ cap; (A)n, poly(A) 3′ tail.
Fig. 2
Fig. 2. Chromatin regulation mediated by long non-coding RNAs.
a | Long non-coding RNAs (lncRNAs) can interact with chromatin modifiers and recruit them to target-gene promoters in order to activate or suppress their transcription in cis,,, or in trans at distant, often multiple, loci. For example, HOXA transcript at the distal tip (HOTTIP), acts in cis at the 5′ genes of the HOXA gene cluster, with which it interacts through chromatin looping. HOTTIP interacts with WD repeat-containing protein 5 (WDR5), thereby targeting the complex WDR5–myeloid/lymphoid or mixed-lineage leukaemia (MLL) to the promoters of the HOXA genes and promoting histone H3 Lys4 trimethylation (H3K4me3). b | lncRNAs can act as decoys of specific chromatin modifiers by sequestering them from the promoters of target genes. For example, p53-regulated and embryonic stem cell-specific lncRNA (lncPRESS1) supports the pluripotency of human embryonic stem cells by sequestering the histone deacetylase sirtuin 6 (SIRT6) from the promoters of numerous pluripotency genes. In this manner, lncPRESS keeps the active-gene H3 acetylated at Lys56 (H3K56ac) and Lys9 (H3K9ac) modifications as its target genes, thereby preventing the switch to activation of differentiation genes. During p53-mediated differentiation or following depletion of lncPRESS1, SIRT6 localizes to the chromatin and ensures the maintenance of pluripotency. c | lncRNAs can interact with DNA and co-transcriptionally form RNA–DNA hybrids such as R-loops, which are recognized by chromatin modifiers that activate or inhibit target-gene transcription, or by transcription factors. The lncRNA TCF21 antisense RNA inducing demethylation (TARID) forms an R-loop upstream of the promoter of its target gene transcription factor 21 (TCF21). The R-loop is recognized by growth arrest and DNA damage inducible-α (GADD45A), which drives the demethylation of the TCF21 promoter DNA by interacting with thymine–DNA glycosylase (TDG) and ten–eleven translocation 1 (TET1). R-loops can also form in trans, with similar possible outcomes. For example, auxin-regulated promoter loop (APOLO) is responsible for the activation of auxin responsive genes in Arabidopsis thaliana. APOLO and auxin target genes are normally silenced by H3K27me3 and the presence of chromatin loops maintained by the Polycomb factor like heterochromatin protein 1 (LHP1). Following transcriptional activation of APOLO in response to auxin, the lncRNA recognizes specific motifs at the promoters of its target genes, where it binds and generates R-loops that act as decoys of LHP1, thereby allowing target-gene expression. Pol II, RNA polymerase II.
Fig. 3
Fig. 3. Transcription regulation by long non-coding RNAs.
a | Long non-coding RNAs (lncRNAs) can inhibit gene expression in a transcript-dependent and/or in a transcription-dependent (that is, transcript-independent) manner. In mouse extra-embryonal tissues, antisense of IGF2R non-protein coding RNA (Airn) functions in trans as it is guided through a specific 3D chromosome conformation (not shown) to the promoters of two distal imprinted target genes, solute carrier family 22 member 2 (Slc22a2) and Slc22a3. Once there, Airn recruits Polycomb repressive complex 2 (PRC2), which catalyses histone H3 Lys27 trimethylation (H3K27me3) and gene silencing. Airn also functions in cis, on its overlapping protein-coding gene insulin-like growth factor 2 receptor (Igf2r). Airn transcription causes steric hindrance for RNA polymerase II (Pol II) at the transcription start site of Igfr2r, which is followed by promoter methylation (not shown) and Igfr2r silencing,,. b | lncRNAs and enhancer RNAs (eRNAs) can promote the expression of protein coding genes (PCGs) that are in close proximity to their enhancers through preformed chromatin loops (for example, the eRNA P53BER (p53-bound enhancer region) and the enhancer-associated lncRNA (elncRNA) SWINGN (SWI/SNF interacting GAS6 enhancer non-coding RNA)), thereby allowing recruitment of chromatin-activating complexes to the promoters of the PCGs. c | An important feature of some eRNAs and elncRNAs is their ability to regulate distant genes by directly promoting chromatin looping through the recruitment of looping factors,,,. For example, following oestrogen receptor (ER) transcription activation, the NRIP1 enhancer (eNRIP) is bi-directionally transcribed into an eRNA, which recruits cohesin to form short-range (solid line) and long-range (dashed line) chromatin loops, thereby promoting contact between the NRIP1 enhancer and the promoters of NRIP1 and trefoil factor 1 (TFF1), two of the several genes activated in response to ER activation. d | lncRNAs can activate gene expression in a transcript-independent manner. Transcription of Bend4-regulating effects not dependent on the RNA (Bendr) is sufficient to activate enhancer elements (e) embedded in its locus, which promotes the formation of an active chromatin state (marked by H3K4me3) at the promoter of the proximal gene BEN domain containing protein 4 (Bend4). e | Example of a complex regulatory unit formed by the lncRNAs Upperhand (Uph) and Handsdown (Hdn) in regulating the PCG heart and neural crest derivatives expressed 2 (Hand2). An enhancer embedded in Uph activates the transcription of the proximal Hand2 gene when the lncRNA gene is transcribed, without requiring chromatin reorganization. By contrast, chromatin looping is necessary for Hdn function, as it puts its promoter in spatial proximity with Hand2-activating enhancers. When Hdn transcription is activated, the Hand2 enhancers become unavailable for Hand2 promoter activation, thereby inhibiting its expression. Removal of Hdn or reduction of its transcription leads to increased expression of Hand2. CTCF, CCCTC-binding factor; NRIP1, nuclear receptor interacting protein 1; TF, transcription factor.
Fig. 4
Fig. 4. Roles of long non-coding RNAs in nuclear organization.
a | The long non-coding RNA (lncRNA) nuclear paraspeckle assembly transcript 1 (NEAT1) is essential for the formation of paraspeckles. NEAT1 sequesters numerous paraspeckle proteins to form a highly organized core–shell (dark and light purple, respectively) spheroidal nuclear body. The middle region of NEAT1 is localized in the centre of paraspeckles and the 3′ and 5′ regions are localized in the periphery. Different paraspeckle proteins are embedded by NEAT1 into the spheroidal structure in the core region (non-POU domain containing octamer binding (NONO), fused in sarcoma (FUS) and splicing factor, proline- and glutamine-rich (SFPQ)) or the shell region (RNA binding motif protein 14 (RBM14)). b | The lncRNA metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) is localized at the periphery of nuclear speckles, and is involved in the regulation of pre-mRNA splicing,,. At the periphery, MALAT1 interacts with the U1 small nuclear RNA (U1 snRNA), whereas proteins such as SON DNA and RNA binding protein (SON) and splicing component, 35 kDa (SC35) are localized at the centre of nuclear speckles. c | 5′ small nucleolar RNA-capped and 3′-polyadenylated lncRNAs (SPAs) and small nucleolar RNA-related lncRNAs (sno-lncRNAs) accumulate at their transcription sites and interact with several splicing factors such as RNA binding protein fox-1 homologue 2 (RBFOX2), TAR DNA-binding protein 43 (TDP43) and heterogeneous nuclear ribonucleoprotein M (hnRNPM) to form a microscopically visible nuclear body that is involved in the regulation of alternative splicing. d | The perinucleolar compartment contains the lncRNA pyrimidine-rich non-coding transcript (PNCTR), which sequesters pyrimidine tract-binding protein 1 (PTBP1) and, thus, suppresses PTPBP1-mediated pre-mRNA splicing elsewhere in the nucleoplasm. e | Functional intergenic RNA repeat element (Firre) is transcribed from the mouse X chromosome and interacts with the nuclear matrix factor hnRNPU to tether chromosome (Chr) X, 2, 9, 15 and 17 into a nuclear domain,. The size of each type of nuclear body is indicated in parts ad.
Fig. 5
Fig. 5. Post-transcriptional functions of trans-acting long non-coding RNAs.
A | trans-Acting long non-coding RNAs (lncRNAs) interact with RNA-binding proteins (RBPs) through sequence motifs or by forming unique structural motifs. Aa | Pyrimidine-rich non-coding transcript (PNCTR) sequesters pyrimidine tract-binding protein 1 (PTBP1) to the perinucleolar compartment (PNC) and, thus, suppresses PTBP1-mediated mRNA splicing elsewhere in the nucleoplasm. Ab | In the cytosol, non-coding RNA activated by DNA damage (NORAD) sequesters Pumilio (PUM) RBPs, which repress the stability and translation of mRNAs to which they bind,,. Ac | Human FOXD3 antisense transcript 1 (FAST) forms several structural modules that bind the E3 ligase β-transducin repeats-containing protein (β-TrCP), thereby blocking the degradation of its substrate β-catenin (β-cat), leading to activation of WNT signalling in human embryonic stem cells. B | trans-Acting lncRNAs directly interact with RNAs through base pairing. Ba | Terminal differentiation-induced ncRNA (TINCR) or half-STAU1-binding site RNAs (1/2-sbsRNAs) promote or suppress mRNA stability, respectively, by forming intermolecular duplexes that bind Staufen homologue 1 (STAU1), the key protein of Staufen-mediated mRNA decay. Bb | The SINEB2 repeat of mouse antisense to ubiquitin carboxyterminal hydrolase L1 (AS-Uchl1) complementarily binds the Uchl1 mRNA and promotes polysome association with Uchl1 and translation. C | Some abundant lncRNAs affect gene expression by functioning as competitive endogenous RNAs (ce-RNAs),. For example, lncRNA-PNUTS is generated by alternative splicing of the PNUTS pre-mRNA by heterogeneous nuclear ribonucleoprotein E1 (hnRNPE1). lncRNA-PNUTS contains seven miR-205 binding sites, which reduce the availability of miR-205 to bind and suppress the zinc finger E-box-binding homeobox 1 (ZEB1) and ZEB2 mRNAs. GSK3, glycogen synthase kinase 3; miR, microRNA; P, phosphate group; PNUTS, phosphatase 1 nuclear targeting subunit; PRE, Pumilio response element.
Fig. 6
Fig. 6. The involvement of long non-coding RNAs in cancer.
a | Long non-coding RNAs (lncRNAs) located in the same (human or mouse) genomic region of the cyclin-dependent kinase inhibitor 1A (CDKN1A) gene are direct targets and effectors of p53 following DNA damage. Long intergenic non-coding RNA p21 (lincRNA-p21) functions in trans to recruit the transcription repressor heterogeneous nuclear ribonucleoprotein K (hnRNPK) to the promoter of target genes in response to p53 activation, or in cis, where it promotes activation of Cdkn1a in two possible ways: lincRNA-p21 can recruit hnRNPK to the promoter of Cdkn1a from its site of transcription; and another in vivo study has revealed the presence of multiple enhancers (green rectangles) in the lincRNA-p21 locus, which are responsible for transcript-independent regulation in cis of Cdkn1a (ref.). p21-associated ncRNA DNA damage-activated (PANDA) functions as a decoy for nuclear transcription factor Y subunit-α (NF-YA), thereby removing it from the promoters of its target genes and reducing apoptosis and cell senescence in a p53-dependent fashion. Damage induced non-coding (DINO) interacts with p53 in the nucleus and promotes p53 tetramer stabilization (consequently reinforcing p53 signalling). Furthermore, DINO co-localizes with p53 at the promoters of several of its target genes, including CDKN1A (ref.). b | GUARDIN (also known as long non-coding transcriptional activator of miR34a) is activated by p53 following DNA damage and contributes to genome integrity through two separate activities. Part of the GUARDIN pool is exported to the cytoplasm, where it acts as a sponge of miR-23a, thus preventing the destabilization of its main mRNA target, telomeric repeat-binding factor 2 (TRF2), which encodes a factor involved in telomere capping and stability. In the nucleus, GUARDIN functions as a scaffold that enables the interaction of breast cancer type 1 susceptibility protein homologue (BRCA1) and BRCA1 associated RING domain 1 (BARD1), which is important for the recruitment of DNA double-strand break (DSB) repair machinery. c | MYC oncogene expression is tightly regulated by numerous non-coding RNAs, and relies on the function of several enhancers (green box labelled ‘e’) in the MYC genomic region. Among them, the super-enhancer lncRNA colon cancer associated transcript 1-long (CCAT1-L) promotes chromatin interactions between MYC enhancers and promoters through recruiting the DNA-binding protein CCCTC-binding factor (CTCF), thereby activating Myc expression,. Furthermore, the 5′ end of CCAT1-L interacts with hnRNPK, and both interact with the MYC promoter and with the lncRNA plasmacytoma variant translocation 1 (PVT1) to coordinate their expression. PVT1 competes with the MYC promoter for the availability of enhancers; thus, when PVT1 is expressed, MYC levels are kept low. In the presence of PVT1-inactivating somatic mutations, which are frequent in some cancers, or when PVT1 expression is experimentally repressed using CRISPR interference (CRISPRi), MYC expression is favoured. HR, homologous recombination; miR, microRNA; NHEJ, non-homologous DNA end joining.

Similar articles

Cited by

References

    1. Uszczynska-Ratajczak B, Lagarde J, Frankish A, Guigo R, Johnson R. Towards a complete map of the human long non-coding RNA transcriptome. Nat. Rev. Genet. 2018;19:535–548. - PMC - PubMed
    1. Fang S, et al. NONCODEV5: a comprehensive annotation database for long non-coding RNAs. Nucleic Acids Res. 2018;46:D308–D314. - PMC - PubMed
    1. Wu H, Yang L, Chen LL. The diversity of long noncoding RNAs and their generation. Trends Genet. 2017;33:540–552. - PubMed
    1. Derrien T, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 2012;22:1775–1789. - PMC - PubMed
    1. Tian B, Manley JL. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 2017;18:18–30. - PMC - PubMed

Publication types

Substances