Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb 5:9:67.
doi: 10.1186/1471-2164-9-67.

All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues

Affiliations

All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues

Julian M Rozenberg et al. BMC Genomics. .

Abstract

Background: The promoters of housekeeping genes are well-bound by RNA polymerase II (RNAP) in different tissues. Although the promoters of these genes are known to contain CpG islands, the specific DNA sequences that are associated with high RNAP binding to housekeeping promoters has not been described.

Results: ChIP-chip experiments from three mouse tissues, liver, heart ventricles, and primary keratinocytes, indicate that 94% of promoters have similar RNAP binding, ranging from well-bound to poorly-bound in all tissues. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters of housekeeping genes, focusing on those DNA sequences which are preferentially localized in the proximal promoter. We observe a bimodal distribution. Virtually all sequences enriched in promoters with high RNAP binding values contain a CpG dinucleotide. These results suggest that only transcription factor binding sites (TFBS) that contain the CpG dinucleotide are involved in RNAP binding to housekeeping promoters while TFBS that do not contain a CpG are involved in regulated promoter activity. Abundant 8-mers that are preferentially localized in the proximal promoters and exhibit the best enrichment in RNAP bound promoters are all variants of six known CpG-containing TFBS: ETS, NRF-1, BoxA, SP1, CRE, and E-Box. The frequency of these six DNA motifs can predict housekeeping promoters as accurately as the presence of a CpG island, suggesting that they are the structural elements critical for CpG island function. Experimental EMSA results demonstrate that methylation of the CpG in the ETS, NRF-1, and SP1 motifs prevent DNA binding in nuclear extracts in both keratinocytes and liver.

Conclusion: In general, TFBS that do not contain a CpG are involved in regulated gene expression while TFBS that contain a CpG are involved in constitutive gene expression with some CpG containing sequences also involved in inducible and tissue specific gene regulation. These TFBS are not bound when the CpG is methylated. Unmethylated CpG dinucleotides in the TFBS in CpG islands allow the transcription factors to find their binding sites which occur only in promoters, in turn localizing RNAP to promoters.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A-C) RNAPbinding to 14,790 promoters from ChIP-chip data in different mouse tissues with each spot representing a single promoter. A) keratinocytes versus heart ventricles (R = +0.76). B) keratinocytes versus liver (R = +0.73). C) heart ventricle versus liver (R = +0.76). D-F) RNAP binding to the 13,861 promoters with similar RNAP binding values in heart, liver and keratinocytes.
Figure 2
Figure 2
8-mer-association-with-RNAP for abundant 8-mers calculated for 13, 861 common promoters between -1,000 bp and +500 bp. 8-mers that contain a CpG are noted in black.
Figure 3
Figure 3
A) Binding ofRNAPvs. H3K9me2 (R = -0.50) in mouse tissue culture keratinocytes.B) 8-mer-association-with-H3K9me2 for 12,208 abundant 8-mers, calculated for 14,790 promoters between -1,000 bp and +500 bp; CpG containing 8-mers are noted in black. C-E) 8-mer-association-with-RNAP vs. 8-mer-association-with-H3K9me2. C) All 8-mers. The association-with-RNAP and the association-with-H3K9me2 for the core promoter elements at their unique position in promoters is presented for TATA (TATAWAAR), INR (YYANWYY) and DPE (RGWYV). D) 8-mers without a CpG. E) 8-mers with a CpG.
Figure 4
Figure 4
A) RNAPbinding to promoters vs. mRNA expression for 4,522 promoters with common identifiers.B) 8-mer-association-with-RNAP vs. 8-mer-association-with-mRNA-expression for abundant 8-mers calculated using the 4,522 promoters graphed in (A). CpG-containing 8-mers are notated in black.
Figure 5
Figure 5
8-mer-association-with-RNAPvs. 8-mer enrichment in 356 liver specific promoters for abundant 8-mers. Highlighted 8-mers contain TATA sequences (STable 1 in Additional file 4) and the liver specific HNF4 binding sites (8-mers containing TGACCT). The CpG containing 8-mers are plotted in black.
Figure 6
Figure 6
A) A measure of non-random distribution termed a Clustering Factor (CF) is plotted in the most populated bin for 8-mers with at least 20 members in the most populated 20 bp bin (abundant 8-mers). Note the dots between -100 bp and the TSS with large CF values representing 8-mers that are more abundant near the TSS than elsewhere. B) A probability term P for the 8-mers in (A). A P value of 24 means that the distribution of the 8-mer has a less than 10-24 chance of being random. C) Non-random distribution of 8-mers (Clustering Factor) vs. 8-mer-association-with-RNAP for abundant 8-mers.
Figure 7
Figure 7
A) Fraction of promoters that contain particular sequences between -200 bp and TSS: 1) CpG island, 2) two or more of six CpG containing motifs (SP1: CCCGCC, CCGCCC, CGCCCC; ETS: CCGGAA, GCGGAA; NRF-1:CGCATGCG, CGCGTGCG, CGCCTGCG; BoxA: TCTCGCG, CTCGCGA; CRE: ACGTCA; E-Box: CACGTG), 3) three or more of the six motifs.B) Fraction of promoters that contain particular motifs: top 20% of common RNAP promoters (Const), liver specific (LS), heart ventricle specific (HS), and keratinocyte specific (KS) promoters. Average RNAP binding for each class is presented.
Figure 8
Figure 8
EMSA using keratinocyte and liver nuclear extracts and pure HU protein with 28 bp double stranded oligonucleotides containing on the sense strand a canonical SP1 (GGGGCGGG), ETS (CCGGAA), and NRF-1 (GCGVTGCG) site where the cytosine in the CpG is non methylated (-/-), hemi-methylated (-/+), hemi-methylated (+/-), or methylated (+/+).

Similar articles

Cited by

References

    1. Web site of the Charles Vinson laboratory http://home.ccr.cancer.gov/metabolism/vinson/vinsonccr.htm
    1. Smale ST, Kadonaga JT. The RNA polymerase II core promoter. Annu Rev Biochem. 2003;72:449–479. doi: 10.1146/annurev.biochem.72.121801.161520. - DOI - PubMed
    1. Maston GA, Evans SK, Green MR. Transcriptional Regulatory Elements in the Human Genome. Annu Rev Genomics Hum Genet. 2006. - PubMed
    1. Heintzman ND, Ren B. The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome. Cell Mol Life Sci. 2007;64:386–400. doi: 10.1007/s00018-006-6295-0. - DOI - PMC - PubMed
    1. Swartz MN, Trautner TA, Kornberg A. Enzymatic synthesis of deoxyribonucleic acid. XI. Further studies on nearest neighbor base sequences in deoxyribonucleic acids. J Biol Chem. 1962;237:1961–1967. - PubMed

LinkOut - more resources