Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Nov 15;26(22):2826-32.
doi: 10.1093/bioinformatics/btq546. Epub 2010 Sep 23.

Identification of context-dependent motifs by contrasting ChIP binding data

Affiliations

Identification of context-dependent motifs by contrasting ChIP binding data

Mike J Mason et al. Bioinformatics. .

Abstract

Motivation: DNA binding proteins play crucial roles in the regulation of gene expression. Transcription factors (TFs) activate or repress genes directly while other proteins influence chromatin structure for transcription. Binding sites of a TF exhibit a similar sequence pattern called a motif. However, a one-to-one map does not exist between each TF and motif. Many TFs in a protein family may recognize the same motif with subtle nucleotide differences leading to different binding affinities. Additionally, a particular TF may bind different motifs under certain conditions, for example in the presence of different co-regulators. The availability of genome-wide binding data of multiple collaborative TFs makes it possible to detect such context-dependent motifs.

Results: We developed a contrast motif finder (CMF) for the de novo identification of motifs that are differentially enriched in two sets of sequences. Applying this method to a number of TF binding datasets from mouse embryonic stem cells, we demonstrate that CMF achieves substantially higher accuracy than several well-known motif finding methods. By contrasting sequences bound by distinct sets of TFs, CMF identified two different motifs that may be recognized by Oct4 dependent on the presence of another co-regulator and detected subtle motif signals that may be associated with potential competitive binding between Sox2 and Tcf3.

Availability: The software CMF is freely available for academic use at www.stat.ucla.edu/∼zhou/CMF.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Motivational example for CMF. (A) The lowess smoothed distributions of log LR(s) in Chen Oct4 bound sequences and control sequences scanned by the Oct4 PWM. Each distribution was normalized by the median and the SD of the control dataset. The vertical dashed line indicates τ = 100. (B) A zoomed-in view of the right tails of the distributions.
Fig. 2.
Fig. 2.
Context-dependent motifs recognized by Oct4. (A) The consensus Oct4 motif. (B) pGCAT motif found by CMF when contrasting sequences cobound by Oct4 and Sox2 against sequences bound solely by Oct4. Boxes indicate positions that change from the consensus. (C) Proportions of cofactor binding within 500 bp of the Oct4-motif peaks and the pGCAT peaks in the Chen study with corresponding P-values (− log10p) from difference of proportions tests. There are 1107 Oct4-motif peaks and 224 pGCAT peaks.
Fig. 3.
Fig. 3.
Context-dependent motifs of Tcf3 found when contrasting sequences bound by Tcf3 and Sox2 against those only bound by Tcf3. (A) The Sox2 consensus motif found enriched in the sequences bound by both Tcf3 and Sox2. (B) The motif found in sequences bound by Tcf3 but not Sox2. Corresponding positions between (A) and (B) with different nucleotide distributions are indicated (boxed and circled).

Similar articles

Cited by

References

    1. Barash Y, et al. A simple hyper-geometric approach for discovering putative transcription factor binding sites. Proc. WABI. 2001;1:278–293.
    1. Beer MA, Tavazoie S. Predicting gene expression from sequence. Cell. 2004;117:185–198. - PubMed
    1. Chen X, et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell. 2008;133:1106–1117. - PubMed
    1. Chen G, Zhou Q. Heterogeneity in DNA multiple alignments: modeling, inference, and applications in motif finding. Biometrics. 2010;66:694–704. - PubMed
    1. Cole M, et al. Tcf3 is an integral component of the core regulatory circuitry of embryonic stem cells. Genes Dev. 2008;22:746–755. - PMC - PubMed

Publication types