Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2019 Jul 26;47(13):6632-6641.
doi: 10.1093/nar/gkz540.

Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro

Affiliations
Comparative Study

Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro

Soumitra Pal et al. Nucleic Acids Res. .

Abstract

Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF-DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Competition between binders using simulated and real HT-SELEX data. (A) Simulation of eight rounds of HT-SELEX using AptaSim (31), where the initial pool contained 10 million unique oligos divided into five groups. Binding probabilities within each group are sampled from a differently parametrized normal distribution: very strong binders (mean 0.90 and SD 0.1), strong binders (mean 0.50, SD 0.30), medium binders (mean 0.25 SD 0.25), weak binders (mean 0.1, SD 0.1) and non-binders (mean 0.025, SD 0.025). The plot shows the distributions of the four groups of oligos as a function of the HT-SELEX rounds. (B) The selection dynamics for the transcription factor MAX. The oligos are categorized into three groups depending on the best match to the core-motif CACGTG. The motif-free group is expected to be a mixture of non-binders and weak background binders.
Figure 2.
Figure 2.
Discretization of shape features. The heat map of the distribution of shape features in the input data (initial pool over all experiments) and the discretization thresholds. For more details, we refer the reader to Supplementary Section S3.
Figure 3.
Figure 3.
Overview of the Co-SELECT method. (A) First, motif-containing (blue) and motif-free (red) aptamers are identified within the sequence pool from the final round of selection. The remaining shapemers that might contain partial-motif are disregarded. Next, the shapemers that potentially contribute to binding in the motif-containing group (core shapemers) and in the motif-free aptamers are identified. (B) Significant overlap of enriched core and motif-free shapemers suggests that the same shape feature contribute to motif specific and motif-free binding.
Figure 4.
Figure 4.
The design of control Co-SELECT experiments. Given two transcription factors (TF1, TF2) from two different families we test for the significance of the overlap of enriched core shapemers from the selection experiment for TF1 and enriched motif-free shapemers from the selection experiment for TF2. We expect that the number of co-selected shapemers in such a control experiment is smaller than in the corresponding true experiment.
Figure 5.
Figure 5.
Comparison of P-value histograms for the original experiment (dark brown) and the control experiments (light brown). ECR0.05 > 1 indicates that the number of results with P-value below 0.05 was higher for the original experiment than control. We note that while for MGW in homeodomain family ECR0.05 < 1, for very small P-values <0.005, ECR0.005 > 1.
Figure 6.
Figure 6.
Differential shape preferences of two homeodomain proteins: PITX3 and HMX2. (A) Crystal structures of PITX3 bound to DNA. (B) MGW shape preference of PITX3. (C) From top to bottom: logos of motif-containing sequences (including flanking regions) in the initial pool, logos of motif-containing sequences (including flanking regions) in the initial pool that contain the most-enriched motif-free shapemer, sequence logos of the most enriched motif-free shapemer in the initial and final pools. The letters in the box below the logos show the corresponding shapemers (the difference is shown in red). The similarity between the second pair of logos different from the sequence motif logo for each protein indicates that shape rather than sequence contributed to the motif-free enrichment.
Figure 7.
Figure 7.
Principal Component Analysis of shapemers enriched in motif-free pool. Here each point corresponds to a TF and is color-coded based primarily on the binding domain and then on the core of the binding motif.
Figure 8.
Figure 8.
Promiscuity of shapemers. The shapemers are sorted in the decreasing order of their promiscuity. In the inset, we show most promiscuous shapemers using the 30% threshold as the cutoff for high promiscuity.

Similar articles

Cited by

References

    1. Stormo G.D. DNA binding sites: representation and discovery. Bioinformatics. 2000; 16:16–23. - PubMed
    1. Hippel P.H.V., Berg O.G.. On the specificity of DNA-protein interactions. Proc. Natl. Acad. Sci. U.S.A. 1986; 83:1608–1612. - PMC - PubMed
    1. Badis G., Berger M.F., Philippakis A.A., Talukder S., Gehrke A.R., Jaeger S.A., Chan E.T., Metzler G., Vedenko A., Chen X. et al. .. Diversity and complexity in DNA recognition by transcription factors. Science. 2009; 324:1720–1723. - PMC - PubMed
    1. Nutiu R., Friedman R.C., Luo S., Khrebtukova I., Silva D., Li R., Zhang L., Schroth G.P., Burge C.B.. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol. 2011; 29:659–664. - PMC - PubMed
    1. Rohs R., West S.M., Sosinsky A., Liu P., Mann R.S., Honig B.. The role of DNA shape in protein-DNA recognition. Nature. 2009; 461:1248–1253. - PMC - PubMed

Publication types

MeSH terms