Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Aug 25;11(9):995.
doi: 10.3390/genes11090995.

Beyond Trees: Regulons and Regulatory Motif Characterization

Affiliations
Review

Beyond Trees: Regulons and Regulatory Motif Characterization

Xuhua Xia. Genes (Basel). .

Abstract

Trees and their seeds regulate their germination, growth, and reproduction in response to environmental stimuli. These stimuli, through signal transduction, trigger transcription factors that alter the expression of various genes leading to the unfolding of the genetic program. A regulon is conceptually defined as a set of target genes regulated by a transcription factor by physically binding to regulatory motifs to accomplish a specific biological function, such as the CO-FT regulon for flowering timing and fall growth cessation in trees. Only with a clear characterization of regulatory motifs, can candidate target genes be experimentally validated, but motif characterization represents the weakest feature of regulon research, especially in tree genetics. I review here relevant experimental and bioinformatics approaches in characterizing transcription factors and their binding sites, outline problems in tree regulon research, and demonstrate how transcription factor databases can be effectively used to aid the characterization of tree regulons.

Keywords: Gibbs sampler; comparative genomics; gene expression; regulatory motifs; regulon; transcription factor.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Illustration of a regulon network of two interacting regulons. (A) tfA-regulon with target genes T1, T2, and tfB, and tfB-regulon with target genes T4 and T5. (B) Legends of graphic elements. TFBS—Transcription factor binding site. (C) A simplified representation of the two regulons. (D) A partial regulon network [3] for coping with cold stress in apple (Malus demestica), in which TFBSs remain poorly characterized.
Figure 2
Figure 2
Construction and use of position weight matrix (PWM) to discover new transcription factor binding sites (TFBS). (A) Site-specific compilation of experimentally validated TFBS for AtHB1, retrieved from PlantPAN [57]. (B) The consensus sequence from nine strongly informative sites in (A). (C) PWM obtained from data in (A). (D) PWM score (PWMS) for 9mers along the 2000 nt upstream of micro-RNA gene miR164a which starts at site 2001. The expected PWMS for random sequences is 0. The 9mer with the highest PWMS is CAATCATTA (at Site 343) which was verified to be an effective TFBS for AtHB1. PWM computation and sequence scanning were done with software DAMBE [66,67].
Figure 3
Figure 3
Examples of leucine zipper transcription factors with heptad repeats, with the seven amino acids in each heptad labelled as abcdefg. Hydrophobic residues at positions a and d are shown in bold (A) GCN4 (sites 250 to 277) in yeast Saccharomyces cerevisiae. (B) XBP1 (sites 95 to 136) in human. (C) Relative position of the seven amino acids in an α helix. (D) Top view of a leucine zipper homodimer contacting at their hydrophobic side.
Figure 4
Figure 4
Protein isoelectric point (pI) profile showing different transcription factors having a positively charged DNA-binding domain, with AtHB1 from Arabidopsis thaliana, Hac1 from Saccharomyces cerevisiae and XBP1 from human (a functional homologue of Hac1). Window-specific pI was computed with software DAMBE [66,67] with window size of 80 and step size of 5.
Figure 5
Figure 5
Illustration of ChIP-Seq method for highlighting its problems and improvements. (A) DNA genome with gene1, gene2, gene 4, and gene7 sharing the same TFBS bound to the same TF of interest (red-colored squares). (B) Cross-linking fixes TF onto TFBS, and sonication is optimized to minimize the sequences flanking TFBS. (C) A specific anti-TF antibody is used to bind to the TF of interest. (D) A general-purpose protein A/G coupled to agarose beads is used to pull down the TFBS- TF- antibody complex. (E) Reversal of cross-linking frees TFBS-containing fragments for sequencing. (F) High-throughput sequence generates millions or even billions of sequence reads that may contain a TFBS.
Figure 6
Figure 6
Reducing background sequences flanking TFBS. (A) Paired-end reading of a TFBS- containing sequence fragment. (B) Paired-end reads that do not overlap, from an SRA file (ACCN: SRR11577050 in BioProject PRJNA544746) generated by Cut&Run. (C) Paired-end reads overlap, from the same file as in (B). (D) Paired-end reads from an SRA file (ACCN: SRR11806558 in BioProject PRJNA633509) generated by ChIP-seq. The forward or reverse reads is reverse-complemented to generate the sequence alignment.
Figure 7
Figure 7
Identifying TFBS by mapping ChIP-Seq reads to genome. (A) Two genes on the genome (colored yellow) are not homologous but share a TFBS (colored red), each with many reads mapped to them. (B) The count of each site represented in reads, with each site in TFBS expected to have higher counts than flanking sequences.
Figure 8
Figure 8
Using Gibbs sampler to extract TFBS. (A) Six reads mapped each mapped to a different gene on the genome (represented by the blue line. (B) FASTA file for the six reads as input to Gibbs sampler. (C) An optimal position weight matrix (PWM) that one can use to scan the genome for new locations of the TFBS. (D) The aligned motif that generates the most informative PWM in (C). (C) and (D) are generated from DAMBE [61] based on the FASTA input file in (B).

Similar articles

Cited by

References

    1. Romero I., Fuertes A., Benito M.J., Malpica J.M., Leyva A., Paz-Ares J. More than 80 R2R3-MYB regulatory genes in the genome of Arabidopsis thaliana. Plant J. 1998;14:273–284. doi: 10.1046/j.1365-313X.1998.00113.x. - DOI - PubMed
    1. Stracke R., Werber M., Weisshaar B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 2001;4:447–456. doi: 10.1016/S1369-5266(00)00199-0. - DOI - PubMed
    1. Xie Y., Chen P., Yan Y., Bao C., Li X., Wang L., Shen X., Li H., Liu X., Niu C., et al. An atypical R2R3 MYB transcription factor increases cold hardiness by CBF-dependent and CBF-independent pathways in apple. New Phytol. 2017;218:201–218. doi: 10.1111/nph.14952. - DOI - PubMed
    1. Maas W.K., Clark A. Studies on the mechanism of repression of arginine biosynthesis in Escherichia coli. J. Mol. Biol. 1964;8:365–370. doi: 10.1016/S0022-2836(64)80200-X. - DOI - PubMed
    1. Koornneef M., Hanhart C.J., Van Der Veen J.H. A genetic and physiological analysis of late flowering mutants in Arabidopsis thaliana. Mol. Genet. Genom. 1991;229:57–66. doi: 10.1007/BF00264213. - DOI - PubMed

Publication types

Substances

LinkOut - more resources