Abstract
Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of homologous regulatory regions, usually collected from multiple species. It does so by identifying the best conserved motifs in those homologous regions. There are two popular sets of methods—alignment-based and motif-based, which are generally employed for phylogenetic methods. However, serious efforts have lacked to develop a tool exclusively for phylogenetic footprinting, based on either of these methods. Nevertheless, a number of software and tools exist that can be applied for prediction of phylogenetic footprinting with variable degree of success. The output from these tools may get affected by a number of factors associated with current state of knowledge, techniques and other resources available. We here present a critical apprehension of various phylogenetic approaches with reference to prokaryotes outlining the available resources and also discussing various factors affecting footprinting in order to make a clear idea about the proper use of this approach on prokaryotes.
Similar content being viewed by others
References
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37:W202–W208
Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36
Baumbach J (2007) RegNet 4.0—a reference database for corynebacterial gene regulatory networks. BMC Bioinformatics 8:429
Blanchette M, Kwong S, Tompa M (2003) An empirical comparison of tools for phylogenetic footprinting. Third IEEE Symposium on Bioinformatics and Bioengineering, pp 69–78
Blanchette M, Tompa M (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 12:739–748
Blanchette M, Tompa M (2003) FootPrinter: a program designed for phylogenetic footprinting. Nucleic Acids Res 31:3840–3842
Brohee S, Janky R, Abdel-Sater F, Vanderstocken G, Andre B, van Helden J (2011) Unraveling networks of co-regulated genes on the sole basis of genome sequences. Nucleic Acids Res 39:6340–6358
Corcoran DL, Feingold E, Benos PV (2005) FOOTER: a web tool for finding mammalian DNA regulatory regions using phylogenetic footprinting. Nucleic Acids Res 33:W442–W446
Fang F, Blanchette M (2006) FootPrinter3: phylogenetic footprinting in partially alignable sequences. Nucleic Acids Res 34:W617–W620
Gonzalez AD, Espinosa V, Vasconcelos AT, Perez-Rueda E, Collado-Vides J (2005) TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes. Nucleic Acids Res 33:D98–D102
Grant CE, Bailey TL, Noble WS (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics 27:1017–1018
Grote A, Klein J, Retter I, Haddad I, Behling S, Bunk B, Biegler I, Yarmolinetz S, Jahn D, Munch R (2009) PRODORIC (release 2009): a database and tool platform for the analysis of gene regulation in prokaryotes. Nucleic Acids Res 37:D61–D65
Grover A, Sharma PC (2011) Is spatial occurrence of microsatellites in the genome a determinant of their function and dynamics contributing to genome evolution? Curr Sci 100:859–869
Hu J, Li B, Kihara D (2005) Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res 33:4899–4913
Huerta AM, Salgado H, Thieffry D, Collado-Vides J (1998) RegulonDB: a database on transcriptional regulation in Escherichia coli. Nucleic Acids Res 26:55–59
Hughes JD, Estep PW, Church GM (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296:1205–1214
Jacques PE, Gervais AL, Cantin M, Lucier JF, Dallaire G, Drouin G, Gaudreau L, Goulet J, Brzezinski R (2005) MtbRegList, a database dedicated to the analysis of transcriptional regulation in Mycobacterium tuberculosis. Bioinformatics 21:2563–2565
Janky R, van Helden J (2008) Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution. BMC Bioinformatics 9:37
Katti MV, Sakharkar MK, Ranjekar PK, Gupta VS (2000) TRES: comparative promoter sequence analysis. Bioinformatics 16:739–740
Kazakov AE, Cipriano MJ, Novichkov PS, Minovitsky S, Vinogradov DV, Arkin A, Mironov AA, Gelfand MS, Dubchak I (2007) RegTransBase—a database of regulatory sequences and interactions in a wide range of prokaryotic genomes. Nucleic Acids Res 35:D407–D412
Kiełbasa SM, Klein H, Roider HG, Vingron M, Bluthgen N (2010) TransFind—predicting transcriptional regulators for gene sets. Nucleic Acids Res 38:W275–W280
Li G, Liu B, Ma Q, Xu Y (2011) A new framework for identifying cis-regulatory motifs in prokaryotes. Nucleic Acids Res 39:e42
Li G, Liu B, Xu Y (2010) Accurate recognition of cis-regulatory motifs with the correct lengths in prokaryotic genomes. Nucleic Acids Res 38:e12
Liu X, Brutlag DL, Liu JS (2001) BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 127–138
Liu XS, Brutlag DL, Liu JS (2002) An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol 8:835–839
MacIsaac K, Fraenkel E (2006) Practical strategies for discovering regulatory DNA sequence motifs. PloS Comput Biol 2:201–210
McCue L, Thompson W, Carmack C, Ryan MP, Liu JS, Derbyshire V, Lawrence CE (2001) Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res 29:774–782
McGuire AM, Church GM (2000) Predicting regulons and their cis-regulatory motifs by comparative genomics. Nucleic Acids Res 28:4523–4530
McGuire AM, Hughes JD, Church GM (2000) Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res 10:744–757
Mironov AA, Koonin EV, Roytberg MA, Gelfand MS (1999) Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes. Nucleic Acids Res 27:2981–2989
Morgenstern B (1999) DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15:211–218
Morgenstern B, Frech K, Dress A, Werner T (1998) DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 14(3):290–4
Mwangi MM, Siggia ED (2003) Genome wide identification of regulatory motifs in Bacillus subtilis. BMC Bioinformatics 4:18
Neph S, Tompa M (2006) MicroFootPrinter: a tool for phylogenetic footprinting in prokaryotic genomes. Nucleic Acids Res 34:W366–W368
Nosil P, Funk DJ, Ortiz-Barrientos D (2009) Divergent selection and heterogeneous genomic divergence. Mol Ecol 18:375–402
Novichkov PS, Laikova ON, Novichkova ES, Gelfand MS, Arkin AP, Dubchak I, Rodionov DA (2010) RegPrecise: a database of curated genomic inferences of transcriptional regulatory interactions in prokaryotes. Nucleic Acids Res 38:D111–D118
Qin ZS, McCue LA, Thompson W, Mayerhofer L, Lawrence CE, Liu JS (2003) Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites. Nat Biotechnol 21:435–439
Rouault H, Mazouni K, Couturier L, Hakim V, Schweisguth F (2010) Genome-wide identification of cis-regulatory motifs and modules underlying gene coregulation using statistics and phylogeny. Proc Natl Acad Sci USA 107:14615–14620
Satija R, Novak A, Miklos I, Lyngso R, Hein J (2009) BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC. BMC Evol Biol 9:217
Siddharthan R, Siggia ED, van Nimwegen E (2005) PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny. PLoS Comput Biol 1:e67
Sierro N, Makita Y, de Hoon M, Nakai K (2008) DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res 36:D93–D96
Sosinsky A, Honig B, Mann RS, Califano A (2007) Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting. Proc Natl Acad Sci USA 104:6305–6310
Stojanovic N, Florea L, Riemer C, Gumucio D, Slightom J, Goodman M, Miller W, Hardison R (1999) Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions. Nucleic Acids Res 27:3899–3910
Tagle DA, Koop BF, Goodman M, Slightom JL, Hess DL, Jones RT (1988) Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints. J Mol Biol 203:439–455
Thijs G, Lescot M, Marchal K, Rombauts S, De Moor B, Rouze P, Moreau Y (2001) A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17:1113–1122
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680
Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ, Makeev VJ, Mironov AA, Noble WS, Pavesi G, Pesole G, Régnier M, Simonis N, Sinha S, Thijs G, van Helden J, Vandenbogaert M, Weng Z, Workman C, Ye C, Zhu Z (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23:137–144
van-Nimwegen E, Zavolan M, Rajewsky N, Siggia ED (2002) Probabilistic clustering of sequences: inferring new bacterial regulons by comparative genomics. Proc Natl Acad Sci USA 99:7323–7328
Yan B, Methe BA, Lovley DR, Krushkal J (2004) Computational prediction of conserved operons and phylogenetic footprinting of transcription regulatory elements in the metal-reducing bacterial family Geobacteraceae. J Theor Biol 230:133–144
Yellaboina S, Seshadri J, Kumar MS, Ranjan A (2004) PredictRegulon: a web server for the prediction of the regulatory protein binding sites and operons in prokaryote genomes. Nucleic Acids Res 32:W318–W320
Zhang S, Li S, Pham PT, Su Z (2010) Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes. BMC Bioinformatics 11:397
Zhang S, Xu M, Li S, Su Z (2009) Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes. Nucleic Acids Res 37:e72
Acknowledgments
The authors acknowledge the DBT center for bioinformatics facility at Department of Bioscience and Biotechnology, Banasthali University, Banasthali, India for support.
Disclosure statement
No competing financial interests exist.
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Peter Nick
Rights and permissions
About this article
Cite this article
Katara, P., Grover, A. & Sharma, V. Phylogenetic footprinting: a boost for microbial regulatory genomics. Protoplasma 249, 901–907 (2012). https://doi.org/10.1007/s00709-011-0351-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00709-011-0351-9