Abstract
Elucidating the human transcriptional regulatory network1 is a challenge of the post-genomic era. Technical progress so far is impressive, including detailed understanding of regulatory mechanisms for at least a few genes in multicellular organisms2,3,4, rapid and precise localization of regulatory regions within extensive regions of DNA by means of cross-species comparison5,6,7, and de novo determination of transcription-factor binding specificities from large-scale yeast expression data8. Here we address two problems involved in extending these results to the human genome: first, it has been unclear how many model organism genomes will be needed to delineate most regulatory regions; and second, the discovery of transcription-factor binding sites (response elements) from expression data has not yet been generalized from single-celled organisms to multicellular organisms. We found that 98% (74/75) of experimentally defined sequence-specific binding sites of skeletal-muscle-specific transcription factors are confined to the 19% of human sequences that are most conserved in the orthologous rodent sequences. Also we found that in using this restriction, the binding specificities of all three major muscle-specific transcription factors (MYF, SRF and MEF2) can be computationally identified.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Kadonaga, J.T. Eukaryotic transcription: an interlaced network of transcription factors and chromatin-modifying machines. Cell 92, 307 –313 (1998).
Orkin, S.H. Regulation of globin gene expression in erythroid cells. Eur. J. Biochem. 231, 271–281 ( 1995).
Qin, W. et al. Molecular characterization of the creatine kinases and some historical perspectives. Mol. Cell. Biochem. 184, 153 –167 (1998).
Yuh, C.H., Bolouri, H. & Davidson, E.H. Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. Science 279, 1896–1902 (1998).
Aparicio, S. et al. Organization of the Fugu rubripes Hox clusters: evidence for continuing evolution of vertebrate Hox complexes. Nature Genet. 16, 79–83 ( 1997).
Brickner, A.G., Koop, B.F., Aronow, B.J. & Wiginton, D.A. Genomic sequence comparison of the human and mouse adenosine deaminase gene regions. Mamm. Genome 10, 95–101 (1999).
Hardison, R.C., Oeltjen, J. & Miller, W. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res. 8, 959–966 (1997).
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J. & Church, G.M. Systematic determination of genetic network architecture. Nature Genet. 22, 281–285 (1999).
Fickett, J.W. & Wasserman, W.W. Discovery and modeling of transcriptional regulatory regions. Curr. Opin. Biotechnol. 11, 19–24 (2000).
Stormo, G.D. & Fields, D.S. Specificity, free energy and information content in protein-DNA interactions. Trends Biochem. Sci. 23, 109–113 (1998).
Werner, T. Models for prediction and recognition of eukaryotic promoters. Mamm. Genome 10, 168–175 (1999).
Fickett, J.W. & Hatzigeorgiou, A.G. Eukaryotic promoter recognition . Genome Res. 7, 861–878 (1997).
Tronche, F., Ringeisen, F., Blumenfeld, M., Yaniv, M. & Pontoglio, M. Analysis of the distribution of binding sites for a tissue-specific transcription factor in the vertebrate genome. J. Mol. Biol. 266, 231– 245 (1997).
Duret, L. & Bucher, P. Searching for regulatory elements in human noncoding sequences. Curr. Opin. Struct. Biol. 7, 399–406 (1997).
Koop, B.F. Human and rodent DNA sequence comparisons: a mosaic model of genomic evolution . Trends Genet. 11, 367– 371 (1995).
Wasserman, W.W. & Fickett, J.W. Identification of regulatory regions which confer muscle-specific gene expression. J. Mol. Biol. 278, 167–181 (1998).
Battey, J., Jordan, E., Cox, D. & Dove, W. An action plan for mouse genomics. Nature Genet. 21, 73– 75 (1999).
Sonnhammer, E.L. & Durbin, R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 29, GC1– 10 (1995).
Huang, X.Q., Hardison, R.C. & Miller, W. A space-efficient algorithm for local similarities. Comput. Appl. Biosci. 6, 373–381 (1990).
Zhu, J., Liu, J.S. & Lawrence, C.E. Bayesian adaptive sequence alignment algorithms. Bioinformatics 14, 25–39 (1998).
Lania, L., Majello, B. & De Luca, P. Transcriptional regulation by the Sp family proteins . Int. J. Biochem. Cell Biol. 29, 1313– 1323 (1997).
Scherf, M., Klingenhoff, A. & Werner, T. Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach . J. Mol. Biol. 297, 599– 606 (2000).
Lawrence, C.E. et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208 –214 (1993).
Liu, J.S., Neuwald, A.F. & Lawrence, C.E. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Am. Stat. Assoc. 90, 1156–1170 (1995).
Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell. 9, 3273–3297 (1998).
Sankoff, D. & Cedergren, R.J. A test for nucleotide sequence homology. J. Mol. Biol. 77, 169– 164 (1973).
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 ( 1990).
Agarwal, P. & States, D.J. A Bayesian evolutionary distance for parametrically aligned sequences. J. Comput. Biol. 3, 1–17 (1996).
Liu, J.S. & Lawrence, C.E. Bayesian inference on biopolymer models. Bioinformatics 15, 38– 52 (1999).
Wootton, J.C. & Federhen, S. Analysis of compositional biased regions in sequence databases. Methods Enzymol. 266 , 554–571 (1996).
Acknowledgements
We thank our colleagues at SmithKline Beecham and the Wadsworth Center for input, and the Computational Molecular Biology Core at the Wadsworth Center and I. Auger for assistance. This work was supported by grants from the NIH to J.W.F. (NHGRI R01 HG00981-03) and C.E.L. (NHGRI R01 HG01257).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wasserman, W., Palumbo, M., Thompson, W. et al. Human-mouse genome comparisons to locate regulatory sites. Nat Genet 26, 225–228 (2000). https://doi.org/10.1038/79965
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/79965