Abstract
The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding1,2,3,4,5,6 and conserved non-coding4,6,7 regions, including regulatory elements8,9,10, and provide insight into the forces that have rendered modern-day genomes6. As a complement to whole-genome sequencing efforts3,5,6, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B. & Lander, E. S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000)
Roest Crollius, H. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000)
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Chen, R., Bouck, J. B., Weinstock, G. M. & Gibbs, R. A. Comparing vertebrate whole-genome shotgun reads to the human genome. Genome Res. 11, 1807–1816 (2001)
Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301–1310 (2002)
Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)
Dubchak, I. et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10, 1304–1306 (2000)
Gottgens, B. et al. Analysis of vertebrate SCL loci identifies conserved enhancers. Nature Biotechnol. 18, 181–186 (2000)
Hardison, R. C. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16, 369–372 (2000)
Pennacchio, L. A. & Rubin, E. M. Genomic strategies to identify mammalian regulatory sequences. Nature Rev. Genet. 2, 100–109 (2001)
Rommens, J. M. et al. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science 245, 1059–1065 (1989)
Felsenfeld, A., Peterson, J., Schloss, J. & Guyer, M. Assessing the quality of the DNA sequence from The Human Genome Project. Genome Res. 9, 1–4 (1999)
Schwartz, S. et al. Human–mouse alignments with BLASTZ. Genome Res 13, 103–107 (2003)
Schwartz, S. et al. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524 (2003)
Murphy, W. J. et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294, 2348–2351 (2001)
Poux, C., Van Rheede, T., Madsen, O. & de Jong, W. W. Sequence gaps join mice and men: phylogenetic evidence from deletions in two proteins. Mol. Biol. Evol. 19, 2035–2037 (2002)
Huelsenbeck, J. P., Larget, B. & Swofford, D. A compound Poisson process for relaxing the molecular clock. Genetics 154, 1879–1892 (2000)
Cooper, G. M. et al. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003)
Siepel, A. & Haussler, D. Proc. 7th Annual Int. Conf. Research in Computational Molecular Biology (ACM, New York, 2003)
Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003)
Green, P. et al. Transcription-associated mutational asymmetry in mammalian evolution. Nature Genet. 33, 514–517 (2003)
Frazer, K. A. et al. Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates. Genome Res. 13, 341–346 (2003)
Britten, R. J. Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels. Proc. Natl Acad. Sci. USA 99, 13633–13635 (2002)
Springer, M. S., Murphy, W. J., Eizirik, E. & O'Brien, S. J. Placental mammal diversification and the Cretaceous/Tertiary boundary. Proc. Natl Acad. Sci. USA 100, 1056–1061 (2003)
Li, W. H., Ellsworth, D. L., Krushkal, J., Chang, B. H. & Hewett-Emmett, D. Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis. Mol. Phylogenet. Evol. 5, 182–187 (1996)
Kumar, S. & Subramanian, S. Mutation rates in mammalian genomes. Proc. Natl Acad. Sci. USA 99, 803–808 (2002)
Shizuya, H. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA 89, 8794–8797 (1992)
Thomas, J. W. et al. Parallel construction of orthologous sequence-ready clone contig maps in multiple species. Genome Res. 12, 1277–1285 (2002)
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)
Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002)
Acknowledgements
We thank J. Weissenbach and H. Roest Crollius for Tetraodon BACs; M. Diekhans for computational expertise; N. Goldman and Z. Yang for advice on phylogenetic analyses; and F. Collins and J. Mullikin for critically reading the manuscript. We acknowledge the support of the National Human Genome Research Institute (National Institutes of Health) and the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Thomas, J., Touchman, J., Blakesley, R. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003). https://doi.org/10.1038/nature01858
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature01858