Aligning multiple genomic sequences with the threaded blockset aligner
- PMID: 15060014
- PMCID: PMC383317
- DOI: 10.1101/gr.1933104
Aligning multiple genomic sequences with the threaded blockset aligner
Abstract
We define a "threaded blockset," which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for "threaded blockset aligner") builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.
Figures
Similar articles
-
MAVID: constrained ancestral alignment of multiple sequences.Genome Res. 2004 Apr;14(4):693-9. doi: 10.1101/gr.1960404. Genome Res. 2004. PMID: 15060012 Free PMC article.
-
How accurately is ncRNA aligned within whole-genome multiple alignments?BMC Bioinformatics. 2007 Oct 26;8:417. doi: 10.1186/1471-2105-8-417. BMC Bioinformatics. 2007. PMID: 17963514 Free PMC article.
-
GS-Aligner: a novel tool for aligning genomic sequences using bit-level operations.Mol Biol Evol. 2003 Aug;20(8):1299-309. doi: 10.1093/molbev/msg139. Epub 2003 May 30. Mol Biol Evol. 2003. PMID: 12777500
-
Computation and analysis of genomic multi-sequence alignments.Annu Rev Genomics Hum Genet. 2007;8:193-213. doi: 10.1146/annurev.genom.8.080706.092300. Annu Rev Genomics Hum Genet. 2007. PMID: 17489682 Review.
-
Recent developments and future directions in computational genomics.FEBS Lett. 2000 Aug 25;480(1):42-8. doi: 10.1016/s0014-5793(00)01776-2. FEBS Lett. 2000. PMID: 10967327 Review.
Cited by
-
Super-enhancers conserved within placental mammals maintain stem cell pluripotency.Proc Natl Acad Sci U S A. 2022 Oct 4;119(40):e2204716119. doi: 10.1073/pnas.2204716119. Epub 2022 Sep 26. Proc Natl Acad Sci U S A. 2022. PMID: 36161929 Free PMC article.
-
The role of PHD2 mutations in the pathogenesis of erythrocytosis.Hypoxia (Auckl). 2014 Jul 1;2:71-90. doi: 10.2147/HP.S54455. eCollection 2014. Hypoxia (Auckl). 2014. PMID: 27774468 Free PMC article. Review.
-
A microRNA-328 binding site in PAX6 is associated with centrotemporal spikes of rolandic epilepsy.Ann Clin Transl Neurol. 2016 Jun 2;3(7):512-22. doi: 10.1002/acn3.320. eCollection 2016 Jul. Ann Clin Transl Neurol. 2016. PMID: 27386500 Free PMC article.
-
The Genome 10K Project: a way forward.Annu Rev Anim Biosci. 2015;3:57-111. doi: 10.1146/annurev-animal-090414-014900. Annu Rev Anim Biosci. 2015. PMID: 25689317 Free PMC article. Review.
-
Rock, paper, scissors: harnessing complementarity in ortholog detection methods improves comparative genomic inference.G3 (Bethesda). 2015 Feb 23;5(4):629-38. doi: 10.1534/g3.115.017095. G3 (Bethesda). 2015. PMID: 25711833 Free PMC article.
References
-
- Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J.M., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A., et al. 2002. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297: 1301-1310. - PubMed
-
- Brudno, M. and Morgenstern, B. 2002. Fast and sensitive alignment of large genomic sequences. In Proceedings of the IEEE Computer Society Bioinformatics Conference, pp. 138-150. IEEE Press. - PubMed
-
- Collins, F.S., Green, E.D., Guttmacher, A.E., and Guyer, M.S. 2003. A vision for the future of genomics research. Nature 422: 835-847. - PubMed
WEB SITE REFERENCES
-
- http://bio.cse.psu.edu/; TBA, simulated test data, and the Gmaj visualization tool.
-
- http://genome.ucsc.edu; MULTIZ and HUMOR alignments.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources