Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis
- PMID: 10742046
- DOI: 10.1093/oxfordjournals.molbev.a026334
Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis
Abstract
The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.
Similar articles
-
Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.Syst Biol. 2007 Aug;56(4):564-77. doi: 10.1080/10635150701472164. Syst Biol. 2007. PMID: 17654362
-
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1. Syst Biol. 2012. PMID: 22139466
-
A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.Mol Biol Evol. 2013 Mar;30(3):689-712. doi: 10.1093/molbev/mss264. Epub 2012 Nov 27. Mol Biol Evol. 2013. PMID: 23193120
-
Phylogenetic inference from conserved sites alignments.J Exp Zool. 1999 Aug 15;285(2):128-39. J Exp Zool. 1999. PMID: 10440724
-
Multiple sequence alignment: in pursuit of homologous DNA positions.Genome Res. 2007 Feb;17(2):127-35. doi: 10.1101/gr.5232407. Genome Res. 2007. PMID: 17272647 Review.
Cited by
-
Development of a Multiplex PCR Assay to Detect Neofusicoccum parvum and Botryosphaeria dothidea in Walnut.Curr Microbiol. 2024 Oct 29;81(12):432. doi: 10.1007/s00284-024-03954-9. Curr Microbiol. 2024. PMID: 39472323
-
Discovering genotype-phenotype relationships with machine learning and the Visual Physiology Opsin Database (VPOD).Gigascience. 2024 Jan 2;13:giae073. doi: 10.1093/gigascience/giae073. Gigascience. 2024. PMID: 39460934 Free PMC article.
-
Selection and Effect of Plant Growth-Promoting Bacteria on Pine Seedlings (Pinus montezumae and Pinus patula).Life (Basel). 2024 Oct 17;14(10):1320. doi: 10.3390/life14101320. Life (Basel). 2024. PMID: 39459620 Free PMC article.
-
Molecular characterization, carbohydrate metabolism and tolerance to abiotic stress of Eremothecium coryli endophytic isolates from fruits of Momordica indica.Folia Microbiol (Praha). 2024 Oct 23. doi: 10.1007/s12223-024-01211-x. Online ahead of print. Folia Microbiol (Praha). 2024. PMID: 39453539
-
Potentially Pathogenic Free-Living Amoebae Isolated from Soil Samples from Warsaw Parks and Squares.Pathogens. 2024 Oct 12;13(10):895. doi: 10.3390/pathogens13100895. Pathogens. 2024. PMID: 39452766 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources