Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 22:9:39.
doi: 10.1186/s13104-016-1847-3.

SeqTools: visual tools for manual analysis of sequence alignments

Affiliations

SeqTools: visual tools for manual analysis of sequence alignments

Gemma Barson et al. BMC Res Notes. .

Abstract

Background: Manual annotation is essential to create high-quality reference alignments and annotation. Annotators need to be able to view sequence alignments in detail. The SeqTools package provides three tools for viewing different types of sequence alignment: Blixem is a many-to-one browser of pairwise alignments, displaying multiple match sequences aligned against a single reference sequence; Dotter provides a graphical dot-plot view of a single pairwise alignment; and Belvu is a multiple sequence alignment viewer, editor, and phylogenetic tool. These tools were originally part of the AceDB genome database system but have been completely rewritten to make them generally available as a standalone package of greatly improved function.

Findings: Blixem is used by annotators to give a detailed view of the evidence for particular gene models. Blixem displays the gene model positions and the match sequences aligned against the genomic reference sequence. Annotators use this for many reasons, including to check the quality of an alignment, to find missing/misaligned sequence and to identify splice sites and polyA sites and signals. Dotter is used to give a dot-plot representation of a particular pairwise alignment. This is used to identify sequence that is not represented (or is misrepresented) and to quickly compare annotated gene models with transcriptional and protein evidence that putatively supports them. Belvu is used to analyse conservation patterns in multiple sequence alignments and to perform a combination of manual and automatic processing of the alignment. High-quality reference alignments are essential if they are to be used as a starting point for further automatic alignment generation.

Conclusions: While there are many different alignment tools available, the SeqTools package provides unique functionality that annotators have found to be essential for analysing sequence alignments as part of the manual annotation process.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Blixem in protein mode. Blixem showing SwissProt alignments for human chromosome 4. The three-frame translation of the reference sequence is shown in yellow. The match sequence residues are highlighted in cyan for an exact match, blue for a conserved match and grey for a mismatch. Insertions are indicated by a vertical purple bar and deletions by a dot
Fig. 2
Fig. 2
Dotter in nucleotide mode. Top left the dot-plot. Top-right the grey-ramp tool, used for dynamically adjusting the stringency cut-offs. Bottom the alignment tool, showing the sequence data at the current cross-hair position; both strands of the horizontal sequence are displayed, and residues are coloured according to how well they match
Fig. 3
Fig. 3
Belvu colour-by-conservation mode. Multiple sequence alignment in Belvu with residues coloured by average similarity by BLOSUM62. Cyan indicates conservation of 3.0 or greater, blue 1.5, and grey 0.5. Colours and thresholds are user-configurable
Fig. 4
Fig. 4
Belvu in colour-by-residue mode. Multiple sequence alignment in Belvu with residues coloured by residue type. Colours are user-configureable and thresholds can be specified to only colour residues with a given percent identity or higher
Fig. 5
Fig. 5
Belvu conservation profile. The maximum conservation for each column is plotted against the column number. The red line shows the average conservation, and a sliding-window of three has been applied for smoothing
Fig. 6
Fig. 6
Belvu phylogenetic tree. Neighbour-joining tree using Scoredist correction

Similar articles

Cited by

References

    1. Durbin R, Thierry-Mieg J. The ACeDB genome database. In: Suhai S, editor. Computational methods in genome research. Berlin: Springer; 1994. p. 45–55. doi:10.1007/978-1-4615-2451-9_4.
    1. Sonnhammer ELL, Durbin R.A workbench for large-scale sequence homology analysis. Comput Appl Biosci (CABIOS). 1994;10(3):301–7. (1994). doi:10.1093/bioinformatics/10.3.301. http://bioinformatics.oxfordjournals.org/content/10/3/301.full.pdf+html. - PubMed
    1. Sonnhammer ELL, Durbin R. A dot-matrix program with dynamic threshold control suited for genomic dna and protein sequence analysis. Comput Appl Biosci (CABIOS). 1995;167(1–2):1–10. doi:10.1016/0378-1119(95)00714-8. http://bioinformatics.oxfordjournals.org/content/10/3/301.full.pdf+html. - PubMed
    1. Sonnhammer E, Hollich V. Scoredist: a simple and robust protein sequence distance estimator. BMC Bioinform. 2005;6(1):108. doi: 10.1186/1471-2105-6-108. - DOI - PMC - PubMed
    1. Harrow J, Denoeud F, Frankish A, Reymond A, Chen C-K, Chrast J, Lagarde J, Gilbert J, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis S, Guigo R. Gencode: producing a reference annotation for encode. Genome Biol. 2006;7(Suppl. 1):4. doi: 10.1186/gb-2006-7-s1-s4. - DOI - PMC - PubMed

LinkOut - more resources