The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote
- PMID: 23558742
- PMCID: PMC3664803
- DOI: 10.1093/nar/gkt214
The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote
Abstract
Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from the seeds. It uses a relatively large number of short seeds (called subreads) extracted from each read and allows all the seeds to vote on the optimal location. When the read length is <160 bp, overlapping subreads are used. More conventional alignment algorithms are then used to fill in detailed mismatch and indel information between the subreads that make up the winning voting block. The strategy is fast because the overall genomic location has already been chosen before the detailed alignment is done. It is sensitive because no individual subread is required to map exactly, nor are individual subreads constrained to map close by other subreads. It is accurate because the final location must be supported by several different subreads. The strategy extends easily to find exon junctions, by locating reads that contain sets of subreads mapping to different exons of the same gene. It scales up efficiently for longer reads.
Figures
Similar articles
-
A fast read alignment method based on seed-and-vote for next generation sequencing.BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):466. doi: 10.1186/s12859-016-1329-6. BMC Bioinformatics. 2016. PMID: 28155631 Free PMC article.
-
Ψ-RA: a parallel sparse index for genomic read alignment.BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27. BMC Genomics. 2011. PMID: 21989248 Free PMC article.
-
Fast and accurate read alignment for resequencing.Bioinformatics. 2012 Sep 15;28(18):2366-73. doi: 10.1093/bioinformatics/bts450. Epub 2012 Jul 18. Bioinformatics. 2012. PMID: 22811546 Free PMC article.
-
SRPRISM (Single Read Paired Read Indel Substitution Minimizer): an efficient aligner for assemblies with explicit guarantees.Gigascience. 2020 Apr 1;9(4):giaa023. doi: 10.1093/gigascience/giaa023. Gigascience. 2020. PMID: 32315028 Free PMC article.
-
Alignment of Next-Generation Sequencing Reads.Annu Rev Genomics Hum Genet. 2015;16:133-51. doi: 10.1146/annurev-genom-090413-025358. Epub 2015 May 4. Annu Rev Genomics Hum Genet. 2015. PMID: 25939052 Review.
Cited by
-
Environmental Enrichment Induces Epigenomic and Genome Organization Changes Relevant for Cognition.Front Mol Neurosci. 2021 May 5;14:664912. doi: 10.3389/fnmol.2021.664912. eCollection 2021. Front Mol Neurosci. 2021. PMID: 34025350 Free PMC article.
-
Circadian control of hepatitis B virus replication.Nat Commun. 2021 Mar 12;12(1):1658. doi: 10.1038/s41467-021-21821-0. Nat Commun. 2021. PMID: 33712578 Free PMC article.
-
The oncogenic E3 ligase TRIP12 suppresses epithelial-mesenchymal transition (EMT) and mesenchymal traits through ZEB1/2.Cell Death Discov. 2021 May 7;7(1):95. doi: 10.1038/s41420-021-00479-z. Cell Death Discov. 2021. PMID: 33963176 Free PMC article.
-
Human ZKSCAN3 and Drosophila M1BP are functionally homologous transcription factors in autophagy regulation.Sci Rep. 2020 Jun 15;10(1):9653. doi: 10.1038/s41598-020-66377-z. Sci Rep. 2020. PMID: 32541927 Free PMC article.
-
ZC3H4 restricts non-coding transcription in human cells.Elife. 2021 Apr 29;10:e67305. doi: 10.7554/eLife.67305. Elife. 2021. PMID: 33913806 Free PMC article.
References
-
- Marco-Sola S, Sammeth M, Guig R, Ribeca P. The GEM mapper: fast, accurate and versatile alignment by filtration. Nat. Methods. 2012;9:1185–1188. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases