SeqPurge: highly-sensitive adapter trimming for paired-end NGS data
- PMID: 27161244
- PMCID: PMC4862148
- DOI: 10.1186/s12859-016-1069-7
SeqPurge: highly-sensitive adapter trimming for paired-end NGS data
Abstract
Background: Trimming of adapter sequences from short read data is a common preprocessing step during NGS data analysis. When performing paired-end sequencing, the overlap between forward and reverse read can be used to identify excess adapter sequences. This is exploited by several previously published adapter trimming tools. However, our evaluation on amplicon-based data shows that most of the current tools are not able to remove all adapter sequences and that adapter contamination may even lead to spurious variant calls.
Results: Here we present SeqPurge ( https://github.com/imgag/ngs-bits ), a highly-sensitive adapter trimmer that uses a probabilistic approach to detect the overlap between forward and reverse reads of Illumina sequencing data. SeqPurge can detect very short adapter sequences, even if only one base long. Compared to other adapter trimmers specifically designed for paired-end data, we found that SeqPurge achieves a higher sensitivity. The number of remaining adapter bases after trimming is reduced by up to 90 %, depending on the compared tool. In simulations with different error rates, we found that SeqPurge is also the most error-tolerant adapter trimmer in the comparison.
Conclusion: SeqPurge achieves a very high sensitivity and a high error-tolerance, combined with a specificity and runtime that are comparable to other state-of-the-art adapter trimmers. The very good adapter trimming performance, complemented with additional features such as quality-based trimming and basic quality control, makes SeqPurge an excellent choice for the pre-processing of paired-end NGS data.
Figures
Similar articles
-
Sequence-matching adapter trimmers generate consistent quality and assembly metrics for Illumina sequencing of RNA viruses.BMC Res Notes. 2024 Oct 14;17(1):308. doi: 10.1186/s13104-024-06951-0. BMC Res Notes. 2024. PMID: 39402647 Free PMC article.
-
Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads.BMC Bioinformatics. 2014 Jun 12;15:182. doi: 10.1186/1471-2105-15-182. BMC Bioinformatics. 2014. PMID: 24925680 Free PMC article.
-
Atropos: specific, sensitive, and speedy trimming of sequencing reads.PeerJ. 2017 Aug 30;5:e3720. doi: 10.7717/peerj.3720. eCollection 2017. PeerJ. 2017. PMID: 28875074 Free PMC article.
-
An Efficient Trimming Algorithm based on Multi-Feature Fusion Scoring Model for NGS Data.IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):728-738. doi: 10.1109/TCBB.2019.2897558. Epub 2019 Feb 5. IEEE/ACM Trans Comput Biol Bioinform. 2020. PMID: 30736001
-
PEAT: an intelligent and efficient paired-end sequencing adapter trimming algorithm.BMC Bioinformatics. 2015;16 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-16-S1-S2. Epub 2015 Jan 21. BMC Bioinformatics. 2015. PMID: 25707528 Free PMC article.
Cited by
-
Sequence-matching adapter trimmers generate consistent quality and assembly metrics for Illumina sequencing of RNA viruses.BMC Res Notes. 2024 Oct 14;17(1):308. doi: 10.1186/s13104-024-06951-0. BMC Res Notes. 2024. PMID: 39402647 Free PMC article.
-
Tumour-informed liquid biopsies to monitor advanced melanoma patients under immune checkpoint inhibition.Nat Commun. 2024 Oct 9;15(1):8750. doi: 10.1038/s41467-024-52923-0. Nat Commun. 2024. PMID: 39384805 Free PMC article.
-
Characterizing the allele-specific gene expression landscape in high hyperdiploid acute lymphoblastic leukemia with BASE.Sci Rep. 2024 Oct 5;14(1):23181. doi: 10.1038/s41598-024-73743-8. Sci Rep. 2024. PMID: 39369032 Free PMC article.
-
Genetic diversity accelerates canine distemper virus adaptation to ferrets.J Virol. 2024 Aug 20;98(8):e0065724. doi: 10.1128/jvi.00657-24. Epub 2024 Jul 15. J Virol. 2024. PMID: 39007615
-
Developing a platform for secretion of biomolecules in Mycoplasma feriruminatoris.Microb Cell Fact. 2024 Apr 30;23(1):124. doi: 10.1186/s12934-024-02392-3. Microb Cell Fact. 2024. PMID: 38689251 Free PMC article.
References
-
- Garrison E, et al. Haplotype-based variant detection from short-read sequencing. 2012. http://arxiv.org/abs/1207.3907. Accessed 18 Jan 2016.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources