A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies
- PMID: 21423806
- PMCID: PMC3056720
- DOI: 10.1371/journal.pone.0017915
A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies
Abstract
The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM) occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC) assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers.
Conflict of interest statement
Figures
Similar articles
-
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8. BMC Genomics. 2016. PMID: 27556636 Free PMC article.
-
Clover: a clustering-oriented de novo assembler for Illumina sequences.BMC Bioinformatics. 2020 Nov 17;21(1):528. doi: 10.1186/s12859-020-03788-9. BMC Bioinformatics. 2020. PMID: 33203354 Free PMC article.
-
FastEtch: A Fast Sketch-Based Assembler for Genomes.IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1091-1106. doi: 10.1109/TCBB.2017.2737999. Epub 2017 Sep 11. IEEE/ACM Trans Comput Biol Bioinform. 2019. PMID: 28910776
-
The present and future of de novo whole-genome assembly.Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096. Brief Bioinform. 2018. PMID: 27742661 Review.
-
Assembly algorithms for next-generation sequencing data.Genomics. 2010 Jun;95(6):315-27. doi: 10.1016/j.ygeno.2010.03.001. Epub 2010 Mar 6. Genomics. 2010. PMID: 20211242 Free PMC article. Review.
Cited by
-
Unitig-centered pan-genome machine learning approach for predicting antibiotic resistance and discovering novel resistance genes in bacterial strains.Comput Struct Biotechnol J. 2024 Apr 16;23:1864-1876. doi: 10.1016/j.csbj.2024.04.035. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38707536 Free PMC article.
-
Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton.mBio. 2023 Dec 19;14(6):e0167623. doi: 10.1128/mbio.01676-23. Epub 2023 Nov 10. mBio. 2023. PMID: 37947402 Free PMC article.
-
Using a combination of short- and long-read sequencing to investigate the diversity in plasmid- and chromosomally encoded extended-spectrum beta-lactamases (ESBLs) in clinical Shigella and Salmonella isolates in Belgium.Microb Genom. 2023 Jan;9(1):mgen000925. doi: 10.1099/mgen.0.000925. Microb Genom. 2023. PMID: 36748573 Free PMC article.
-
Tool evaluation for the detection of variably sized indels from next generation whole genome and targeted sequencing data.PLoS Comput Biol. 2022 Feb 17;18(2):e1009269. doi: 10.1371/journal.pcbi.1009269. eCollection 2022 Feb. PLoS Comput Biol. 2022. PMID: 35176018 Free PMC article.
-
Genome-wide transcriptome analysis of the early developmental stages of Echinococcus granulosus protoscoleces reveals extensive alternative splicing events in the spliceosome pathway.Parasit Vectors. 2021 Nov 12;14(1):574. doi: 10.1186/s13071-021-05067-9. Parasit Vectors. 2021. PMID: 34772444 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources