Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
- PMID: 28594827
- PMCID: PMC5481147
- DOI: 10.1371/journal.pcbi.1005595
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
Abstract
The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequencing is more expensive and error-prone. There is significant interest in combining data from these complementary sequencing technologies to generate more accurate "hybrid" assemblies. However, few tools exist that truly leverage the benefits of both types of data, namely the accuracy of short reads and the structural resolving power of long reads. Here we present Unicycler, a new tool for assembling bacterial genomes from a combination of short and long reads, which produces assemblies that are accurate, complete and cost-effective. Unicycler builds an initial assembly graph from short reads using the de novo assembler SPAdes and then simplifies the graph using information from short and long reads. Unicycler uses a novel semi-global aligner to align long reads to the assembly graph. Tests on both synthetic and real reads show Unicycler can assemble larger contigs with fewer misassemblies than other hybrid assemblers, even when long-read depth and accuracy are low. Unicycler is open source (GPLv3) and available at github.com/rrwick/Unicycler.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.BMC Genomics. 2020 Sep 14;21(1):631. doi: 10.1186/s12864-020-07041-8. BMC Genomics. 2020. PMID: 32928108 Free PMC article.
-
Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes.Microb Genom. 2019 Sep;5(9):e000294. doi: 10.1099/mgen.0.000294. Epub 2019 Aug 30. Microb Genom. 2019. PMID: 31483244 Free PMC article.
-
Genome assembly using Nanopore-guided long and error-free DNA reads.BMC Genomics. 2015 Apr 20;16(1):327. doi: 10.1186/s12864-015-1519-z. BMC Genomics. 2015. PMID: 25927464 Free PMC article.
-
The present and future of de novo whole-genome assembly.Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096. Brief Bioinform. 2018. PMID: 27742661 Review.
-
De novo sequencing of plant genomes using second-generation technologies.Brief Bioinform. 2009 Nov;10(6):609-18. doi: 10.1093/bib/bbp039. Brief Bioinform. 2009. PMID: 19933209 Review.
Cited by
-
Identification of Haemoproteus infection in an imported grey crowned crane (Balearica regulorum) in China.Parasitol Res. 2024 Oct 11;123(10):349. doi: 10.1007/s00436-024-08373-0. Parasitol Res. 2024. PMID: 39392533 Free PMC article.
-
Analysis of risk factors and different treatments for infections caused by carbapenem-resistant Acinetobacter baumannii in Shaanxi, China.BMC Infect Dis. 2024 Oct 9;24(1):1130. doi: 10.1186/s12879-024-10036-5. BMC Infect Dis. 2024. PMID: 39385067 Free PMC article.
-
Complex transcriptional regulations of a hyperparasitic quadripartite system in giant viruses infecting protists.Nat Commun. 2024 Oct 9;15(1):8608. doi: 10.1038/s41467-024-52906-1. Nat Commun. 2024. PMID: 39384766 Free PMC article.
-
The marine environmental microbiome mediates physiological outcomes in host nematodes.BMC Biol. 2024 Oct 8;22(1):224. doi: 10.1186/s12915-024-02021-w. BMC Biol. 2024. PMID: 39379910 Free PMC article.
-
First report of carbapenems encoding multidrug-resistant gram-negative bacteria from a pediatric hospital in Gaza Strip, Palestine.BMC Microbiol. 2024 Oct 9;24(1):393. doi: 10.1186/s12866-024-03550-8. BMC Microbiol. 2024. PMID: 39379824 Free PMC article.
References
-
- Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34(Database issue):D32–6. doi: 10.1093/nar/gkj014 - DOI - PMC - PubMed
-
- Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10(6):563–9. doi: 10.1038/nmeth.2474 - DOI - PubMed
-
- Kwong JC, McCallum N, Sintchenko V, Howden BP. Whole genome sequencing in clinical and public health microbiology. Pathology. 2015;47(3):199–210. doi: 10.1097/PAT.0000000000000235 - DOI - PMC - PubMed
-
- Hunt M, Newbold C, Berriman M, Otto TD. A comprehensive evaluation of assembly scaffolding tools. Genome Biol. 2014;15(3):R42 doi: 10.1186/gb-2014-15-3-r42 - DOI - PMC - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources