Entering edit mode
5 weeks ago
chemokine-1
▴
10
Is it possible to use SPAdes to assemble a specific bacterial strain from stool sequenced data, and what steps or modifications in the standard pipeline are required to optimize strain-level assembly?
Check an old answer. The thing is: if your strain of interest does not differ a lot from the published genomic sequence simple mapping of reads (using multiple bacterial genomes) may be good enough. But if it differs i.e. having large insertions not present in the published genome of that strain it is hard/close to impossible to enrich reads (during pre-assembly) derived from such inserts.
So I think I wasn't clear enough in my question. The stool contains many bacteria, and I am concerned about the presence of a new strain that interests me. This new strain shouldn't be very different from its original bacteria (I believe). I think that for a monoclonal strain sequenced stool, SPAdes, QUAST, and Prokka should be enough. How can I differentiate the new strain from well-known genomes?
Do you have any a priori ideas about where differences in the genome might arise between the strains, or is this a fishing expedition?
It sounds like you'll need to do some annotating with tools like
Kraken2
orQiime2
, and then pull out your species of interest from the rest of the metagenomic community. After you can check for variants, and how they segregate if you have enough data left over. Hard to say without more information about the data you have.