Hello all,
I'd downloaded and installed Karken2 locally on WSL Ubuntu. After install, I build a standard database with the command:
kraken2-build --standard --db databasename
here is the out:
Downloading nucleotide gb accession to taxon map... done.
Downloading nucleotide wgs accession to taxon map... done.
Downloaded accession to taxon map(s)
Downloading taxonomy tree data... done.
Uncompressing taxonomy data... done.
Untarring taxonomy tree data... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 572 projects (1049 sequences, 1.64 Gbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 41215 projects (97145 sequences, 172.10 Gbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 14972 projects (18639 sequences, 549.88 Mbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Downloading plasmid files from FTP...xargs: warning: options --max-args and --replace/-I/-i are mutually exclusive, ignoring previous --max-args value
xargs: warning: options --max-args and --replace/-I/-i are mutually exclusive, ignoring previous --max-args value
done.
Masking low-complexity regions of downloaded library... done.
mv: replace 'assembly_summary.txt', overriding mode 0555 (r-xr-xr-x)?
Note the question at the end.. This took all night to run and so I don't want to accidentally overwrite the main 172 Gbp of 16s sequences that have already been properly processed and complexity masked, what is this asking? What should I answer? Anything I should know before trying to determine taxonomy via 16S on my assembled fastqs of shotgun metagenome data?
thank you so much.