How to format a FASTA file (based on an ASV table) for clustering in Swarm?
0
0
Entering edit mode
2.9 years ago
Ellen ▴ 20

Dear all,

After a long Googling history, I decided to post my question on this forum.

I have run the dada2 pipeline in R on my HTS data from 24 samples to obtain an ASV table, which contains the read abundance of each ASV per sample. Now, I want to perform clustering of these ASVs using Swarm (Mahé et al. 2021, Mahé et al. 2014), but I am wondering how to properly convert the ASV table created with dada2 to a FASTA file for clustering with Swarm. I managed to obtain a FASTA file with headers being ASV name and their total read abundance (over all samples), but of course this file does not contain any sample information (ASV read abundance per sample).

screenshot FASTA file

How to solve this issue? Should I prepare different FASTA files per sample (I don't think so, based on the pipeline of Frédéric Mahé: https://github.com/frederic-mahe/swarm/wiki/Fred's-metabarcoding-pipeline)? Or should I "breakdown" all ASVs per sample, i.e. ASV1_sample1, ASV1_sample2,...ASV1_sample24? But then it's quite a tedious task to have in the end an OTU table with read abundances of each cluster (=OTU) per sample?

Note that I have practically zero knowledge on scripting, so perhaps it is really straightforward for those who do have this experience.

Any advice would be greatly appreciated! Note that I have also contacted the developer of Swarm. In case I would get a faster answer directly from him, I will post the answer here to help others!

Thanks, Ellen

dada2 SWARM clustering ASV • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 1674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6