I am working on a metagenomics pipeline, and would like to compare my taxonomical predictions to a known (simulated) control along with the results from various other metagenomics pipelines. I enjoyed reading the CAMI comparative metagenomics paper and profiling of various methods and projects regarding taxonomic identification, and understand that I can download the CAMI datasets, but I don't think I can obtain the answer key to score my own output.
Therefore, I am looking for a simulated or very well-defined metagenomics dataset on which I can attempt unguided metagenomics profiling and compare my results to the known key, and I would like to be able to find similar results of other metagenomics methods on the same dataset for scoring. Is there such a resource available, like a public database of some kind? It seems like CAMI is exactly this, but unless I'm missing something, the key isn't available despite their phase 1 competition being over and published.
Thanks for any suggestions.
Hi there, I see that your post is over 4 years old but I am exactly at this point right now as you were then. Im actually about to start testing some metagenomic profiling workflows with CAMISIM and the workflow they described. But Im curious if you found any such datasets you describe to test workflows. Im especially interested in datasets produced by nanopore reads.