You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While playing around with gapseq and BacArena, I noticed that I was unable to replicate some results that I previously got while doing community simulations. After some troubleshooting, I found the culprit: newly generated models from the same input genome, but using a different gapseq version, looked very different from the ones generated previously.
At first, I suspected that it might have to do with changes in the gapseq code/databases, or perhaps that there were bugs related to running on different operating systems or CPU architectures. To assess whether this was the case, I gathered FASTA files for three organisms: Syntrophaceticus schinkii, Candidatus Syntrophopropionicum ammoniitolerans, and Acetomicrobium mobile. For all three, I downloaded the genomes in both nucleotide and protein FASTA formats. For the candidatus, I also downloaded the cds_from_genomic file which should be equivalent to the nucleotide FASTA for this purpose.
I then generated models from these input files using three gapseq versions: from March 16 2023 (5a8e985) (HEAD), from October 30 2023 (f715432) (oct22), and from January 4 2023 (e2f3209) (jan23). I ran the script both on Mac/ARM64 and on Linux/x86_64. Draft models were gapfilled on ALLmed, MM_glu, and MM_anaerobic_Acetate media (media definitions were from the most current gapseq version).
In total, 126 models were generated. Although most generated models were similar, as would be expected, there were several outliers. There were no systematic effects of OS, gapseq version, or FASTA type as far as I could see.
Here is an overview graph illustrating presence of outliers:
And here is a close up view of S. schinkii, illustrating the apparent randomness in occurrence:
S. schinkii is an acetate oxidizer, producing CO2 and H2 from acetate. This was predicted in the high-growth models. In all of the outlier models, the organism instead produces hexanoate, with only some CO2 and H2.
Is this type of variability expected? And do you have any suggestions on how to deal with or reduce the variability in the reconstruction process?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
While playing around with gapseq and BacArena, I noticed that I was unable to replicate some results that I previously got while doing community simulations. After some troubleshooting, I found the culprit: newly generated models from the same input genome, but using a different gapseq version, looked very different from the ones generated previously.
At first, I suspected that it might have to do with changes in the gapseq code/databases, or perhaps that there were bugs related to running on different operating systems or CPU architectures. To assess whether this was the case, I gathered FASTA files for three organisms: Syntrophaceticus schinkii, Candidatus Syntrophopropionicum ammoniitolerans, and Acetomicrobium mobile. For all three, I downloaded the genomes in both nucleotide and protein FASTA formats. For the candidatus, I also downloaded the
cds_from_genomic
file which should be equivalent to the nucleotide FASTA for this purpose.I then generated models from these input files using three gapseq versions: from March 16 2023 (5a8e985) (HEAD), from October 30 2023 (f715432) (oct22), and from January 4 2023 (e2f3209) (jan23). I ran the script both on Mac/ARM64 and on Linux/x86_64. Draft models were gapfilled on
ALLmed
,MM_glu
, andMM_anaerobic_Acetate
media (media definitions were from the most current gapseq version).In total, 126 models were generated. Although most generated models were similar, as would be expected, there were several outliers. There were no systematic effects of OS, gapseq version, or FASTA type as far as I could see.
Here is an overview graph illustrating presence of outliers:
And here is a close up view of S. schinkii, illustrating the apparent randomness in occurrence:
S. schinkii is an acetate oxidizer, producing CO2 and H2 from acetate. This was predicted in the high-growth models. In all of the outlier models, the organism instead produces hexanoate, with only some CO2 and H2.
Is this type of variability expected? And do you have any suggestions on how to deal with or reduce the variability in the reconstruction process?
Beta Was this translation helpful? Give feedback.
All reactions