Quickly search, compare, and analyze genomic and metagenomic data sets.
Usage:
sourmash sketch dna *.fq.gz
sourmash compare *.sig -o distances.cmp -k 31
sourmash plot distances.cmp
sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027
):.
The latest major release is sourmash v4, which has several command-line and Python incompatibilities with previous versions. Please visit our migration guide to upgrade!
The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)
Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).
sourmash was initially developed by the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine, and now includes contributions from the global research and developer community.
We recommend using bioconda to install sourmash:
conda install -c conda-forge -c bioconda sourmash
This will install the latest stable version of sourmash 4.
You can also use pip to install sourmash:
pip install sourmash
A quickstart tutorial is available.
sourmash runs under Python 3.7 and later. The base requirements are screed, cffi, numpy, matplotlib, and scipy. Conda (see below) will install everything necessary, and is our recommended installation method.
Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda, you can install sourmash by running:
$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash --help
which will install the latest released version.
For questions, please open an issue on Github, or ask in our chat.
Development happens on github at sourmash-bio/sourmash.
sourmash is developed in Python and Rust, and you will need a Rust environment to build it; see the developer notes for our suggested development setup.
After installation, sourmash
is the main command-line entry point;
run it with python -m sourmash
, or do pip install -e /path/to/repo
to
do a developer install in a virtual environment.
The sourmash/
directory contains the Python library and command-line interface code.
The src/core/
directory contains the Rust library implementing core
functionality.
Tests require py.test and can be run with make test
.
Please see the developer notes for more information on getting set up with a development environment.
Please note that this repository is participating in a study into sustainability of open source projects. Data will be gathered about this repository for approximately the next 12 months, starting from 2021-06-11.
Data collected will include number of contributors, number of PRs, time taken to close/merge these PRs, and issues closed.
For more information, please visit our informational page or download our participant information sheet.
CTB Feb 2021