Releases: marbl/Mash
Releases · marbl/Mash
Mash v2.3
- Improvements to Screen function:
- Up to 4x speed improvement due to Robin Hood hashing
- Additional taxonomy-aware
taxscreen
command, for use with taxmash, due to Florian Breitweiser
- Sketching optimization due to Torsten Seemann
- Manpages due to Fabian Klötzl
Mash v2.2
Mash v2.1.1
screen
p-values fixed for amino acid queries- sketching optimizations
Mash v2.1
Features
- Triangular matrix - A new
triangle
command computes a lower triangular
distance matrix in relaxed Phylip format. This streamlines all-pairs distance
commands and avoids computational redundancy. - Custom IDs - The
ID
andComment
fields of a sketch can now
be set with-I
and-C
. Only applies to the first sketch for multi-sketch files. - Read pooling - If multiple input files are given in read mode (
-r
), e.g.
paired ends as inmash sketch -r read1.fq read2.fq
, they will now pool to the same sketch, avoiding the need for concatenation.
Mash v2.0
Mash's first major version increment focuses on a new top level command, screen
, which estimates containment within (rather than distance to) a read set for many sketches simultaneously.
Features
- Screen - A new command that estimates how well sketches are contained
within a set of reads. - Hash seed parameter - The seed of the hash function can now be set with
-S
. Note that if it is changed from the default (42), any sketches created will not work with older versions of Mash (they will appear to old versions to have no sketches, causing an error or empty output).
Fixes
Mash v1.1.1
Features
- JSON sketch dumps - Sketches can now be converted to text in JSON format
for interoperability with other tools. Metadata, such as k-mer and hash function
information, are included with the hashes themselves, which are represented as
unsigned integers.
Fixes
- Fix for stdin sketch input (Issue #32)
Mash v1.1
Features
- Read sketching
- Minimum k-mer copy number (
-m
) for more precise and flexible filtering
than Bloom filter. - Genome size and coverage estimation for improved p-values and optional
termination at sufficient coverage (-c
).
- Minimum k-mer copy number (
- Parallelism
- Parallel sketching (
-p
), if more than one sketch is being created. - Parallel distance for all comparisons, not just multiple files.
- Parallel sketching (
- Alphabets
- Amino acid (
-a
) and arbitrary alphabets (-z
). - Case sensitivity option (
-Z
), which allows lowercase masking.
- Amino acid (
- Information
- A new
bounds
command for printing expected accuracy for various
parameters and distances. - K-mer copy number histogram available from
mash info
(-c
) for
sketches made with this version. - Tabular mode (
-t
) and more header information (-H
) formash info
.
- A new
Fixes
Mash v1.0.2
Fix for false k-mer size warnings (issue #17)
Mash v1.0.1
- fix for buffer overflow that causes "stack smashing" error with recent GCC versions (issue #15)