`krust`

krust is a k-mer counter--a bioinformatics 101 tool for counting the frequency of substrings of length k within strings of DNA data. It's written in Rust and run from the command line. It takes a fasta file of DNA sequences and will output all canonical k-mers (the double helix means each k-mer has a reverse complement) and their frequency across all records in the given fasta file.

Run krust on the test data* in the krust Github repo, searching for kmers of length 5, like this:

cargo run --release 5 your/local/path/to/cerevisae.pan.fa > output.tsv

or, searching for kmers of length 21:

cargo run --release 21 your/local/path/to/cerevisae.pan.fa > output.tsv

krust prints to stdout, writing, on alternate lines:

>{frequency}  
{canonical k-mer}
>{frequency}  
{canonical k-mer}  
...

krust uses the rust-bio, rayon, and dashmap Rust libraries.

*Unusual, yes, to provide this data in the repo, but it's helped me spread word about what I'm doing.

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
cerevisiae.pan.fa		cerevisiae.pan.fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`krust`

About

Languages

License

suchapalaver/krust

Folders and files

Latest commit

History

Repository files navigation

krust

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

`krust`