-
Notifications
You must be signed in to change notification settings - Fork 8
k mer api
Teo Lemane edited this page Jul 29, 2021
·
1 revision
kmtricks exposes a 2bit representation of k-mers using this encoding A=0, C=1, T=2, G=3. This class supports any k-mer size and has specilization for short k-mers:
- Kmer<32> uses uint64_t
- Kmer<64> uses __uint128_t (if available)
- Kmer uses uint64_t[(K+31)/32]
kmtricks provides also hash utilities and a runtime implementation selector mechanism for work with these k-mers.
Warning: Several k-mers with different size using the same specialization cannot be used at the same time because k-mer size is maintained as static variable.
#include <kmtricks/public.hpp>
using namespace km;
int main(int argc, char* argv[])
{
Kmer<32> kmer("ACGTACGTACGT");
Kmer<32> kmer2("TACTACTACTAC");
// k-mer operations
Kmer<32> rev = kmer.rev_comp();
Kmer<32> cano = kmer.canonical();
// comparisons
bool a = kmer == kmer2;
bool b = kmer != kmer2;
bool c = kmer < kmer2;
bool c = kmer > kmer2;
// string representation
std::cout << kmer.to_string() << std::endl;
std::cout << kmer.to_bit_string() << std::endl;
// io
{
std::ofstream out("kmer_file", std::ios::out | std::ios::binary);
kmer.dump(out);
}
{
std::ifstream in("kmer_file", std::ios::in | std::ios::binary);
Kmer<32> loaded(12); // kmer_size
loaded.load(in);
}
// access
const uint64_t* kmer_data = kmer.get_data64();
const uint8_t* kmer_data8 = kmer.get_data8();
uint8_t value = kmer.at2bit(0); // 0 (A)
char nt = kmer.at(0) // 'A'
// Hash
using HType = KmerHashers<0>::Hasher<32>;
HType hasher;
uint64_t hash = hasher(kmer);
// KmerHashers<0> is folly hash, KmerHashers<1> uses xxHash, you need to add #define WITH_XXHASH before include kmtricks and link with xxHash.
}
Kmer supports also a lot of comparison, arithmetic, bitwise and assignment operators, just take a look at kmer.hpp. A standalone implementation is also provided in kmercpp.