Implement Rank/Select #14

daniel-j-h · 2021-07-21T21:54:37Z

We should look into implementing rank/select structures for bitset / elias-fano #13

For example have a look at

"Space-efficient, high-performance rank and select structures on uncompressed bit sequences" by Zhou, Andersen, and Kaminsky

which seems to be simple and practical.

cc @ucyo

daniel-j-h · 2021-07-28T21:29:25Z

I have added the building blocks for succinct rank/select data structures

tinygraph/tinygraph-bits.h

Lines 8 to 26 in 65770f0

    
           /* 
        
            * Efficient bits and broadword manipulation. 
        
            * 
        
            * We need this for our succinct rank/select 
        
            * implementations, mostly. 
        
            */ 
        
           TINYGRAPH_WARN_UNUSED 
        
           uint32_t tinygraph_bits_count(uint64_t v); 
        
           TINYGRAPH_WARN_UNUSED 
        
           uint32_t tinygraph_bits_find(uint64_t v, uint32_t n); 
        
           TINYGRAPH_WARN_UNUSED 
        
           uint32_t tinygraph_bits_leading0(uint64_t v); 
        
           TINYGRAPH_WARN_UNUSED 
        
           uint32_t tinygraph_bits_trailing0(uint64_t v);

uses the BMI2 instruction set on modern processors for uint64 rank/select.

From here we can implement e.g. the poppy paper above, or a different approach.

daniel-j-h · 2021-07-29T18:35:57Z

Alex Bowe has some amazing material from this domain; we should check out his writing

https://raw.githubusercontent.com/alexbowe/wavelet-paper/thesis/thesis.pdf

daniel-j-h · 2021-07-29T20:32:18Z

See also this paper describing https://github.com/ot/succinct

Also this thesis going with it.

daniel-j-h · 2021-08-06T17:36:36Z

This paper https://arxiv.org/abs/1706.00990 describes the machine word select we are using for succinct data structures

tinygraph/tinygraph-bits.c

Lines 27 to 36 in 893de95

    
           #ifdef TINYGRAPH_HAS_BMI2 
        
           uint32_t tinygraph_bits_select(uint64_t v, uint32_t n) { 
        
             TINYGRAPH_STATIC_ASSERT(sizeof(uint64_t) == sizeof(unsigned long long)); 
        
             TINYGRAPH_ASSERT(n < tinygraph_bits_count(v)); 
        
             return tinygraph_bits_trailing0_u64(_pdep_u64(UINT64_C(1) << n, v)); 
        
           } 
        
           #else // TINYGRAPH_HAS_BMI2

daniel-j-h · 2021-08-06T17:43:24Z

The paper http://www.cs.cmu.edu/~dga/papers/zhou-sea2013.pdf - termed "poppy" - from

"Space-efficient, high-performance rank and select structures on uncompressed bit sequences" by Zhou, Andersen, and Kaminsky

sounds very practical keeping in mind the realities of hardware architectures and cache hierarchies; they

get down to 3.2% space overhead for rank
get down to 0.39% space overhead for select

and more importantly they also provide a simple basic rank sketch which is simple to implement, has 12.5% space overhead, and allows us then to iterate easily in the future. The first iteration builds up a single index; the second iteration improves on this to get down to 3.2% space overhead for rank.

Their main insights (for rank, since that's what I have read so far) are as follows

cache misses are way more important than optimizing for instructions
a processor cache line is 64 bytes nowadays; use 64 byte "basic blocks"
the popcount instruction is the fastest way to count bits in these basic blocks
build up an index where we store the cumulative sums for each basic block

This means we create an index and store one 64bit uint for every 64 bytes (= 512 bits) in the original bit vector.

During rank query time we use the pre-computed index, plus the remaining bits in the last basic block.

daniel-j-h · 2021-08-06T17:45:19Z

Adding here

We already provide the machine word popcount functionality in the bits module

tinygraph/tinygraph-bits.c

Lines 16 to 25 in 893de95

    
           uint32_t tinygraph_bits_count(uint64_t v) { 
        
             TINYGRAPH_STATIC_ASSERT(sizeof(uint64_t) == sizeof(unsigned long long)); 
        
             return __builtin_popcountll(v); 
        
           } 
        
           uint32_t tinygraph_bits_rank(uint64_t v, uint32_t n) { 
        
             TINYGRAPH_ASSERT(n <= 64); 
        
             return tinygraph_bits_count(v << (64 - n)); 
        
           }

and memory alignment to a 64 byte cache line we can simply do with posix_memalign(&p, 64, size).

daniel-j-h · 2024-05-25T13:41:01Z

Since opening this issue there has been research to improve on the cs-poppy solution we wanted to start with here.

An improvement on poppy, called "pasta" (see "pasta-flat" in the paper):

Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors

In practice, the smallest (uncompressed) rank and select data structure cs-poppy
has a space overhead of ≈ 3.51 % [Zhou et al., SEA ‘13]. Using the same overhead, we present a
data structure that can answer queries up to 8 % (rank) and 16.5 % (select) faster compared with
cs-poppy.

screen grabs

An improvement on poppy and pasta altogether:

SPIDER: Improved Succinct Rank and Select Performance

However, rank and select query performance still incurs a tradeoff between query time
and space. For example, Vigna [ 27 ] gives a data structure for rank queries using 25% space
that is roughly 19% faster than pasta-flat, and a data structure for select queries using
12.2% space, which is roughly 65% faster than pasta-flat.

and a few tricks

screen grabs

daniel-j-h · 2024-06-27T20:23:24Z

I have a first version ready in #41

daniel-j-h mentioned this issue Jul 24, 2021

Succinct trie #23

Open

daniel-j-h added a commit that referenced this issue Jul 28, 2021

Adds BMI2 instrution based rank/select for uint64, see #11 and #14

65770f0

daniel-j-h mentioned this issue Jul 29, 2021

Quasi-Succinct Index #31

Closed

daniel-j-h mentioned this issue Jul 29, 2021

darray: O(1) select #33

Closed

daniel-j-h added a commit that referenced this issue Aug 7, 2021

Adds rank/select poppy-based implementation, closes #14

7519d05

daniel-j-h mentioned this issue Aug 7, 2021

Adds rank/select poppy-based implementation, closes #14 #34

Closed

daniel-j-h added a commit that referenced this issue Aug 29, 2022

Adds rank/select poppy-based implementation, closes #14

48e4e96

daniel-j-h added a commit that referenced this issue Jun 27, 2024

Adds a rank-select data structure, see #14

a52ae63

daniel-j-h added a commit that referenced this issue Jun 27, 2024

Adds a rank-select data structure, see #14

0be8dd3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Rank/Select #14

Implement Rank/Select #14

daniel-j-h commented Jul 21, 2021 •

edited

Loading

daniel-j-h commented Jul 28, 2021

daniel-j-h commented Jul 29, 2021

daniel-j-h commented Jul 29, 2021 •

edited

Loading

daniel-j-h commented Aug 6, 2021

daniel-j-h commented Aug 6, 2021

daniel-j-h commented Aug 6, 2021

daniel-j-h commented May 25, 2024

daniel-j-h commented Jun 27, 2024

Implement Rank/Select #14

Implement Rank/Select #14

Comments

daniel-j-h commented Jul 21, 2021 • edited Loading

daniel-j-h commented Jul 28, 2021

daniel-j-h commented Jul 29, 2021

daniel-j-h commented Jul 29, 2021 • edited Loading

daniel-j-h commented Aug 6, 2021

daniel-j-h commented Aug 6, 2021

daniel-j-h commented Aug 6, 2021

daniel-j-h commented May 25, 2024

daniel-j-h commented Jun 27, 2024

daniel-j-h commented Jul 21, 2021 •

edited

Loading

daniel-j-h commented Jul 29, 2021 •

edited

Loading