index: Verify the block filter hash when reading the filter from disk. #24832

kcalvinalvin · 2022-04-12T10:38:26Z

This PR picks up the abandoned #19280

BlockFilterIndex was depending on GolombRiceDecode() during the filter decode to sanity check that the filter wasn't corrupt. However, we can check for corruption by ensuring that the encoded blockfilter's hash matches up with the one stored in the index database.

Benchmarks that were added in #19280 showed that checking the hash is much faster.

The benchmarks were changed to nanobench and the relevant benchmarks were like below, showing a clear win for the hash check method.

|             ns/elem |              elem/s |    err% |        ins/elem |       bra/elem |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:----------
|              531.40 |        1,881,819.43 |    0.3% |        3,527.01 |         411.00 |    0.2% |      0.01 | `DecodeCheckedGCSFilter`
|          258,220.50 |            3,872.66 |    0.1% |    2,990,092.00 |     586,706.00 |    1.7% |      0.01 | `DecodeGCSFilter`
|           13,036.77 |           76,706.09 |    0.3% |       64,238.24 |         513.04 |    0.2% |      0.01 | `BlockFilterGetHash`

jonatack

Concept ACK

A few suggestions regarding filter_checked

diff --git a/src/blockfilter.cpp b/src/blockfilter.cpp
index 59f34bc54e..0ab89fdbdb 100644
--- a/src/blockfilter.cpp
+++ b/src/blockfilter.cpp
@@ -47,7 +47,7 @@ GCSFilter::GCSFilter(const Params& params)
     : m_params(params), m_N(0), m_F(0), m_encoded{0}
 {}
 
-GCSFilter::GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, const bool filter_checked)
+GCSFilter::GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked)
     : m_params(params), m_encoded(std::move(encoded_filter))
 {
     SpanReader stream{GCS_SER_TYPE, GCS_SER_VERSION, m_encoded};
@@ -221,7 +221,7 @@ static GCSFilter::ElementSet BasicFilterElements(const CBlock& block,
 }
 
 BlockFilter::BlockFilter(BlockFilterType filter_type, const uint256& block_hash,
-                         std::vector<unsigned char> filter, const bool filter_checked)
+                         std::vector<unsigned char> filter, bool filter_checked)
     : m_filter_type(filter_type), m_block_hash(block_hash)
 {
     GCSFilter::Params params;
diff --git a/src/blockfilter.h b/src/blockfilter.h
index af054dbf34..5de5679834 100644
--- a/src/blockfilter.h
+++ b/src/blockfilter.h
@@ -59,7 +59,7 @@ public:
     explicit GCSFilter(const Params& params = Params());
 
     /** Reconstructs an already-created filter from an encoding. */
-    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, const bool filter_checked=false);
+    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked = false);
 
     /** Builds a new filter from the params and set of elements. */
     GCSFilter(const Params& params, const ElementSet& elements);
@@ -122,7 +122,7 @@ public:
 
     //! Reconstruct a BlockFilter from parts.
     BlockFilter(BlockFilterType filter_type, const uint256& block_hash,
-                std::vector<unsigned char> filter, const bool filter_checked=false);
+                std::vector<unsigned char> filter, bool filter_checked = false);
 
     //! Construct a new BlockFilter of the specified type from a block.
     BlockFilter(BlockFilterType filter_type, const CBlock& block, const CBlockUndo& block_undo);
diff --git a/src/index/blockfilterindex.cpp b/src/index/blockfilterindex.cpp
index bad766ebe1..0a4728f571 100644
--- a/src/index/blockfilterindex.cpp
+++ b/src/index/blockfilterindex.cpp
@@ -159,7 +159,7 @@ bool BlockFilterIndex::ReadFilterFromDisk(const FlatFilePos& pos, BlockFilter& f
         uint256 result;
         CHash256().Write(encoded_filter).Finalize(result);
         if (result != hash) return error("Checksum mismatch in filter decode.");
-        filter = BlockFilter(GetFilterType(), block_hash, std::move(encoded_filter), true);
+        filter = BlockFilter(GetFilterType(), block_hash, std::move(encoded_filter), /*filter_checked=*/true);
     }

jonatack · 2022-04-12T11:26:03Z

src/blockfilter.cpp

    // Verify that the encoded filter contains exactly N elements. If it has too much or too little
    // data, a std::ios_base::failure exception will be raised.
-    BitStreamReader<SpanReader> bitreader{stream};
+    BitStreamReader<SpanReader> bitreader(stream);


No need to change this.

kcalvinalvin · 2022-04-12T12:27:44Z

A few suggestions regarding filter_checked

I've applied all the suggested changes :)

jonatack

Approach ACK, a few iterative further comments on second look.

jonatack · 2022-04-12T15:36:52Z

src/index/blockfilterindex.h

@@ -31,7 +31,7 @@ class BlockFilterIndex final : public BaseIndex
    FlatFilePos m_next_filter_pos;
    std::unique_ptr<FlatFileSeq> m_filter_fileseq;

-    bool ReadFilterFromDisk(const FlatFilePos& pos, BlockFilter& filter) const;
+    bool ReadFilterFromDisk(const FlatFilePos& pos, BlockFilter& filter, const uint256& hash) const;


It might be nice to order the ReadFilterFromDisk() arguments with in-params first, then the out-param. There are only four lines to change of the ones touched in this pull.

Suggested change

bool ReadFilterFromDisk(const FlatFilePos& pos, BlockFilter& filter, const uint256& hash) const;

bool ReadFilterFromDisk(const FlatFilePos& pos, const uint256& hash, BlockFilter& filter) const;

jonatack · 2022-04-12T16:39:56Z

src/bench/gcs_filter.cpp

+    });
+}
+
+static void BlockFilterGetHash(benchmark::Bench& bench)


It would be handy to name all of these benchmarks with a common prefix to group them together in the output and that is easy to filter for, i.e. ./src/bench/bench_bitcoin -filter=EvictionProtection*.*

All the GCSFIlter benchmarks are now changed to have the prefix GCSFilter.

jonatack · 2022-04-12T16:41:18Z

src/bench/gcs_filter.cpp

+        GCSFilter::Element element(32);
+        element[0] = static_cast<unsigned char>(i);
+        element[1] = static_cast<unsigned char>(i >> 8);
+        elements.insert(std::move(element));


Optional: it looks like there is a fair amount of duplicate boilerplate in each of these benchmarks that you could extract out to a common setup helper in a separate commit, if you like. See EvictionProtectionCommon() in src/bench/peer_eviction.cpp for an example.

Since most of the duplicate code is just generating the elements, how's separating that out to a function like so:

static const GCSFilter::ElementSet GenerateGCSTestElements() { GCSFilter::ElementSet elements; for (int i = 0; i < 10000; ++i) { GCSFilter::Element element(32); element[0] = static_cast<unsigned char>(i); element[1] = static_cast<unsigned char>(i >> 8); elements.insert(std::move(element)); } return elements; }

And having each of the benchmarks call that function?

I've applied the above change I suggested above.

jonatack · 2022-04-12T16:42:34Z

src/blockfilter.h

@@ -59,7 +59,7 @@ class GCSFilter
    explicit GCSFilter(const Params& params = Params());

    /** Reconstructs an already-created filter from an encoding. */
-    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter);
+    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked=false);


Clang format for each of these defaults

Suggested change

GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked=false);

GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked = false);

No longer needed as the default value was removed

jonatack · 2022-04-12T16:53:00Z

src/bench/gcs_filter.cpp

+    auto encoded = filter.GetEncoded();
+
+    bench.unit("elem").run([&] {
+        GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, true);


Suggested change

GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, true);

GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, /*filter_checked=*/true);

mzumsande

Concept ACK

As far as I can see, the sanity check that this PR makes optional is now only invoked from BlockFilter::Unserialize(), which is used only in test code because as a full node, bitcoin core doesn't have the use case of accepting unchecked serialized blockfilters over p2p, only send self-generated ones to peers.
So I wonder whether it might make sense to just remove the old sanity check instead of making it optional?

jonatack · 2022-04-13T14:06:28Z

@mzumsande good point, and I think it also highlights that it may alternatively be better to drop the default false value for filter_checked here as follows to make that more explicit. Then (here or in a follow-up), if BlockFilter::Unserialize() is removed or moved to the test code, the sanity check here could be removed.

drop default false filter_checked value

diff --git a/src/bench/gcs_filter.cpp b/src/bench/gcs_filter.cpp
index 89fa07f602..56f3cff0c1 100644
--- a/src/bench/gcs_filter.cpp
+++ b/src/bench/gcs_filter.cpp
@@ -52,7 +52,7 @@ static void DecodeGCSFilter(benchmark::Bench& bench)
     auto encoded = filter.GetEncoded();
 
     bench.unit("elem").run([&] {
-        GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded);
+        GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, /*filter_checked=*/false);
     });
 }
 
@@ -66,7 +66,7 @@ static void BlockFilterGetHash(benchmark::Bench& bench)
         elements.insert(std::move(element));
     }
     GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, elements);
-    BlockFilter block_filter(BlockFilterType::BASIC, {}, filter.GetEncoded());
+    BlockFilter block_filter(BlockFilterType::BASIC, {}, filter.GetEncoded(), /*filter_checked=*/false);
 
     bench.unit("elem").run([&] {
         block_filter.GetHash();
@@ -86,7 +86,7 @@ static void DecodeCheckedGCSFilter(benchmark::Bench& bench)
     auto encoded = filter.GetEncoded();
 
     bench.unit("elem").run([&] {
-        GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, true);
+        GCSFilter filter({0, 0, BASIC_FILTER_P, BASIC_FILTER_M}, encoded, /*filter_checked=*/true);
     });
 }
 BENCHMARK(BlockFilterGetHash);
diff --git a/src/blockfilter.h b/src/blockfilter.h
index 21d6a295a2..f823354989 100644
--- a/src/blockfilter.h
+++ b/src/blockfilter.h
@@ -59,7 +59,7 @@ public:
     explicit GCSFilter(const Params& params = Params());
 
     /** Reconstructs an already-created filter from an encoding. */
-    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked=false);
+    GCSFilter(const Params& params, std::vector<unsigned char> encoded_filter, bool filter_checked);
 
     /** Builds a new filter from the params and set of elements. */
     GCSFilter(const Params& params, const ElementSet& elements);
@@ -122,7 +122,7 @@ public:
 
     //! Reconstruct a BlockFilter from parts.
     BlockFilter(BlockFilterType filter_type, const uint256& block_hash,
-                std::vector<unsigned char> filter, bool filter_checked=false);
+                std::vector<unsigned char> filter, bool filter_checked);
 
     //! Construct a new BlockFilter of the specified type from a block.
     BlockFilter(BlockFilterType filter_type, const CBlock& block, const CBlockUndo& block_undo);
@@ -164,7 +164,7 @@ public:
         if (!BuildParams(params)) {
             throw std::ios_base::failure("unknown filter_type");
         }
-        m_filter = GCSFilter(params, std::move(encoded_filter));
+        m_filter = GCSFilter(params, std::move(encoded_filter), /*filter_checked=*/false);
     }
 };

kcalvinalvin · 2022-04-13T15:42:34Z

Sure I can just get rid of the option. I'll move BlockFilter::Unserialize() to the test files in a follow-up.

Quick side-question: my initial thought was that it'd be helpful to leave an option in there in case anyone is using the code as a library. In general, does Bitcoin Core development think about potential use cases of the code being used as a library somewhere else?

jonatack · 2022-04-13T19:38:12Z

@kcalvinalvin maybe decided case-by-case; I don't recall a recent specific example of leave-it-in-for-reuse-as-a-library being invoked as a rationale for not moving code used only for tests out to the test code.

Verified that removing BlockFilter::Unserialize() compiles with the following test-only deletions:

code that depends on BlockFilter::Unserialize()

diff --git a/src/blockfilter.h b/src/blockfilter.h
index 21d6a295a2..764395c370 100644
--- a/src/blockfilter.h
+++ b/src/blockfilter.h
@@ -148,24 +148,6 @@ public:
           << m_block_hash
           << m_filter.GetEncoded();
     }
-
-    template <typename Stream>
-    void Unserialize(Stream& s) {
-        std::vector<unsigned char> encoded_filter;
-        uint8_t filter_type;
-
-        s >> filter_type
-          >> m_block_hash
-          >> encoded_filter;
-
-        m_filter_type = static_cast<BlockFilterType>(filter_type);
-
-        GCSFilter::Params params;
-        if (!BuildParams(params)) {
-            throw std::ios_base::failure("unknown filter_type");
-        }
-        m_filter = GCSFilter(params, std::move(encoded_filter));
-    }
 };
 
 #endif // BITCOIN_BLOCKFILTER_H
diff --git a/src/test/blockfilter_tests.cpp b/src/test/blockfilter_tests.cpp
index 8eb4dbc592..b99d9a32c3 100644
--- a/src/test/blockfilter_tests.cpp
+++ b/src/test/blockfilter_tests.cpp
@@ -106,23 +106,6 @@ BOOST_AUTO_TEST_CASE(blockfilter_basic_test)
     for (const CScript& script : excluded_scripts) {
         BOOST_CHECK(!filter.Match(GCSFilter::Element(script.begin(), script.end())));
     }
-
-    // Test serialization/unserialization.
-    BlockFilter block_filter2;
-
-    CDataStream stream(SER_NETWORK, PROTOCOL_VERSION);
-    stream << block_filter;
-    stream >> block_filter2;
-
-    BOOST_CHECK_EQUAL(block_filter.GetFilterType(), block_filter2.GetFilterType());
-    BOOST_CHECK_EQUAL(block_filter.GetBlockHash(), block_filter2.GetBlockHash());
-    BOOST_CHECK(block_filter.GetEncodedFilter() == block_filter2.GetEncodedFilter());
-
-    BlockFilter default_ctor_block_filter_1;
-    BlockFilter default_ctor_block_filter_2;
-    BOOST_CHECK_EQUAL(default_ctor_block_filter_1.GetFilterType(), default_ctor_block_filter_2.GetFilterType());
-    BOOST_CHECK_EQUAL(default_ctor_block_filter_1.GetBlockHash(), default_ctor_block_filter_2.GetBlockHash());
-    BOOST_CHECK(default_ctor_block_filter_1.GetEncodedFilter() == default_ctor_block_filter_2.GetEncodedFilter());
 }
 
 BOOST_AUTO_TEST_CASE(blockfilters_json_test)
diff --git a/src/test/fuzz/deserialize.cpp b/src/test/fuzz/deserialize.cpp
index ed6f172a2a..3793921e86 100644
--- a/src/test/fuzz/deserialize.cpp
+++ b/src/test/fuzz/deserialize.cpp
@@ -95,12 +95,6 @@ void DeserializeFromFuzzingInput(FuzzBufferType buffer, T& obj, const std::optio
             throw invalid_fuzzing_input_exception();
         }
     }
-    try {
-        ds >> obj;
-    } catch (const std::ios_base::failure&) {
-        throw invalid_fuzzing_input_exception();
-    }
-    assert(buffer.empty() || !Serialize(obj).empty());
 }
 
 template <typename T>
diff --git a/src/test/fuzz/util.h b/src/test/fuzz/util.h
index 6c91844633..e8e8e57394 100644
--- a/src/test/fuzz/util.h
+++ b/src/test/fuzz/util.h
@@ -145,11 +145,6 @@ template <typename T>
     const std::vector<uint8_t> buffer = ConsumeRandomLengthByteVector(fuzzed_data_provider, max_length);
     CDataStream ds{buffer, SER_NETWORK, INIT_PROTO_VERSION};
     T obj;
-    try {
-        ds >> obj;
-    } catch (const std::ios_base::failure&) {
-        return std::nullopt;
-    }
     return obj;
 }

kcalvinalvin · 2022-04-14T08:45:13Z

Addressed all of the changes requested except for removing BlockFilter::Unserialize() and getting rid of the option to sanity check. I'll do those in a follow up.

jonatack · 2022-04-14T13:42:10Z

ACK 3e6d98e

modulo:

ordering the gcs_filter benchmark methods in the same order as the benchmarks in that file
maybe naming the BlockFilterGetHash bench as GCSBlockFilterGetHash so that ./src/bench/bench_bitcoin -filter=GCS*.* can invoke it with the others in that file

Here is a patch on top of your last push, if helpful, that removes BlockFilter::Unserialize: https://github.com/jonatack/bitcoin/commits/remove-BlockFilter-Unserialize. Though it may be more prudent to move it to the test code and leave the tests in place.

jonatack · 2022-04-19T15:51:52Z

ACK 0b976a6

Sanity check:

Running ./src/bench/bench_bitcoin -filter=GCS*.* on this branch (debug build and not tuned for benchmarking)

|             ns/elem |              elem/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          179,394.67 |            5,574.30 |    0.5% |      0.01 | `GCSBlockFilterGetHash`
|            3,376.66 |          296,150.75 |    2.1% |      0.38 | `GCSFilterConstruct`
|        5,163,985.00 |              193.65 |    2.5% |      0.06 | `GCSFilterDecode`
|            1,926.40 |          519,102.50 |    1.9% |      0.01 | `GCSFilterDecodeChecked`
|          577,446.00 |            1,731.76 |    1.0% |      0.01 | `GCSFilterMatch`

and with the operative codebase changes reverted (diff)

diff --git a/src/blockfilter.cpp b/src/blockfilter.cpp
index 08e2e6f72e..9d76d2b60e 100644
--- a/src/blockfilter.cpp
+++ b/src/blockfilter.cpp
@@ -59,7 +59,7 @@ GCSFilter::GCSFilter(const Params& params, std::vector<unsigned char> encoded_fi
     }
     m_F = static_cast<uint64_t>(m_N) * static_cast<uint64_t>(m_params.m_M);
 
-    if (filter_checked) return;
+    // if (filter_checked) return;
 
     // Verify that the encoded filter contains exactly N elements. If it has too much or too little
     // data, a std::ios_base::failure exception will be raised.
diff --git a/src/index/blockfilterindex.cpp b/src/index/blockfilterindex.cpp
index 1471c29a85..f83b883c8f 100644
--- a/src/index/blockfilterindex.cpp
+++ b/src/index/blockfilterindex.cpp
@@ -156,9 +156,6 @@ bool BlockFilterIndex::ReadFilterFromDisk(const FlatFilePos& pos, const uint256&
     std::vector<uint8_t> encoded_filter;
     try {
         filein >> block_hash >> encoded_filter;
-        uint256 result;
-        CHash256().Write(encoded_filter).Finalize(result);
-        if (result != hash) return error("Checksum mismatch in filter decode.");
         filter = BlockFilter(GetFilterType(), block_hash, std::move(encoded_filter), /*filter_checked=*/true);

...GCSFilterDecodeChecked is far slower and equivalent to GCSFilterDecode as expected

|             ns/elem |              elem/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          165,774.17 |            6,032.30 |    0.5% |      0.01 | `GCSBlockFilterGetHash`
|            3,775.71 |          264,850.84 |    0.5% |      0.42 | `GCSFilterConstruct`
|        4,960,766.00 |              201.58 |    0.9% |      0.05 | `GCSFilterDecode`
|        4,968,437.00 |              201.27 |    1.4% |      0.06 | `GCSFilterDecodeChecked`
|          584,923.00 |            1,709.63 |    3.7% |      0.01 | `GCSFilterMatch`

kcalvinalvin · 2022-04-20T06:48:58Z

ACK 0b976a6

Thank you for the speedy review!

I pushed 04008e2 on top of 0b976a6 which does mostly what you did in https://github.com/jonatack/bitcoin/commits/remove-BlockFilter-Unserialize but it keeps the serialize/unserialize test in src/test/blockfilter_tests.cpp with the function

template <typename Stream>
static BlockFilter UnserializeBlockFilter(Stream& s) {
    std::vector<unsigned char> encoded_filter;
    uint8_t filter_type_uint8;
    uint256 block_hash;

    s >> filter_type_uint8
      >> block_hash
      >> encoded_filter;

    BlockFilterType filter_type = static_cast<BlockFilterType>(filter_type_uint8);
    BlockFilter block_filter(filter_type, block_hash, std::move(encoded_filter), /*filter_checked=*/false);
    return block_filter;
}

I think this is a good tradeoff between keeping the serialization in the unit tests but still keeping the maintenance overhead low.

kcalvinalvin · 2022-04-20T07:38:31Z

Looks like you are removing the fuzz test as well?

@MarcoFalke Ah my first instinct was to get rid of the entire test as ConsumeDeserializable would no longer be usable. I'll push a change with the fuzz tests kept in.

jonatack · 2022-04-21T18:00:35Z

@kcalvinalvin if you move BlockFilter::Unserialize() to the test code in this pull, it will be a smaller change as the first commit rather than last one, WDYT? Edit: maybe a first commit that adds the check in ReadFilterFromDisk() and rename the commit, then the one that moves Unserialize() to the test code.

kcalvinalvin · 2022-04-22T07:33:32Z

@kcalvinalvin if you move BlockFilter::Unserialize() to the test code in this pull, it will be a smaller change as the first commit rather than last one, WDYT? Edit: maybe a first commit that adds the check in ReadFilterFromDisk() and rename the commit, then the one that moves Unserialize() to the test code.

I'm ok with that if it'd lower the cost of reviewing. However, I'm thinking if the cost of reviewing would be even lower if I have commit 447174d be in a separate PR. I'm sorta thinking maybe getting rid of Unserialize() and replacing GolombRiceDecode() check are two separate things. Let me know what you'd think would lower the cost of reviewing :)

jonatack · 2022-04-22T08:10:33Z

@kcalvinalvin SGTM.

kcalvinalvin · 2022-04-22T16:03:53Z

Reverted remove the last commit. Latest commit is now back to 0b976a6. Will make a follow up PR after this one.

This benchmark allows us to compare the differences between doing the sanity check for corruption via GolombRiceDecode() vs checking the hash of the encoded block filter.

Element count used in the GCSFilter benchmarks are increased to 100,000 from 10,000. Testing the benchmarks with different element counts showed that a filter with 100,000 elements resulted in the same ns/op. This this a desirable thing to have as it allows us to reason about how long a single filter element takes to process, letting us easily calculate how long a filter with N elements (where N > 100,000) would take to process. GCSFilterConstruct benchmark is now called without batch. This makes intra-bench results more intuitive as all benchmarks are in ns/op instead of a custom unit. There are no downsides to this change as testing showed that there is no observable difference in error rates in the benchmarks when calling without batch.

kcalvinalvin · 2022-05-22T05:42:19Z

4 things changed in the latest push:

Renamed the GCSFilterGetHash benchmark back to GCSBlockFilterGetHash.
Renamed GCSFilterDecodeChecked to GCSFilterDecodeSkipCheck as filter_checked argument was renamed to skip_decode_check. Changing the name of the benchmark as well seemed fitting.
Added a commit where GenerateGCSTestElements generates 100,000 elements instead of 10,000. The reasoning was based off of my findings in testing the benchmarks. I've included an explanation in the code as a comment and in the commit message.
In the same added commit, I've changed the GCSFilterConstruct to be called without batch and the custom unit. This was done as it makes comparing intra-bench results more intuitive. Calling with batch didn't result in lower error rates either so there were no tradeoffs made.

mzumsande

Code Review ACK e734228

I'm not sure if it really matters a lot how big we choose the number of elements used in the test, I just think that if we pick custom measure other than "ops", the benchmark should be linear in that - so I like that everything uses ns/op now.
In any case, the important feature of this PR is the improvement of the filter verification.

theStack · 2022-07-04T13:33:36Z

Concept ACK

theStack

Code-review ACK e734228

FWIW, this are the benchmark results on my machine:

$ ./src/bench/bench_bitcoin -filter=GCS.*

ns/op	op/s	err%	total	benchmark
719,504.00	1,389.85	2.4%	0.01	`GCSBlockFilterGetHash`
28,431,064.00	35.17	4.0%	0.31	`GCSFilterConstruct`
3,641,177.00	274.64	3.4%	0.04	`GCSFilterDecode`
13,994.04	71,458.99	1.5%	0.01	`GCSFilterDecodeSkipCheck`
424,500.50	2,355.71	2.2%	0.01	`GCSFilterMatch`

stickies-v

ACK e734228

Non-controversial and significant performance improvement when loading block filters from disk, which can be a frequent process depending on peers or RPC users. I've left some style/readability suggestions but none of them are blocking for me.

If BlockFilter::Unserialize() is removed in a follow up, further code cleanup is possible by removing the skip_decode_check parameter and the check branch.

$ ./src/bench/bench_bitcoin -filter=GCS.*
yields

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          766,333.00 |            1,304.92 |    0.8% |      0.01 | `GCSBlockFilterGetHash`
|       60,891,167.00 |               16.42 |    0.3% |      0.67 | `GCSFilterConstruct`
|       15,802,125.00 |               63.28 |    0.3% |      0.17 | `GCSFilterDecode`
|        1,697,875.00 |              588.97 |    0.2% |      0.02 | `GCSFilterDecodeSkipCheck`
|        1,687,000.00 |              592.77 |    0.1% |      0.02 | `GCSFilterMatch`

stickies-v · 2022-06-23T23:27:56Z