-
Notifications
You must be signed in to change notification settings - Fork 36.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index: Fix backwards search for bestblock #23365
Conversation
Concept ACK |
The proposed change would also mean that it would now be possible to prune up to the best block of an index without needing to reindex (+1 block compared to master). I hope I'm not missing something, but I don't see why we still need the full data of |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Moved out of draft and added steps to reproduce to OP. Note that the last commit of #21726 would partly "fix" the issue too by just not running the affected code unless |
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK a36e23b (jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear
)
Thanks for looking into this and figuring out a good fix. Once this is merged, I can update #23289 to cover interrupting init after index initialization.
I built this locally and confirmed that the attached test fails if the index/base.cpp
changes are reverted. Logically the change makes sense - if the index's best block is the genesis, we should be able to continue indexation from that point (which is all the rephrased while
condition does).
The block = block->pprev;
does make me a little nervous (although as-written the code would never cause any problems given how we initialize block_to_test
), so if you wanted to you could include an assertion like
diff --git a/src/index/base.cpp b/src/index/base.cpp
index 88c74ec543..8525dcbfa0 100644
--- a/src/index/base.cpp
+++ b/src/index/base.cpp
@@ -96,6 +96,7 @@ bool BaseIndex::Init()
prune_violation = false;
break;
}
+ assert(block->pprev);
block = block->pprev;
}
}
Show signature data
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
ACK a36e23b183a6b8d1c60598618f2a5c8f31da02e0 ([`jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear`](https://github.com/jamesob/bitcoin/tree/ackr/23365.1.mzumsande.index_fix_backwards_sear))
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmGbxWwACgkQepNdrbLE
TwVIdg/8CKYBVyw9DvvSx/G2E31/z7avW4uwj5cq10HMmjtsT/cCHhFtofDtVvIG
9binTF/dlYhyzQmv7FduzQkgpZtehokrrBugoktyh6zeJpRV+mPUxZvFJ0VUCK1w
+dVJ2zdbDmDuogJR7+ZZbGgBGxPKjS9t/Q+/LQIP221lgAVi1MPJ8oYuyg4+R6hn
ueS3piEtXmHnQt0F05PUhJDZZidB/3hRp1bu9Z0H79TrL65wnA1bnDf3+IKbGeQL
CPsYxZR4VZRiIqkSESsKuIbIVE2ELNJcQHP5pKoA389dBDbDlMfw1Lo/N+opB1tm
kbkrCt0ZB4HEAwsbyi3KWOue5rYkx5zuz6/7QW6sChkfGLwE+F+UfyvzCzy2P8FI
Fzd8lx/74O/+0NtcIpY+kSlPxztdQCuqCZHmx8cuYvnfPDIWkg2TrgX77JPoGJHO
x8mJBGWrTkxYoEM2brLWBKK24pPbafrx860NWEzJEoJ5xreZOfAzsBZPLtEZtymq
hggWvPjgGljGUFnKakjhP3ahMgDBNyQxoXM/n/XIE3NSepvRE7WOc5yxj2gGScM8
h6oIYZVmlgbbImA1asNDgXRk4acrA/N+3dZdUCAyjPz0lhx3/g2XznCZsfRyZdjA
cftewu0RF0l/XnyRZ37K5xvTUsjDYqb2omufwnmgy2giD9QnHdA=
=5ydW
-----END PGP SIGNATURE-----
Show platform data
Tested on Linux-5.10.0-9-amd64-x86_64-with-glibc2.28
Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --no-create --no-recursion
Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=. -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4 -msha -msse4.1 -msse4.2 i
Compiler version: Debian clang version 13.0.1-++20211019122913+8a93745a7121-1~exp1~20211019003538.10
This allows filters to be reconstructed when the best known block is the Genesis block without needing to reindex. It fixes Init errors seen in bitcoin#23289.
a36e23b
to
9600ea0
Compare
Thanks - I added the assertion as suggested, and rebased. Note that adding the assert only changes the kind of error: Before, if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
if (block_to_test == block) { | ||
prune_violation = false; | ||
break; | ||
} | ||
assert(block->pprev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth adding a comment here on why block->pprev
must exist (because we're walking backwards from the active chain tip to a block index in the active chain. If a block index that we encounter doesn't have a backwards pointer, then the chain is malformed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually a correct explanation (the genesis block is in the active chain and doesn't have pprev
set). The reason this assertion will hold is because of how block
and block_to_test
are chosen; the conditional above this line is guaranteed to break at-or-higher-than genesis because of the few lines of logic above (which, if summarized in a comment, would just end up repeating the fairly-straightforward code above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're saying the same thing as I am. Because of the way block
and block_to_test
are chosen, if we walk backwards from block
, we're guaranteed to encounter block_to_test
(either at or before the genesis block).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't get to this earlier, would have added a comment, do people think that this warrants a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mzumsande sounds like it could be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in #23777
pruneheight_2 = self.nodes[0].pruneblockchain(1000) | ||
assert_equal(pruneheight_2, 998) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason why calling pruneblockchain(1000)
prunes up to exactly block 998? Is it because that's where the block file happens to wrap (i.e. blocks 999 and 1000 are in the same file as later blocks and so can't be pruned). If so, that seems a bit brittle (e.g. if the default block file size of disk serialization changes, then this test will fail), and definitely seems worthy of a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it because that's where the block file happens to wrap
Yes, I believe that is the reason, and I agree it's brittle (although the test relied before on this, e.g. at assert_equal(pruneheight, 248)
- I'm not sure though if there is a better way to test pruning in a detailed way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment added in #23777
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial code review ACK 9600ea0 for the code change, not the test changes. (Test changes are indirect and little over my head.) It seems obvious that previous code prune_violation = true, while (block->pprev)
would incorrectly detect a prune violation at the genesis block, and the fix here make sense and looks correct.
Now that bitcoin#23365 is merged.
9600ea0 test: Add edge case of pruning up to index height (Martin Zumsande) 698c524 index: Fix backwards search for bestblock (Martin Zumsande) Pull request description: This PR attempts to fix an intermittent Init issue encountered during the stress testing of bitcoin#23289, which relates to the pruning-compatible filter reconstruction logic introduced in bitcoin#15946. The problem would occur when the node starts with `-txindex=1` but `ThreadSync` is interrupted after it sets `m_best_block_index` to Genesis, and before it gets do any further work. In that case, during the next restart of the node, an Init error would be thrown because `BaseIndex::Init()` tries to backtrack from the tip to the last block which has been successfully indexed (here: Genesis), but the backtracking logic didn't work properly in this case: The loop `while (block_to_test && block->pprev && (block->pprev->nStatus & BLOCK_HAVE_DATA))` checks if a predecessor exists **before** performing the check `block_to_test == block` and then possbily setting `prune_violation = false` If `block_to_test` and `block` are the Genesis block this check will not be reached because `block->pprev` does not exist. To reproduce this bug on regtest: 1) start a node with a fresh datadir using `-txindex=1` (or any other index) 2) stop and restart without any index 3) mine a block 3) stop and restart again with the index enabled ->InitError `Error: txindex best block of the index goes beyond pruned data. (...)` Fix this by requiring that we have the data for the block of the current iteration `block` (instead of requiring it for the predecessor `block->pprev`) That way, the check for `block_to_test == block` is also reached when `block_to_test` is the Genesis block. No longer requiring the data of `block->pprev` also means that we can now prune up to `m_best_block_index` height without requiring a reindex (one block more than before). I added this edge case to `feature_blockfilterindex_prune.py`, the new version should fail on master. ACKs for top commit: ryanofsky: Partial code review ACK 9600ea0 for the code change, not the test changes. (Test changes are indirect and little over my head.) It seems obvious that previous code `prune_violation = true, while (block->pprev)` would incorrectly detect a prune violation at the genesis block, and the fix here make sense and looks correct. Tree-SHA512: c717f372cee8fd49718b1b18bfe237aa6ba3ff4468588c10e1272d7a2ef3981d10af4e57de51dec295e2ca72d441bc6c2812f7990011a94d7f818775e3ff1a38
These were suggested in review of bitcoin#23365
These were suggested in review of bitcoin#23365 Co-authored-by: John Newbery <john@johnnewbery.com>
These were suggested in review of bitcoin#23365 Co-authored-by: John Newbery <john@johnnewbery.com>
…ards search) e4a8d56 doc: add explanations for assert in index and magic numbers in test (Martin Zumsande) Pull request description: This adds two explanations suggested in the review of #23365, that I didn't manage to address before that PR was merged: bitcoin/bitcoin#23365 (comment) bitcoin/bitcoin#23365 (comment) ACKs for top commit: jnewbery: ACK e4a8d56 Tree-SHA512: 0500c8abb37bb3e3694463ad5e74b2e1483615ccf1d7529b0d5faa694652ada17d242dc7fda6d995733766c627d54178a2c8fa21a570cdf13292f64ff5425b56
…rds search) e4a8d56 doc: add explanations for assert in index and magic numbers in test (Martin Zumsande) Pull request description: This adds two explanations suggested in the review of bitcoin#23365, that I didn't manage to address before that PR was merged: bitcoin#23365 (comment) bitcoin#23365 (comment) ACKs for top commit: jnewbery: ACK e4a8d56 Tree-SHA512: 0500c8abb37bb3e3694463ad5e74b2e1483615ccf1d7529b0d5faa694652ada17d242dc7fda6d995733766c627d54178a2c8fa21a570cdf13292f64ff5425b56
These were suggested in review of bitcoin#23365 Co-authored-by: John Newbery <john@johnnewbery.com>
Now that bitcoin#23365 is merged.
Now that bitcoin#23365 is merged.
These were suggested in review of bitcoin#23365 Co-authored-by: John Newbery <john@johnnewbery.com>
Now that bitcoin#23365 is merged.
Summary: PR description: > Maintaining the blockfilterindexes in prune mode is possible and may lead to efficient p2p based rescans of wallets (restore backups, import/sweep keys) beyond the prune height (rescans not part of that PR). > > This PR allows running the blockfilterindex(es) in conjunction with pruning. > > - Bitcoind/Qt will shutdown during startup when missing block data has been detected ([re]enable -blockfilterindex when we already have pruned) > - manual block pruning is disabled during blockfilterindex sync > - auto-pruning is delayed during blockfilterindex sync ---- > Allow blockfilter in conjunction with prune bitcoin/bitcoin@6abe9f5 ---- > index: Fix backwards search for bestblock bitcoin/bitcoin@698c524 ---- > Avoid accessing nullpointer in BaseIndex::GetSummary() bitcoin/bitcoin@00d57ff ---- > Avoid pruning below the blockfilterindex sync height bitcoin/bitcoin@5e11226 ---- > Add debug startup parameter -fastprune for more effective pruning tests bitcoin/bitcoin@00d57ff ---- > Add functional test for blockfilterindex in prune-mode bitcoin/bitcoin@ab3a0a2 ---- > Fix several bugs in feature_blockfilterindex_prune.py [[bitcoin/bitcoin#21230 | core#21230]] ---- > test: Intermittent issue in feature_blockfilterindex_prune [[bitcoin/bitcoin#21252 | core#21252]] ---- > test: improve assertions in feature_blockfilterindex_prune.py > test: remove unneeded node from feature_blockfilterindex_prune.py [[bitcoin/bitcoin#21297 | core#21297]] ---- > test: Add edge case of pruning up to index height bitcoin/bitcoin@9600ea0 This is a backport of [[bitcoin/bitcoin#15946 | core#15946]], [[bitcoin/bitcoin#21230 | core#21230]], [[bitcoin/bitcoin#21252 | core#21252]], [[bitcoin/bitcoin#21297 | core#21297]] and [[bitcoin/bitcoin#23365 | core#23365]] Backport notes: - the test was buggy and ugly code after the first commit, so I had to squash all these commits to reach an acceptable quality. - I had to use different numbers of blocks that are generated and pruned, because Bitcoin ABC can fit more blocks into each blk?????.dat file than core because in this test the witness data of the coinbase transaction makes core blocks larger. Comments were added to explain this where needed. - I used the shortcut `node = self.nodes[0]` to make the code a bit more readable (shorter lines) Test Plan: `ninja all check-all` Run IBD with pruning enabled ` src/bitcoind -prune=2000 -blockfilterindex=1` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11299
Summary: > Now that [[bitcoin/bitcoin#23365 | core#23365]] is merged. [[bitcoin/bitcoin#23365 | core#23365]] was backported in D11299 This concludes backport of [[bitcoin/bitcoin#23737 | core#23737]] bitcoin/bitcoin@8904f17 Depends on D12614 Test Plan: `ninja check-functional` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D12615
This PR attempts to fix an intermittent Init issue encountered during the stress testing of #23289, which relates to the pruning-compatible filter reconstruction logic introduced in #15946.
The problem would occur when the node starts with
-txindex=1
butThreadSync
is interrupted after it setsm_best_block_index
to Genesis, and before it gets do any further work.In that case, during the next restart of the node, an Init error would be thrown because
BaseIndex::Init()
tries to backtrack from the tip to the last block which has been successfully indexed (here: Genesis), but the backtracking logic didn't work properly in this case:The loop
while (block_to_test && block->pprev && (block->pprev->nStatus & BLOCK_HAVE_DATA))
checks if a predecessor exists before performing the checkblock_to_test == block
and then possbily settingprune_violation = false
If
block_to_test
andblock
are the Genesis block this check will not be reached becauseblock->pprev
does not exist.To reproduce this bug on regtest:
-txindex=1
(or any other index)->InitError
Error: txindex best block of the index goes beyond pruned data. (...)
Fix this by requiring that we have the data for the block of the current iteration
block
(instead of requiring it for the predecessorblock->pprev
)That way, the check for
block_to_test == block
is also reached whenblock_to_test
is the Genesis block.No longer requiring the data of
block->pprev
also means that we can now prune up tom_best_block_index
height without requiring a reindex (one block more than before). I added this edge case tofeature_blockfilterindex_prune.py
, the new version should fail on master.