-
Notifications
You must be signed in to change notification settings - Fork 36.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix non-determinism in p2p_headers_sync_with_minchainwork.py #25978
test: fix non-determinism in p2p_headers_sync_with_minchainwork.py #25978
Conversation
The test for node3's chaintips needs some sort of synchronization in order to be reliable.
I checked out this PR and the tests run for me fine both with and without your changes. However, when I drop in some debugger breakpoints to understand things better, I am able to observe failures in the first test maybe 25% of the time (this is with the changes that are in this PR). The failure is that the Is there a known race condition here? Am I observing the behavior that this PR is aiming to fix? Here are my test steps:
|
While a debugger is attached, you may run into (unrelated) test timeouts, because the debugger doesn't prevent your system clock from advancing. You may increase the timeout with the timeout-factor command line option, but this will also make it harder to reproduce this intermittent issue, because you may have to wait longer for the timeout to happen if the issue occurs. Usually intermittent issues are observed when the system running them puts a thread to sleep for a longer time. Thus, I find it easiest to deterministically reproduce the failures by inserting the sleep myself. For example, you can modify the source code to insert a |
Haven't had the chance to look into this yet, but just pointing out that the test currently fails intermittently in the CI, which this PR should fix: |
Code Review ACK 88e7807
Yes, there is a race condition: While node 0 mines some blocks, and we wait for node 1 and 2 to complete the headers-sync (and decide to reject the headers because of insufficient work), there was no such check for node 3 before, so that it would be possible to call |
@MarcoFalke thank you for the valuable insight on the debugger's relationship (or lack thereof) with the system clock. I like what you said about intentionally inserting sleeps to reproduce intermittent timing failures. That's clever! @mzumsande thank you for explaining that race condition so thoroughly! It was really helpful to have the CI logs with the failure to reference. Based on these two comments I'm convinced I was introducing instability with the debugger, especially as I added additional breakpoints. Anyways I got confused about something and ended up running the tests 10 times (without the debugger) and never saw any failures. ACK 88e7807 |
…th_minchainwork.py 88e7807 test: fix non-determinism in p2p_headers_sync_with_minchainwork.py (Suhas Daftuar) Pull request description: The test for node3's chaintips (added in PR25960) needs some sort of synchronization in order to be reliable. ACKs for top commit: mzumsande: Code Review ACK 88e7807 satsie: ACK 88e7807 Tree-SHA512: 5607c5b1a95d91e7cf81b695eb356b782cbb303bcc7fd9044e1058c0c0625c5f9e5fe4f4dde9d2bffa27a80d83fc060336720f7becbba505ccfb8a04fcc81705
The test for node3's chaintips (added in PR25960) needs some sort of synchronization in order to be reliable.