-
Notifications
You must be signed in to change notification settings - Fork 36.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Implicitly sync after generate* to preempt races and intermittent test failures #20362
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK: thanks for doing this - intermittent test failures are the worst! Bike shed nit: |
faf0319
to
fae7684
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
How about for now:
try_dispatch_with_sync
try
dispatch
except
sync_all
dispatch
log with print_stack
@@ -296,9 +296,27 @@ def wait_for_cookie_credentials(self): | |||
time.sleep(1.0 / poll_per_s) | |||
self._raise_assertion_error("Unable to retrieve cookie credentials after {}s".format(self.rpc_timeout)) | |||
|
|||
def generate(self, nblocks, maxtries=1000000): | |||
def generateblock(self, *args, sync_fun=True, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to "overload" these generate* is to add a generic _rpc_dispatch_and_sync
(__getattr__
-> _rpc_dispatch_and_sync
-> _rpc_dispatch
).
It would be cool to keep the last generate* extract_stack
that when a related error occurs we could print it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make calling code tremendously verbose, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am misunderstanding. Do you have a working diff that I can steal?
Concept ACK! very nice, thank you |
Concept ACK, rooting out a cause of intermittent issues (due to test flakiness) is great. |
fa7ef91
to
faf6073
Compare
Concept ACK |
What do you think about adding a |
That would require changing all calls to |
I think it's worth it, to avoid adding an implicit dependency from TestNode onto TestFramework. |
-BEGIN VERIFY SCRIPT- perl -0777 -pi -e 's/(generate[^\n]*\)[^\n]*)(\n|\s)+self.sync_.*\(\)\n/\1\n/g' $(git grep -l generate ./test) -END VERIFY SCRIPT-
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
The most frequent failure in functional tests are intermittent races. Fixing such bugs is cumbersome because it involves:
self.sync_all()
where it was forgottenAlso, writing a linter to catch those is not possible, nor is review effective in finding these bugs prior to merge.
Fix all future intermittent races caused by a missing sync_block call by calling
sync_all
implicitly after eachgenerate*
, unless opted out. This ensures that the code is race-free (with regards to blocks) when the tests pass once, instead of our current approach where the code can never be guaranteed to be race-free.