-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more gofailpoints to validate critical parts of the code #14735
Comments
I will provide a list next week. |
Supplement:
|
@serathius I've checked ConsistenIndex and I think the most interesting use case for failpoint is UnsafeSave, but it's indirectly covered by PreCommitHook failpoints. |
@ahrtr Checked Moveleader and it seams most of the work is happening inside raft lib. I think it will be useful to add MoveLeader call to DefaultTraffic. But I don't see a good place inside etcd repo for a new failpoint. Do you have something else in mind for MoveLeader failpoint? |
No, list is to ensure we give some thought to it. Thanks for confirming. |
Sorry for the late response. We can use newLeader to check whether the leader changed, and add failpoint accordingly.
This seems to be a good suggestion. But I still do not think it can reliably reproduce the duplicated event issue. Submitted a PR #15272. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
If no one else is working on this issue, is it fine if I work on it @serathius ? |
This issue requires some research into best places to put failpoints. This might require knowledge from maintainers, that might not be available and not able to provide timely responses. @mounilKshah Could you look into #17817 ? It's very similar issue, but about backporting existing failpoints, so there is no need for maintainers help and should allow you to dig into the problem. |
Yeah sure, thanks for the quick response! |
What would you like to be added?
Followup from #14733. Thanks to gofailpoint in defrag code we were able to find an issue with data inconsistency.
Based on this we should increase coverage of other processes similar to defrag. TODO:
etcd/server/storage/backend/backend.go
Line 504 in c83b1ad
etcd/tests/linearizability/failpoints.go
Lines 35 to 44 in 2f558ca
Let's start from creating the list:
Feel free to suggest other processes that require a gofailpoint. cc @ahrtr @ptabor
Why is this needed?
This should allow us to better detect issues with data inconsistency like #14733
The text was updated successfully, but these errors were encountered: