-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue #200: update leader timeouts after disk write #201
Conversation
|
||
// reset election timer to avoid punishing the leader for our own | ||
// long writes | ||
stepDown(request.term()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to call stepDown again here.
// This request is a sign of life from the current leader. Update | ||
// our term and convert to follower if necessary; reset the | ||
// election timer. set it here in case we exit early, we will set | ||
// it again after the write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"exit early" -> "exit this function early"
"write" -> "disk write"
Cool. stepDown is there (above) to ensure this server is a follower in the same term (it might not have participated in the election). |
Probably also worth updating copyright header and release notes. |
ok, re-pushed with those changes. is this issue able to repro'd somewhere in the logcabin unit tests? |
@@ -15,6 +15,10 @@ See [RELEASE-PROCESS.md](RELEASE-PROCESS.md). | |||
Version 1.2.0-alpha.0 (In Development) | |||
====================================== | |||
|
|||
Bug fixes (low severity): | |||
|
|||
- #200: reset leader election timeout in follower after disk io completes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an improvement, not a bug fix :)
Let me think about the unit testing question. At least we should be able to find a clean way to check that the code is doing what we expect (whitebox). |
in cases where disk writes can take longer than the leader election, a follower can call for an election when one is not necessary. this change resets the election related timers after all disk writes.
pushed another. |
issue #200: update leader timeouts after disk write
in cases where disk writes can take longer than the leader election,
a follower can call for an election when one is not necessary. this
change resets the election related timers after all disk writes.
I don't fully understand what stepDown is doing or if we should call it again, but it does seem related to the other calls so I duplicated it, so I left in. In testing, this change is helping prevent the stated problem, which makes sense. I'll make whatever changes are requested.