Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #200: update leader timeouts after disk write #201

Merged
merged 1 commit into from
Feb 11, 2016

Conversation

nhardt
Copy link
Contributor

@nhardt nhardt commented Dec 7, 2015

in cases where disk writes can take longer than the leader election,
a follower can call for an election when one is not necessary. this
change resets the election related timers after all disk writes.

I don't fully understand what stepDown is doing or if we should call it again, but it does seem related to the other calls so I duplicated it, so I left in. In testing, this change is helping prevent the stated problem, which makes sense. I'll make whatever changes are requested.


// reset election timer to avoid punishing the leader for our own
// long writes
stepDown(request.term());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call stepDown again here.

// This request is a sign of life from the current leader. Update
// our term and convert to follower if necessary; reset the
// election timer. set it here in case we exit early, we will set
// it again after the write
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"exit early" -> "exit this function early"
"write" -> "disk write"

@ongardie
Copy link
Member

ongardie commented Dec 7, 2015

Cool. stepDown is there (above) to ensure this server is a follower in the same term (it might not have participated in the election).

@ongardie
Copy link
Member

ongardie commented Dec 7, 2015

Probably also worth updating copyright header and release notes.

@nhardt
Copy link
Contributor Author

nhardt commented Dec 7, 2015

ok, re-pushed with those changes. is this issue able to repro'd somewhere in the logcabin unit tests?

@@ -15,6 +15,10 @@ See [RELEASE-PROCESS.md](RELEASE-PROCESS.md).
Version 1.2.0-alpha.0 (In Development)
======================================

Bug fixes (low severity):

- #200: reset leader election timeout in follower after disk io completes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an improvement, not a bug fix :)

@ongardie
Copy link
Member

ongardie commented Dec 7, 2015

Let me think about the unit testing question. At least we should be able to find a clean way to check that the code is doing what we expect (whitebox).

in cases where disk writes can take longer than the leader election,
a follower can call for an election when one is not necessary. this
change resets the election related timers after all disk writes.
@nhardt
Copy link
Contributor Author

nhardt commented Dec 7, 2015

pushed another.

ongardie added a commit that referenced this pull request Feb 11, 2016
issue #200: update leader timeouts after disk write
@ongardie ongardie merged commit bca155c into logcabin:master Feb 11, 2016
@nhardt nhardt deleted the issue200 branch October 5, 2016 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants