Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix consensus sync peers waiting on wrong value #5350

Merged
merged 4 commits into from
Nov 4, 2024

Conversation

timvisee
Copy link
Member

@timvisee timvisee commented Nov 4, 2024

When synchronizing the consensus commit and peer we were waiting on the wrong commit value.

We waited on the commit in hard state, but we should wait on the last applied entry instead. The last applied entry better reflects what collection metadata we're actually seeing, while the hard state commit might be somewhere in the future.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

@timvisee timvisee requested review from generall and ffuugoo November 4, 2024 10:58
Copy link
Member

@KShivendu KShivendu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems correct code-wise. But I have the same doubt as Andrey, why is it not better to wait for a higher commit value? In the worse case, there will be a timeout but even that it was waiting for the same time 🤔

Maybe we should pick one of the scenarios (one with snapshot or wal delta transfer) to see if this actually makes a difference?

@timvisee timvisee force-pushed the fix-consensus-sync-peers branch from 06a205e to b7ab085 Compare November 4, 2024 11:22
@timvisee
Copy link
Member Author

timvisee commented Nov 4, 2024

But I have the same doubt as Andrey, why is it not better to wait for a higher commit value? In the worse case, there will be a timeout but even that it was waiting for the same time 🤔

I'm quite confused by this as well, and have no answer to this yet. What I can say is that making this change did have a clear effect on my repro.

Even though this is not entirely clear yet I do believe it's better to change it because this is how it should have been from the start.

Maybe we should pick one of the scenarios (one with snapshot or wal delta transfer) to see if this actually makes a difference?

If you mean - testing in chaos testing - then yes I definitely vote for that.

Copy link
Member

@generall generall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it. It is what we ask other nodes to wait for, not what other nodes use to wait on

@timvisee timvisee merged commit 4bc765e into dev Nov 4, 2024
17 checks passed
@timvisee timvisee deleted the fix-consensus-sync-peers branch November 4, 2024 13:25
timvisee added a commit that referenced this pull request Nov 8, 2024
* When syncing consensus across peers, wait on last applied index

* Update comment

* Reformat

* Fall back to 0
@timvisee timvisee mentioned this pull request Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants