-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hotbackup Deadlock and Consistency fixes. #18928
Conversation
7f51fbc
to
b16f412
Compare
b16f412
to
7ff75b8
Compare
…/arangodb into feature/block-commit-hot-backup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review not yet complete, but already some comments, publishing for earlier visibility.
if (!result.ok()) { | ||
unlockDBServerTransactions(pool, backupId, lockedServers); | ||
unlockServersTrxCommit(pool, backupId, lockedServers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that this is not a new change of this PR, but what if a lock request is sent and runs into a timeout? Then this particular coordinator might still be locked and we do not even try to roll back the lock here, since we are only contacting the lockedServers
. Should we not - as in the cases further down - try to unlock all coordinators again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm against changing this in this PR:
- This PR is already big enough and it is not a new issue.
- The code in question is a huge mess. It is more likely for me to break it than fix anything useful. It has to be reimplemented from ground up.
- The existing logic is good enough for dbservers, I would argue it is good enough for coordinators. There was probably a reason for why to do it like that.
However, I'm not saying that this should never be change. The hotbackup code (in particular the coordinator code) needs a rewrite, in particular since the assumptions have changed over time and chances are high we could come up with simpler code that works more reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we adress this in a BTS follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I've missed that, but do we test here that ArangoSearch view documents are all present (and nothing duplicated) after restoring hotbackup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the things I have so far reported: LGTM. I will now move on to the enterprise part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the open point(s) can imo be addressed later
Afaik it was |
Scope & Purpose
Before this PR, when the user requested a HotBackup, the ArangoDB cluster waited for all in-progress transactions to be committed. This lead to deadlocks.
Also, the HotBackup snapshot of the RocksDB database did only include some portion of the WAL and was not guaranteed to include enough of the WAL to recover MerkleTrees. This would lead to MerkleTrees being inconsistent with the database and potential data loss from replication.
After this PR, requesting a HotBackup merely prevents transactions from being committed.
The HotBackup process has to take two snapshots: One of the RocksDB store, and one of the ArangoSearch store.
Consistency is achieved by taking a snapshot of RocksDB with a large enough portion of the WAL so that the recovery process after the DBServer restart can replay all necessary operations on ArangoSearch and MerkleTrees.
Unfortunately, to achieve taking a snapshot of RocksDB including the WAL files, a small patch is applied to our RocksDB fork. Future work can include upstreaming that patch.
Enterprise: https://github.com/arangodb/enterprise/pull/1267