-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCE PD Detach fails if node no longer exists #29358
Comments
So is the issue here that we made a bad assumption about how the GCE API will behave for detach when the node doesn't exist? |
Not so much a bad assumption, more a missed case. |
The fix for this, PR #29485, missed one location where node is fetched. The existing fix handles the case where the actual node is physically deleted. But it does not handle the case where the node API object is deleted. This means that detach can still sometimes fail due to missing node API object:
|
Automatic merge from submit-queue Skip safe to detach check if node API object no longer exists Fixes #29358
Automatic merge from submit-queue Add test to detach a pd whose node was deleted **What this PR does / why we need it**: A test for the following issue : If a node with a GCE PD attached is deleted (before the volume is detached), subsequent attempts by the attach/detach controller to detach it should not fail. **Bonus** :Added additional code to ensure that the pd can still be attached to a different node. Edit : Removed it as it was making the test much slower. #29358
Problem:
If a node with a GCE PD attached is deleted (before the volume is detached), subsequent attempts by the attach/detach controller to detach it continuously fail, and prevent the controller from attaching the volume to another node.
Repro steps:
Logs:
Workarounds:
-or-
Proposed Fix:
If GCE PD detach fails with
instance not found
, assume successful detach.The text was updated successfully, but these errors were encountered: