-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(dra): when deleting resourceclaimtemplate, pod can't running again #129362
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/wg device-management |
Do we need to add a field in resourceclaimtemplate to manage which pods are using it? Or add a Finalizer |
Can you reproduce this with the dra example driver? |
Was the ResourceClaim already created for the pod? This events indicates otherwise:
Without the ResourceClaim, scheduling the pod cannot proceed, and without the ResourceClaimTemplate, the ResourceClaim cannot be created. Once the ResourceClaim exists, it should be safe to remove the ResourceClaimTemplate. There is no concept of "ResourceClaimTemplate is in use". It's used only for very brief moments in time when creating a ResourceClaim for a pod. |
What happened?
pod is always Pending
What did you expect to happen?
When resourceclaimtemplate is deleted, the pod can still run successfully after restart. Or resourceclaimtemplate cannot be deleted when it is in use.
How can we reproduce it (as minimally and precisely as possible)?
use this sample yaml, and delete pod.
Anything else we need to know?
I'm not sure if this is by design. But according to common sense, there is a possibility that resourceclaimtemplate was accidentally deleted. If the training task is restarted, the above problem may occur. 🤔
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: