DRA: Using All allocation mode will schedule to nodes with zero devices #129310
Labels
kind/bug
Categorizes issue or PR as related to a bug.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
wg/device-management
Categorizes an issue or PR as relevant to WG Device Management.
What happened?
I created a resource claim template to get "All" GPUs on a node:
I then created a deployment that had a Pod that used that claim. The Pod was scheduled to a node. However, my DRA driver on that node was not running, so there were no resource slices for that node.
What did you expect to happen?
I expected the pod to not schedule, since there were no available devices meeting the request. "All" should mean "at least one".
How can we reproduce it (as minimally and precisely as possible)?
Create the resource claim template as shown and a deployment, with no DRA driver running. The pod will still schedule.
Anything else we need to know?
/wg device-management
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: