Disruption controller handles PodDisruptionBudget with empty selector incorrectly #95083
Description
What happened:
The disruption controller that enforces PodDisruptionBudgets interprets an empty label selector incorrectly—in conflict with how the API machinery treats such an empty selector—defeating attempts to cover all pods in a namespace with a PDB.
In a namespace with at least one pod running, create two PodDisruptionBudgets:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: inconseqential
spec:
# NB: The "selector" field is absent.
minAvailable: 1
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: total
spec:
selector: {}
minAvailable: 1
I named these "inconsequential" and "total" to clarify their intended interpretation and effect:
- PDB "inconsequential" has no selector, and selects no pods.
Since thePodDisruptionBudget.Selector
field is optional, the validation procedure allows its omission. - PDB "total" has an empty selector, selecting all pods.
An empty selector—one that imposes no restricting constraints—selects everything.
The disruption controller uses the PDB's selector to find pods covered by the budget. There, it mistakenly interprets the v1.Labels.Empty
function returning true as meaning that the selector covers no pods.
sel, err := metav1.LabelSelectorAsSelector(pdb.Spec.Selector)
if sel.Empty() {
return []*v1.Pod{}, nil
}
Note that labels.Selector.Empty
means something different. Quoting the documentation:
Empty returns true if this selector does not restrict the selection space.
That is, the empty label selector selects the universe. Adding restrictions—taking it from empty to populated—narrows its scope, selecting fewer target objects. The labels.nothingSelector
returned by labels.Nothing
is not empty. It is as full as can be, with so many restrictions that no target can satisfy it. By contrast, the selector returned by labels.Everything
is empty, because it contains no restrictions.
Bringing this back to PodDisruptionBudget, a budget with no selector (PodDisruptionBudget.Selector
is nil) uses labels.Nothing
. A budget with a present but empty selector (PodDisruptionBudget.Selector
is non-nil, but an empty map) uses lables.Everything
. Even though lables.Everything
is empty, that does not mean that it selects nothing. That means that it imposes no restrictions, and therefore selects the entire universe.
Looking at the kubectl describe poddisruptionbudgets output, we can see this mistaken interpretation:
Name: inconseqential
Namespace: test
Min available: 1
Selector: <unset>
Status:
Allowed disruptions: 0
Current: 0
Desired: 1
Total: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NoPods 17s (x3 over 39s) controllermanager No matching pods found
Name: total
Namespace: test
Min available: 1
Selector: <none>
Status:
Allowed disruptions: 0
Current: 0
Desired: 1
Total: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NoPods 3s (x5 over 38s) controllermanager No matching pods found
Observe that PDB "total" matches no pods, even though there are pods running in that namespace. Its selector is described as "<none>", which is how the API machinery's v1.FormatLabelSelector
shows a selector with no restrictions. There, "none" means "nothing less than the universe".
What you expected to happen:
PDB "total" defined above should have covered all the pods in the namespace, instead of none of them.
How to reproduce it (as minimally and precisely as possible):
- Choose a namespace with at least one running pod.
- Create a PodDisruptionBudget similar to "total" above, with its "selector" field set to an empty YAML sequence or JSON object (
{}
). - Inspect the PodDisruptionBudget's status via kubectl describe or kubectl get poddisruptionbudget total --output=jsonpath='{.status.expectedPods}'.
- Observe that the "expected pods" count is zero, instead of the count of the pods running in the namespace (assuming use of the
PodDisruptionBudget.MinAvailable
field with an integer value).
Anything else we need to know?:
This code in the disruption controller came in with commit d60ba3c as part of #25921. @davidopp noted in review twice (#25921 (comment) and #25921 (comment)) that an empty selector matches everything. @mml responded (#25921 (comment)) that he was mimicking behavior from elsewhere, and that we'd trap empty selectors during validation, but as of today we don't. There's also a comment in the code that clarifies that @mml intended an empty selector to select nothing, rather than everything.
// If a PDB with a nil or empty selector creeps in, it should match nothing, not everything.
If we don't intend for PDBs with empty selectors to be meaningful, we should make the PodDisruptionBudget.Selector
field mandatory, and validate that it's not empty upon creation or update. See kubernetes/enhancements#904 for the latest graduation criteria for PodDisruptionBudget.
Environment:
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T21:51:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
AWS EC2 - OS (e.g:
cat /etc/os-release
):
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=2605.4.0
VERSION_ID=2605.4.0
BUILD_ID=2020-09-14-1802
PRETTY_NAME="Flatcar Container Linux by Kinvolk 2605.4.0 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar-linux.org/"
BUG_REPORT_URL="https://issues.flatcar-linux.org"
FLATCAR_BOARD="amd64-usr"
- Kernel (e.g.
uname -a
):
Linux ip-10-3-2-117.us-east-2.compute.internal 5.4.65-flatcar #1 SMP Mon Sep 14 17:27:17 -00 2020 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux
- Install tools:
kops - Network plugin and version (if this is a network-related bug):
Calico version 3.15.3