Overlapping labels can lead to HPA matching incorrect pods #124307

adrianmoisey · 2024-04-14T18:11:47Z

What happened?

When an HPA targets a Deployment which has a label selector matching Pods that don't belong to it (overlapping labels, for example), those "other" Pods are considered by the HPA to be part of the targeted HPA.

What did you expect to happen?

I have always been lead to believe that this behaviour is correct and the user should have labels that are specific enough to not overlap other Deployments.
This is stated here:

Do not overlap labels or selectors with other controllers (including other Deployments and StatefulSets). Kubernetes doesn't stop you from overlapping, and if multiple controllers have overlapping selectors those controllers might conflict and behave unexpectedly.

However, recently the VPA fixed the same behaviour in kubernetes/autoscaler#6460

Which makes me wonder if I should expect the HPA to test the owner reference too, and only target that Deployment's pods.

How can we reproduce it (as minimally and precisely as possible)?

Create 2 deployments, both with the same labels/selectors
In 1 Deployment set resources
Deploy an HPA (and metrics-server)
Watch the events of the HPA throw error events related to Pods not under control of the HPA's deployment
1. ie: Warning FailedGetResourceMetric 87s (x3 over 117s) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu in container ubuntu of Pod other-64886557cb-ldtnt

YAML files

---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
  app: workload
name: workload
spec:
replicas: 1
selector:
  matchLabels:
    app: other
template:
  metadata:
    labels:
      app: other
  spec:
    containers:
    - command:
      - /bin/bash
      - -c
      - sha256sum /dev/zero
      image: ubuntu
      imagePullPolicy: Always
      name: nginx
      resources:
        requests:
          cpu: 10m
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
  app: other
name: other
spec:
replicas: 1
selector:
  matchLabels:
    app: other
template:
  metadata:
    labels:
      app: other
  spec:
    containers:
    - command:
      - /bin/bash
      - -c
      - sleep infinity
      image: ubuntu
      imagePullPolicy: Always
      name: ubuntu
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: workload
spec:
scaleTargetRef:
  apiVersion: apps/v1
  kind: Deployment
  name: workload
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Cloud provider

kind
GKE
kOps (AWS)

OS version

No response

Install tools

kind

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-04-14T18:11:56Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

adrianmoisey · 2024-04-14T18:14:53Z

/sig autoscaling

dbenque · 2024-04-15T20:15:08Z

Since the spec is using a scaleTargetRef and not a labelSelector, I think it makes sense to scope the HPA to the deployment and not to the set of pods matching the labelSelector of that deployment. This is what is done in the VPA and also in the Deployment Controller.

This being said the documentation clearly says:

The controller manager finds the target resource defined by the scaleTargetRef, then selects the pods based on the target resource's .spec.selector labels,...

So fixing this would be a change in behaviour as discussed in the SIG. Does it really makes sense to have deployment having overlapping selectors? It is technically possible yes, but is it a good practice... probably not! This leads to so many problems, like overlapping PDBs, that result in 500 when you try to evict pods. If there is a real need for doing labelSelector to define the scope of the HPA, maybe next HPA spec should explicitly introduce that capability by offering in the spec both scaleTargetRef and labelSelector (only one can be used at a time). I am clearly not in favour of such a change, but if the need really exists that would be a clear way to address it.

I would be in favour of giving a mean to fix this issue to align the behaviour of all the controllers (VPA, Deployment). If there is a consensus for fixing, to protect backward compatibility, maybe we can propose to pilot the activation of the fix at controller (or HPA object) level using a new parameter (or annotation on the HPA). Thanks to that new parameter the user can decide on the behaviour: "strict scaleTargetRef using ownerRef" versus "labelSelector"

adrianmoisey · 2024-04-18T19:06:25Z

I think it makes sense to scope the HPA to the deployment and not to the set of pods matching the labelSelector of that deployment. This is what is done in the VPA and also in the Deployment Controller.

As an end user, I think this is what one would "expect" when starting out on Kubernetes.
I also really like the consistency that this would add.

sftim · 2024-04-22T10:58:05Z

I think we might need a KEP to change the way that Kubernetes interprets the spec of a HorizontalPodAutoscaler.

VerticalPodAutoscaler is not part of the formal Kubernetes API (yet), whereas HorizontalPodAutoscaler already is.

OmarHawk · 2024-04-26T09:48:45Z

Just ran into the same issue I believe.

We have following situation:

one namespace app
two deployments in the same namespace app-frontend and app-backend
both deployments, replicasets and their pods have overlapping labels, like this as everything belongs sort of together:
-- app: app
in the same time, they also have a different label per deployment/replica set/pod like this (just fyi):
-- Deployment 1: app-role: app-frontend
-- Deployment 2: app-role: app-backend
both have the following thing in their Deployment spec:

  selector:
    matchLabels:
      app: app

they have the following default replica count:
-- app-frontend: 2
-- app-backend: 1
and the following resource requests / limits (cpu) per Pod:
-- app-frontend: 200m / 1000m
-- app-backend: 600m / 1200m

The backend one is performing some heavy load operations in the background, causing nearly 100% utilization of that single pod in the deployment all the time (that is fine for us), while the frontend one is having near 0 load, as it is produces only load on a per request basis.

We have configured an HPA that has scaleTargetRef like this:

scaleTargetRef:
    kind: Deployment
    name: app-frontend
    apiVersion: apps/v1

So my assumption would be that metrics defined like this:

  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80

would only apply to that single app-frontend Deployment as this is also the target.
From what I observe, the metrics seem to also consider the heavy-load app-backend Pods, therefore leading to undesired upscaling operations on the app-frontend Deployment. This is in my opinion nowhere documented. I have tried to find explanations for the observations I have made , but haven't found anything. Additionally, I think, this is also undesired behaviour...

btw. scaling itself happens only on the app-frontend, but not on app-backend, which is fine.

adrianmoisey · 2024-04-26T12:07:19Z

What are the label selectors for each deployment?

OmarHawk · 2024-04-26T13:55:29Z

I've updated my description.

OmarHawk · 2024-04-26T14:43:39Z

After adjusting the matchLabels selectors there accordingly, it seems to work properly actually.

elvis-cai · 2024-06-03T06:12:54Z

same here, when create different deployment with same label and change hpa to select different deployment, hpa reporting usage is wrong(targets always same), after make sure reference target deployment is unique, this issue is gone
Before

kl get hpa 
NAME                REFERENCE                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
danswer-api         Deployment/danswer-api         87%/80%   1         3         3          15m
danswer-indexing    Deployment/danswer-indexing    87%/80%   1         3         3          59m
danswer-inference   Deployment/danswer-inference   87%/80%   1         3         3          59m

After

kl get hpa 
NAME                REFERENCE                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
danswer-api         Deployment/danswer-api         28%/80%   1         3         1          14m
danswer-indexing    Deployment/danswer-indexing    98%/80%   1         3         3          79m
danswer-inference   Deployment/danswer-inference   68%/80%   1         3         3          79m

k8s-triage-robot · 2024-09-01T06:26:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

adrianmoisey · 2024-09-02T15:17:43Z

/remove-lifecycle stale

k8s-triage-robot · 2024-12-01T16:16:14Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

adrianmoisey · 2024-12-01T16:39:40Z

/remove-lifecycle stale

adrianmoisey · 2024-12-28T19:03:39Z

I think I'll take a try at making a KEP for this
/assign

adrianmoisey · 2024-12-30T19:14:01Z

I've been digging into this more.
It seems that the existing behaviour of matching pods outside of a Deployment doesn't actually work. The HPA artificially increases the replica count, I think it's due to this:

https://github.com/kubernetes/kubernetes/blob/v1.32.0/pkg/controller/podautoscaler/replica_calculator.go#L103

readyPodCount could be too high, since additional pods are matched

adrianmoisey added the kind/bug Categorizes issue or PR as related to a bug. label Apr 14, 2024

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 14, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 14, 2024

k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 14, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 1, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 2, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2024

k8s-ci-robot assigned adrianmoisey Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overlapping labels can lead to HPA matching incorrect pods #124307

Overlapping labels can lead to HPA matching incorrect pods #124307

adrianmoisey commented Apr 14, 2024 •

edited

Loading

k8s-ci-robot commented Apr 14, 2024

adrianmoisey commented Apr 14, 2024

dbenque commented Apr 15, 2024

adrianmoisey commented Apr 18, 2024

sftim commented Apr 22, 2024

OmarHawk commented Apr 26, 2024 •

edited

Loading

adrianmoisey commented Apr 26, 2024

OmarHawk commented Apr 26, 2024 •

edited

Loading

OmarHawk commented Apr 26, 2024

elvis-cai commented Jun 3, 2024 •

edited

Loading

k8s-triage-robot commented Sep 1, 2024

adrianmoisey commented Sep 2, 2024

k8s-triage-robot commented Dec 1, 2024

adrianmoisey commented Dec 1, 2024

adrianmoisey commented Dec 28, 2024

adrianmoisey commented Dec 30, 2024

Overlapping labels can lead to HPA matching incorrect pods #124307

Overlapping labels can lead to HPA matching incorrect pods #124307

Comments

adrianmoisey commented Apr 14, 2024 • edited Loading

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

k8s-ci-robot commented Apr 14, 2024

adrianmoisey commented Apr 14, 2024

dbenque commented Apr 15, 2024

adrianmoisey commented Apr 18, 2024

sftim commented Apr 22, 2024

OmarHawk commented Apr 26, 2024 • edited Loading

adrianmoisey commented Apr 26, 2024

OmarHawk commented Apr 26, 2024 • edited Loading

OmarHawk commented Apr 26, 2024

elvis-cai commented Jun 3, 2024 • edited Loading

k8s-triage-robot commented Sep 1, 2024

adrianmoisey commented Sep 2, 2024

k8s-triage-robot commented Dec 1, 2024

adrianmoisey commented Dec 1, 2024

adrianmoisey commented Dec 28, 2024

adrianmoisey commented Dec 30, 2024

adrianmoisey commented Apr 14, 2024 •

edited

Loading

OmarHawk commented Apr 26, 2024 •

edited

Loading

OmarHawk commented Apr 26, 2024 •

edited

Loading

elvis-cai commented Jun 3, 2024 •

edited

Loading