Expected behavior for Deployment replicas with HPA during update #25238

verdverm · 2016-05-06T00:02:23Z

I am seeing unexpected behavior from a user standpoint when kubectl apply -f mydeployment.yml

The existing Deployment has an HPA associated with it and the current replica count is 3. When the deployment is accepted, the number of replicas jumps to 9, then scales down to 6, the value in the deployment. Eventually, the new pods info starts flowing in and the HPA does its thing and scales back to satisfaction.

The behavior I expected is that the rollout would use the current replicas from the HPA and the strategy parameters from the Deployment to keep replica count in line with the current target set by the HPA.

k8s version: 1.2.3

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mydeployment
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: "40%"
      maxSurge: "50%"
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: "someimage"

apiVersion: extensions/v1beta1
kind: HorizontalPodAutoscaler
metadata:
  name: mydeployment
  namespace: default
spec:
  scaleRef:
    kind: Deployment
    name: mydeployment
    subresource: scale
  minReplicas: 3
  maxReplicas: 12
  cpuUtilization:
    targetPercentage: 50

The text was updated successfully, but these errors were encountered:

DirectXMan12 · 2016-05-06T14:27:28Z

cc @kubernetes/autoscaling

The scaling of a deployment is based on the replicas field. When the HPA scales a deployment, it also updates the replicas field. AFAICT from your description, what's going on is that the apply operation ends up updating the replicas field, and thus is temporarily scaling the deployment manually until the HPA's next sync loop iteration (when the HPA notices the extra replicas, and corrects the replica count). You should be able to check this by checking (via kubectl get) if the replicas field on the deployment changes temporarily after the apply.

0xmichalis · 2016-05-06T14:47:18Z

@kubernetes/deployment

verdverm · 2016-05-06T16:51:03Z

This is more an issue when we have 20 replicas under load and roll a new Deployment and the replicas go from 20 -> 26 -> 6 -> ... -> 20

roberthbailey · 2016-05-08T06:58:46Z

/cc @janetkuo

bgrant0607 · 2016-05-09T17:18:45Z

@verdverm

There are multiple issues here.

Do not set the replicas field in Deployment if you're using apply and HPA. As mentioned by @DirectXMan12, apply will interfere with HPA and vice versa. If you don't set the field in the yaml, apply should ignore it.

Also, I'm not sure HPA can be expected to be stable right now with large maxUnavailable and maxSurge values. Maybe try maxUnavailable=0 and maxSurge=1.

Finally, Deployment doesn't do a good job of detecting scaling behavior right now, since it's fairly tricky to do robustly. What I'd like it to do is to distinguish scaling from other changes and then scale proportionally. #20754

rosskukulinski · 2016-05-10T15:56:30Z

I agree with @verdverm - this was unexpected behavior for me as well. At a minimum, I think the Deployment docs should call this out.

I would like to be able to utilize the same deployment.yml file for both initial deploy and updates (all is done through CI/CD). It seems like the 'proper' way is to not specify a replicas in the Deployment (defaults to 1?) and apply the HPA profile when launching a new service.

verdverm · 2016-05-24T16:27:27Z

@bgrant0607 @DirectXMan12

Your suggested changes makes the observed behavior in line with what we were expecting.
Maybe an update to the docs to make this more explicit would be helpful.

Thanks for all your help!

bgrant0607 · 2016-05-28T00:25:32Z

I wrote an explanation of apply behavior here:
#15894

itajaja · 2016-08-21T22:47:18Z

If you don't set the field in the yaml, apply should ignore it.

@bgrant0607 I am not noticing this. It seems when I apply the replica number is set to 0 if I don't specify the number in the file

jszczepkowski · 2017-03-09T11:07:35Z

Are we sure we want to deliver this for 1.6? Not much time left.

As I understand, the problem here lies in kubectl apply and no modification in HPA are needed. Please, correct me if I'm wrong.

jszczepkowski · 2017-03-09T11:07:44Z

CC @jszczepkowski

0xmichalis · 2017-03-09T11:57:40Z

The milestone was probably added for triaging. I don't think we can do anything for 1.6.

dsalaza4 · 2021-09-24T00:24:06Z

I ended up implementing a hpa_replicas function and replacing its returned value in the deployment YAML.

ldemailly · 2021-10-20T00:07:42Z

@pinowthebird

I did some basic testing but it seems like not specifying replicas in my yaml file simply defaults them to 1 after an apply

it's a bit more complicated than that; yes not specifying will set a value of 1 if nothing else happens; but once "not specified" and handled by hpa; further applies will indeed do the right thing and not reset the pod count to 1 (you can see doing kubectl diff)

Note that if you had a replicas: N in your yaml and want to remove it you have to hack/edit kubectl.kubernetes.io/last-applied-configuration to remove it there in the json to avoid a scale down to 1 when you first fix your deployment

philipsu522 · 2021-12-27T23:29:42Z

I'm trying to follow the recommendation in https://github.com/kubernetes/website/pull/29739/files for migrating deployments to HPA, specifically recommendation 1. Note that the patch verb exhibits the same behavior of scaling the replica count down to 1, although remove the spec.replica field from the manifest then applying doesn't.
Specifically, this command causes the scale-to-1 behavior:

 kubectl patch -n psu-test-hpa deployment/sample-app --type=json -p='[{"op": "remove", "path":"/spec/replicas"}]'

bgrant0607 · 2022-02-05T03:32:59Z

@apelisse Is there a way with SSA to specify an initial value of a field, such as spec.replicas, but not take ownership of it?

Discussion with @tamalsaha:
https://twitter.com/tsaha/status/1489720742300160004

apelisse · 2022-02-09T16:53:42Z

We wrote this piece of documentation to describe how one should do that. I'd love to hear feedback and if this doesn't work!

dudicoco · 2022-02-09T17:20:23Z

@apelisse thank you for the documentation, it's very clear.

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

apelisse · 2022-02-09T17:24:25Z

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

Because they rely on multiple steps? i.e. "commit this, apply that, commit that, apply this" kind of sequence?

dudicoco · 2022-02-09T17:28:34Z

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

Because they rely on multiple steps? i.e. "commit this, apply that, commit that, apply this" kind of sequence?

@apelisse because a special logic would need to be added to any tool/system which adds the flag only when adding an hpa to an existing deployment.

apelisse · 2022-02-10T00:18:06Z

What special logic needs to be built? I guess I must be missing that part.

dudicoco · 2022-02-10T09:20:14Z

The logic is:

check if deployment exists without an hpa
if 1 is true, add the flag --field-manager=handover-to-hpa

Please correct me if I misunderstood how this new feature works.

rehevkor5 · 2022-03-02T20:24:24Z

We wrote this piece of documentation to describe how one should do that. I'd love to hear feedback and if this doesn't work!

That is helpful if you're using kubectl, but it doesn't appear to matter if you use Helm, sadly.

rbroggi · 2022-03-09T10:50:51Z

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

victorboissiere · 2022-06-03T13:54:46Z

We had the same problem, we ended up by implementing a mutating webhook with Kyverno to make sure the current replicas value is not overriden.

Example:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: general-deployment-hpa-replicas
  annotations:
    policies.kyverno.io/title: HPA replicas spec
    policies.kyverno.io/category: Standard
    policies.kyverno.io/severity: high
    policies.kyverno.io/subject: Deployment
    policies.kyverno.io/description: >-
      Prevent spec.replicas to be overriden when HPA handles the deployment replicas
spec:
  background: false
  schemaValidation: false # Issue: https://github.com/kyverno/kyverno/issues/2748
  rules:
  - name: Deployment spec replicas
    match:
      resources:
        kinds:
          - Deployment
    preconditions:
      all:
      - key: "{{ request.object.metadata.labels.hpa || '' }}"
        operator: Equals
        value: "true"
      - key: "{{request.object.spec.replicas }}" # Allow to scale down to zero without changes
        operator: GreaterThan
        value: 0
      - key: "{{request.operation}}"
        operator: Equals
        value: UPDATE
    mutate:
      patchStrategicMerge:
        spec:
          replicas: '{{request.object.status.replicas}}'

dudicoco · 2022-06-13T07:32:13Z

@victorboissiere assuming that your manifests/helm charts/anything else are stored as code in git, wouldn't that create a diff when using kubectl diff/helm diff?

shubhamsre · 2022-06-30T19:59:18Z

In our case (deployment via Helm chart) we just removed the replicas field from the deployment manifest and let the hpa to handle all of it.

kustodian · 2022-10-14T15:05:45Z

@apelisse as far as I understand this now works properly since 1.22 if you use the server-side apply, which means to make it work for Helm and others, they need to implement server-side apply? Is that correct?

apelisse · 2022-10-14T19:03:00Z

If you use server-side apply, and remove the value from your manifest before you apply, then yes, this should work properly.

kustodian · 2022-10-14T19:19:42Z

Thanks for confirming 👍

This is good if you use kubectl, but for anyone using gitops and tools like Helm this won't work. What still needs to be done so that this issue can be resolved/fixed in general and that it works correctly with all tools?

ravikyada · 2023-05-26T08:29:10Z

Thank you for this Awesome Discussion @kustodian.

I Just Fixed this issue temporary with kubectl. Please Let me Know if there any workarounds to run this flag for helm and ArgoCD Only.

Thanks

madibaT · 2023-07-04T05:49:56Z

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

@rbroggi broggi, How did you work around this? We're seeing this issue in our environment lately.

rbroggi · 2023-07-04T06:25:45Z

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

@rbroggi broggi, How did you work around this? We're seeing this issue in our environment lately.

Hey @madibaT , it was a while ago so take this with a grain of salt - I think you need to:

Remove the spec.replica (or configure it to default value which I don't remember exactly what it is);
Redefine your rolling update strategy according to the behavior you would like to have when you have a lot of traffic.

taliastocks · 2024-01-31T17:46:35Z

If you use server-side apply, and remove the value from your manifest before you apply, then yes, this should work properly.

I'm still seeing this issue if the HPA is added and spec.replicas removed in the same step, even with server-side apply. Is the suggested workaround to add the HPA first, then remove spec.replicas after the HPA has its first scaling event?

roberthbailey mentioned this issue May 8, 2016

Deployment Undo doesn't undo number of replicas #25236

Closed

roberthbailey added the team/ux label May 8, 2016

bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/app-lifecycle sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. team/control-plane labels May 9, 2016

bgrant0607 added the kind/documentation Categorizes issue or PR as related to documentation. label May 24, 2016

bgrant0607 mentioned this issue May 27, 2016

changing update strategy and "applying" is not allowed #24198

Closed

bgrant0607 added the kind/bug Categorizes issue or PR as related to a bug. label Aug 22, 2016

bgrant0607 added area/workload-api/deployment and removed area/workload-api/deployment labels Dec 8, 2016

pwittrock mentioned this issue Dec 14, 2016

kubectl apply umbrella issue tracker #35234

Open

32 tasks

calebamiles modified the milestone: v1.6 Mar 8, 2017

ethernetdan modified the milestone: v1.6 Mar 10, 2017

0xmichalis removed team/control-plane (deprecated - do not use) labels Apr 3, 2017

civitaspo mentioned this issue Feb 15, 2022

BlueGreen Deployment Strategy should respect status.HPAReplica if exists. argoproj/argo-rollouts#1864

Closed

tacf mentioned this issue Feb 24, 2022

CreateOrReplace not fully equivalent with apply fabric8io/kubernetes-client#3896

Closed

This was referenced Sep 14, 2023

Allow to unset the replicas field in Deployments zilliztech/milvus-helm#16

Closed

[milvus] allow to unset replicas field in autoscalable deployments zilliztech/milvus-helm#19

Merged

helayoty added this to SIG CLI Oct 2, 2023

github-project-automation bot moved this to Needs Triage in SIG CLI Oct 2, 2023

akatkov mentioned this issue Oct 13, 2023

Helm scales deployment to replicas 1 when specific conditions are met helm/helm#12486

Closed

amartani mentioned this issue Jan 16, 2024

executor: Do not set replicas when autoscaling is enabled buildbuddy-io/buildbuddy-helm#79

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expected behavior for Deployment replicas with HPA during update #25238

Expected behavior for Deployment replicas with HPA during update #25238

verdverm commented May 6, 2016 •

edited

Loading

DirectXMan12 commented May 6, 2016

0xmichalis commented May 6, 2016

verdverm commented May 6, 2016

roberthbailey commented May 8, 2016

bgrant0607 commented May 9, 2016

rosskukulinski commented May 10, 2016

verdverm commented May 24, 2016

bgrant0607 commented May 28, 2016 •

edited

Loading

itajaja commented Aug 21, 2016

jszczepkowski commented Mar 9, 2017

jszczepkowski commented Mar 9, 2017

0xmichalis commented Mar 9, 2017

dsalaza4 commented Sep 24, 2021 •

edited

Loading

ldemailly commented Oct 20, 2021

philipsu522 commented Dec 27, 2021 •

edited

Loading

bgrant0607 commented Feb 5, 2022

apelisse commented Feb 9, 2022

dudicoco commented Feb 9, 2022

apelisse commented Feb 9, 2022

dudicoco commented Feb 9, 2022

apelisse commented Feb 10, 2022

dudicoco commented Feb 10, 2022

rehevkor5 commented Mar 2, 2022

rbroggi commented Mar 9, 2022 •

edited

Loading

victorboissiere commented Jun 3, 2022

dudicoco commented Jun 13, 2022

shubhamsre commented Jun 30, 2022

kustodian commented Oct 14, 2022

apelisse commented Oct 14, 2022

kustodian commented Oct 14, 2022

ravikyada commented May 26, 2023

madibaT commented Jul 4, 2023

rbroggi commented Jul 4, 2023

taliastocks commented Jan 31, 2024

Expected behavior for Deployment replicas with HPA during update #25238

Expected behavior for Deployment replicas with HPA during update #25238

Comments

verdverm commented May 6, 2016 • edited Loading

DirectXMan12 commented May 6, 2016

0xmichalis commented May 6, 2016

verdverm commented May 6, 2016

roberthbailey commented May 8, 2016

bgrant0607 commented May 9, 2016

rosskukulinski commented May 10, 2016

verdverm commented May 24, 2016

bgrant0607 commented May 28, 2016 • edited Loading

itajaja commented Aug 21, 2016

jszczepkowski commented Mar 9, 2017

jszczepkowski commented Mar 9, 2017

0xmichalis commented Mar 9, 2017

dsalaza4 commented Sep 24, 2021 • edited Loading

ldemailly commented Oct 20, 2021

philipsu522 commented Dec 27, 2021 • edited Loading

bgrant0607 commented Feb 5, 2022

apelisse commented Feb 9, 2022

dudicoco commented Feb 9, 2022

apelisse commented Feb 9, 2022

dudicoco commented Feb 9, 2022

apelisse commented Feb 10, 2022

dudicoco commented Feb 10, 2022

rehevkor5 commented Mar 2, 2022

rbroggi commented Mar 9, 2022 • edited Loading

victorboissiere commented Jun 3, 2022

dudicoco commented Jun 13, 2022

shubhamsre commented Jun 30, 2022

kustodian commented Oct 14, 2022

apelisse commented Oct 14, 2022

kustodian commented Oct 14, 2022

ravikyada commented May 26, 2023

madibaT commented Jul 4, 2023

rbroggi commented Jul 4, 2023

taliastocks commented Jan 31, 2024

verdverm commented May 6, 2016 •

edited

Loading

bgrant0607 commented May 28, 2016 •

edited

Loading

dsalaza4 commented Sep 24, 2021 •

edited

Loading

philipsu522 commented Dec 27, 2021 •

edited

Loading

rbroggi commented Mar 9, 2022 •

edited

Loading