Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected behavior for Deployment replicas with HPA during update #25238

Open
verdverm opened this issue May 6, 2016 · 98 comments
Open

Expected behavior for Deployment replicas with HPA during update #25238

verdverm opened this issue May 6, 2016 · 98 comments
Labels
area/app-lifecycle area/workload-api/deployment kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI.

Comments

@verdverm
Copy link

verdverm commented May 6, 2016

I am seeing unexpected behavior from a user standpoint when kubectl apply -f mydeployment.yml

The existing Deployment has an HPA associated with it and the current replica count is 3. When the deployment is accepted, the number of replicas jumps to 9, then scales down to 6, the value in the deployment. Eventually, the new pods info starts flowing in and the HPA does its thing and scales back to satisfaction.

The behavior I expected is that the rollout would use the current replicas from the HPA and the strategy parameters from the Deployment to keep replica count in line with the current target set by the HPA.

k8s version: 1.2.3

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mydeployment
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: "40%"
      maxSurge: "50%"
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: "someimage"
apiVersion: extensions/v1beta1
kind: HorizontalPodAutoscaler
metadata:
  name: mydeployment
  namespace: default
spec:
  scaleRef:
    kind: Deployment
    name: mydeployment
    subresource: scale
  minReplicas: 3
  maxReplicas: 12
  cpuUtilization:
    targetPercentage: 50
@DirectXMan12
Copy link
Contributor

cc @kubernetes/autoscaling

The scaling of a deployment is based on the replicas field. When the HPA scales a deployment, it also updates the replicas field. AFAICT from your description, what's going on is that the apply operation ends up updating the replicas field, and thus is temporarily scaling the deployment manually until the HPA's next sync loop iteration (when the HPA notices the extra replicas, and corrects the replica count). You should be able to check this by checking (via kubectl get) if the replicas field on the deployment changes temporarily after the apply.

@0xmichalis
Copy link
Contributor

@kubernetes/deployment

@verdverm
Copy link
Author

verdverm commented May 6, 2016

This is more an issue when we have 20 replicas under load and roll a new Deployment and the replicas go from 20 -> 26 -> 6 -> ... -> 20

@roberthbailey
Copy link
Contributor

/cc @janetkuo

@bgrant0607
Copy link
Member

@verdverm

There are multiple issues here.

Do not set the replicas field in Deployment if you're using apply and HPA. As mentioned by @DirectXMan12, apply will interfere with HPA and vice versa. If you don't set the field in the yaml, apply should ignore it.

Also, I'm not sure HPA can be expected to be stable right now with large maxUnavailable and maxSurge values. Maybe try maxUnavailable=0 and maxSurge=1.

Finally, Deployment doesn't do a good job of detecting scaling behavior right now, since it's fairly tricky to do robustly. What I'd like it to do is to distinguish scaling from other changes and then scale proportionally. #20754

@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/app-lifecycle sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. team/control-plane labels May 9, 2016
@rosskukulinski
Copy link
Contributor

I agree with @verdverm - this was unexpected behavior for me as well. At a minimum, I think the Deployment docs should call this out.

I would like to be able to utilize the same deployment.yml file for both initial deploy and updates (all is done through CI/CD). It seems like the 'proper' way is to not specify a replicas in the Deployment (defaults to 1?) and apply the HPA profile when launching a new service.

@verdverm
Copy link
Author

@bgrant0607 @DirectXMan12

Your suggested changes makes the observed behavior in line with what we were expecting.
Maybe an update to the docs to make this more explicit would be helpful.

Thanks for all your help!

@bgrant0607 bgrant0607 added the kind/documentation Categorizes issue or PR as related to documentation. label May 24, 2016
@bgrant0607
Copy link
Member

bgrant0607 commented May 28, 2016

I wrote an explanation of apply behavior here:
#15894

@itajaja
Copy link

itajaja commented Aug 21, 2016

If you don't set the field in the yaml, apply should ignore it.

@bgrant0607 I am not noticing this. It seems when I apply the replica number is set to 0 if I don't specify the number in the file

@bgrant0607 bgrant0607 added the kind/bug Categorizes issue or PR as related to a bug. label Aug 22, 2016
@calebamiles calebamiles modified the milestone: v1.6 Mar 8, 2017
@jszczepkowski
Copy link
Contributor

Are we sure we want to deliver this for 1.6? Not much time left.

As I understand, the problem here lies in kubectl apply and no modification in HPA are needed. Please, correct me if I'm wrong.

@jszczepkowski
Copy link
Contributor

CC @jszczepkowski

@0xmichalis
Copy link
Contributor

The milestone was probably added for triaging. I don't think we can do anything for 1.6.

@dsalaza4
Copy link

dsalaza4 commented Sep 24, 2021

I ended up implementing a hpa_replicas function and replacing its returned value in the deployment YAML.

@ldemailly
Copy link

@pinowthebird

I did some basic testing but it seems like not specifying replicas in my yaml file simply defaults them to 1 after an apply

it's a bit more complicated than that; yes not specifying will set a value of 1 if nothing else happens; but once "not specified" and handled by hpa; further applies will indeed do the right thing and not reset the pod count to 1 (you can see doing kubectl diff)

Note that if you had a replicas: N in your yaml and want to remove it you have to hack/edit kubectl.kubernetes.io/last-applied-configuration to remove it there in the json to avoid a scale down to 1 when you first fix your deployment

@philipsu522
Copy link

philipsu522 commented Dec 27, 2021

I'm trying to follow the recommendation in https://github.com/kubernetes/website/pull/29739/files for migrating deployments to HPA, specifically recommendation 1. Note that the patch verb exhibits the same behavior of scaling the replica count down to 1, although remove the spec.replica field from the manifest then applying doesn't.
Specifically, this command causes the scale-to-1 behavior:

 kubectl patch -n psu-test-hpa deployment/sample-app --type=json -p='[{"op": "remove", "path":"/spec/replicas"}]' 

@bgrant0607
Copy link
Member

@apelisse Is there a way with SSA to specify an initial value of a field, such as spec.replicas, but not take ownership of it?

Discussion with @tamalsaha:
https://twitter.com/tsaha/status/1489720742300160004

@apelisse
Copy link
Member

apelisse commented Feb 9, 2022

We wrote this piece of documentation to describe how one should do that. I'd love to hear feedback and if this doesn't work!

@dudicoco
Copy link

dudicoco commented Feb 9, 2022

@apelisse thank you for the documentation, it's very clear.

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

@apelisse
Copy link
Member

apelisse commented Feb 9, 2022

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

Because they rely on multiple steps? i.e. "commit this, apply that, commit that, apply this" kind of sequence?

@dudicoco
Copy link

dudicoco commented Feb 9, 2022

However, I don't think it is usable in most real word scenarios where we store our yamls in git and use helm/CI/controllers to apply new configs.

Because they rely on multiple steps? i.e. "commit this, apply that, commit that, apply this" kind of sequence?

@apelisse because a special logic would need to be added to any tool/system which adds the flag only when adding an hpa to an existing deployment.

@apelisse
Copy link
Member

What special logic needs to be built? I guess I must be missing that part.

@dudicoco
Copy link

The logic is:

  1. check if deployment exists without an hpa
  2. if 1 is true, add the flag --field-manager=handover-to-hpa

Please correct me if I misunderstood how this new feature works.

@rehevkor5
Copy link

We wrote this piece of documentation to describe how one should do that. I'd love to hear feedback and if this doesn't work!

That is helpful if you're using kubectl, but it doesn't appear to matter if you use Helm, sadly.

@rbroggi
Copy link

rbroggi commented Mar 9, 2022

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

@victorboissiere
Copy link

We had the same problem, we ended up by implementing a mutating webhook with Kyverno to make sure the current replicas value is not overriden.

Example:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: general-deployment-hpa-replicas
  annotations:
    policies.kyverno.io/title: HPA replicas spec
    policies.kyverno.io/category: Standard
    policies.kyverno.io/severity: high
    policies.kyverno.io/subject: Deployment
    policies.kyverno.io/description: >-
      Prevent spec.replicas to be overriden when HPA handles the deployment replicas
spec:
  background: false
  schemaValidation: false # Issue: https://github.com/kyverno/kyverno/issues/2748
  rules:
  - name: Deployment spec replicas
    match:
      resources:
        kinds:
          - Deployment
    preconditions:
      all:
      - key: "{{ request.object.metadata.labels.hpa || '' }}"
        operator: Equals
        value: "true"
      - key: "{{request.object.spec.replicas }}" # Allow to scale down to zero without changes
        operator: GreaterThan
        value: 0
      - key: "{{request.operation}}"
        operator: Equals
        value: UPDATE
    mutate:
      patchStrategicMerge:
        spec:
          replicas: '{{request.object.status.replicas}}'

@dudicoco
Copy link

@victorboissiere assuming that your manifests/helm charts/anything else are stored as code in git, wouldn't that create a diff when using kubectl diff/helm diff?

@shubhamsre
Copy link

In our case (deployment via Helm chart) we just removed the replicas field from the deployment manifest and let the hpa to handle all of it.

@kustodian
Copy link

@apelisse as far as I understand this now works properly since 1.22 if you use the server-side apply, which means to make it work for Helm and others, they need to implement server-side apply? Is that correct?

@apelisse
Copy link
Member

If you use server-side apply, and remove the value from your manifest before you apply, then yes, this should work properly.

@kustodian
Copy link

Thanks for confirming 👍

This is good if you use kubectl, but for anyone using gitops and tools like Helm this won't work. What still needs to be done so that this issue can be resolved/fixed in general and that it works correctly with all tools?

@ravikyada
Copy link

Thank you for this Awesome Discussion @kustodian.

I Just Fixed this issue temporary with kubectl. Please Let me Know if there any workarounds to run this flag for helm and ArgoCD Only.

Thanks

@madibaT
Copy link

madibaT commented Jul 4, 2023

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

@rbroggi broggi, How did you work around this? We're seeing this issue in our environment lately.

@rbroggi
Copy link

rbroggi commented Jul 4, 2023

I wanted to contribute with my 2-cents:

We have faced the issue of defining spec.replicas while having hpa enabled - in our case the situation was even more confusing as at a certain point when we had around 11 pods (defined by hpa) we started a deployment and instead of seing the number of our ready replicas drop to 3 (which was the spec.replicas defined in our deployment.yaml) we saw it going all the way down to 0. After investigating it we realized that the reason was that the remaining 3 pods belonging to the old replica set were so overwhelmed with traffic - which was supposed to be handled by 11 instances - that the Kubernetes readiness probes started failing and therefore we had a big problem as we faced due to this very issue a downtime.

@rbroggi broggi, How did you work around this? We're seeing this issue in our environment lately.

Hey @madibaT , it was a while ago so take this with a grain of salt - I think you need to:

  1. Remove the spec.replica (or configure it to default value which I don't remember exactly what it is);
  2. Redefine your rolling update strategy according to the behavior you would like to have when you have a lot of traffic.

@taliastocks
Copy link

If you use server-side apply, and remove the value from your manifest before you apply, then yes, this should work properly.

I'm still seeing this issue if the HPA is added and spec.replicas removed in the same step, even with server-side apply. Is the suggested workaround to add the HPA first, then remove spec.replicas after the HPA has its first scaling event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/app-lifecycle area/workload-api/deployment kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI.
Projects
Status: Needs Triage
Development

No branches or pull requests