-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having programable, the waiting time before the "autoscale-up/down" is effective #56335
Comments
/sig autoscaling |
Hi @rsd35410! Cool idea! Just a couple of clarifying questions to make sure I'm understanding. Currently, its possible to specify the Curious on your thoughts @DirectXMan12 and @MaciekPytel ? I personally think allowing window specification on a per hpa basis could be a useful change (especially given our discussions on auto-scaler performance @MaciekPytel). The main downside I see is that it requires changing the public api. Fortunately, I don't think it would need to be a breaking change, as we could always fall back to the controller values. I'm happy to take a stab at implementing these changes if we do go down this path. For your second suggestion, do you mind providing an example of a time when "when the "autoscale-up" is performed, it is not directly scaled to the max number defined inside the HPA"? Do you mean that if you have 3 nodes operating at 120% of their desired capacity, the number of desired nodes is the same, regardless of whether maxReplicas is 10 or 100? |
Yes, my idea is to have multiple autoscalers in the same cluster with possibly different 'custom metrics' and so with different updscale and downscale forbidden windows for each. For my second suggestion, I will try to explain my use case: That's why I'm asking for a kind of 'step' in order to have this programmable. I hope my explanation is enough clear. |
Hi @rsd35410, @mattjmcnaughton, |
Yeah, we've discussed this quite a bit in the past, and the conclusion has always been that the forbidden window is an implementation detail that we really shouldn't need at all, so we shouldn't expose it as a knob. There's an argument to be made about a pragmatic approach, but even then, it probably shouldn't be an API field (at most an annotation). As for the step field, I'm not certain as to how that helps your usecase, but I'm not certain I understand exactly what you're describing. As for the usecase itself, it's probably a better idea to scale on the ratio between incoming and processed messages, and then weight a little to process additional backlog. The problem with directly scaling on the number of jobs in the queue is that picking a good target number is strange (do you actually want to always have 3 items in the queue, or would you really prefer to just be able to process items as they're coming in), and at that increasing the number of replicas proportional to the number of jobs in the queue is not necessarily the best way to scale: What if jobs are coming in just as fast as you are processing, but there's a 10 item backlog? Then, if you have a target of 1 backlog item, you'll get 10 times as many pods, as you do now, which is probably not what you want -- you probably just want 1 or 2 extra pods to process down the backlog. |
@MaciekPytel @DirectXMan12 that makes total sense! I agree with not wanting to "double down" on adding customization to an implementation detail that you feel shouldn't really be necessary. I'll give some thought to if there is a good alternative and we can sync after the 1.9 release. Good luck getting 1.9 out the door :) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale I'm not convinced we want to expose the fields as requested in this issue, but the problem of arbitrary forbidden periods and 2x limit on scale-up is still there and needs to be addressed. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
For future readers of this issue, we're brainstorming improvements here: https://docs.google.com/document/d/1Gy90Rbjazq3yYEUL-5cvoVBgxpzcJC9vcfhAkkhMINs/edit#heading=h.9oka059ig9n5 |
/remove-lifecycle stale |
Specifying the HPA windows for up-scale and down-scale on a per-HPA basis makes a lot of sense. We have several HPAs and some of them need aggressive scale-up and some of them do not (and should not). @DirectXMan12, is this specific use of instance-specific cooldown also being discussed as part of the effort you linked? |
@foxish yes, it's on the list of things we're discussing. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Hey there, I'm working on the RFC to add some configuration parameters into the HPA. will the following RFC help solving your issues? |
Are the "scale-up" & "scale-down" waiting time ideas still being discussed in this issue? If not can someone suggest a pointer of where to look, the google doc seems to have gone quiet. I'd be happy to open a new issue / doc, I have an app that I'd like to scale via HPA, but is rather sensitive to thrashing. |
@dturn: if you're talking about my document, the work is being done here: |
I'm also interested to have "scale-up" & "scale-down" waiting time and "scale-step", because HPA adds as much as it can and not relying to percentage over threshold. |
@sqerison : This is covered by my KEP. |
@gliush, I see, thanks. But it still not merged and not released. Is there some possibility to smoothly add new pods, like by 10? Currently HPA adds from 100 to 200 pods just in one minute when metric only 10-20% above threshold. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this a BUG REPORT or FEATURE REQUEST?:
Kind feature
@kubernetes/sig-autoscaling-feature-requests
@kubernetes/sig/autoscaling
What happened:
There is no way to change the waiting time before the "autoscale-up/down" is effective.
In addition, when the "autoscale-up" is performed, it is not directly scaled to the max number defined inside the HPA.
What you expected to happen:
Having these parameters configurable into the template definition of the HPA, like:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
):uname -a
):The text was updated successfully, but these errors were encountered: