-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to enable/disable replication controller #37086
Comments
@kubernetes/sig-apps @kubernetes/deployment pausing a Deployment foo does not stop the deployment controller from managing replicas for foo. Not sure if we can have pausing for a ReplicationController foo-1 that would make the RC manager not manage replicas. |
I'm not sure is a use case that we intended to support, although I understand the desire to do it in some cases. Can you describe some of the detail about the underlying use case that matters for you? |
This relates to my thread here: openshift/origin#11954. The setup is as follows using a blue/green deployment strategy with an additional abstraction of stage/active. Before (dc = deployment config / deployment):
Stage Y
Promote Y (delete dc X)
At the "promote Y" stage the pods created by dc Y will still have the label of status:stage. We were investigating how we could update those labels on both the RC and the pre-existing pods without creating new pods or fighting with the RC autoscaler. This doesn't seem possible at the moment. One of the ideas we had around this was we could perform promotion on a pod level instead of carrying a dc through the stage -> active promotion.
Stage Y
Promote stg -> act in the following steps:
A "PodSwitch" strategy could be written as a custom deployment strategy (openshift concept but I think it has trickled back into kubernetes with the deployment object). It would have an input of a source dc / rc:
Another use case: Having pods in "standby mode". This can be helpful for quick scaling when pod startup is slow. Example:
Problems happen with step 2. The RC replica count needs to change at the exact same time as the RC gains more pods. If the replica count changes too soon it may try to spin up spurious pods. If the replica count changes too late it may try to spin down pods that were just added to the RC. |
Independent of the merits of pause on the RC, I did want to mention the On Fri, Nov 18, 2016 at 11:09 AM, Mark notifications@github.com wrote:
|
One could always delete the RCs and then re-create them. I think the most compelling case is scaling down and choosing a victim, which some other systems (e.g., Marathon) support. For that, a simple grace period before choosing a victim would suffice. We've discussed that previously in a number of issues. |
Deleting the RCs could leave you in a bad state should you come across a failure. I had considered it but it also means that if something died along the way I could be left without an RC which could make a mess. |
I am interested in what cases you have for this kind of setup. We have a lot of issues with overlapping controllers and I am not sure we want to allow ways of getting this any more complex than it already is. Managing labels in general is hard for average users. Wouldn't #36897 be helpful for you? |
@Kargakis yes #36897 would be helpful. The case I was thinking of was a rolling strategy that reused old pods when the content of the pods did not change. Hence, instead of new pods and old pods being spun up / down one-by-one, they would be transferred over one-by-one and then any additional scaling that is needed would be performed. At present my desire for this is to try to work around not being able to atomically change RC selectors and pod labels at the same time (openshift/origin#11954, #36897) |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@MarkRx: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
At present there isn't any way to temporarily disable a replication controller. As a result attempting to manage pod counts outside of a replication controller can be infeasible. Adding the ability to enable/disable a replication controller would pause / unpause the auto scale feature to the replication count. When paused pods can be manually created / deleted / moved around without interference from the replication controller.
Examples:
In all of the above at present there is a contention between the replication controller trying to maintain a set number of pods and the user performing manual work. If a pod is added to a replication controller then the replication controller will see that there are too many pods and scale one down. If a pod is removed from a replication controller then the replication controller will see that there are not enough pods and create one. In both cases the replication count could be modified to reflect the true desired amount of pods but since removing a pod and changing the count cannot be done atomically there is a catch 22.
By having the ability to temporarily disable replication controllers pods could be reshuffled while they are disabled and the controllers could be updated to reflect the new states before being re-enabled.
The text was updated successfully, but these errors were encountered: