Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-place rolling updates #9043

Closed
davidopp opened this issue May 31, 2015 · 59 comments · Fixed by #102884
Closed

In-place rolling updates #9043

davidopp opened this issue May 31, 2015 · 59 comments · Fixed by #102884
Assignees
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@davidopp
Copy link
Member

Our current rolling update scheme requires deleting a pod and creating a replacement. We should also have rolling "in-place" update, of both containers and pods. There are two kinds of in-place update

  1. [only applies to container update] pod stays in place, but container doesn't - for example, change container image without killing pod
  2. [applies to pod or container update] pod stays in place, and container stays in place - for example, resource limit updates (IIRC we don't implement this yet, or at least don't implement it in-place)

The motivation is that there's no reason to kill the pod if you don't need to. The user may have local data stored in it that they don't want blown away due to update.

@davidopp davidopp added priority/backlog Higher priority than priority/awaiting-more-evidence. team/master sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels May 31, 2015
@erictune erictune added this to the v1.0-post milestone Jun 1, 2015
@bgrant0607
Copy link
Member

I agree we'll eventually want this, but...

There are very few in-place, non-disruptive updates that we can actually do right now. For instance, Docker doesn't support resource changes, so we'll need to work around that using cgroup_parent. Resource changes will also complicate admission control in kubelet.

Decoupling lifetime of local storage from the pod is discussed in #7562 and #598, which would reduce the number of circumstances where in-place updates would be required.

@soundofjw
Copy link

This would be wildly valuable - specifically for kubernetes + elastic-search, where data nodes really shouldn't be losing data.

@bgrant0607
Copy link
Member

Data needs a lifetime independent of pods. See, for example, #598 and #7562.

@bgrant0607 bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015
@ghost ghost added team/control-plane and removed team/master labels Aug 20, 2015
@bgrant0607 bgrant0607 added team/ux and removed sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Sep 2, 2015
@hurf
Copy link
Contributor

hurf commented Sep 15, 2015

It seems #6099 is also talking about this.

@bgrant0607
Copy link
Member

@hurf #6099 is not related to this. That's nodes, this is pods.

@chengyli
Copy link
Contributor

What's the difference between this proposal and "kubectl replace/patch"?

@alfred-huangjian
Copy link
Contributor

@kubernetes/huawei

@davidopp
Copy link
Member Author

davidopp commented Feb 5, 2016

https://blog.docker.com/2016/02/docker-1-10/
(Docker 1.10 announcement)

Live update container resource constraints: When setting limits on what resources containers can use (e.g. memory usage), you had to restart the container to change them. You can now update these resource constraints on the fly with the new docker update command.

@davidopp
Copy link
Member Author

/inplaceupdate subresource on pod might be a good way to express pod updates (resource as well as container image)

@dts
Copy link

dts commented May 25, 2016

Mentioned in #13488 is an important use case for me: updating secrets. Specifically if we have a load balancer referencing certificates in a Secret and we want to refresh those certificates in a rolling fashion to all pods in the deployment/etc.

@mqliang
Copy link
Contributor

mqliang commented Jun 13, 2016

@davidopp May I ask what the benefit can we get from in-place rolling update? Is it about scheduling? IIRC, if we update pod.spec, kubelet will sync the pod. So, what we really need is just updating the spec of a running pod, kubelet will watched the change and do all the remaining task. Then, why do we need /inplaceupdate subresource on pod? Is it because /inplaceupdate will update the spec as well as the status, so we can wait kubelet sync the new spec and report the latest status?

@xiaods
Copy link

xiaods commented Feb 21, 2019

so any update on this feature request?

@vinaykul
Copy link
Member

vinaykul commented Mar 6, 2019

so any update on this feature request?

@xiaods we have started a new merged design KEP and i'm working on it. this has taken me longer than i had initially hoped due to some other tasks taking priority. i'll be updating the new KEP document with my suggested flow control mechanism in that discussion and pushing it out for review in a day or two.

@xiaods
Copy link

xiaods commented Mar 11, 2019

@vinaykul yes, this is very complicated concerns for how-to implement in-place task. i will look for this review docs. thanks a lot.

@michelgokan
Copy link

michelgokan commented Sep 3, 2019

@vinaykul Is there any quick "hack" to resize a particular container without killing it? Something like disabling scheduler for a short while and perform "docker update", do whatever I want to do (in my case some benchmarks), and then turn on scheduler again (I don't care if scheduler kills it after I finished my task). I need something like this for a specific resource tuning project.

@vinaykul
Copy link
Member

vinaykul commented Sep 3, 2019

@vinaykul Is there any quick "hack" to resize a particular container without killing it? Something like disabling scheduler for a short while and perform "docker update", do whatever I want to do (in my case some benchmarks), and then turn on scheduler again (I don't care if scheduler kills it after I finished my task). I need something like this for a specific resource tuning project.

We have an implementation for v1.11 based on our original design. You may have to port it to the version you're interested in, and may need bug fixes - we are not maintaining it anymore as the current design being considered for upstream is significantly different. Hope this helps.

@michelgokan
Copy link

michelgokan commented Sep 4, 2019

@vinaykul Very nice. Just 2 more questions: Is this In-Place resizing holds the QoS types (guaranteed, burstable,best-effort)? Also, is it compatible with the static CPU manager policy?

@vinaykul
Copy link
Member

vinaykul commented Sep 4, 2019

@vinaykul Very nice. Just 2 more questions: Is this In-Place resizing holds the QoS types (guaranteed, burstable,best-effort)? Also, is it compatible with the static CPU manager policy?

Resizing applies to Guaranteed and burstable QoS classes, and changing QoS class of a running Pod is not allowed. That code does not check for CPU manager policy - so it would incorrectly update non-integral requests for static policy nodes.

@krmayankk
Copy link

@bgrant0607 @davidopp @kow3ns is this still in the plans ?

@vinaykul
Copy link
Member

vinaykul commented Jan 13, 2020

@bgrant0607 @davidopp @kow3ns is this still in the plans ?

Yes. @thockin reviewed and approved the KEP, and I'm working on resolving a couple of changes requested from KubeCon API review session, and fleshing out the test plans and GA criteria sections in order to get the KEP into implementable stage before Jan 28 date for 1.18 release.

@andyxning
Copy link
Member

@vinaykul What is the status for In-Place Pod resource change? We have similar requirements like yours. Hope you can give us some info.

@vinaykul
Copy link
Member

@vinaykul What is the status for In-Place Pod resource change? We have similar requirements like yours. Hope you can give us some info.

We are revisiting the API changes that we previously thought were good. For details, please see kubernetes/enhancements#1883

I hope to find some time in the coming weeks to refactor PR #92127 as per the above discussion and check the implementation for robustness (I'm about half-way done, but I have some higher priority work)

@yarncraft
Copy link

OpenKruise (https://openkruise.io/en-us/index.html) offers a well documented solution through the use of CustomResourceDefinitions. It would be valuable to include these controllers in vanilla Kubernetes since they seem to resolve the challenges mentioned above.

@liupeng0518
Copy link
Member

liupeng0518 commented Jan 5, 2021

  1. [only applies to container update] pod stays in place, but container doesn't - for example, change container image without killing pod

@vinaykul Hi,
Is there any quick "hack" to replace image without killing it?

@yangjunsss
Copy link

OpenKruise (https://openkruise.io/en-us/index.html) offers a well documented solution through the use of CustomResourceDefinitions. It would be valuable to include these controllers in vanilla Kubernetes since they seem to resolve the challenges mentioned above.

It seems Kruise resolve the update container in-place, but not the request & limit of container.

@FillZpp
Copy link

FillZpp commented Dec 24, 2021

It seems Kruise resolve the update container in-place, but not the request & limit of container.

Yeah, OpenKruise v1.0 now supports update images and env/command/args (via Downward API) in-place (https://openkruise.io/docs/core-concepts/inplace-update), but it can not modify request & limit which will break the logic of scheduler and kubelet.

@jonyhy96
Copy link
Contributor

FYI #102884 this pr try to implement In-place Pod Vertical Scaling which may change pod's request & limit

@vinaykul
Copy link
Member

/assign vinaykul

@surarchita-matam
Copy link

Is there any hack we can use to read the updated config map values without restarting a pod?

@shinebayar-g
Copy link

Is there any hack we can use to read the updated config map values without restarting a pod?

Doesn't configmap already do that? (if not using subPath)

@surarchita-matam
Copy link

surarchita-matam commented Feb 22, 2023

Doesn't configmap already do that? (if not using subPath)

No its not working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet