-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling restart of pods #13488
Comments
Whats the use case for restarting the pods without any changes to the spec? Note that there wont be any way to rollback the change if pods started failing when they were restarted. |
Whenever services get into some wedged or undesirable state (maxed out connections and are now stalled, bad internal state, etc.). It's usually one of the first troubleshooting steps if a service is seriously misbehaving. If the first pod fails as it is restarted, I would expect it to cease continuing or continue retrying to start the pod. |
Also, a rolling restart with no spec change reallocates pods across the However, I would also like the ability to do this without rescheduling the On Wed, Sep 2, 2015 at 12:01 AM, Sam Ghods notifications@github.com wrote:
Clayton Coleman | Lead Engineer, OpenShift |
@smarterclayton Is that like my option 2 listed above? Though why would labels be changed? |
I suppose this would more be for a situation where the pod is alive and responding to checks but still needs to be restarted. One example is a service with an in-memory cache or internal state that gets corrupted and needs to be cleared. I feel like asking for an application to be restarted is a fairly common use case, but maybe I'm incorrect. |
Corruption would just be one pod, which could just be killed and replaced by the RC. The other case mentioned offline was to re-read configuration. That's dangerous to do implicitly, because restarts for any reason would cause containers to load the new configuration. It would be better to do a rolling update to push a new versioned config reference (e.g. in an env var) to the pods. This is similar to what motivated #1353. |
@bgrant0607 have we decided that we don't want to do this? |
@gmarek Nothing, for now. Too many things are underway already. |
Can we have a |
I would be a fan of this feature as well, you don't want to be forced to switch tags for every minor update you want to roll out. |
I'm a fan of this feature. Use case: Easily upgrade all the pods to use a newly-pushed docker image (with |
Another use case: Updating secrets. |
I'd really like to see this feature. We run node apps on kubernetes and currently have certain use cases where we restart pods to clear in app pseudo caching. Here's what I'm doing for now:
This deletes pods 10 at a time and works well in a replication controller set up. It does not address any concerns like pod allocation or new pods failing to start. It's a quick solution when needed. |
I would really like to be able to do a rolling restart. |
Yes, there are a lot of cases when you really want to restart pod/container without changes inside... |
Small work around (I use deployments and I want to change configs without having real changes in image/pod):
k8s will see that definition of the deployment has been changed and will start process of replacing pods |
Thank you @paunin |
@paunin Thats exactly the case where we need it currently - We have to change ConfigMap values that are very important to the services and need to be rolled-out to the containers within minutes up to some hours. If no deployment happens in the meantime the containers will all fail at the same time and we will have partial downtime of at least some seconds |
Our GKE cluster on "rapid" release channel has upgraded itself to Kubernetes 1.16 and now
@nikhiljindal asked a while ago about the use case for updating the deployments without any changes to the specs. Maybe we're doing it in a non-optimal way, but here it is: our pre-trained ML models are loaded into memory from Google Cloud Storage. When model files get updated on GCS, we want to rollout restart our K8S deployment, which pulls the models from GCS. I appreciate we aren't able to roll back the deployment with previous model files easily, but that's the trade-off we adopted to bring models as close as possible to the app and avoid a network call (as some might suggest). |
hey @dimileeh Do you happen to know what version of kubectl you're using now? and what version you used before? I'd love to know if there was a regression, but at the same time I'd be surprised if the feature had entirely disappeared. With regard to the GCS thing, and knowing very little about your use-case so sorry if it makes no sense: I would suggest that the gcs model get a different name every time they get modified (maybe suffix with their hash), and that the name would be included in the deployment. Updating the deployment to use the new files would automatically trigger a rollout. This give you the ability to roll-back to a previous deployment/model, have a better understanding of the changes happening to the models, etc. |
hi @apelisse, thank you for your response! When I run
When I tried to upgrade kubectl via The Kubenetes documentation 1.17, 1.16 and 1.15 has nothing about Thank you for your suggestion on model versioning, it makes perfect sense. We thought about that but then, since we retrain our models every day, we thought we'd start accumulating too many models (and they are quite heavy). Of course, we could use some script to clean up old versions after some time, but so far we've decided to keep it simple relying on |
I can see the docs here: |
Ah, thank you, I was looking here: |
Thank you very much for that link, I’ll make sure it gets updated !
…On Thu, Dec 19, 2019 at 12:40 PM Dmitri Lihhatsov ***@***.***> wrote:
Ah, thank you, I was looking here:
https://v1-16.docs.kubernetes.io/docs/reference/kubectl/cheatsheet/
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13488?email_source=notifications&email_token=AAOXDLCDSTPYK6EGBQWSRADQZPL5BA5CNFSM4BOYZ5Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHK3ZSA#issuecomment-567655624>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOXDLHWCU4T6NCSHOYZIELQZPL5BANCNFSM4BOYZ5ZQ>
.
|
@dimileeh PTAL kubernetes/website#18224 (I'll cherry-pick in relevant branches once this gets merged). |
@dimileeh I think I figured out what's wrong with your kubectl version, we'll be working on it. |
Yes, we also have use case of re-starting pod without code change, after updating the configmap. This is to update a ML model without re-deploying the service. |
@anuragtr with latest versions you can run
|
I was using a custom command for that [1], glad it is now in the standard kubectl! Thanks |
|
As i understand now if i have a newer docker image tagged as :latest and a deployment using the image tagged :latest, with the IS THERE a way to ask Kubernetes to check if the image has really changed and restart pods ONLY if the image differs from that one used in the pods? I am migrating services that are managed by docker-compose, and currently i run |
@shoce I believe you may misunderstand how kubernetes and this latest tag concept works. Simply put, if you use a dynamic tag (like latest) that can have multiple different images at any given time, you can never guarantee you are always using the same version of that tag (aka the same image). Kubernetes doesn’t do a “lookup” like you seem to assume to check what sha sum the current t deployment needs to pull from your container registry. This all is possible whether or not you have image pull policy set. An example of it helps, is let’s say you have 3 nodes and a deployment of your “widgets” service that has 3 replicas specified and let’s say your image pull policy is always. Let’s say you trigger an update to your service (although I do not know how since your image tag didn’t change. So let’s say you do something silly which I’ve seen before like set the current date into an annotation). The second this triggers on the first node it will try to bring up a new pod with your latest latest, but before that gets healthy let’s say your CI system or your dev pushed a new latest. Then after this your first pod got healthy and a new pod on the second node tried to come up. This one would now be using the newer latest. TL;DR: Do not use latest tag for anything ever for the most part. It’s really bad practice. All registries I know of have a feature you can enable which disallows pushing over an existing tag exactly for this reason. It’s bad. There are simple use cases for latest (Eg useful for internal ci images and tooling and can be useful in docker files) but you should understand when you use and not use them. Deploying something into kubernetes with a latest tag is generally viewed as “doing something wrong/funny” in my experience. |
@AndrewFarley thanks, your explanation helped me a lot and the example is something i could not see before. Actually what i do is using :develop and :master tagging of docker images built against the corresponding git branches and deploy them to dev and staging environments as soon as possible. As my dev and staging environments are hosted on a single host i did not worry much. After reading many discussions on :latest tags and Kubernetes i finally agree to drop using :latest tagging with Kubernetes even for dev and staging environments. It was simple to use with docker-compose but not ok with Kubernetes. I have two related questions now and i would appreciate any leads. It might seem off topic but i believe these are the issues blocking people to drop using :latest tags.
|
@shoce You should google this problem for your registry provider (eg: "Automatically delete old images on REGISTRY_PROVIDER" or perhaps "Delete untagged images on REGISTRY_PROVIDER") Each registry tends to have their own API and/or tools to handle this. AWS, for example, has a built-in Lifecycle Policy which you can configure to automatically delete images without requiring much effort on your part. In the past on "simpler" registries I've been known to write simple scripts to query the old images and delete them. Good luck! |
kubectl rolling-update
is useful for incrementally deploying a new replication controller. But if you have an existing replication controller and want to do a rolling restart of all the pods that it manages, you are forced to do a no-op update to an RC with a new name and the same spec. It would be useful to be able to do a rolling restart without needing to change the RC or to give the RC spec, so anyone with access to kubectl could easily initiate a restart without worrying about having the spec locally, making sure it's the same/up to date, etc. This could work in a few different ways:kubectl rolling-restart
that takes an RC name and incrementally deletes all the pods controlled by the RC and allows the RC to recreate them.kubectl rolling-update
with a flag that lets you specify an old RC only, and it follows the logic of either 1 or 2.kubectl rolling-update
with a flag that lets you specify an old RC only, and it auto-generates a new RC based on the old one and proceeds with normal rolling update logic.All of the above options would need the MaxSurge and MaxUnavailable options recently introduced (see #11942) along with readiness checks along the way to make sure that the restarting is done without taking down all the pods.
@nikhiljindal @kubernetes/kubectl
The text was updated successfully, but these errors were encountered: