Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

nyxcharon · 2016-03-25T18:02:45Z

I have a Kubernetes cluster running version 1.2 and the latest version of kubectl installed. If I perform a rolling update on it the same way when I used the 1.1 client, it fails to update. Is this intended behavior, i.e. do I need to invoke this update differently now or is this a regression/bug ?

Client Info:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"2+", GitVersion:"v1.2.0-alpha.7", GitCommit:"c0fd002fbb25d6a6cd8427d28b8ec78379c354a0", GitTreeState:"clean"}
kServer Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"5cb86ee022267586db386f62781338b0483733b3", GitTreeState:"clean"}
$ kubectl-v1.1.1 version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.1", GitCommit:"92635e23dfafb2ddc828c8ac6c03c7a7205a84d8", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"5cb86ee022267586db386f62781338b0483733b3", GitTreeState:"clean"}

Invoking rolling update with 1.2:

$ kubectl --server=http://myclusterurl.com:8080 rolling-update sensu-client --image=sstarcher/sensu:latest
error: Specified --image must be distinct from existing container image
See 'kubectl rolling-update -h' for help and examples.

Whereas in 1.1 it worked fine:

$ kubectl-v1.1.1 --server=http:/myclusterurl.com:8080 rolling-update sensu-client --image=sstarcher/sensu:latest
Created sensu-client-59ea6198da5cb1fe4df053b249be7f78
Scaling up sensu-client-59ea6198da5cb1fe4df053b249be7f78 from 0 to 3, scaling down sensu-client from 3 to 0 (keep 3 pods available, don't exceed 4 pods)
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 1
Scaling sensu-client down to 2
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 2
Scaling sensu-client down to 1
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 3
Scaling sensu-client down to 0
Update succeeded. Deleting old controller: sensu-client
Renaming sensu-client-59ea6198da5cb1fe4df053b249be7f78 to sensu-client
replicationcontroller "sensu-client" rolling updated

The text was updated successfully, but these errors were encountered:

zhouhaibing089 · 2016-03-26T02:48:08Z

Checked with v1.2 code: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/rollingupdate.go#L236-L267

if oldRc.Spec.Template.Spec.Containers[0].Image == image {
    return cmdutil.UsageError(cmd, "Specified --image must be distinct from existing container image")
}

While in v1.1 there is no such check: https://github.com/kubernetes/kubernetes/blob/release-1.1/pkg/kubectl/cmd/rollingupdate.go#L216-L240.

So basically, I would suggest it should not consider the image is the same if the tag is latest.

janetkuo · 2016-03-27T01:06:19Z

@nikhiljindal

pascal-hofmann · 2016-03-30T06:48:24Z

Will there be a bugfix release containing the fix for this?

Edit: As a workaround I am now using kubectl from the 1.1.8 release, which works fine.

zhouhaibing089 · 2016-03-30T07:11:24Z

the fix is merged into master, though not in release1.2.

0xmichalis · 2016-03-31T09:54:36Z

Edit: As a workaround I am now using kubectl from the 1.1.8 release, which works fine.

See for a proposed workaround on 1.2: #15908 (comment)

pascal-hofmann · 2016-04-01T13:15:31Z

Just tried it - proposed workaround from #15908 (comment) does not work for me:

error: Must specify the controller to update
See 'kubectl rolling-update -h' for help and examples.

And if I specify a controller:

error: my_rc.yaml cannot have the same name as the existing ReplicationController my_controller_name
See 'kubectl rolling-update -h' for help and examples.

sfaujour · 2016-04-04T13:53:52Z

@phofmann-trust Same issue here, the workaround doesn't work for me too.

sfaujour · 2016-04-06T08:23:27Z

@phofmann-trust I replaced all :latest tags with a new one e.g. 'edge'.
Now rolling-update will not fail anymore!!

bgrant0607 · 2016-04-10T19:58:47Z

There's no way to predictably cause Kubernetes to pull a ":latest" image exactly when you'd like. I recommend not using ":latest" for anything other than short-duration pods. See also #1697.

srcspider · 2016-04-15T13:53:31Z

Workaround are not the correct way to go about it, this illogical change needs to be reverted.

An image when starting will potentially configure itself and there's plenty of cases where you would want to update to the SAME EXACT IMAGE to force everything to re-configure itself to some updated settings or environment changes. It's also a good way to extremely quickly temporarily fix some once-in-a-blue-moon server error (push the same image as a "reset"). Sometimes you might also temporarily push images as placeholders while you perform some complex operations, there's no point in preventing you from not pushing a (let's say) "maintenance" image over another "maintenance" image just because it happens its the same (why should you have to upload 100 versions of the same exact maintenance image?).

Even if we were talking about jpeg images (which is how this fix seems to be treating the docker images) if for some reason the image gets corrupted on disk why should I be prevented from copy pasting over it a good copy? Whats the harm here?

There should be tests to make sure this is always possible.

zhouhaibing089 · 2016-04-15T14:41:02Z

@srcspider I basically agree and understand your point. but considering the integrity(rolling update and rollback), rollback would be impossible if it is permitted to use the same tagged image, that's the only concern that I am aware of.

please correct me if I am thinking wrong.

srcspider · 2016-04-16T00:58:03Z

@zhouhaibing089 I'm not sure I understand your scenario and problem here. You'll have to explain it to me. Rollback of what? What's "impossible" exactly? and did you mean it as truly impossible or simply "harder".

If we're talking normal application where you find out you did something stupid and then press a button to "turn back time" sort to speak, then "rollbacks" (for everything I manage) is not allowed as an acceptable solution. It's an anti-pattern as far as I'm concern. It's not possible to implement the powers of a time lord. It's much simpler, and saner, to implement "revert" strategies that moves you forward, removing the undesired changes with knowledge of the consequences (and compromises).

I honestly don't see why kubernetes has to concern itself in any way with this. This pointless restriction here for example (with no opt-out) has only made it harder to implement it proper. An image is just the "start" for a container. And a container that has been running for a decent period of time is potentially (by design or unintentionally) not necessarily equivalent to a container just starting. So if you think about it as the containers and not the images (which you should), then using the same image makes perfect sense since it produces a "previous" known good state. In fact it's one of the only states you can make reasonably safe guarantees as a state you can "rollback" to, since it's by design suppose to work regardless of when it starts, for the given "system state" (and in the context of everything else interacting with that system).

How exactly can kubernetes even model a problem such as "X state has these irreversable data changes that can be reverted only via this imperfect strategy, and Y state has these specific communication changes that will need careful coordination with independent parties." Just reaching state X and Y is hard enough, going back to X from Y if you initially went the other way is just way harder to even do normally. Much less solve it "magically."

To make a simple analogy...

The way I think of "rollbacks" (the anti-pattern) is as putting out a fire by blowing on it. It will work almost 100% of the time for candles and matches (small applications, no foreign dependencies, no data, no nothing). It might work for something a bit bigger, like a piece of paper that's on fire (large but relatively easy to grasp applications, no foreign dependencies, throwaway data, easily contained, ...easily excused). But for anything that's actually even remotely close to a serious fire trying to blow on it (or "beat it to death") will likely only make a small problem into a disaster (your bank lost your salary, that shop you ordered from lost your payment, your accounting software lost a week of records, your cloud backup service corrupted your backups, etc). If you want to put out a fire, with minimal damage, you invest into a specialized tool for the job and protocol, like... a fire extinguisher—hopefully one that doesn't have undesirable consequences in the context of where you expect to have a fire.

Going back to our issues at hand: the extinguisher is a real strategy that involves more then pressing a button and kubernetes moving images blindly around. It's of course fine if that's all it ends up being following the right steps (just as it's fine to blow out a canddle). But as mentioned previously I find unwise for the implementation of those steps to be kubernetes concern.

It's as this issue illustrates inconvenient for kubernetes to concern itself with it. Since any way it does it there will be bias. And as this issue shows it's quite annoying for kubernetes to be biased for the sake of "features" that to users introduce limitations with no benefit; practically speaking these "features" are indistinguishable from "bugs".

zhouhaibing089 · 2016-04-17T04:56:14Z

/cc @jlowdermilk

@srcspider thanks for this long explanation, it makes sense to me.

nikhiljindal · 2016-04-18T20:39:46Z

I agree with #15908 (comment) comment by @jlowdermilk.
Users running kubectl rolling-update will expect that new containers will come up with the new image (which might not happen). We should first ensure that that happens and then we can allow this.

@srcspider If i understand correctly, you just want to restart your pods with the same image? Is #13488 what you are looking for?
There is also #22368 which talks about how to rollout configuration changes to pods.

zhouhaibing089 · 2016-04-19T02:00:28Z

@nikhiljindal I am thinking what @srcspider want is to use the same tagged image(the image may still be different).

A rolling-update of all rc-managed pods to the same image is not guaranteed to have the desired effect, because image caching on the node may result in the new pods using the old "version" of the image.

we could guarantee by setting imagePullPolicy accordingly.

srcspider · 2016-04-19T12:06:37Z

@zhouhaibing089

From reading the comment linked by @nikhiljindal if I'm understanding correctly is the restriction meant to prevent misuse of the special :latest tag? (for the record I've never used that one) Personally I have all images assigned actual exact versions, and rc tags. So if I'm deploying 1.2.3-rc7 it's pretty cut and dry. There is only ever one 1.2.3-rc7. The QA servers also just get the 1.2.3-rc7. The production servers also get 1.2.3-rc7.

I take it the fix suggested in the comment is suggesting that the "solution" is to replace the entire rc? Seems a bit odd in that case that the case is recognized but you have to do it in a completely roundabout hackish way. It's like saying "we have this restrictions, but here's an equivalent part where we totally forgot/ignore the restriction."

Your suggestion on being able to push the same image with the same name and kubernetes handling the update transparently is still very much welcome. The opaqueness of what exactly kubernetes does with images on the nodes, garbage collection of images, what images each node knows about, how to force nodes to cache images pre-deploy/update is a big pain point with kubernetes.

Probably the most annoying conversation to have whenever I have to talk about kubernetes (and also the most common conversation I have to have), is how somehow the nodes need to be treated as imbeciles when it comes to handling the images. Anything that can make node image handling more human accessible would be greatly appreciated.

off-topic: This is not related with the problem at hand, but since you mentioned it... There is indeed occasionally the need to create temporary namespaces for testing (services, pods, etc) and it's bit annoying not to be able with a flag safely re-use the name for those cases. Just to be clear these are mainly testing changes to kubernetes (new rcs, pods, services, etc), though image changes are also required for obvious reasons. There's a lot of iteration involved and the way I've had to mitigate that has been though tacking on various meta information usually a random hash as well. This unfortunately results in the annoyance that there's a bunch of images with random garabge as names.

However, at least for me, re-use of images is a separate issue since not-reusing names for anything that has to go though the steps towards production is just simpler from a human understanding point of view (if anyone says version 1.2.3 rc7 it's really troublesome if we have to then think about different versions of that). Even after cleaning images from the GCE registry there's something silly like 400 images; can't imagine managing something like that with ambiguity like "latest" thrown in.

@nikhiljindal

@srcspider If i understand correctly, you just want to restart your pods with the same image? Is #13488 what you are looking for? There is also #22368 which talks about how to rollout configuration changes to pods.

Yes and no. The problem with having this "rolling-restart" way of doing it is that it simply adds pointless complexity. So now every time a rolling-update call happens I have to somehow figure out if it's the same image or a different image, for the only purpose of changing if I call kubernetes rolling-update or kubernetes rolling-restart. (I also assume the rolling-restart one would require not providing the image flag)

There should just be a flag that says --allow-restarts, that would make implementations of abstractions and automation scripts a lot simpler.

You can also have a flag that's just --restart, if the semantics of "only restart" are needed and you want a flag that allows you to skip providing a image entirely. Personally I find that to be the niche case here, since it's much easier to provide the version I expect, and derive all images from that, then to figure out "is this the same version that's already there? do I need to update this, restart that, not update that?" for every little thing, especially when we go into intermediate images, multiple pods and so on.

Also as mentioned previously the image is just a means to an end, the container is the actual thing that matters, so even if the image is identical kubernetes deciding on it's own "hey that's the same image so I'm gonna leave this container untouched" is not desirable behavior because a container restart may result in a different container state then the container continuing to run as-is. Any update should result in a restart, and the opposite effect should be what's under a flag.

j3ffml · 2016-04-19T16:53:41Z

@srcspider the primary reason for disallowing rolling-update with the same image tag is that its behavior is not reliable without imagePullPolicy:Always on the container spec. Kubelet caches images locally and has no way of knowing if the image it has is the same as the one requested in the case of re-using image tags (e.g. :latest). It sounds like you never reuse image tags, so I'm guessing you just want the rolling restart behavior?

To make sure we're on the same page, the behavior of rolling update is to delete and recreate pods one at a time. All containers in the pod get restarted, possibly on a different node (unless they are restricted to same-node), and kubelet will use the imagePullPolicy to determine whether it should reuse a locally cached image with the given name:tag or refetch from registry.

Seems there are 2 use cases for same image tag rolling-update:

reusing an image:name tag for an actual update, e.g. :latest for dev, and
rolling-restart with same (actually same) image

My original thought was to require imagePullPolicy:Always, but that only meets use case 1). So perhaps a better solution is to add a --image-pull-policy flag, and make it required when doing a rolling-update to the same image:tag. It's a bit redundant, since presumably you've already specified it in the rc spec, but it solves the ambiguous intent problem.

@bgrant0607 is this something that was thought about / handled for deployments?

bgrant0607 · 2016-04-19T17:08:03Z

@jlowdermilk Deployment does not have explicit support for :latest. It is declarative, so a user would need to make another change to their pod template in order to cause a rollout.

We're going to advocate against use of :latest.
kubernetes/website#395

bgrant0607 · 2016-04-19T17:22:18Z

Deployment issue is kubernetes/website#316

srcspider · 2016-04-20T09:30:35Z

@jlowdermilk

It sounds like you never reuse image tags, so I'm guessing you just want the rolling restart behavior?

Correct. I expect rolling-update to handle everything exactly the same for all cases.

To make sure we're on the same page, the behavior of rolling update is to delete and recreate pods one at a time. All containers in the pod get restarted, possibly on a different node (unless they are restricted to same-node), and kubelet will use the imagePullPolicy to determine whether it should reuse a locally cached image with the given name:tag or refetch from registry.

This may be a bad habbit from being an early adopter of kubernetes, but, I assume the nodes will only ever pull the image once and every other time it's probably cached. For all intents and purposes I ignore the imagePullPolicy. I design abstractions to work regardless of what it does.

The reason I do this is because, even though I have production and (most) development nodes on separate clusters (final stages before production still has to be on production cluster due to potential kubernetes cluster version bugs), it's unclear on what happens when say: one namespace/project/whatever happens to request the cached version be refreshed, while another one doesn't and some time later some scaling happens.

To keep things reasonably understandable and avoid bugs from containers being "different" but the same and such, everything is unique, everything gets restarted and the imagePullPolicy key is never used (I leave it to kubernetes defaults; but maybe I shouldn't). I also ever so slightly derive a new image from the base. The only change is a change to the text in a file who's only purpose to store the name of the intended use of the image. I do this for the purpose of easily debugging should kubernetes tangle the images/containers as well as make sure the images are 99.9999% identical but "unique" as far as kubernetes and docker is concerned. So any mess that might happen doesn't ever propagate to production.

There's no real overhead to doing it this way, from my experience, aside from maybe an extra layer that needs to be downloaded for the one inconsequential minor file change. This translates to every image having exactly 1 base + number of stages of development tags (not much effect on size) and an extra pull and check here and there. So if the other version was cached, kubernetes (assuming it's trying to cache) shouldn't really have to do more then just pull a tiny bit more.

As of right now the biggest performance overhead in pod deployment is still mainly kubernetes inability to determine pod "ready-ness" via a customization docker exec -it containerId /hey-container-did-you-finish-your-startup-scripts-yet command, instead of the time-interval method which always results in the "estimate time required as if the apocalypse happened".

bgrant0607 · 2016-04-22T23:25:55Z

@srcspider Thanks for the explanation. Please file a separate issue about readiness probe suggestions/improvements.

srcspider · 2016-04-24T00:49:39Z

@bgrant0607 done #24712

@janetkuo

Automatic merge from submit-queue kubectl rolling-update support for same image Fixes #23497. Enables `kubectl rolling-update --image` to the same image, adding a `--image-pull-policy` flag to remove ambiguity. This allows rolling-update to behave as an "update and/or restart" (#23497 (comment)), or as a forced update when the same tag can mean multiple versions (e.g. `:latest`). cc @janetkuo @nikhiljindal

nyxcharon changed the title ~~Kubectl Rolling update in 1.2 not behaving the same as 1.1~~ Kubectl rolling update in 1.2 not behaving the same as 1.1 Mar 25, 2016

zhouhaibing089 mentioned this issue Mar 26, 2016

the image check when rolling update should consider image tag latest as well #23516

Merged

janetkuo added area/kubectl team/ux labels Mar 27, 2016

bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Mar 30, 2016

j3ffml mentioned this issue Apr 22, 2016

kubectl rolling-update support for same image #24645

Merged

k8s-github-robot closed this as completed in #24645 May 4, 2016

j3ffml mentioned this issue Jun 28, 2016

Force deployment rolling-update #27081

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

nyxcharon commented Mar 25, 2016

zhouhaibing089 commented Mar 26, 2016

janetkuo commented Mar 27, 2016

pascal-hofmann commented Mar 30, 2016

zhouhaibing089 commented Mar 30, 2016

0xmichalis commented Mar 31, 2016

pascal-hofmann commented Apr 1, 2016

sfaujour commented Apr 4, 2016

sfaujour commented Apr 6, 2016

bgrant0607 commented Apr 10, 2016

srcspider commented Apr 15, 2016

zhouhaibing089 commented Apr 15, 2016

srcspider commented Apr 16, 2016

zhouhaibing089 commented Apr 17, 2016 •

edited

Loading

nikhiljindal commented Apr 18, 2016

zhouhaibing089 commented Apr 19, 2016

srcspider commented Apr 19, 2016 •

edited

Loading

j3ffml commented Apr 19, 2016

bgrant0607 commented Apr 19, 2016

bgrant0607 commented Apr 19, 2016

srcspider commented Apr 20, 2016

bgrant0607 commented Apr 22, 2016

srcspider commented Apr 24, 2016

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

Comments

nyxcharon commented Mar 25, 2016

zhouhaibing089 commented Mar 26, 2016

janetkuo commented Mar 27, 2016

pascal-hofmann commented Mar 30, 2016

zhouhaibing089 commented Mar 30, 2016

0xmichalis commented Mar 31, 2016

pascal-hofmann commented Apr 1, 2016

sfaujour commented Apr 4, 2016

sfaujour commented Apr 6, 2016

bgrant0607 commented Apr 10, 2016

srcspider commented Apr 15, 2016

zhouhaibing089 commented Apr 15, 2016

srcspider commented Apr 16, 2016

zhouhaibing089 commented Apr 17, 2016 • edited Loading

nikhiljindal commented Apr 18, 2016

zhouhaibing089 commented Apr 19, 2016

srcspider commented Apr 19, 2016 • edited Loading

j3ffml commented Apr 19, 2016

bgrant0607 commented Apr 19, 2016

bgrant0607 commented Apr 19, 2016

srcspider commented Apr 20, 2016

bgrant0607 commented Apr 22, 2016

srcspider commented Apr 24, 2016

zhouhaibing089 commented Apr 17, 2016 •

edited

Loading

srcspider commented Apr 19, 2016 •

edited

Loading