Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

Closed
nyxcharon opened this issue Mar 25, 2016 · 22 comments
Closed

Kubectl rolling update in 1.2 not behaving the same as 1.1 #23497

nyxcharon opened this issue Mar 25, 2016 · 22 comments
Labels
area/kubectl priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@nyxcharon
Copy link

I have a Kubernetes cluster running version 1.2 and the latest version of kubectl installed. If I perform a rolling update on it the same way when I used the 1.1 client, it fails to update. Is this intended behavior, i.e. do I need to invoke this update differently now or is this a regression/bug ?

Client Info:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"2+", GitVersion:"v1.2.0-alpha.7", GitCommit:"c0fd002fbb25d6a6cd8427d28b8ec78379c354a0", GitTreeState:"clean"}
kServer Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"5cb86ee022267586db386f62781338b0483733b3", GitTreeState:"clean"}
$ kubectl-v1.1.1 version
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.1", GitCommit:"92635e23dfafb2ddc828c8ac6c03c7a7205a84d8", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"5cb86ee022267586db386f62781338b0483733b3", GitTreeState:"clean"}

Invoking rolling update with 1.2:

$ kubectl --server=http://myclusterurl.com:8080 rolling-update sensu-client --image=sstarcher/sensu:latest
error: Specified --image must be distinct from existing container image
See 'kubectl rolling-update -h' for help and examples.

Whereas in 1.1 it worked fine:

$ kubectl-v1.1.1 --server=http:/myclusterurl.com:8080 rolling-update sensu-client --image=sstarcher/sensu:latest
Created sensu-client-59ea6198da5cb1fe4df053b249be7f78
Scaling up sensu-client-59ea6198da5cb1fe4df053b249be7f78 from 0 to 3, scaling down sensu-client from 3 to 0 (keep 3 pods available, don't exceed 4 pods)
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 1
Scaling sensu-client down to 2
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 2
Scaling sensu-client down to 1
Scaling sensu-client-59ea6198da5cb1fe4df053b249be7f78 up to 3
Scaling sensu-client down to 0
Update succeeded. Deleting old controller: sensu-client
Renaming sensu-client-59ea6198da5cb1fe4df053b249be7f78 to sensu-client
replicationcontroller "sensu-client" rolling updated
@nyxcharon nyxcharon changed the title Kubectl Rolling update in 1.2 not behaving the same as 1.1 Kubectl rolling update in 1.2 not behaving the same as 1.1 Mar 25, 2016
@zhouhaibing089
Copy link
Contributor

Checked with v1.2 code: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/rollingupdate.go#L236-L267

if oldRc.Spec.Template.Spec.Containers[0].Image == image {
    return cmdutil.UsageError(cmd, "Specified --image must be distinct from existing container image")
}

While in v1.1 there is no such check: https://github.com/kubernetes/kubernetes/blob/release-1.1/pkg/kubectl/cmd/rollingupdate.go#L216-L240.

So basically, I would suggest it should not consider the image is the same if the tag is latest.

@janetkuo
Copy link
Member

@nikhiljindal

@pascal-hofmann
Copy link

Will there be a bugfix release containing the fix for this?

Edit: As a workaround I am now using kubectl from the 1.1.8 release, which works fine.

@zhouhaibing089
Copy link
Contributor

the fix is merged into master, though not in release1.2.

@bgrant0607 bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Mar 30, 2016
@0xmichalis
Copy link
Contributor

Edit: As a workaround I am now using kubectl from the 1.1.8 release, which works fine.

See for a proposed workaround on 1.2: #15908 (comment)

@pascal-hofmann
Copy link

Just tried it - proposed workaround from #15908 (comment) does not work for me:

error: Must specify the controller to update
See 'kubectl rolling-update -h' for help and examples.

And if I specify a controller:

error: my_rc.yaml cannot have the same name as the existing ReplicationController my_controller_name
See 'kubectl rolling-update -h' for help and examples.

@sfaujour
Copy link

sfaujour commented Apr 4, 2016

@phofmann-trust Same issue here, the workaround doesn't work for me too.

@sfaujour
Copy link

sfaujour commented Apr 6, 2016

@phofmann-trust I replaced all :latest tags with a new one e.g. 'edge'.
Now rolling-update will not fail anymore!!

@bgrant0607
Copy link
Member

There's no way to predictably cause Kubernetes to pull a ":latest" image exactly when you'd like. I recommend not using ":latest" for anything other than short-duration pods. See also #1697.

@srcspider
Copy link

Workaround are not the correct way to go about it, this illogical change needs to be reverted.

An image when starting will potentially configure itself and there's plenty of cases where you would want to update to the SAME EXACT IMAGE to force everything to re-configure itself to some updated settings or environment changes. It's also a good way to extremely quickly temporarily fix some once-in-a-blue-moon server error (push the same image as a "reset"). Sometimes you might also temporarily push images as placeholders while you perform some complex operations, there's no point in preventing you from not pushing a (let's say) "maintenance" image over another "maintenance" image just because it happens its the same (why should you have to upload 100 versions of the same exact maintenance image?).

Even if we were talking about jpeg images (which is how this fix seems to be treating the docker images) if for some reason the image gets corrupted on disk why should I be prevented from copy pasting over it a good copy? Whats the harm here?

There should be tests to make sure this is always possible.

@zhouhaibing089
Copy link
Contributor

@srcspider I basically agree and understand your point. but considering the integrity(rolling update and rollback), rollback would be impossible if it is permitted to use the same tagged image, that's the only concern that I am aware of.

please correct me if I am thinking wrong.

@srcspider
Copy link

@zhouhaibing089 I'm not sure I understand your scenario and problem here. You'll have to explain it to me. Rollback of what? What's "impossible" exactly? and did you mean it as truly impossible or simply "harder".

If we're talking normal application where you find out you did something stupid and then press a button to "turn back time" sort to speak, then "rollbacks" (for everything I manage) is not allowed as an acceptable solution. It's an anti-pattern as far as I'm concern. It's not possible to implement the powers of a time lord. It's much simpler, and saner, to implement "revert" strategies that moves you forward, removing the undesired changes with knowledge of the consequences (and compromises).

I honestly don't see why kubernetes has to concern itself in any way with this. This pointless restriction here for example (with no opt-out) has only made it harder to implement it proper. An image is just the "start" for a container. And a container that has been running for a decent period of time is potentially (by design or unintentionally) not necessarily equivalent to a container just starting. So if you think about it as the containers and not the images (which you should), then using the same image makes perfect sense since it produces a "previous" known good state. In fact it's one of the only states you can make reasonably safe guarantees as a state you can "rollback" to, since it's by design suppose to work regardless of when it starts, for the given "system state" (and in the context of everything else interacting with that system).

How exactly can kubernetes even model a problem such as "X state has these irreversable data changes that can be reverted only via this imperfect strategy, and Y state has these specific communication changes that will need careful coordination with independent parties." Just reaching state X and Y is hard enough, going back to X from Y if you initially went the other way is just way harder to even do normally. Much less solve it "magically."

To make a simple analogy...

The way I think of "rollbacks" (the anti-pattern) is as putting out a fire by blowing on it. It will work almost 100% of the time for candles and matches (small applications, no foreign dependencies, no data, no nothing). It might work for something a bit bigger, like a piece of paper that's on fire (large but relatively easy to grasp applications, no foreign dependencies, throwaway data, easily contained, ...easily excused). But for anything that's actually even remotely close to a serious fire trying to blow on it (or "beat it to death") will likely only make a small problem into a disaster (your bank lost your salary, that shop you ordered from lost your payment, your accounting software lost a week of records, your cloud backup service corrupted your backups, etc). If you want to put out a fire, with minimal damage, you invest into a specialized tool for the job and protocol, like... a fire extinguisher—hopefully one that doesn't have undesirable consequences in the context of where you expect to have a fire.

Going back to our issues at hand: the extinguisher is a real strategy that involves more then pressing a button and kubernetes moving images blindly around. It's of course fine if that's all it ends up being following the right steps (just as it's fine to blow out a canddle). But as mentioned previously I find unwise for the implementation of those steps to be kubernetes concern.

It's as this issue illustrates inconvenient for kubernetes to concern itself with it. Since any way it does it there will be bias. And as this issue shows it's quite annoying for kubernetes to be biased for the sake of "features" that to users introduce limitations with no benefit; practically speaking these "features" are indistinguishable from "bugs".

@zhouhaibing089
Copy link
Contributor

zhouhaibing089 commented Apr 17, 2016

/cc @jlowdermilk

@srcspider thanks for this long explanation, it makes sense to me.

@nikhiljindal
Copy link
Contributor

I agree with #15908 (comment) comment by @jlowdermilk.
Users running kubectl rolling-update will expect that new containers will come up with the new image (which might not happen). We should first ensure that that happens and then we can allow this.

@srcspider If i understand correctly, you just want to restart your pods with the same image? Is #13488 what you are looking for?
There is also #22368 which talks about how to rollout configuration changes to pods.

@zhouhaibing089
Copy link
Contributor

@nikhiljindal I am thinking what @srcspider want is to use the same tagged image(the image may still be different).

A rolling-update of all rc-managed pods to the same image is not guaranteed to have the desired effect, because image caching on the node may result in the new pods using the old "version" of the image.

we could guarantee by setting imagePullPolicy accordingly.

@srcspider
Copy link

srcspider commented Apr 19, 2016

@zhouhaibing089

From reading the comment linked by @nikhiljindal if I'm understanding correctly is the restriction meant to prevent misuse of the special :latest tag? (for the record I've never used that one) Personally I have all images assigned actual exact versions, and rc tags. So if I'm deploying 1.2.3-rc7 it's pretty cut and dry. There is only ever one 1.2.3-rc7. The QA servers also just get the 1.2.3-rc7. The production servers also get 1.2.3-rc7.

I take it the fix suggested in the comment is suggesting that the "solution" is to replace the entire rc? Seems a bit odd in that case that the case is recognized but you have to do it in a completely roundabout hackish way. It's like saying "we have this restrictions, but here's an equivalent part where we totally forgot/ignore the restriction."

Your suggestion on being able to push the same image with the same name and kubernetes handling the update transparently is still very much welcome. The opaqueness of what exactly kubernetes does with images on the nodes, garbage collection of images, what images each node knows about, how to force nodes to cache images pre-deploy/update is a big pain point with kubernetes.

Probably the most annoying conversation to have whenever I have to talk about kubernetes (and also the most common conversation I have to have), is how somehow the nodes need to be treated as imbeciles when it comes to handling the images. Anything that can make node image handling more human accessible would be greatly appreciated.

off-topic: This is not related with the problem at hand, but since you mentioned it... There is indeed occasionally the need to create temporary namespaces for testing (services, pods, etc) and it's bit annoying not to be able with a flag safely re-use the name for those cases. Just to be clear these are mainly testing changes to kubernetes (new rcs, pods, services, etc), though image changes are also required for obvious reasons. There's a lot of iteration involved and the way I've had to mitigate that has been though tacking on various meta information usually a random hash as well. This unfortunately results in the annoyance that there's a bunch of images with random garabge as names.

However, at least for me, re-use of images is a separate issue since not-reusing names for anything that has to go though the steps towards production is just simpler from a human understanding point of view (if anyone says version 1.2.3 rc7 it's really troublesome if we have to then think about different versions of that). Even after cleaning images from the GCE registry there's something silly like 400 images; can't imagine managing something like that with ambiguity like "latest" thrown in.

@nikhiljindal

@srcspider If i understand correctly, you just want to restart your pods with the same image? Is #13488 what you are looking for? There is also #22368 which talks about how to rollout configuration changes to pods.

Yes and no. The problem with having this "rolling-restart" way of doing it is that it simply adds pointless complexity. So now every time a rolling-update call happens I have to somehow figure out if it's the same image or a different image, for the only purpose of changing if I call kubernetes rolling-update or kubernetes rolling-restart. (I also assume the rolling-restart one would require not providing the image flag)

There should just be a flag that says --allow-restarts, that would make implementations of abstractions and automation scripts a lot simpler.

You can also have a flag that's just --restart, if the semantics of "only restart" are needed and you want a flag that allows you to skip providing a image entirely. Personally I find that to be the niche case here, since it's much easier to provide the version I expect, and derive all images from that, then to figure out "is this the same version that's already there? do I need to update this, restart that, not update that?" for every little thing, especially when we go into intermediate images, multiple pods and so on.

Also as mentioned previously the image is just a means to an end, the container is the actual thing that matters, so even if the image is identical kubernetes deciding on it's own "hey that's the same image so I'm gonna leave this container untouched" is not desirable behavior because a container restart may result in a different container state then the container continuing to run as-is. Any update should result in a restart, and the opposite effect should be what's under a flag.

@j3ffml
Copy link
Contributor

j3ffml commented Apr 19, 2016

@srcspider the primary reason for disallowing rolling-update with the same image tag is that its behavior is not reliable without imagePullPolicy:Always on the container spec. Kubelet caches images locally and has no way of knowing if the image it has is the same as the one requested in the case of re-using image tags (e.g. :latest). It sounds like you never reuse image tags, so I'm guessing you just want the rolling restart behavior?

To make sure we're on the same page, the behavior of rolling update is to delete and recreate pods one at a time. All containers in the pod get restarted, possibly on a different node (unless they are restricted to same-node), and kubelet will use the imagePullPolicy to determine whether it should reuse a locally cached image with the given name:tag or refetch from registry.

Seems there are 2 use cases for same image tag rolling-update:

  1. reusing an image:name tag for an actual update, e.g. :latest for dev, and
  2. rolling-restart with same (actually same) image

My original thought was to require imagePullPolicy:Always, but that only meets use case 1). So perhaps a better solution is to add a --image-pull-policy flag, and make it required when doing a rolling-update to the same image:tag. It's a bit redundant, since presumably you've already specified it in the rc spec, but it solves the ambiguous intent problem.

@bgrant0607 is this something that was thought about / handled for deployments?

@bgrant0607
Copy link
Member

@jlowdermilk Deployment does not have explicit support for :latest. It is declarative, so a user would need to make another change to their pod template in order to cause a rollout.

We're going to advocate against use of :latest.
kubernetes/website#395

@bgrant0607
Copy link
Member

Deployment issue is kubernetes/website#316

@srcspider
Copy link

@jlowdermilk

It sounds like you never reuse image tags, so I'm guessing you just want the rolling restart behavior?

Correct. I expect rolling-update to handle everything exactly the same for all cases.

To make sure we're on the same page, the behavior of rolling update is to delete and recreate pods one at a time. All containers in the pod get restarted, possibly on a different node (unless they are restricted to same-node), and kubelet will use the imagePullPolicy to determine whether it should reuse a locally cached image with the given name:tag or refetch from registry.

This may be a bad habbit from being an early adopter of kubernetes, but, I assume the nodes will only ever pull the image once and every other time it's probably cached. For all intents and purposes I ignore the imagePullPolicy. I design abstractions to work regardless of what it does.

The reason I do this is because, even though I have production and (most) development nodes on separate clusters (final stages before production still has to be on production cluster due to potential kubernetes cluster version bugs), it's unclear on what happens when say: one namespace/project/whatever happens to request the cached version be refreshed, while another one doesn't and some time later some scaling happens.

To keep things reasonably understandable and avoid bugs from containers being "different" but the same and such, everything is unique, everything gets restarted and the imagePullPolicy key is never used (I leave it to kubernetes defaults; but maybe I shouldn't). I also ever so slightly derive a new image from the base. The only change is a change to the text in a file who's only purpose to store the name of the intended use of the image. I do this for the purpose of easily debugging should kubernetes tangle the images/containers as well as make sure the images are 99.9999% identical but "unique" as far as kubernetes and docker is concerned. So any mess that might happen doesn't ever propagate to production.

There's no real overhead to doing it this way, from my experience, aside from maybe an extra layer that needs to be downloaded for the one inconsequential minor file change. This translates to every image having exactly 1 base + number of stages of development tags (not much effect on size) and an extra pull and check here and there. So if the other version was cached, kubernetes (assuming it's trying to cache) shouldn't really have to do more then just pull a tiny bit more.

As of right now the biggest performance overhead in pod deployment is still mainly kubernetes inability to determine pod "ready-ness" via a customization docker exec -it containerId /hey-container-did-you-finish-your-startup-scripts-yet command, instead of the time-interval method which always results in the "estimate time required as if the apocalypse happened".

@bgrant0607
Copy link
Member

@srcspider Thanks for the explanation. Please file a separate issue about readiness probe suggestions/improvements.

@srcspider
Copy link

@bgrant0607 done #24712

k8s-github-robot pushed a commit that referenced this issue May 4, 2016
Automatic merge from submit-queue

kubectl rolling-update support for same image

Fixes #23497.

Enables `kubectl rolling-update --image` to the same image, adding a `--image-pull-policy` flag to remove ambiguity. This allows rolling-update to behave as an "update and/or restart" (#23497 (comment)), or as a forced update when the same tag can mean multiple versions (e.g. `:latest`). cc @janetkuo @nikhiljindal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

10 participants