-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling-update by node #18450
Comments
No there is no way to do it right now. Am curious why you want to do rolling update by node? |
I can't get instances with different version on the same node because my processes use shared memory. |
In general, we should do rolling update by the failure domains pods are spread by. To clarify: these pods are communicating via shared memory? How? Why not put all the containers in the same pod? I don't see how this would work without hard affinity #18265. |
|
@nikhiljindal I just made an implementation of rolling update by node using labels on nodes. |
@titilambert Have you seen |
@bgrant0607 Hello! |
Sorry, I just saw this issue. Would #9043 solve your problem? |
Hello ! |
@titilambert If you use a hostPort in your pods, only one can schedule per node. We also have some anti-affinity features coming that may help: |
@bgrant0607 Thanks, for your reply ! BTW, I'm pretty sure that anti-affinity feature will help this PR to be better (maybe getting this feature without using node Selector ?) |
@titilambert I still don't understand the reason why you want to stop all pods on a given node at the same time. However, this sounds like a fairly niche use case. Maybe there is something we could do to make this easier to implement outside of Kubernetes? |
Additionally, as I mentioned in the PR, we're trying to reduce the amount of logic in kubectl (#12143). |
Hi Brian, let me try to explain in more details the use case here. We have this single thread service that requires a lot of RAM to be loaded Since the process is single threaded, we run multiple instances of the Now in k8s, we have managed to migrate the service into a single docker So the only way right now to fix this (at the infra level) is to stop all going one node at a time allows no down time of the service. We do understand that this is not aligned with the micro services best Hopefully this niche case, is now clearer for you. Regards! Sylvain On Fri, Apr 29, 2016 at 1:37 PM, Brian Grant notifications@github.com
|
In 1.3 we're adding "pod affinity" which lets you say "when you're trying to schedule a pod from [service, RC, RS, Job, whatever] X, only do so on a node that is already running at least one pod from [service, RC, RS, Job, whatever] Y. There is a variant of this (that we're not implementing in 1.3, but might later) that says "in addition, if the pod from Y stops running, then kill X" If you really only have two services, then this variant (that we're not implementing in 1.3) sounds like it would solve your problem. In particular:
I'm not saying this is the best way to address your problem, and of course it's hard to compare one nonexistent solution to other nonexistent solutions, but I thought I'd mention it, as this at least fits in with something we're building. |
Hi Daniel. Thanks for the feedback. Regards |
Hi @djsly. Thanks for the clarification. Now I understand -- you want to "roll" one node at a time rather than one replica at a time, and you want to ensure that no updated replica starts on the node until all of the old replicas on the node have been killed. There's no automated way to do what you're asking. But here's an approach that might be good enough. Let's say rc1 is the ReplicationController that's managing the current replicas, rc1's PodSpec has a label selector "version=1", and all the nodes in the cluster start out labeled "version=1" First, you create rc2, a ReplicationController that will manage the new version; it is identical to rc1 except it uses the image name you're upgrading to and it has label selector "version=2" instead of "version=1" (and its name is rc2 instead of rc1, of course). Then
Once you're done upgrading the nodes, you can delete rc1. I realize this isn't perfect, but I think it's the closest you can get without writing your own controller. |
Hi David, Thanks for the proposal! This is indeed exactly what we coded in this PR #22442 with the only exception that we kept
to allow other RC to deploy replicas that are independent of the /dev/shm, hence making the node's resources still available for other type of services. What we would like to do in the end is provide upstream with the changes to support such scenario such that we could stop relying on our own fork of the project and eventually get back at using the official releases. We understand that this should be coded server side which makes lot of senses and would like to get guidance on what you guys would prefer to ensure that we can work on getting a future PR accepted. Thanks! Sylvain |
@davidopp, if we are interested to resume this work in migrating the previous PR to the deployment object, where would be the best place to start in terms of proposal�� ? is #sig-apps the right venue for initial design discussion ? |
Yes, sig-apps is probably the right place. |
Hi, sorry we did not get a chance to talk in-person at KubeCon. Is it possible to do this using your own client? We now have a go client: If you want to see this built into Deployment, you should write a proposal and discuss it with sig-apps. |
@titilambert I guess we can close this now that we have coded the logic on the client side for now. Eventually we will be looking at using either Operators from coreOS or directly third_party_resources. |
@djsly mind sharing your implementation if it's open-sourced? |
Hello,
I would like to make rolling update by node:
Questions:
I can patch it to get something like: kubectl rolling-update frontend-v1 -f frontend-v2.json --by-node
Thanks
The text was updated successfully, but these errors were encountered: