Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to relabel controllers and their pods #36897

Closed
0xmichalis opened this issue Nov 16, 2016 · 23 comments
Closed

Provide a way to relabel controllers and their pods #36897

0xmichalis opened this issue Nov 16, 2016 · 23 comments
Labels
area/workload-api/replicaset lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@0xmichalis
Copy link
Contributor

0xmichalis commented Nov 16, 2016

Currently, the deployment controller relabels replica sets and pods when it needs to adopt a replica set for a deployment. It would be more convenient to have a replica set endpoint that would force the replica set controller to do all the relabeling (change pod template labels, update old pods, change selector). The implementation should be similar to how the rollback spec works for Deployments.

It would help:

  • users that want to change their selectors but have no good way to relabel old pods. We are adding kubectl set selector in Add new command "kubectl set selector" #28949 enabled only for services. Eventually, we could enable it for controllers by using the new endpoint.
  • in dropping the pod update permission from the deployment controller and move any logic around pods one level down to replica sets (deployments should operate only on replica sets).

@kubernetes/api-review-team @kubernetes/sig-api-machinery @ncdc @liggitt @bgrant0607

@0xmichalis
Copy link
Contributor Author

cc @mfojtik

@bgrant0607 bgrant0607 added sig/apps Categorizes an issue or PR as relevant to SIG Apps. area/workload-api/replicaset and removed team/ux labels Nov 16, 2016
@bgrant0607
Copy link
Member

@Kargakis Sorry, quick comments for now.

We may want to think about a subresource, similar to /scale, that could do this uniformly across workload APIs. Maybe even we could extend /scale to be a generic workload-controller interface, but I haven't thought about that much yet.

@bgrant0607
Copy link
Member

cc @erictune

@0xmichalis
Copy link
Contributor Author

@bgrant0607 agreed that this should be similar to /scale, not sure about making it generic.

@0xmichalis
Copy link
Contributor Author

Use-case: Easier Blue-Green deployments: openshift/origin#11954

@bgrant0607
Copy link
Member

See also comments here:
#36859 (comment)

Note that one pattern that's safe and doesn't require complex orchestration when one wants to introduce a new attribute is to modify selectors of existing workload controllers to select pods without the new label key, and then just add the key to new pod templates. This isn't possible with RCs, however, since they don't support the more powerful selectors.

@MarkRx
Copy link

MarkRx commented Nov 28, 2016

Would causing a relabel to happen spin down old pods and spin up new pods, or would existing pods be reused?

I don't know if this is by design, but at present it isn't really possible to store pod state in labels as changing the RC causes new pods to be created. Having pod state is useful for services as it allows them to be able to select pods based on their state.

@0xmichalis
Copy link
Contributor Author

Changing the RC is not causing new pods to be created, you are probably talking about Deployments / DeploymentConfigs. Changing a RC will create new pods if the RC is scaled up or old pods die. Relabeling would merely relabel all old pods to use the new label(s).

@MarkRx
Copy link

MarkRx commented Nov 28, 2016

If the RC selector is changed the system would suddenly see 0 pods and try to start scaling up. Since existing pods cannot have their labels changed in an atomic operation when the RC selector is changed there is possible contention.

@0xmichalis
Copy link
Contributor Author

If the RC selector is changed the system would suddenly see 0 pods and try to start scaling up. Since existing pods cannot have their labels changed in an atomic operation when the RC selector is changed there is possible contention.

The RC selector should change as the last part of the "transaction" ie. once all old pods and the rc pod template labels have changed. If any of the latter actions fails, we won't change the selector. I guess once we start using transactions in etcd3, this won't be a problem.

@smarterclayton
Copy link
Contributor

smarterclayton commented Nov 28, 2016 via email

@smarterclayton
Copy link
Contributor

@bprashanth because we had discussed pod labels changing to track state (like leader). I think the atomicity challenge remains even with etcd3, so we should at least consider whether this is something that we might indicate another way.

@lavalamp
Copy link
Member

lavalamp commented Nov 30, 2016 via email

@davidopp
Copy link
Member

Was that officially decided somewhere? I think there are some reasonable use cases for multi-object transactions, for example the way we do pod binding today is not great because it only updates the pod, not the node, so you can't actually detect conflicts. Even just being able to bump the resource version on the node and the pod atomically, while only actually mutating the NodeName in the pod, would be useful. But I suspect there are other useful scenarios (whether the one here is one of those, I don't know).

We were afraid of the implication for sharding etcd, but we never actually needed to shard etcd (yet)...

@smarterclayton
Copy link
Contributor

smarterclayton commented Nov 30, 2016 via email

@lavalamp
Copy link
Member

lavalamp commented Nov 30, 2016 via email

@bgrant0607
Copy link
Member

It's off topic for this issue, so please don't discuss it further in this issue, but I agree with smarterclayton and lavalamp: that we should not support atomic transactions across multiple resources, and have documented that:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture.md

OTOH, I could imagine some operation similar to rollback, which performs the relabeling orchestration server-side.

@bgrant0607 bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Feb 24, 2017
@bgrant0607
Copy link
Member

See also #14894

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 20, 2018
@bgrant0607
Copy link
Member

/remove-lifecycle rotten
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 22, 2018
@janetkuo
Copy link
Member

Deployment controller stops relabeling adopted ReplicaSets and Pods after #61615 got merged. Selectors became immutable in apps/v1 endpoints. Can we close this issue now that it's no longer needed by workloads controllers?

@bgrant0607
Copy link
Member

I agree we are unlikely to get to this, and permitting it would increase complexity of object management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workload-api/replicaset lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet
Development

No branches or pull requests

10 participants