Revisit use of expectations in controllers #22061

bgrant0607 · 2016-02-26T15:54:34Z

Esp. the Deployment controller.

When working on #21963 and #19299 we discovered a lot of suspect behavior in the expectation behavior.

Expectations are supposed to be used to await previous controller actions to be seen in watch.

However, there appear to be a number of issues:

Can't distinguish unexpected events from controller actions. For RC and RS, in theory the pods are fungible, but actually they aren't, since they differ in availability (pending vs. running, crashing, readiness, ready time, etc.). For Deployment, they aren't fungible, since they can also be different versions.
Deletes appear to show up late and maybe are sometimes double-counted
The Deployment controller is reliant on expiration of the expectations in order to function
RC and RS may have erratic behavior now that they don't use expiration -- we may need to do something similar to what was done in Deployment

cc @bprashanth @mqliang @janetkuo @madhusudancs @Kargakis @nikhiljindal

bgrant0607 · 2016-02-26T17:34:08Z

This relates to a discussion @mikedanese, @davidopp, and I had yesterday.

Also related: #14961 (controllerRef, observedGeneration), #22007 (controller annotation of managed resources)

bgrant0607 · 2016-02-28T08:19:09Z

I effectively added expiration to RS and RC. However, pod creation appears to be fairly slow, so the timeout needs to be long.

#22138 sets the timeout to 5 minutes.

It also eliminates expectations from Deployment.

bprashanth · 2016-02-28T19:15:22Z

Is this about the need for refcounting in the RC manager, or something larger? The issues hint at a unified theory of controllers much larger than the adhoc implementation of expectations that has been cargo culted around.

The RCs need to be precise, so they need to refcount. Not all controllers have the same requirements. Fwiw in the controllers I've written (ingress/service) I've just invented different synchronization mechanisms of my own using annotations. I believe leaving this upto the controller author should work for the 60% case.

bgrant0607 · 2016-05-06T17:23:35Z

I'd like to make the RC and RS controllers have deterministic/idempotent behavior.

We already made deletion idempotent by ensuring victims to delete are chosen deterministically.

I'd like to do the same for creation by making pod names deterministic.

I don't want to use dense indices, however, since that would have misleading connotations. However, we could do something like hash the pod template with a monotonic counter. Deployment takes the max of outstanding revisions and adds one (but doesn't use it in object names) and uses the pod template hash in the RS name. RC/RS could take the max of outstanding incarnations, add one, and hash that with the pod template. As with Deployment revisions, incarnations could be recorded as annotations (on the pods, in this case).

bgrant0607 · 2016-05-06T17:24:47Z

Would need to rewrite incarnation annotations in the case of adoption.

bgrant0607 · 2016-05-18T07:22:22Z

See also #7369

fejta-bot · 2017-12-21T20:23:38Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-01-20T21:11:25Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

bgrant0607 · 2018-01-22T17:43:43Z

/remove-lifecycle rotten
/lifecycle frozen

sftim · 2022-06-27T10:34:38Z

Is this issue still one we'd want to track and work on?

bgrant0607 · 2022-07-07T23:06:32Z

I haven't looked at reported bugs for a long time, but probably we can close this in favor of addressing user-reported issues, if any.

/close

k8s-ci-robot · 2022-07-07T23:06:47Z

@bgrant0607: Closing this issue.

In response to this:

I haven't looked at reported bugs for a long time, but probably we can close this in favor of addressing user-reported issues, if any.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bgrant0607 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/app-lifecycle team/ux labels Feb 26, 2016

bgrant0607 added this to the next-candidate milestone Feb 26, 2016

bgrant0607 removed this from the next-candidate milestone Apr 27, 2016

bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 6, 2016

This was referenced May 20, 2016

Controllers can scale expectations ttl #22619

Closed

RC and RS controllers should cap number of outstanding actions, not wait for all to complete #22599

Open

Make rc manager more responsive to unexpected adds/deletes #7369

Open

enisoc mentioned this issue Feb 1, 2017

Update ControllerRef proposal kubernetes/community#298

Merged

bgrant0607 mentioned this issue Mar 8, 2017

Workload API v1 requirements umbrella issue #42752

Closed

bgrant0607 added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed team/control-plane (deprecated - do not use) labels Mar 8, 2017

caesarxuchao mentioned this issue Jul 16, 2017

Controllers keeps creating extra controllees if initialization is slow #48893

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017

k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 20, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2018

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 22, 2018

munnerz mentioned this issue Aug 11, 2018

Add proposal for Order and Challenge resources cert-manager/cert-manager#809

Merged

k8s-ci-robot closed this as completed Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit use of expectations in controllers #22061

Revisit use of expectations in controllers #22061

bgrant0607 commented Feb 26, 2016

bgrant0607 commented Feb 26, 2016

bgrant0607 commented Feb 28, 2016

bprashanth commented Feb 28, 2016

bgrant0607 commented May 6, 2016

bgrant0607 commented May 6, 2016

bgrant0607 commented May 18, 2016

fejta-bot commented Dec 21, 2017

fejta-bot commented Jan 20, 2018

bgrant0607 commented Jan 22, 2018

sftim commented Jun 27, 2022

bgrant0607 commented Jul 7, 2022

k8s-ci-robot commented Jul 7, 2022

Revisit use of expectations in controllers #22061

Revisit use of expectations in controllers #22061

Comments

bgrant0607 commented Feb 26, 2016

bgrant0607 commented Feb 26, 2016

bgrant0607 commented Feb 28, 2016

bprashanth commented Feb 28, 2016

bgrant0607 commented May 6, 2016

bgrant0607 commented May 6, 2016

bgrant0607 commented May 18, 2016

fejta-bot commented Dec 21, 2017

fejta-bot commented Jan 20, 2018

bgrant0607 commented Jan 22, 2018

sftim commented Jun 27, 2022

bgrant0607 commented Jul 7, 2022

k8s-ci-robot commented Jul 7, 2022