Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit use of expectations in controllers #22061

Closed
bgrant0607 opened this issue Feb 26, 2016 · 12 comments
Closed

Revisit use of expectations in controllers #22061

bgrant0607 opened this issue Feb 26, 2016 · 12 comments
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@bgrant0607
Copy link
Member

Esp. the Deployment controller.

When working on #21963 and #19299 we discovered a lot of suspect behavior in the expectation behavior.

Expectations are supposed to be used to await previous controller actions to be seen in watch.

However, there appear to be a number of issues:

  • Can't distinguish unexpected events from controller actions. For RC and RS, in theory the pods are fungible, but actually they aren't, since they differ in availability (pending vs. running, crashing, readiness, ready time, etc.). For Deployment, they aren't fungible, since they can also be different versions.
  • Deletes appear to show up late and maybe are sometimes double-counted
  • The Deployment controller is reliant on expiration of the expectations in order to function
  • RC and RS may have erratic behavior now that they don't use expiration -- we may need to do something similar to what was done in Deployment

cc @bprashanth @mqliang @janetkuo @madhusudancs @Kargakis @nikhiljindal

@bgrant0607 bgrant0607 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/app-lifecycle team/ux labels Feb 26, 2016
@bgrant0607 bgrant0607 added this to the next-candidate milestone Feb 26, 2016
@bgrant0607
Copy link
Member Author

This relates to a discussion @mikedanese, @davidopp, and I had yesterday.

Also related: #14961 (controllerRef, observedGeneration), #22007 (controller annotation of managed resources)

@bgrant0607
Copy link
Member Author

I effectively added expiration to RS and RC. However, pod creation appears to be fairly slow, so the timeout needs to be long.

#22138 sets the timeout to 5 minutes.

It also eliminates expectations from Deployment.

@bprashanth
Copy link
Contributor

Is this about the need for refcounting in the RC manager, or something larger? The issues hint at a unified theory of controllers much larger than the adhoc implementation of expectations that has been cargo culted around.

The RCs need to be precise, so they need to refcount. Not all controllers have the same requirements. Fwiw in the controllers I've written (ingress/service) I've just invented different synchronization mechanisms of my own using annotations. I believe leaving this upto the controller author should work for the 60% case.

@bgrant0607 bgrant0607 removed this from the next-candidate milestone Apr 27, 2016
@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 6, 2016
@bgrant0607
Copy link
Member Author

I'd like to make the RC and RS controllers have deterministic/idempotent behavior.

We already made deletion idempotent by ensuring victims to delete are chosen deterministically.

I'd like to do the same for creation by making pod names deterministic.

I don't want to use dense indices, however, since that would have misleading connotations. However, we could do something like hash the pod template with a monotonic counter. Deployment takes the max of outstanding revisions and adds one (but doesn't use it in object names) and uses the pod template hash in the RS name. RC/RS could take the max of outstanding incarnations, add one, and hash that with the pod template. As with Deployment revisions, incarnations could be recorded as annotations (on the pods, in this case).

@bgrant0607
Copy link
Member Author

Would need to rewrite incarnation annotations in the case of adoption.

@bgrant0607
Copy link
Member Author

See also #7369

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 20, 2018
@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2018
@bgrant0607
Copy link
Member Author

/remove-lifecycle rotten
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 22, 2018
@sftim
Copy link
Contributor

sftim commented Jun 27, 2022

Is this issue still one we'd want to track and work on?

@bgrant0607
Copy link
Member Author

I haven't looked at reported bugs for a long time, but probably we can close this in favor of addressing user-reported issues, if any.

/close

@k8s-ci-robot
Copy link
Contributor

@bgrant0607: Closing this issue.

In response to this:

I haven't looked at reported bugs for a long time, but probably we can close this in favor of addressing user-reported issues, if any.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/app-lifecycle lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet
Development

No branches or pull requests

5 participants