Controllers can scale expectations ttl #22619

bprashanth · 2016-03-07T01:44:50Z

Currently the ttl is set to: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/controller_utils.go#L44

In theory it can be infinite, but if by some chance (user error/bug) a pod gets orphaned the controller will wait 5m before giving up. It can give up much sooner in most situations. One way aroud this would be to set the timeout to something like: numPods/(watches per second) + padding.

Since we currently don't have an accurate way to track watch latency we need to calculate these value empirically. The comment indicates that watches per second is 10 but it should be set based on the theoretical limit of pods in the cluster, which depends on the number of nodes and allowed pods per node.

This probably involves opening up a watch on all nodes in the rc manager and only paying attention to adds/deletes (since node udpates are really frequent). It also involves determining watches per second for buckets like 100 nodes, 500 nodes, 1000 nodes etc.

If we did this, we'd notice a wedged RC in under 10s in the simple case of 3 nodes and a few orphaned pods.

This would be nice to have for 1.2, I'll try to get to it but anyone with cycles should jump in.

The text was updated successfully, but these errors were encountered:

adohe-zz · 2016-03-07T03:42:32Z

would try to do this, if any question, I will leave here.

bgrant0607 · 2016-05-20T18:16:04Z

See also #22061

fejta-bot · 2017-12-21T20:23:38Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-01-20T21:11:24Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-02-19T21:17:59Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

bprashanth added help-wanted area/controller-manager labels Mar 7, 2016

bprashanth mentioned this issue Mar 7, 2016

RC and RS controllers should cap number of outstanding actions, not wait for all to complete #22599

Open

bgrant0607 added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 10, 2016

bgrant0607 removed the help-wanted label Aug 30, 2016

bgrant0607 added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed team/control-plane (deprecated - do not use) labels Mar 8, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 20, 2018

k8s-ci-robot closed this as completed Feb 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Controllers can scale expectations ttl #22619

Controllers can scale expectations ttl #22619

bprashanth commented Mar 7, 2016

adohe-zz commented Mar 7, 2016

bgrant0607 commented May 20, 2016

fejta-bot commented Dec 21, 2017

fejta-bot commented Jan 20, 2018

fejta-bot commented Feb 19, 2018

Controllers can scale expectations ttl #22619

Controllers can scale expectations ttl #22619

Comments

bprashanth commented Mar 7, 2016

adohe-zz commented Mar 7, 2016

bgrant0607 commented May 20, 2016

fejta-bot commented Dec 21, 2017

fejta-bot commented Jan 20, 2018

fejta-bot commented Feb 19, 2018