Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the status.Replica count useful to watchers #5745

Closed
bprashanth opened this issue Mar 20, 2015 · 5 comments
Closed

Make the status.Replica count useful to watchers #5745

bprashanth opened this issue Mar 20, 2015 · 5 comments
Assignees
Labels
area/api Indicates an issue on api area. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@bprashanth
Copy link
Contributor

Currently the status.Replica count is not very useful to watchers (like stop rc, which actually doesn't watch but should):

  1. Its updated fillCurrentState style: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/registry/etcd/etcd.go#L114
  2. If we remove that (we should, in any case), it will only get updated once in 10 seconds.

The quick and dirty solutions are:

  • Decreasing the polling interval: will probably lead to load on the apiserver.
    • We can probably tolerate a quicker interval if we just list pods once in the manager, instead of listing in each rc.
  • Wait on the goroutines that create/delete replicas, assume a 200 response means the request was a success, and update the count then and there: If a pod just keeps dying (stuck in a create->death loop), the status.Replicas would never reflect this because we can always create the pod (and hence update Status.Replicas), it just dies. So the watcher won't be aware of it.

@lavalamp thoughts on just watching the status field of all pods in the manager through the controller framework? I feel like we either need to do that, or list once in the manager and decrease the poll interval.

@lavalamp
Copy link
Member

That code linked in 1. looks like it should have been deleted when we changed controller manager to fill in the replicas field. Can we take that out ASAP?

I think having watchers get updated "only" every 10 seconds is not actually a problem at this point in time.

@bprashanth
Copy link
Contributor Author

That happened automatically when we started using generic etcd: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/master/master.go#L380. I forgot to delete the code in registry/etcd, will do.

@lavalamp
Copy link
Member

Ah.

So is a fair TL;DR: "is a 10 second refresh for rc.status.replicas fast enough?"

@bprashanth
Copy link
Contributor Author

Yes, I get the feeling it's not because it could be 20s before it is accurate right now, which slows down stop (even though stop will soon use watch, eliminating its own 3s poll interval).

We can probably make it as fast as we want when we stop listing all the things from the apiserver.

Correction: In spite of the 20s lag in the general case stop will only be slowed down by 10s, because resizing the rc to 0 will trigger a sync loop.

@lavalamp
Copy link
Member

To answer my own question: Yes, 10 seconds sounds totally reasonable. I think this is working now, and the parts that aren't perfect are covered by other issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

3 participants