-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a QPS limiter to the kubernetes client. #6203
Conversation
Does this have to be inside rest client? Would we expect qps to be stored in client config on disk? Otherwise RESTClient is designed to be wrapped, so could this be either a wrapper on RESTClient? Kube Client should/can take an interface for RESTClient (although I may not have completed that refactor yet). |
Assigning to @smarterclayton, feel free to reassign :) |
There is no RESTClient interface yet, so I don't think the re-factor is complete. RESTClient is really the only place where this is possible to store this for now, since it is the central point that all of the other client interfaces route through. I'd rather not gate this PR on that refactor. I can plumb this up into a config flag if you want me to. ( Thanks! |
There is, but it's in pkg/resource. I'm fine with this as a start point as long as you add a TODO to hoist it out of client. ----- Original Message -----
|
pkg/kubectl/resource i meant ----- Original Message -----
|
Fixed the test, and added a TODO. ptal. Thanks! |
ok, fixed tests for realz... |
5834836
to
c0e46a0
Compare
Switched to the new RateLimiter::Accept() method. |
What's the relation between this and |
ugh |
Usually it's not having enough cpu - you can set maxprocs or severely cpu constrain the resources for test-cmd to emulate it. I run it on a one cpu vm.
|
Ok, I updated test-cmd.sh, it was using ptal |
LGTM, you can self merge because I'm not at my laptop |
Add a QPS limiter to the kubernetes client.
Can folks please label this as perf for tracking. |
@timothysc done. |
Side effect of this PR is that density test is creating pods much slower than earlier (~6 pods per second). For 50 node cluster and rc with 1500 replicas (30 pods per node) this gives 300 seconds just to create rc. The same applies to stopping rc and scheduling. Is this intentional? |
As discussed offline with Filip I think that we should increase the limits. I'm not sure what to do with replication controller - I think that we should increase the limit, but it's more subtle here. We definitely will be throttling such cases (i.e. resizing from 0 to 1500) but I'm not sure how long it should take for replication controller. But we definitely don't want to make it slower than the scheduler can process them. |
It seems weird to me that we are sprinkling magic numbers in vs. doing backoffs. |
Derek and I were talking - all of the controllers in the controller-manager are sharing one client. Maybe we should have multiple per cm.
|
@smarterclayton I don't think it'd help. Even with a separate client we'd be able to create 5 pods per second. |
I was more concerned with RC preventing other controllers from running. There's no "fair" sharing if the have the same limiter, so Derek and my concern was that the replication controller manager would starve other things like the ResourceQuota and ResourceUsage controllers, which means that it's potentially possible for those never to get updated (in theory). ----- Original Message -----
|
This is a starting point. I'm open to other settings for max QPS (or alternately setting it via flag differently for different binaries)
See #5865