Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a QPS limiter to the kubernetes client. #6203

Merged
merged 1 commit into from
Apr 3, 2015

Conversation

brendandburns
Copy link
Contributor

This is a starting point. I'm open to other settings for max QPS (or alternately setting it via flag differently for different binaries)

See #5865

@smarterclayton
Copy link
Contributor

Does this have to be inside rest client? Would we expect qps to be stored in client config on disk? Otherwise RESTClient is designed to be wrapped, so could this be either a wrapper on RESTClient? Kube Client should/can take an interface for RESTClient (although I may not have completed that refactor yet).

@vmarmol
Copy link
Contributor

vmarmol commented Mar 31, 2015

Assigning to @smarterclayton, feel free to reassign :)

@brendandburns
Copy link
Contributor Author

There is no RESTClient interface yet, so I don't think the re-factor is complete. RESTClient is really the only place where this is possible to store this for now, since it is the central point that all of the other client interfaces route through.

I'd rather not gate this PR on that refactor. I can plumb this up into a config flag if you want me to.

(gofmt'd now too...)

Thanks!
--brendan

@smarterclayton
Copy link
Contributor

There is, but it's in pkg/resource.

I'm fine with this as a start point as long as you add a TODO to hoist it out of client.

----- Original Message -----

There is no RESTClient interface yet, so I don't think the re-factor is
complete. RESTClient is really the only place where this is possible to
store this for now, since it is the central point that all of the other
client interfaces route through.

I'd rather not gate this PR on that refactor. I can plumb this up into a
config flag if you want me to.

(gofmt'd now too...)

Thanks!
--brendan


Reply to this email directly or view it on GitHub:
#6203 (comment)

@smarterclayton
Copy link
Contributor

pkg/kubectl/resource i meant

----- Original Message -----

There is, but it's in pkg/resource.

I'm fine with this as a start point as long as you add a TODO to hoist it out
of client.

----- Original Message -----

There is no RESTClient interface yet, so I don't think the re-factor is
complete. RESTClient is really the only place where this is possible to
store this for now, since it is the central point that all of the other
client interfaces route through.

I'd rather not gate this PR on that refactor. I can plumb this up into a
config flag if you want me to.

(gofmt'd now too...)

Thanks!
--brendan


Reply to this email directly or view it on GitHub:
#6203 (comment)

@brendandburns
Copy link
Contributor Author

Fixed the test, and added a TODO. ptal.

Thanks!
--brendan

@brendandburns
Copy link
Contributor Author

ok, fixed tests for realz...

@brendandburns brendandburns force-pushed the qps branch 2 times, most recently from 5834836 to c0e46a0 Compare April 3, 2015 04:35
@brendandburns
Copy link
Contributor Author

Switched to the new RateLimiter::Accept() method.

@satnam6502
Copy link
Contributor

What's the relation between this and Retry-After ?

@brendandburns
Copy link
Contributor Author

ugh test-cmd.sh is failing on shippable/travis but not my machine. @smarterclayton any suggestions for debugging?

@smarterclayton
Copy link
Contributor

Usually it's not having enough cpu - you can set maxprocs or severely cpu constrain the resources for test-cmd to emulate it. I run it on a one cpu vm.

On Apr 3, 2015, at 1:38 AM, Brendan Burns notifications@github.com wrote:

ugh test-cmd.sh is failing on shippable/travis but not my machine. @smarterclayton any suggestions for debugging?


Reply to this email directly or view it on GitHub.

@brendandburns
Copy link
Contributor Author

Ok, I updated test-cmd.sh, it was using delete rc <foo> not stop rc <foo>

ptal
Thanks
--brendan

@smarterclayton
Copy link
Contributor

LGTM, you can self merge because I'm not at my laptop

brendandburns added a commit that referenced this pull request Apr 3, 2015
Add a QPS limiter to the kubernetes client.
@brendandburns brendandburns merged commit 88dbdc4 into kubernetes:master Apr 3, 2015
@timothysc
Copy link
Member

Can folks please label this as perf for tracking.

@roberthbailey roberthbailey added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Apr 6, 2015
@roberthbailey
Copy link
Contributor

@timothysc done.

@fgrzadkowski
Copy link
Contributor

Side effect of this PR is that density test is creating pods much slower than earlier (~6 pods per second). For 50 node cluster and rc with 1500 replicas (30 pods per node) this gives 300 seconds just to create rc.

The same applies to stopping rc and scheduling.

Is this intentional?

@wojtek-t
Copy link
Member

wojtek-t commented Apr 7, 2015

As discussed offline with Filip I think that we should increase the limits.
Currently (in a healthy situation) scheduler is able to schedule at least 10 pods per second. We shouldn't block the healthy situation and allow only 2-3 bindings per second, because of throttling (see #3954) [2-3 bindings per second is what experiments show with current throttling].
So I think that increasing the limit by, say, 5x is what we should do here.

I'm not sure what to do with replication controller - I think that we should increase the limit, but it's more subtle here. We definitely will be throttling such cases (i.e. resizing from 0 to 1500) but I'm not sure how long it should take for replication controller. But we definitely don't want to make it slower than the scheduler can process them.

@timothysc
Copy link
Member

It seems weird to me that we are sprinkling magic numbers in vs. doing backoffs.

@smarterclayton
Copy link
Contributor

Derek and I were talking - all of the controllers in the controller-manager are sharing one client. Maybe we should have multiple per cm.

On Apr 3, 2015, at 12:36 AM, Brendan Burns notifications@github.com wrote:

Switched to the new RateLimiter::Accept() method.


Reply to this email directly or view it on GitHub.

@fgrzadkowski
Copy link
Contributor

@smarterclayton I don't think it'd help. Even with a separate client we'd be able to create 5 pods per second.

@smarterclayton
Copy link
Contributor

I was more concerned with RC preventing other controllers from running. There's no "fair" sharing if the have the same limiter, so Derek and my concern was that the replication controller manager would starve other things like the ResourceQuota and ResourceUsage controllers, which means that it's potentially possible for those never to get updated (in theory).

----- Original Message -----

@smarterclayton I don't think it'd help. Even with a separate client we'd be
able to create 5 pods per second.


Reply to this email directly or view it on GitHub:
#6203 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants