New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Limit events accepted by API Server #50925

Merged

k8s-github-robot merged 1 commit into kubernetes:master from staebler:server-event-rate-limiter

Sep 3, 2017

Contributor

staebler commented Aug 18, 2017 •

edited by deads2k

Loading

What this PR does / why we need it:
This PR adds the ability to limit events processed by an API server. Limits can be set globally on a server, per-namespace, per-user, and per-source+object. This is needed to prevent badly-configured or misbehaving players from making a cluster unstable.

Please see kubernetes/community#945.

Release Note:

Adds a new alpha EventRateLimit admission control that is used to limit the number of event queries that are accepted by the API Server.

k8s-ci-robot added the cncf-cla: yes label

Contributor

k8s-ci-robot commented Aug 18, 2017

Hi @staebler. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot added the needs-ok-to-test label

k8s-github-robot assigned satnam6502 and derekwaynecarr

k8s-github-robot added size/XXL release-note-label-needed labels

deads2k self-assigned this

staebler force-pushed the server-event-rate-limiter branch from c2d0db3 to 2e94750 Compare

August 18, 2017 20:24

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              // newEventRateLimitWithClock configures an admission controller that can enforce event rate limits.
+              // It uses a clock for testing purposes.
+              func newEventRateLimitWithClock(config *eventratelimitapi.Configuration, clock flowcontrol.Clock) (admission.Interface, error) {

Contributor

deads2k Aug 23, 2017

Update the method to always use a clock and just call it with a real clock from the register. This mass of news is hard to follow.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              }
+              // newEventRateLimit configures an admission controller than can enforce event rate limits
+              func newEventRateLimitUsingRLF(config *eventratelimitapi.Configuration, rlf rateLimiterFactory) (admission.Interface, error) {

Contributor

deads2k Aug 23, 2017

If you eliminate the clock variant above and always require it, you can eliminate this too.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              		if err != nil {
+              			return nil, err
+              		}
+              		eventRateLimitAdmission.limitEnforcers = append(eventRateLimitAdmission.limitEnforcers, enforcer)

Contributor

deads2k Aug 23, 2017

nit: construct this array above and initialize the eventRateLImitAdmission in one init block instead of mutating it here.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	rateLimiter flowcontrol.RateLimiter
+              }
+              func newSingleCache(rateLimiter flowcontrol.RateLimiter) *singleCache {

Contributor

deads2k Aug 23, 2017

The style in the project doesn't use this for simple construction of private (or even public) structs. Imagine all the anti-patterns from every other language and that's the go-style.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	return nil
+              }
+              // cache is an interface for caching the limits of a particular type

Contributor

deads2k Aug 23, 2017

move this and both implementations to another file in this package.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              type eventRateLimitAdmission struct {
+              	sync.RWMutex
+              	*admission.Handler
+              	limitEnforcers []*limitEnforcer

Contributor

deads2k Aug 23, 2017

good indicating how many of these you expect. Less than 10 by the looks of it.

Contributor Author

staebler Aug 28, 2017

What type of indication are you looking for here? Are you suggesting a comment? Or are you suggesting something like [4]*limitEnforcer?

Contributor

deads2k Aug 28, 2017

What type of indication are you looking for here? Are you suggesting a comment? Or are you suggesting something like [4]*limitEnforcer?

comment

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              // eventRateLimitAdmission implements an admission controller that can enforce event rate limits
+              type eventRateLimitAdmission struct {
+              	sync.RWMutex

Contributor

deads2k Aug 23, 2017

Don't anonymously include the mutex

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              type eventRateLimitAdmission struct {
+              	sync.RWMutex
+              	*admission.Handler
+              	limitEnforcers []*limitEnforcer

Contributor

deads2k Aug 23, 2017

read-only after construction?

Contributor Author

staebler Aug 28, 2017

Would you mind elaborating on this? I don't know how to make something read-only.

Contributor

deads2k Aug 28, 2017

Would you mind elaborating on this? I don't know how to make something read-only.

It was a question. Is it read-only after construction. If so, a comment would help.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              func newLimitEnforcer(config eventratelimitapi.Limit, rlf rateLimiterFactory) (*limitEnforcer, error) {
+              	limitType := config.Type
+              	newRateLimiter := func() flowcontrol.RateLimiter {
+              		return rlf(config.QPS, config.Burst)

Contributor

deads2k Aug 23, 2017

This doesn't look valuable. Why not just pass the func() flowcontrol.RateLimiter you need?

Contributor Author

staebler Aug 28, 2017

I am not quite sure what you are suggesting here. I may have addressed, or avoided, this with other changes. But you may want to check that this specifically is addressed in the next revision of changes.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	filter := rateLimiter.TryAccept()
+              	// update the cache
+              	enforcer.cache.add(key, rateLimiter)

Contributor

deads2k Aug 23, 2017

This looks odd. We have to do a write every accept call? Did this belong in that !found block above?

Contributor Author

staebler Aug 28, 2017

It seemed odd to me, too. It was copied over from the event rate limiting code in the client-side event recorder. I had assumed that Add was needed to make the key the most recently used. I see from the lru code, though, that Get does that already, which makes sense.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	// get the rate limiter associated with the specified key
+              	get(key interface{}) (rateLimiter flowcontrol.RateLimiter, found bool)
+              	// add the specified rate limiter to the cache, associated with the specified key
+              	add(key interface{}, rateLimiter flowcontrol.RateLimiter)

Contributor

deads2k Aug 23, 2017

Why do we have this lever? This is an attempt to avoid plumbing the func() RateLimiter through? Seems like the get call should always return a valid RateLimiter and we have no need for a separate get

Contributor

deads2k Aug 23, 2017

err no need for a separate add

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              		return nil
+              	}
+              	a.Lock()

Contributor

deads2k Aug 23, 2017

This is questionable. You're doing this because the ratelimiter isn't threadsafe or for something else? I thought the ratelimiters were thread-safe (I remember looking back when we were working on the client).

Contributor Author

staebler Aug 28, 2017

The locking is required because the LRU cache is not thread-safe. I will move the lock into lruCache to minimize the duration the lock is held.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	return nil
+              }
+              // getNamespaceKey returns a key for a parceledGate that is based on the namespace of the event request

Contributor

deads2k Aug 23, 2017

don't say parcel. This is a key for the request, rigth?

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	return userInfo.GetName()
+              }
+              // getSourceObjecttKey returns a key for a parceledGate that is based on the source+object of the event

Contributor

deads2k Aug 23, 2017

minus parcel. This a key for the request/limit tuple

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/admission.go Outdated

+              	// give each limit enforcer a chance to reject the event
+              	for _, enforcer := range a.limitEnforcers {
+              		if err := enforcer.accept(attr); err != nil {
+              			return err

Contributor

deads2k Aug 23, 2017

you can't short-circuit here. You need to count against all the rate limits.

Contributor Author

staebler Aug 28, 2017

Derek and I discussed this a while back, coming to the conclusion that it was adequate to short-circuit. The behavior is similar to having each limit type be a separate admission control. If an earlier admission control in the chain rejects a request, the later admission controls never see that request and do not have the opportunity to count the request against any limits. I don't think that it really matters too much in the grand scheme of things whether we count a rejected request against all limits or only a subset of limits. If you feel strongly that each rejected request must count against all limits, then I will make that change. Logically, it probably makes the most sense to have a rejected request count against none of the limits, as it was rejected. This is harder to implement, though. And, again, I don't think it matters too much in the overall flow of event requests.

Contributor

deads2k Aug 28, 2017

If you feel strongly that each rejected request must count against all limits, then I will make that change.

I feel pretty strongly. Either all or none please

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/apis/eventratelimit/types.go Outdated

+              	// "namespace": limits are maintained against events from each namespace
+              	// "user": limits are maintained against events from each user
+              	// "source+object": limits are maintained against events from each source+object
+              	Type string `json:"type"`

Contributor

deads2k Aug 23, 2017

Alias the string type string LimitType

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/apis/eventratelimit/v1alpha1/types.go Outdated

+              	QPS float32 `json:"qps"`
+              	// Maximum burst for throttle of events for this limit
+              	Burst int ` json:"burst"`

Contributor

deads2k Aug 23, 2017

int64. Externals must specify. To match, internals can specify too.

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/apis/eventratelimit/v1alpha1/types.go Outdated

+              	//
+              	// If the type of limit is "server", then CacheSize is ignored and can be
+              	// omitted.
+              	CacheSize int `json:"cacheSize"`

Contributor

deads2k Aug 23, 2017

This should have a default if it isn't specified. I'm ok with the default varying based on type. I'd also make this omitEmpty. I don't think most people will care, but I could see it mattering for namespaces.

Contributor

deads2k commented Aug 23, 2017

left some initial comments. Let's work through those and then re-review. I'm going to start again from the top, so squashing the changes into the main commit is fine.

staebler force-pushed the server-event-rate-limiter branch 2 times, most recently from e0f1cd8 to 9952de7 Compare

August 29, 2017 02:25

k8s-github-robot added release-note and removed do-not-merge/release-note-label-needed labels

Contributor Author

staebler commented Aug 31, 2017

/retest

Contributor

deads2k commented Aug 31, 2017

update bazel

Contributor

deads2k commented Aug 31, 2017

gofmt

Contributor

deads2k commented Aug 31, 2017

also, squash

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/cache.go Outdated

    
              }

              func (c *lruCache) get(key interface{}) flowcontrol.RateLimiter {

              	c.mutex.Lock()

Contributor

deads2k Aug 31, 2017

Given a thread-safe lru cache and the fact that we don't care if we're off by a little bit, we don't actually care if this method is locked anymore do we?

Contributor Author

staebler Aug 31, 2017

Yes. Oversight on my part to not remove the mutex when I switched LRU cache implementations.

staebler force-pushed the server-event-rate-limiter branch from c6e350b to fb02ad2 Compare

August 31, 2017 20:29

smarterclayton mentioned this pull request

Add client side event spam filtering #47367

Merged

Contributor Author

staebler commented Sep 1, 2017

I do not know what to do with this test failure. The failure is the k8s.io/kubernetes/test/integration/garbagecollector TestCascadingDeletion test. I do not see how that test is affected by these changes.

Contributor Author

staebler commented Sep 1, 2017

/retest

deads2k reviewed

View reviewed changes

plugin/pkg/admission/eventratelimit/apis/eventratelimit/types.go Outdated

    
              	UserLimitType LimitType = "user"

              	// SourceAndObjectLimitType is a type of limit where there is one bucket used

              	// by each combination of source and involved object of the event.

              	SourceAndObjectLimitType LimitType = "sourceAndObject"

Contributor

deads2k Sep 1, 2017

Looks like you missed clayton's "Start with a capital letter".

Contributor Author

staebler Sep 1, 2017

Yep, I registered the latter part of that comment, but not the former.

k8s-github-robot removed the release-note-label-needed label


          Create an EventRateLimit admission control plug-in for the API Server.

a4542ae

The EventRateLimit plug-in limits the number of events that the API Server
will accept in a given time period. It allows for server-wide, per-namespace,
per-user,and per-source+object rate limiting.

staebler force-pushed the server-event-rate-limiter branch from fb02ad2 to a4542ae Compare

September 1, 2017 17:38

Contributor

deads2k commented Sep 1, 2017

/lgtm

k8s-ci-robot added the lgtm label

deads2k added this to the v1.8 milestone

liggitt reviewed

View reviewed changes

staging/src/k8s.io/apimachinery/pkg/api/errors/errors.go

+              const (
+              	// StatusTooManyRequests means the server experienced too many requests within a
+              	// given window and that the client must wait to perform the action again.
+              	StatusTooManyRequests = 429

Member

liggitt Sep 1, 2017

why not http.StatusTooManyRequests?

liggitt reviewed

View reviewed changes

staging/src/k8s.io/apimachinery/pkg/api/errors/errors.go

+              // NewTooManyRequestsError returns an error indicating that the request was rejected because
+              // the server has received too many requests. Client should wait and retry. But if the request
+              // is perishable, then the client should not retry the request.
+              func NewTooManyRequestsError(message string) *StatusError {

Member

liggitt Sep 1, 2017

NewTooManyRequests already exists, right?

Member

liggitt commented Sep 1, 2017

fix the name of the admission plugin in the release note to match what they would use with --admission-control and note that it is alpha

Contributor

deads2k commented Sep 1, 2017

NewTooManyRequests already exists, right?

Ah, good old go, don't specify "error" because its in an error package.

Ok as a follow to remove the duplicate. Tagging since its minor and today is freeze.

/lgtm

Contributor

deads2k commented Sep 1, 2017

plugin/OWNERS

Net new admission plugin with approved proposal. manually approving the new package.

deads2k added the approved label

Contributor

eparis commented Sep 1, 2017

/retest

Contributor

smarterclayton commented Sep 1, 2017

/approve

Already approved by other approvers (new bazel dir)

Contributor

smarterclayton commented Sep 1, 2017

/approve no-issue

k8s-github-robot commented Sep 1, 2017

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, smarterclayton, staebler

Associated issue requirement bypassed by: smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~cmd/kube-apiserver/OWNERS~~ [deads2k,smarterclayton]
~~hack/OWNERS~~ [deads2k,smarterclayton]
~~plugin/OWNERS~~ [smarterclayton]
~~staging/src/k8s.io/apimachinery/pkg/OWNERS~~ [deads2k,smarterclayton]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-github-robot commented Sep 3, 2017

Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638)

k8s-github-robot merged commit 4d42f80 into kubernetes:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cncf-cla: yes lgtm release-note size/XXL