-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-proxy: ratelimit runs of iptables by sync-period flags #46266
kube-proxy: ratelimit runs of iptables by sync-period flags #46266
Conversation
@thockin - it seems it already needs rebase :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor stuff (except from making this work by default in GCI too).
// | ||
// The function will be run no more often than burstRuns per minInterval, For | ||
// example, this can mitigate the impact of expensive operations being called in | ||
// response to user-initiated operations. Burts runs are "accumulated" over |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Burts/Bursts/
if minInterval == 0 { | ||
bfr.limiter = nullLimiter{} | ||
} else { | ||
// minInterval is a duration, typically in seconds but could be fractional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this comment. I mean I'm not sure how it relates to the line below (which I agree is correct).
bfr.lastRun = bfr.timer.Now() | ||
bfr.fn() | ||
bfr.timer.Stop() | ||
bfr.timer.Reset(bfr.maxInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it would be good to comment about the semantic of it - I think it isn't obvious:
- "lastRun" is the time when the last function call started
- "timer" starts ticking when the function actually ended.
So in the situation when noone is calling Run() (or calling it super rarely that it doesn't really matter), but the function call takes significant time comparing to "maxInterval", then we are not really calling it every "maxInterval".
To clarify - if calling a function takes 5s, and maxInterval is also 5s, we will be calling the function every 10s.
I guess, I would do the same (I mean I wouldn't change the logic here), but I think it needs clarification in the comment for NewBoundedFrequencyRunner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note, that this semantic is different for user-initiated call, where then "minInterval" is between two consecutive starts of the function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an excellent point - this is busted. It really should be consistent. My hunch is that it should take the time after the run, so there's always some rest period after a run. This means the named "bounded frequency" is bad. Should be "ThrottledRunner" or something? That doesn't quite convey that is also has a periodic run ...
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible rename is the only thing left, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now when I think about it, i think that the semantic that you are suggesting is the good one (I initially though that it should be measure between starts, but that has a bunch of drawbacks).
Regarding naming - I don't have opinion. I was always extremely bad at naming :)
# test_args should always go last to overwrite prior configuration | ||
{% set params = log_level + " " + feature_gates + " " + test_args -%} | ||
{% set params = log_level + " " + throttles + " " + feature_gates + " " + test_args -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work in GCI. You also need to add appropriate if in:
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L822
and
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/container-linux/configure-helper.sh#L615
pkg/proxy/iptables/proxier.go
Outdated
} | ||
|
||
func (proxier *Proxier) OnServiceAdd(service *api.Service) { | ||
namespacedName := types.NamespacedName{Namespace: service.Namespace, Name: service.Name} | ||
if proxier.serviceChanges.update(&namespacedName, nil, service) { | ||
// TODO(wojtek-t): If the initial sync of informer either for endpoints or | ||
// services is not finished, it doesn't make sense to call syncProxyRules | ||
// services is not finished, it doesn't make sense to sync right away |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess those changes are the one that conficts with my PR that has just been merged.
@@ -29,8 +29,10 @@ | |||
{% set feature_gates = "--feature-gates=" + grains.feature_gates -%} | |||
{% endif -%} | |||
|
|||
{% set throttles = "--iptables-sync-period=1m --iptables-min-sync-period=10s" -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we start with smaller min-sync-period now? E.g. 5s?
After thinking about it, i think this change may surprise people with small cluster where everything worked really fast.
If we decide to increase it later, it would be super trivial change to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the bursts should allow changes to happen very quickly most of the time, and throttling kicks in only in larger clusters. At least, that's what I expect...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scenario I was afraid about is that you are setting bursts=2.
So if we are creating a replication controller with 3 replicas, if endpoint controller will be fast enough that it will be adding pods to Endpoints object one by one, the first two will be added quickly, but for the last we will need to wait 10s more.
I'm not sure how big problem it is though, so maybe it fine to leave now. We can tune later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can make this even more complex and add a short holdoff (e.g. 1sec). I'd rather do that as a followup, if needed. Or simpler, make bursts higher. Here's my thinking - if the iptables update is fast, a short burst isn't a big deal. If the update is slow, there will be a lot of time between runs anyway to accumulate new pods. So maybe setting burst to 4 or is better - we could do it as a follow up to make that a flag...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that bumping bursts sounds like a good option. I'm obviously fine with doing it in a followup - we will see what will be the impact of this change.
What's the current state of IPVS load-balancing, and doesn't that obviate this to some extent? |
@timothysc AFAIK IPVS is not done, much less ready to be the default. |
3a9c269
to
dfd6cab
Compare
65c3806
to
5b3c9bd
Compare
@k8s-bot pull-kubernetes-unit test this |
5b3c9bd
to
3da04e1
Compare
@caesarxuchao I have no idea what staging godeps wants. Help? |
Just two minor comments (not very actionable though). @thockin - feel free to self-lgtm after fixing godeps issue |
Other staging repos also imports juju,
Manually updating them should make the verify script happy. I'll improve the script after the 1.7 rush. |
// completion to next start), except that up to bursts may be allowed. Burst | ||
// runs are "accumulated" over time, one per minInterval up to burstRuns total. | ||
// This can be used, for example, to mitigate the impact of expensive operations | ||
// being called in response to user-initiated operations. Run requests s that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/requests\ss/request
This picked up an unrelated but missing change.
This lib manages runs of a function to have min and max frequencies.
3da04e1
to
2856fde
Compare
manually fixed staging godeps. @caesarxuchao that script is my least favorite thing in the whole project right now. It also decided it needed to delete staging/src/k8s.io/client-go/LICENSE - no idea why. and I had to make a clean GOPATH or it just failed in inscrutable ways. |
@thockin: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
self-lgtm |
/lgtm @thockin - thanks a lot! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: thockin, wojtek-t
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 44774, 46266, 46248, 46403, 46430) |
This bounds how frequently iptables can be synced. It will be no more often than every 10 seconds and no less often than every 1 minute, by default.
@timothysc FYI
@dcbw @freehan FYI