-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only clean up iptables chains periodically in large clusters #110334
only clean up iptables chains periodically in large clusters #110334
Conversation
@danwinship: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
just to try to understand these series of PR, what is the scale problem we are addressing? |
Speed. |
a271970
to
9b11986
Compare
pkg/proxy/iptables/proxier.go
Outdated
@@ -404,6 +405,8 @@ var iptablesCleanupOnlyChains = []iptablesJumpChain{ | |||
{utiliptables.TableFilter, kubeServicesChain, utiliptables.ChainInput, "kubernetes service portals", []string{"-m", "conntrack", "--ctstate", "NEW"}}, | |||
} | |||
|
|||
var iptablesCleanupPeriod = time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we only do the cleanup during the syncPeriod
?
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 1s
syncPeriod: 30s
this way it can be parameterizable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
+1 /assign @thockin |
Turn this into a generic "large cluster mode" that determines whether we optimize for performance or debuggability.
"iptables-save" takes several seconds to run on machines with lots of iptables rules, and we only use its result to figure out which chains are no longer referenced by any rules. While it makes things less confusing if we delete unused chains immediately, it's not actually _necessary_ since they never get called during packet processing. So in large clusters, make it so we only clean up chains periodically rather than on every sync.
9b11986
to
7d3ba83
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ok, rebased for I pulled in c12da17 from #110268, which serves the purpose here of confirming that chain deletion behavior is unchanged in "small" clusters. Then I renamed |
/lgtm Just add a mention to the release note that this behavior is controlled by the |
/test pull-kubernetes-unit |
What type of PR is this?
/kind cleanup
/kind feature
What this PR does / why we need it:
Followup to #110328 (which this branch includes). This makes it so that in "large" clusters, we only clean up stale service chains once a minute, rather than doing it on every sync. In very large clusters, this cuts several seconds off each
syncProxyRules
run.Since the chains being removed by this code are not referenced by any other rules, it doesn't matter if we let them sit around for a while before we get around to deleting them; they're not going to have any effect on packet processing either way.
However, since it could be confusing if an admin does
iptables-save
and finds rules that are no longer in use (eg, aKUBE-SEP-
chain for a pod that exited), I made it so that it only switches from "synchronous deletion" to "periodic deletion" in "large" clusters, using the same definition of "large" as we use for deciding when to start eliminating comments from the iptables rules.Which issue(s) this PR fixes:
none
Does this PR introduce a user-facing change?
/sig network
/priority important-longterm