Scale kube-proxy conntrack limits by cores (new default behavior) #28876

thockin · 2016-07-13T05:35:26Z

For large machines we want more conntrack entries than smaller machines.

Commence bike-shedding on names and values and semantics. I'd love to get this in as a patch to 1.3.x since we have had some trouble for users on large machines.

Fixes #28867

justinsb · 2016-07-13T05:56:20Z

Looks good. I think the default for conntrack-max is not 0, so users will have to explicitly set it to zero to get per-core scaling?

I think the behaviour when both flags are set is a little surprising. What I would have guessed is that the max was always an upper bound if non-zero, but if per-core is non-zero this becomes the default value. Something like:

n = 0
if connsPerCore > 0 {
  n = connsPerCore * cores
  if connsMax > 0 {
   n = min(n, connsMax)
  } 
} else {
  n = connsMax
}
if n != 0 {
  apply(n)
}

Or is the intention that we will change the default for conntrack-max to zero and deprecate it entirely?

(If more bike-shedding is mandatory, why scale based on cores vs on total RAM?)

thockin · 2016-07-13T06:01:50Z

On Tue, Jul 12, 2016 at 10:57 PM, Justin Santa Barbara
notifications@github.com wrote:

Looks good. I think the default for conntrack-max is not 0, so users will have to explicitly set it to zero to get per-core scaling?

The default was set in code, which this PR changes.

I think the behaviour when both flags are set is a little surprising. What I would have guessed is that the max was always an upper bound if non-zero, but if per-core is non-zero this becomes the default value. Something like:

I wanted the behavior for anyone who sets the older flag to be
unchanged. I could be talked out of that, but it seemed least likely
to cause an explosion.

n = 0
if connsPerCore > 0 {
n = connsPerCore * cores
if connsMax > 0 {
n = min(n, connsMax)
}
} else {
n = connsMax
}
if n != 0 {
apply(n)
}

Or is the intention that we will change the default for conntrack-max to zero and deprecate it entirely?

(If more bike-shedding is mandatory, why scale based on cores vs on total RAM?)

I considered a per-core and per-GB scalar, but decided that we scale
other things like max-pods by cores, and that USUALLY CPUs and memory
scale together pretty linearly (especially in clouds). I could buy an
argument for two params if someone could explain a concrete case for
it.

justinsb · 2016-07-13T06:15:34Z

Ah sorry, I totally missed the nuance of the default change.

Was definitely not suggesting scaling on both CPU & Memory - cores are great, it was really just curiosity :-)

dims · 2016-07-13T11:01:04Z

cmd/kube-proxy/app/options/options.go

+	fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax,
+		"Maximum number of NAT connections to track (0 to leave as-is).")
+	fs.Int32Var(&s.ConntrackMaxPerCore, "conntrack-max-per-core", s.ConntrackMaxPerCore,
+		"Maximum number of NAT connections to track per CPU core (0 to leave as-is). This is only considered if conntrack-max is 0.")


Would it be useful to add the default value here? (32 * 1024)

The default value gets set in code elsewhere. I'm not really happy with the state of it all, but I didn't want to buck the trend for this PR. When you run kube-proxy -? it shows the 32k default.

Ack. thanks!

matchstick · 2016-07-15T20:00:21Z

cmd/kube-proxy/app/options/options.go

-	fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax, "Maximum number of NAT connections to track (0 to leave as-is)")
+	fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax,
+		"Maximum number of NAT connections to track (0 to leave as-is).")
+	fs.Int32Var(&s.ConntrackMaxPerCore, "conntrack-max-per-core", s.ConntrackMaxPerCore,


@thockin Is there an easy way to let user know if they set both that is incorrect?

Fair enough. Done.

For large machines we want more conntrack entries than smaller machines.

k8s-bot · 2016-07-16T00:16:51Z

GCE e2e build/test passed for commit 85a0106d6086e9d7f333bd103d8b48d0a36a3d8d.

k8s-github-robot · 2016-07-16T00:39:20Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-07-16T01:14:15Z

GCE e2e build/test passed for commit 1f37281.

k8s-github-robot · 2016-07-16T01:14:32Z

Automatic merge from submit-queue

…6-upstream-release-1.3 Automatic merge from submit-queue Automated cherry pick of #28876 upstream release 1.3 Cherry pick of #28876 onto release-1.3. Scale kube-proxy conntrack limits by cores For large machines we want more conntrack entries than smaller machines.

…k-of-#28876-upstream-release-1.3 Automatic merge from submit-queue Automated cherry pick of kubernetes#28876 upstream release 1.3 Cherry pick of kubernetes#28876 onto release-1.3. Scale kube-proxy conntrack limits by cores For large machines we want more conntrack entries than smaller machines.

thockin added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. cherrypick-candidate labels Jul 13, 2016

thockin added this to the v1.3 milestone Jul 13, 2016

thockin assigned matchstick Jul 13, 2016

googlebot added the cla: yes label Jul 13, 2016

k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 13, 2016

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 6d0a499 to 6771346 Compare July 13, 2016 06:27

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2016

dims reviewed Jul 13, 2016
View reviewed changes

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 6771346 to e5e98aa Compare July 13, 2016 18:48

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2016

matchstick reviewed Jul 15, 2016
View reviewed changes

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from e5e98aa to 266d82f Compare July 15, 2016 20:13

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 15, 2016

matchstick added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 266d82f to 833f05c Compare July 15, 2016 22:07

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016

thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 833f05c to 85a0106 Compare July 15, 2016 23:34

thockin added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 15, 2016

Scale kube-proxy conntrack limits by cores

1f37281

For large machines we want more conntrack entries than smaller machines.

thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 85a0106 to 1f37281 Compare July 15, 2016 23:36

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016

thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016

k8s-github-robot merged commit 6f98b69 into kubernetes:master Jul 16, 2016

thockin mentioned this pull request Jul 17, 2016

Automated cherry pick of #28876 upstream release 1.3 #29068

Merged

roberthbailey added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jul 18, 2016

fabioy removed the cherrypick-candidate label Jul 19, 2016

thockin mentioned this pull request Sep 10, 2016

kube-proxy conntrack should have a lower-bound #32435

Closed

thockin deleted the kube-proxy-scale-conntrack-by-cores branch November 2, 2016 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale kube-proxy conntrack limits by cores (new default behavior) #28876

Scale kube-proxy conntrack limits by cores (new default behavior) #28876

thockin commented Jul 13, 2016

justinsb commented Jul 13, 2016

thockin commented Jul 13, 2016

justinsb commented Jul 13, 2016

dims Jul 13, 2016

thockin Jul 13, 2016

dims Jul 13, 2016

matchstick Jul 15, 2016

thockin Jul 15, 2016

k8s-bot commented Jul 16, 2016

k8s-github-robot commented Jul 16, 2016

k8s-bot commented Jul 16, 2016

k8s-github-robot commented Jul 16, 2016

Scale kube-proxy conntrack limits by cores (new default behavior) #28876

Scale kube-proxy conntrack limits by cores (new default behavior) #28876

Conversation

thockin commented Jul 13, 2016

justinsb commented Jul 13, 2016

thockin commented Jul 13, 2016

justinsb commented Jul 13, 2016

dims Jul 13, 2016

Choose a reason for hiding this comment

thockin Jul 13, 2016

Choose a reason for hiding this comment

dims Jul 13, 2016

Choose a reason for hiding this comment

matchstick Jul 15, 2016

Choose a reason for hiding this comment

thockin Jul 15, 2016

Choose a reason for hiding this comment

k8s-bot commented Jul 16, 2016

k8s-github-robot commented Jul 16, 2016

k8s-bot commented Jul 16, 2016

k8s-github-robot commented Jul 16, 2016