Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale kube-proxy conntrack limits by cores (new default behavior) #28876

Conversation

thockin
Copy link
Member

@thockin thockin commented Jul 13, 2016

For large machines we want more conntrack entries than smaller machines.

Commence bike-shedding on names and values and semantics. I'd love to get this in as a patch to 1.3.x since we have had some trouble for users on large machines.

Fixes #28867

@thockin thockin added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. cherrypick-candidate labels Jul 13, 2016
@thockin thockin added this to the v1.3 milestone Jul 13, 2016
@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 13, 2016
@justinsb
Copy link
Member

Looks good. I think the default for conntrack-max is not 0, so users will have to explicitly set it to zero to get per-core scaling?

I think the behaviour when both flags are set is a little surprising. What I would have guessed is that the max was always an upper bound if non-zero, but if per-core is non-zero this becomes the default value. Something like:

n = 0
if connsPerCore > 0 {
  n = connsPerCore * cores
  if connsMax > 0 {
   n = min(n, connsMax)
  } 
} else {
  n = connsMax
}
if n != 0 {
  apply(n)
}

Or is the intention that we will change the default for conntrack-max to zero and deprecate it entirely?

(If more bike-shedding is mandatory, why scale based on cores vs on total RAM?)

@thockin
Copy link
Member Author

thockin commented Jul 13, 2016

On Tue, Jul 12, 2016 at 10:57 PM, Justin Santa Barbara
notifications@github.com wrote:

Looks good. I think the default for conntrack-max is not 0, so users will have to explicitly set it to zero to get per-core scaling?

The default was set in code, which this PR changes.

I think the behaviour when both flags are set is a little surprising. What I would have guessed is that the max was always an upper bound if non-zero, but if per-core is non-zero this becomes the default value. Something like:

I wanted the behavior for anyone who sets the older flag to be
unchanged. I could be talked out of that, but it seemed least likely
to cause an explosion.

n = 0
if connsPerCore > 0 {
n = connsPerCore * cores
if connsMax > 0 {
n = min(n, connsMax)
}
} else {
n = connsMax
}
if n != 0 {
apply(n)
}

Or is the intention that we will change the default for conntrack-max to zero and deprecate it entirely?

(If more bike-shedding is mandatory, why scale based on cores vs on total RAM?)

I considered a per-core and per-GB scalar, but decided that we scale
other things like max-pods by cores, and that USUALLY CPUs and memory
scale together pretty linearly (especially in clouds). I could buy an
argument for two params if someone could explain a concrete case for
it.

@justinsb
Copy link
Member

Ah sorry, I totally missed the nuance of the default change.

Was definitely not suggesting scaling on both CPU & Memory - cores are great, it was really just curiosity :-)

@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 6d0a499 to 6771346 Compare July 13, 2016 06:27
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2016
fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax,
"Maximum number of NAT connections to track (0 to leave as-is).")
fs.Int32Var(&s.ConntrackMaxPerCore, "conntrack-max-per-core", s.ConntrackMaxPerCore,
"Maximum number of NAT connections to track per CPU core (0 to leave as-is). This is only considered if conntrack-max is 0.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to add the default value here? (32 * 1024)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value gets set in code elsewhere. I'm not really happy with the state of it all, but I didn't want to buck the trend for this PR. When you run kube-proxy -? it shows the 32k default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. thanks!

@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 6771346 to e5e98aa Compare July 13, 2016 18:48
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2016
fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax, "Maximum number of NAT connections to track (0 to leave as-is)")
fs.Int32Var(&s.ConntrackMax, "conntrack-max", s.ConntrackMax,
"Maximum number of NAT connections to track (0 to leave as-is).")
fs.Int32Var(&s.ConntrackMaxPerCore, "conntrack-max-per-core", s.ConntrackMaxPerCore,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin Is there an easy way to let user know if they set both that is incorrect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Done.

@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from e5e98aa to 266d82f Compare July 15, 2016 20:13
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 15, 2016
@matchstick matchstick added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016
@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 266d82f to 833f05c Compare July 15, 2016 22:07
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016
@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016
@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 833f05c to 85a0106 Compare July 15, 2016 23:34
@thockin thockin added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 15, 2016
For large machines we want more conntrack entries than smaller machines.
@thockin thockin force-pushed the kube-proxy-scale-conntrack-by-cores branch from 85a0106 to 1f37281 Compare July 15, 2016 23:36
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016
@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2016
@k8s-bot
Copy link

k8s-bot commented Jul 16, 2016

GCE e2e build/test passed for commit 85a0106d6086e9d7f333bd103d8b48d0a36a3d8d.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jul 16, 2016

GCE e2e build/test passed for commit 1f37281.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 6f98b69 into kubernetes:master Jul 16, 2016
@roberthbailey roberthbailey added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jul 18, 2016
k8s-github-robot pushed a commit that referenced this pull request Jul 19, 2016
…6-upstream-release-1.3

Automatic merge from submit-queue

Automated cherry pick of #28876 upstream release 1.3

Cherry pick of #28876 onto release-1.3.

Scale kube-proxy conntrack limits by cores

For large machines we want more conntrack entries than smaller machines.
@thockin thockin deleted the kube-proxy-scale-conntrack-by-cores branch November 2, 2016 06:35
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…k-of-#28876-upstream-release-1.3

Automatic merge from submit-queue

Automated cherry pick of kubernetes#28876 upstream release 1.3

Cherry pick of kubernetes#28876 onto release-1.3.

Scale kube-proxy conntrack limits by cores

For large machines we want more conntrack entries than smaller machines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants