Change kube-proxy & fluentd CPU request to 20m/80m. #23646

cjcullen · 2016-03-30T21:32:59Z

On 1.1.8, our per-node CPU overhead was 100m (fluentd 100m, kube-proxy unaccounted). To stay at that number, we should adjust fluentd to 80m, and give kube-proxy 20m.

Fixes the "failure to schedule kube-dns" part of #23556.

k8s-github-robot · 2016-03-30T21:41:08Z

Labelling this PR as size/XS

bprashanth · 2016-03-30T21:49:48Z

I think this is a good compromise for right now. Unfortunately kube-proxy will still get starved when all the addons land on the same node, becaues they all have higher limits, which leads to a bad ux since services are a core part of our system, but this is cpu so it should be OK.

We will have to bump up these limits sometime. I'm not sure how we'd do that yet.

This might also result in test flake, but we can handle that as it comes, we can easily figure out a higher timeout till a roll-forward solution presents itself.

k8s-bot · 2016-03-30T22:06:36Z

GCE e2e build/test passed for commit 915b6c6513317078df500daa37d055646ebc9a97.

bprashanth · 2016-03-30T22:07:42Z

@k8s-bot unit test this github issue: #23647

cjcullen · 2016-03-30T22:18:33Z

I'm going to switch it to 80/20 for fluentd/kube-proxy. In 1.1.8, kube-proxy had no reservation, so this will still be strictly better for kube-proxy, while not constraining fluentd too much.

bprashanth · 2016-03-30T22:47:30Z

cluster/saltbase/salt/kube-proxy/init.sls

@@ -17,7 +17,7 @@
    - makedirs: true
    - dir_mode: 755
    - context:
-        cpurequest: '200m'
+        cpurequest: '20m'


please add a comment with something like:

// 20m might cause cpu starvation for kube-proxy resulting in delayed service updates. However // we can't given it more cpu till we have a way to do so without affecting existing Kubernetes clusters. // Any change made here must reflect in a proportional change to the requests of other daemons on // the node, eg fluentd.

I meant # wherever I said //

How's that look?

alex-mohr · 2016-03-30T23:00:57Z

And documenting for the record, @a-robinson mentioned that fluentd in 1.2 is more efficient than 1.1, so we suspect there won't be much if any regression in user-visible logging performance by downsizing it from 100m to 80m.

k8s-github-robot · 2016-03-30T23:10:54Z

Labelling this PR as size/S

k8s-bot · 2016-03-30T23:10:59Z

GCE e2e build/test passed for commit 20a51cb0e1eebb6bf4bbba8c63a2e37610f24464.

k8s-bot · 2016-03-30T23:15:40Z

GCE e2e build/test passed for commit 6386938d3e079994b142aa0c3a06aafadfb2a84f.

bprashanth · 2016-03-30T23:39:32Z

cluster/saltbase/salt/fluentd-gcp/fluentd-gcp.yaml

@@ -14,7 +14,7 @@ spec:
      limits:
        memory: 200Mi
      requests:
-        cpu: 100m
+        cpu: 80m


please add the same note of caution here.

k8s-bot · 2016-03-30T23:49:43Z

GCE e2e build/test failed for commit 4c18769caca655eb74d8180c9d08d1b0f892c754.

Please reference the list of currently known flakes when examining this failure. If you request a re-test, you must reference the issue describing the flake.

bprashanth · 2016-03-31T00:00:36Z

Your bug is #23659 in case you really need to push this out and can't re-up

cjcullen · 2016-03-31T02:55:21Z

Although that doesn't sound like a kube-proxy flake...

cjcullen · 2016-03-31T03:02:51Z

@k8s-bot test this github issue: #23668

cjcullen · 2016-03-31T03:06:13Z

@k8s-bot test this github issue: #23668

k8s-bot · 2016-03-31T03:39:48Z

GCE e2e build/test passed for commit 26a6c66.

alex-mohr · 2016-03-31T04:22:34Z

Mergebot is blocked on unrelated kubemark test suite broken for past 5 hours. This PR passed tests, is low risk, and fixes a significant regression in the amount of cpu available to users on each node, so manually merging.

wojtek-t · 2016-03-31T17:16:35Z

@gmarek - FYI

david-mcmahon · 2016-03-31T18:10:44Z

The cherrypick process is defined here. Hoping to get this PR merged today, but the details are here. You need to run the cherry pick tool yourself to get these onto the branch. @bgrant0607 will then approve them for merge.

bgrant0607 · 2016-03-31T20:30:29Z

Every cherrypick needs a release note.

bgrant0607 · 2016-03-31T20:32:18Z

This PR has no description and no reference to a P0 issue.

cjcullen · 2016-03-31T20:37:06Z

Fixes the "failure to schedule kube-dns" part of #23556.

cjcullen · 2016-03-31T20:38:22Z

Internal bugref: 27924350

…46-upstream-release-1.2 Automated cherry pick of #23646

k8s-cherrypick-bot · 2016-03-31T21:03:55Z

Commit found in the "release-1.2" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

fagiani · 2016-04-02T08:24:53Z

@cjcullen thanks for this PR, it helps a lot to get smaller clusters running smooth. I've tested with 1 master and 2 node hosts on AWS m3.medium machines and it works great!

thockin · 2016-04-04T05:03:16Z

@vishh @dchen1107 how far are we from having a way to use docker's --cgroup-parent logic? This might make a cool intern project if we can spec it enough. Then we could have a handful of special cgroups (let's call it "sys" for things that need to run on each machine) that some pods can ask to be contained in. I doubt that is the right API but we can work on that as we spec...

vishh · 2016-04-04T14:24:53Z

@thockin that's the plan. My intern is most likely working on cgroups.
On Apr 3, 2016 10:03 PM, "Tim Hockin" notifications@github.com wrote:

@vishh https://github.com/vishh @dchen1107
https://github.com/dchen1107 how far are we from having a way to use
docker's --cgroup-parent logic? This might make a cool intern project if we
can spec it enough. Then we could have a handful of special cgroups (let's
call it "sys" for things that need to run on each machine) that some pods
can ask to be contained in. I doubt that is the right API but we can work
on that as we spec...

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#23646 (comment)

vishh · 2016-04-14T17:02:59Z

BTW we already use cgroup parent, except that it is not configurable per
pod.
On Apr 3, 2016 10:03 PM, "Tim Hockin" notifications@github.com wrote:

@vishh https://github.com/vishh @dchen1107
https://github.com/dchen1107 how far are we from having a way to use
docker's --cgroup-parent logic? This might make a cool intern project if we
can spec it enough. Then we could have a handful of special cgroups (let's
call it "sys" for things that need to run on each machine) that some pods
can ask to be contained in. I doubt that is the right API but we can work
on that as we spec...

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#23646 (comment)

davidopp · 2016-04-16T23:46:02Z

I don't think that will solve the problem of occasionally needing to increase the sys cgroup's size from time to time, though.

…ck-of-#23646-upstream-release-1.2 Automated cherry pick of kubernetes#23646

googlebot added the cla: yes label Mar 30, 2016

cjcullen changed the title ~~Change kube-proxy & fluentd CPU request to 50m.~~ WIP: Change kube-proxy & fluentd CPU request to 50m. Mar 30, 2016

k8s-github-robot assigned davidopp Mar 30, 2016

k8s-github-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 30, 2016

cjcullen force-pushed the kubeproxy branch 3 times, most recently from 9040ce4 to 20a51cb Compare March 30, 2016 22:28

cjcullen changed the title ~~WIP: Change kube-proxy & fluentd CPU request to 50m.~~ WIP: Change kube-proxy & fluentd CPU request to 25m/75m. Mar 30, 2016

cjcullen changed the title ~~WIP: Change kube-proxy & fluentd CPU request to 25m/75m.~~ Change kube-proxy & fluentd CPU request to 25m/75m. Mar 30, 2016

cjcullen force-pushed the kubeproxy branch from 20a51cb to 6386938 Compare March 30, 2016 22:41

cjcullen changed the title ~~Change kube-proxy & fluentd CPU request to 25m/75m.~~ Change kube-proxy & fluentd CPU request to 20m/80m. Mar 30, 2016

bprashanth reviewed Mar 30, 2016
View reviewed changes

cjcullen force-pushed the kubeproxy branch from 6386938 to 4c18769 Compare March 30, 2016 23:04

k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 30, 2016

bprashanth reviewed Mar 30, 2016
View reviewed changes

bprashanth mentioned this pull request Mar 31, 2016

gcloud complains about node not found #23659

Closed

bprashanth added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 31, 2016

cjcullen mentioned this pull request Mar 31, 2016

Possible e2e flake: Services [It] should be able to up and down services #23668

Closed

alex-mohr merged commit 2e89f55 into kubernetes:master Mar 31, 2016

alex-mohr added cherrypick-candidate cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. labels Mar 31, 2016

alex-mohr added this to the v1.2 milestone Mar 31, 2016

cjcullen mentioned this pull request Mar 31, 2016

Automated cherry pick of #23646 #23685

Merged

eparis mentioned this pull request Mar 31, 2016

cherrypick: auto-approve cherrypicks if the pick to master was approved kubernetes-retired/contrib#686

Merged

bgrant0607 added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 31, 2016

bgrant0607 added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Mar 31, 2016

bgrant0607 added a commit that referenced this pull request Mar 31, 2016

Merge pull request #23685 from cjcullen/automated-cherry-pick-of-#236…

12d7dec

…46-upstream-release-1.2 Automated cherry pick of #23646

k8s-cherrypick-bot removed the cherrypick-candidate label Mar 31, 2016

shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016

Merge pull request kubernetes#23685 from cjcullen/automated-cherry-pi…

4e051e8

…ck-of-#23646-upstream-release-1.2 Automated cherry pick of kubernetes#23646

shouhong pushed a commit to shouhong/kubernetes that referenced this pull request Feb 14, 2017

Merge pull request kubernetes#23685 from cjcullen/automated-cherry-pi…

6c1fc83

…ck-of-#23646-upstream-release-1.2 Automated cherry pick of kubernetes#23646

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change kube-proxy & fluentd CPU request to 20m/80m. #23646

Change kube-proxy & fluentd CPU request to 20m/80m. #23646

cjcullen commented Mar 30, 2016

k8s-github-robot commented Mar 30, 2016

bprashanth commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

bprashanth commented Mar 30, 2016

cjcullen commented Mar 30, 2016

bprashanth Mar 30, 2016

bprashanth Mar 30, 2016

cjcullen Mar 30, 2016

alex-mohr commented Mar 30, 2016

k8s-github-robot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

bprashanth Mar 30, 2016

cjcullen Mar 31, 2016

k8s-bot commented Mar 30, 2016

bprashanth commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

k8s-bot commented Mar 31, 2016

alex-mohr commented Mar 31, 2016

wojtek-t commented Mar 31, 2016

david-mcmahon commented Mar 31, 2016

bgrant0607 commented Mar 31, 2016

bgrant0607 commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

k8s-cherrypick-bot commented Mar 31, 2016

fagiani commented Apr 2, 2016

thockin commented Apr 4, 2016

vishh commented Apr 4, 2016

vishh commented Apr 14, 2016

davidopp commented Apr 16, 2016

Change kube-proxy & fluentd CPU request to 20m/80m. #23646

Change kube-proxy & fluentd CPU request to 20m/80m. #23646

Conversation

cjcullen commented Mar 30, 2016

k8s-github-robot commented Mar 30, 2016

bprashanth commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

bprashanth commented Mar 30, 2016

cjcullen commented Mar 30, 2016

bprashanth Mar 30, 2016

Choose a reason for hiding this comment

bprashanth Mar 30, 2016

Choose a reason for hiding this comment

cjcullen Mar 30, 2016

Choose a reason for hiding this comment

alex-mohr commented Mar 30, 2016

k8s-github-robot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

bprashanth Mar 30, 2016

Choose a reason for hiding this comment

cjcullen Mar 31, 2016

Choose a reason for hiding this comment

k8s-bot commented Mar 30, 2016

bprashanth commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

k8s-bot commented Mar 31, 2016

alex-mohr commented Mar 31, 2016

wojtek-t commented Mar 31, 2016

david-mcmahon commented Mar 31, 2016

bgrant0607 commented Mar 31, 2016

bgrant0607 commented Mar 31, 2016

cjcullen commented Mar 31, 2016

cjcullen commented Mar 31, 2016

k8s-cherrypick-bot commented Mar 31, 2016

fagiani commented Apr 2, 2016

thockin commented Apr 4, 2016

vishh commented Apr 4, 2016

vishh commented Apr 14, 2016

davidopp commented Apr 16, 2016