Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running kube-proxy as a DaemonSet when using kube-up.sh on GCE #50705

Merged
merged 3 commits into from
Aug 29, 2017

Conversation

MrHohn
Copy link
Member

@MrHohn MrHohn commented Aug 15, 2017

What this PR does / why we need it:
From #23225, this PR adds an option for user to run kube-proxy as a DaemonSet instead of static pods using GCE startup scripts. By default, kube-proxy will run as static pods.

This is the first step for moving kube-proxy into a DaemonSet in GCE, remaining tasks will be tracked on #23225.

Special notes for your reviewer:
The last commit are purely for testing out kube-proxy as daemonset via CIs.

cc @kubernetes/sig-network-misc @kubernetes/sig-cluster-lifecycle-misc

Release note (Edited):

[Experiment Only] When using kube-up.sh on GCE, user could set env `KUBE_PROXY_DAEMONSET=true` to run kube-proxy as a DaemonSet. kube-proxy is run as static pods by default.

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 15, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Aug 15, 2017
@MrHohn
Copy link
Member Author

MrHohn commented Aug 15, 2017

Seems like I missed the debian bits, will reassign when ready.

/assign
/unassign @gmarek @zmerlynn

@k8s-ci-robot k8s-ci-robot assigned MrHohn and unassigned gmarek and zmerlynn Aug 15, 2017
@MrHohn
Copy link
Member Author

MrHohn commented Aug 16, 2017

CIs passed, reassign...

/assign @gmarek @zmerlynn
/unassign

@@ -25,6 +25,8 @@ set -o errexit
set -o nounset
set -o pipefail

LOCAL_NODE_LABELS="${NODE_LABELS:-}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting kube-proxy-ds-ready label only on nodes so that kube-proxy won't be scheduled on master. It is due to the fact that daemonset controller does not respect node.spec.unschedulable field and we don't set taint on master by default.

I'm a bit concerned about defining this LOCAL_NODE_LABELS env as it differs from NODE_LABELS set in kube-env and may introduce inconsistency and confusion.

Another possible option is set NoSchedule taint on master (as defined by #39112). From #43537 it seems that requires a comprehensive plan but don't have a clear timeline and owner yet. So may not want to block this on that.

cc @mikedanese @thockin

# Please keep kube-proxy configuration in-sync with:
# cluster/saltbase/salt/kube-proxy/kube-proxy.manifest

apiVersion: extensions/v1beta1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized we have removed support for debian masters in GCE: #41666. So I removed some unnecessary salt configuration from this addon file. cc @bowei

@MrHohn
Copy link
Member Author

MrHohn commented Aug 23, 2017

Forgot to cc sig-node as well. @kubernetes/sig-node-pr-reviews

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 23, 2017
@yujuhong
Copy link
Contributor

kube-proxy as a DaemonSet would only be an alpha feature in 1.8. The rollback mechanism suggested by @thockin #50705 (comment) seems sufficient to me.

I agree with @dchen1107 that we should probably wait for 1.9 before making this change. This change works best when pod priority and scheduler preemption work reliably. They are both alpha features in 1.8. So, I would wait for 1.9 when they get some mileage.

Would having this as an alpha feature help test and stabilize pod priority and scheduler preemption?

@dchen1107
Copy link
Member

dchen1107 commented Aug 25, 2017

kube-proxy as a DaemonSet would only be an alpha feature in 1.8. The rollback mechanism suggested by @thockin #50705 (comment) seems sufficient to me.

First, no one told me that is an alpha feature. According to @MrHohn, in his original proposal, this should be alpha feature, but decided not later. Using the new logic introduced by pod priority feature is gated by an alpha. Also even we plan to make this alpha here, it will be very error-prone. Asking @bowei @gmarek how painful & messy to introduce an alpha feature gate through kube-up scripts.

To achieve @thockin's suggest, @MrHohn has to add extra complexity to his logic on already-messy-kube-up. He has to introduce two vars:

  • one for master: want to run kube-proxy as the daemonset, so that addon manager will create those kube-proxy daemonest spec
  • one for node to remove that node label, instead of populating the static pod manifest.

If sig-scalability and sig-scheduling have the confidence on the rescheduler (#50705 (comment)), and we have low chance to run into case 3), yes, let's take the risk to do this. I want to get rid of those static pods from worker nodes long time ago.

But if no such confidence, why in a rush? We don't have vertical scaling yet; we don't have GKE deployment & addon manager can integrate that feature yet. No gain in a short term, instead of risk.

@bsalamat
Copy link
Member

kube-proxy as a DaemonSet would only be an alpha feature in 1.8. The rollback mechanism suggested by @thockin #50705 (comment) seems sufficient to me.
Would having this as an alpha feature help test and stabilize pod priority and scheduler preemption?

Sure, this sounds good to me from the scheduling point of view. You should make sure that sig-network is fine with this.

@yujuhong
Copy link
Contributor

yujuhong commented Aug 25, 2017

First, no one told me that is an alpha feature. According to @MrHohn, in his original proposal, this should be alpha feature, but decided not later. Using the new logic introduced by pod priority feature is gated by an alpha. Also even we plan to make this alpha here, it will be very error-prone. Asking @bowei @gmarek how painful & messy to introduce an alpha feature gate through kube-up scripts.

I believe this was intended to be an Alpha feature, but I could be wrong (of course).

I don't doubt that it's a pain to change the kube-up scripts, but isn't this necessary unless we are so confident that we can ship this as a beta/GA feature directly? If that's the preferred route (i.e., direct to beta), I agree that it's premature to introduce this feature in 1.8.

To achieve @thockin's suggest, @MrHohn has to add extra complexity to his logic on already-messy-kube-up. He has to introduce two vars:

one for master: want to run kube-proxy as the daemonset, so that addon manager will create those kube-proxy daemonest spec
one for node to remove that node label, instead of populating the static pod manifest.

Even if this goes directly to Beta, we still need to have a rollback mechanism. I don't see how we can avoid that....

Regardless whether this makes 1.8 or not, I think it might be good to add the code (as a pre-alpha experimental feature?) and start a test cluster to run kube-proxy as a DaemonSet with pod priority and scheduler preemption.

@mikedanese
Copy link
Member

If this is going directly to beta, we shouldn't include a configuration option to turn it off.

fi
if [[ -n "${node_labels:-}" ]]; then
flags+=" --node-labels=${node_labels}"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in start-kubelet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, fixed now.

@dchen1107
Copy link
Member

Ok, quickly chat with @MrHohn and @bowei, so that we can move forward. We agreed upon that there is no way to make this an alpha feature in OSS k8s. But we could only enable this feature on GKE's alpha clusters if we want to.

This particular change along with other pending changes including the pod priority and preemption, the vertical scaling, etc. can give us a much stable and easy managing kube-proxy and cluster in a long run, but not in 1.8 for sure. The current state of this change doesn't bring any value to the users directly at this moment, except to introduce more complexity to GKE alpha clusters. At the end, all of us agreed that there is no value to enable this for GKE alpha clusters in 1.8 since there is no opt-in option for the user to choose which alpha features to run.

But enabling the test for the future movement is a good idea. We decided to disable KUBE_PROXY_DAEMONSET for test by default, instead, creating a separate e2e job to enable this for testing.

@MrHohn MrHohn force-pushed the kube-proxy-ds branch 3 times, most recently from 277bdf2 to f4b5cdc Compare August 28, 2017 18:17
@MrHohn
Copy link
Member Author

MrHohn commented Aug 28, 2017

Test passed. PR ready for another round.

@mikedanese
Copy link
Member

LGTM, please squash fixup commits.

@MrHohn
Copy link
Member Author

MrHohn commented Aug 28, 2017

@mikedanese Thanks, squashed commits and made KUBE_PROXY_DAEMONSET default to false.

@mikedanese
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 28, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MrHohn, mikedanese

Associated issue: 23225

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 28, 2017
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)

@k8s-github-robot k8s-github-robot merged commit 04b3ab9 into kubernetes:master Aug 29, 2017
hh pushed a commit to ii/kubernetes that referenced this pull request Aug 30, 2017
…-tests

Automatic merge from submit-queue (batch tested with PRs 51228, 50185, 50940, 51544, 51543)

Add upgrades tests for kube-proxy daemonset migration path

**What this PR does / why we need it**:
From kubernetes#23225, this is a part of setting up CIs to validate the kube-proxy migration path (static pods -> daemonset and reverse).
The other part of the works (adding real CIs that run these tests) will be in a separate PR against [kubernetes/test-infra](https://github.com/kubernetes/test-infra).

Though this is currently blocked by kubernetes#50705.

**Special notes for your reviewer**:
cc @roberthbailey  @pwittrock 

**Release note**:

```release-note
NONE
```
@MrHohn MrHohn deleted the kube-proxy-ds branch August 30, 2017 23:13
k8s-github-robot pushed a commit that referenced this pull request Sep 3, 2017
Automatic merge from submit-queue (batch tested with PRs 51553, 51538, 51663, 51069, 51737)

Allow enable pod priority feature gate for GCE and configure priority for kube-proxy

**What this PR does / why we need it**:
From #23225, this PR adds an option for user to enable pod priority feature gate using GCE startup scripts, and configure pod priority for kube-proxy when enabled.

The setup `priorityClassName: system` derives from: https://github.com/kubernetes/kubernetes/blob/ce1485c626d8cddc9dbffe24f91f7d62480148cd/staging/src/k8s.io/api/core/v1/types.go#L2536-L2542

The plan is to configure pod priority for kube-proxy daemonset (#50705) in the same way.

**Special notes for your reviewer**:
cc @bsalamat @davidopp @thockin 

**Release note**:

```release-note
When using kube-up.sh on GCE, user could set env `ENABLE_POD_PRIORITY=true` to enable pod priority feature gate.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.