Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added unschedulable taint #61161

Merged
merged 1 commit into from
Mar 16, 2018
Merged

Conversation

k82cn
Copy link
Member

@k82cn k82cn commented Mar 14, 2018

Signed-off-by: Da K. Ma klaus1982.cn@gmail.com

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
part of #59194; fixes #61050

Release note:

When `TaintNodesByCondition` enabled, added `node.kubernetes.io/unschedulable:NoSchedule`
 taint to the node if `spec.Unschedulable` is true.

When `ScheduleDaemonSetPods` enabled, `node.kubernetes.io/unschedulable:NoSchedule` 
toleration is added automatically to DaemonSet Pods; so the `unschedulable` field of 
a node is not respected by the DaemonSet controller.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 14, 2018
@k82cn
Copy link
Member Author

k82cn commented Mar 14, 2018

/cc @janetkuo @bsalamat

@@ -438,7 +438,20 @@ func (nc *Controller) doNoScheduleTaintingPass(node *v1.Node) error {
}
}
}
if node.Spec.Unschedulable {
// If unscheduable, append related taint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what removes this taint when the node becomes schedulable again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely needs a test demonstrating this lifecycle works properly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This might break kubectl cordon/uncordon too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janetkuo +1 as we need to ensure how the taint is removed when kubectl uncordon

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal is handled here: https://github.com/kubernetes/kubernetes/pull/61161/files#diff-43035107687eb30550696751ac1066e4R458 . we'll get target taints according to the conditions, and the exist taints on the node; and then get taintsToAdd and taintsToDel by TaintSetDiff. SwapNodeControllerTaint will update the taints accordingly.

re a test: definitely :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added integration test for unschedulable; tested removal manually.

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Unschedulable field is cleared, we need to remove the "NoSchedule" taint.


// AppendUnscheduableTaintIfNotExist appends unscheduable toleration to `.spec` if not exist; otherwise,
// no changes to `.spec.tolerations`.
func AppendUnscheduableTaintIfNotExist(tolerations []v1.Toleration) []v1.Toleration {
Copy link
Member

@bsalamat bsalamat Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to AppendNoScheduleTolerationIfNotExist.

func AppendUnscheduableTaintIfNotExist(tolerations []v1.Toleration) []v1.Toleration {
unscheduableTaintExist := false
for _, t := range tolerations {
if t.Key == algorithm.TaintNodeUnscheduable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the key being equal to the taint key does not necessarily mean that the toleration is the right one that we need. It could have an operator or a different effect. We should check if the key is the same, operator is "Exists" and effect is "NoSchedule".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsalamat - we should create a function under util/toleration that checks if a toleration exists in a slice of tolerations, IIRC we’re doing the same for taints.


if !unscheduableTaintExist {
tolerations = append(tolerations, v1.Toleration{
Key: algorithm.TaintNodeUnscheduable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to set operator to "Exists" as well.

// TaintNodeUnscheduable will be added when node becomes unscheduable
// and feature-gate for TaintNodesByCondition flag is enabled,
// and removed when node becomes scheduable.
TaintNodeUnscheduable = "node.kubernetes.io/unscheduable"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/TaintNodeUnscheduable/TaintNodeUnschedulable

s/unscheduable/unschedulable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@krzyzacy
Copy link
Member

/milestone v1.10

@k8s-ci-robot k8s-ci-robot added this to the v1.10 milestone Mar 14, 2018
@janetkuo janetkuo added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/bug Categorizes issue or PR as related to a bug. labels Mar 14, 2018
@jdumars
Copy link
Member

jdumars commented Mar 14, 2018

/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 14, 2018
// AppendUnscheduableTaintIfNotExist appends unscheduable toleration to `.spec` if not exist; otherwise,
// no changes to `.spec.tolerations`.
func AppendUnscheduableTaintIfNotExist(tolerations []v1.Toleration) []v1.Toleration {
unscheduableTaintExist := false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo please rename tounschedulableTaintExist

@janetkuo janetkuo added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 14, 2018
@dims
Copy link
Member

dims commented Mar 14, 2018

@k82cn is this related to #60763 ?

@jdumars
Copy link
Member

jdumars commented Mar 14, 2018

@dims see comment from @janetkuo in #61050

@janetkuo
Copy link
Member

is this related to #60763?

@dims it's not. #60763 is not caused by DaemonSet, see #60763 (comment)

@liggitt
Copy link
Member

liggitt commented Mar 14, 2018

@krzyzacy is this release blocking? it looks like this is just changing function behind alpha gates, and is still WIP. Should it move out of the milestone?

edit: I see it is fixing alpha function for the alpha feature CI job

@dims
Copy link
Member

dims commented Mar 16, 2018

@mikedanese ping for pkg/controller/daemon/ approval
@gmarek @bowei ping for pkg/controller/nodelifecycle approval

@k82cn
Copy link
Member Author

k82cn commented Mar 16, 2018

/assign @gmarek

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@bsalamat @gmarek @janetkuo @k82cn

Pull Request Labels
  • sig/scheduling: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/bug: Fixes a bug discovered during the current release.
Help

@bsalamat
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 16, 2018
@gmarek
Copy link
Contributor

gmarek commented Mar 16, 2018

LGTM for NodeController.

/approve

@jberkus
Copy link

jberkus commented Mar 16, 2018

@mikedanese @janetkuo can one of you approve this PR please?

@dims
Copy link
Member

dims commented Mar 16, 2018

@kow3ns does this look good to you too?

@janetkuo
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, gmarek, janetkuo, k82cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 16, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[test failed] gci-gce-alpha-features