Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature gate and kubelet flags for Topology Manager #74411

Merged
merged 1 commit into from
Jul 3, 2019
Merged

Add feature gate and kubelet flags for Topology Manager #74411

merged 1 commit into from
Jul 3, 2019

Conversation

nolancon
Copy link

What this PR does / why we need it:
Create feature gate for Topology Manager and add Kubelet flags.
Topology Manager based merged design proposal here: kubernetes/community#1680
Issue for tracking PRs: #72828

What type of PR is this?

/kind feature

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 22, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @nolancon. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 22, 2019
@k8s-ci-robot k8s-ci-robot added area/kubelet kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 22, 2019
@nolancon
Copy link
Author

@lmdaly
/cc @ConnorDoyle @balajismaniam

@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@@ -167,6 +167,12 @@ const (
//
// Enable pods to consume pre-allocated huge pages of varying page sizes
HugePages utilfeature.Feature = "HugePages"

// owner: @lmdaly
// alpha: v1.13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are just adding this now, this should be 1.14/1.15, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 1.15


if utilfeature.DefaultFeatureGate.Enabled(features.NodeLease) {
klet.nodeLeaseController = nodelease.NewController(klet.clock, klet.heartbeatClient, string(klet.nodeName), kubeCfg.NodeLeaseDurationSeconds, klet.onRepeatedHeartbeatFailure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove whitespace changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 11, 2019
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels May 13, 2019
@lmdaly
Copy link
Contributor

lmdaly commented May 28, 2019

@dashpole could you ok-to-test this pr?

@@ -852,7 +852,7 @@ func NewMainKubelet(kubeCfg *kubeletconfiginternal.KubeletConfiguration,
}
klet.AddPodSyncLoopHandler(activeDeadlineHandler)
klet.AddPodSyncHandler(activeDeadlineHandler)

klet.admitHandlers.AddPodAdmitHandler(klet.containerManager.GetTopologyPodAdmitHandler())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably add this line as part of #74357 so that this current PR can be merged independent of all other PRs in this chain.

@@ -149,6 +149,9 @@ func SetDefaults_KubeletConfiguration(obj *kubeletconfigv1beta1.KubeletConfigura
// Keep the same as default NodeStatusUpdateFrequency
obj.CPUManagerReconcilePeriod = metav1.Duration{Duration: 10 * time.Second}
}
if obj.TopologyManagerPolicy == "" {
obj.TopologyManagerPolicy = "Preferred"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK that "Preferred" is capitalized here, but is lower-case in the TopologyManager itself? Does it get normalized to lower-case somewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to make a difference, but changed anyway for consistency.

@dashpole
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 28, 2019
@lmdaly
Copy link
Contributor

lmdaly commented May 30, 2019

@liggitt the policy is passed into the New Topology Manager func where it gets checked.
https://github.com/kubernetes/kubernetes/pull/74357/files#diff-06aa1b61a9bc4a50aa8d8b0a69d95bb2R289 - passed in
and checked here https://github.com/kubernetes/kubernetes/pull/73580/files#diff-a19f204f56e91b0871491b88c0e0b139R71

Is there additional check we should be doing before this?

@liggitt
Copy link
Member

liggitt commented May 31, 2019

Is there additional check we should be doing before this?

func ValidateKubeletConfiguration(kc *kubeletconfig.KubeletConfiguration) validates a config object and should verify the specified option is valid.

does the default preferred value map to the current implicit behavior, or is the entire idea of policies new?

@liggitt
Copy link
Member

liggitt commented May 31, 2019

if the TopologyManager feature is disabled, should that prevent someone from setting a policy or from setting a particular policy like strict? we don't want to let people set things in config that are silently ignored, so I'd expect something along the lines of this:

if kc.RotateCertificates && !localFeatureGate.Enabled(features.RotateKubeletClientCertificate) {
allErrors = append(allErrors, fmt.Errorf("invalid configuration: RotateCertificates %v requires feature gate RotateKubeletClientCertificate", kc.RotateCertificates))
}
if kc.ServerTLSBootstrap && !localFeatureGate.Enabled(features.RotateKubeletServerCertificate) {
allErrors = append(allErrors, fmt.Errorf("invalid configuration: ServerTLSBootstrap %v requires feature gate RotateKubeletServerCertificate", kc.ServerTLSBootstrap))
}

@lmdaly
Copy link
Contributor

lmdaly commented May 31, 2019

@liggitt Will add the check for the policy and feature gate setting. The policies are a new concept and give additional functionality to resource allocation for cpu and devices

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2019
@@ -149,6 +149,9 @@ func SetDefaults_KubeletConfiguration(obj *kubeletconfigv1beta1.KubeletConfigura
// Keep the same as default NodeStatusUpdateFrequency
obj.CPUManagerReconcilePeriod = metav1.Duration{Duration: 10 * time.Second}
}
if obj.TopologyManagerPolicy == "" {
obj.TopologyManagerPolicy = "preferred"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we have default policy 'none' - meaning, "do not do anything, work as older version of kubelet, even if the feature gate is enabled" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmdaly:
The policies are a new concept and give additional functionality to resource allocation for cpu and devices

I agree with @kad that the default value of the config field should preserve current behavior

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 24, 2019
@nolancon
Copy link
Author

nolancon commented Jun 24, 2019

@liggitt @kad this PR has been updated - Topology Manager default policy is now "none". None policy will do nothing and behave as before, even if feature gate is enabled.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 24, 2019
@@ -149,6 +149,9 @@ func SetDefaults_KubeletConfiguration(obj *kubeletconfigv1beta1.KubeletConfigura
// Keep the same as default NodeStatusUpdateFrequency
obj.CPUManagerReconcilePeriod = metav1.Duration{Duration: 10 * time.Second}
}
if obj.TopologyManagerPolicy == "" {
obj.TopologyManagerPolicy = "none"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, define constants for the allowed values (in the same file as the config type) and use them here, in validation, and in fuzzing

@@ -409,6 +409,13 @@ type KubeletConfiguration struct {
// Default: "10s"
// +optional
CPUManagerReconcilePeriod metav1.Duration `json:"cpuManagerReconcilePeriod,omitempty"`
// TopologyManagerPolicy is the name of the policy to use.
// Requires the TopologyManager feature gate to be enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Policies other than none require the TopologyManager feature gate to be enabled."

@liggitt
Copy link
Member

liggitt commented Jun 28, 2019

thanks, one nit on constants, one on doc, lgtm otherwise. please squash before merge

Nit: remove capitalization of preferred
Remove line from kubelet and add to separate PR for easier merge

nit: dependency added to separate PR

Add check to ensure strict policy cannot be set without feature gate enabled

Topology Manager runs "none" policy by default.

Added constants for policies and updated documentation.
@nolancon
Copy link
Author

nolancon commented Jul 2, 2019

/retest

@liggitt
Copy link
Member

liggitt commented Jul 3, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 3, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, liggitt, nolancon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 3, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit bbef01a into kubernetes:master Jul 3, 2019
@liggitt liggitt added this to the v1.16 milestone Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants