Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory manager support for Windows nodes #128560

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

marosset
Copy link
Contributor

@marosset marosset commented Nov 5, 2024

/kind feature

What type of PR is this?

What this PR does / why we need it:

This PR adds support for memory manager on Windows nodes and add a new BestEffort policy which allows giving preferences to the Windows OS for which CPUs to bind containers to.
Windows cannot guarantee scheduling workloads to specific NUMA nodes so instead we query the OS for which CPUs are part of a given NUMA node and specify that when starting containers.

Which issue(s) this PR fixes:

Part of #125262

To enable memory manager support the kubelet configuration would look something like

cpuManagerPolicy: static
memoryManagerPolicy: BestEffort
systemReserved: 
  cpu: 500m
  memory: 500Mi
featureGates:
  MemoryManager: true
  WindowsCPUAndMemoryAffinity: true

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added Windows support for the node memory manager.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/4885-windows-cpu-and-memory-affinity

/sig windows node
/milestone v1.32
/area kubelet
/assign @jsturtevant

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 5, 2024
@k8s-ci-robot k8s-ci-robot added this to the v1.32 milestone Nov 5, 2024
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. sig/windows Categorizes an issue or PR as relevant to SIG Windows. sig/node Categorizes an issue or PR as relevant to SIG Node. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 5, 2024
pkg/kubelet/cm/container_manager_windows.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/container_manager_windows.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/internal_container_lifecycle_windows.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/memorymanager/memory_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/memorymanager/policy_best_effort.go Outdated Show resolved Hide resolved

// bestEffortPolicy is implementation of the policy interfact for the BestEffort policy
type bestEffortPolicy struct {
static *staticPolicy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not obvious to me what do we gain by wrapping the static policy. Could you please elaborate a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Windows we want the new BestEffort policy to have the same logic as the existing static policy.
I figured that doing this would reduce a lot of code duplication.

I can update the comments here to reflect that or maybe I can have the kubelet arguments specify BestEffortPolicy and then create a staticPolicy instance here for Windows.

Do you all have a preference @ffromani @jsturtevant ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating comments and wrapping seems to make sense to me. This will allow to adjust more easily if we learn of reasons to adjust.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to refresh my memory over what we discussed in the KEP before I can answer. Right now I can surely say comments is a great starting point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, first: I've nothing against reusing the code like this. We didn't do often in the codebase, but why not?
The issue I see however is that from my interpretation of the KEP I would have expected different hint generation on windows based on https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/4885-windows-cpu-and-memory-affinity#windows-memory-considerations

I think we can settle this with a good set of testcases, at very least unit tests. which seems lacking to me?
Granted, the policy is actually a thin wrapper over the static policy, which has its own tests; the tests I recommend to add are to ensure and document the expected behavior on windows.

The tests should cover preferably also the topology manager merge hint process. It will be similar in spirit to what you are adding to internal_container_lifecycle_windows_test.go, but covering the generic topology manager flow.

There's likely not enough time to do this for this cycle. Plus I want to be practical: it's an alpha feature, disabled by default, and the early feedback from having some functionality in is important as well. So I don't think it is blocking, but I DO think this is beta-blocking (to discuss this topic in detail at least) and possibly we can need another alpha iteration to sort this out depending on the aforementioned feedback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I see however is that from my interpretation of the KEP I would have expected different hint generation on windows based on https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/4885-windows-cpu-and-memory-affinity#windows-memory-considerations

I didn't see a difference in what we would provide for a hint. The major difference is in the semantic meaning of the hint, which on Windows we can't guarantee that it the hint will be respected, only make that suggestion. What kind of difference are you thinking?

There's likely not enough time to do this for this cycle. Plus I want to be practical: it's an alpha feature, disabled by default, and the early feedback from having some functionality in is important as well. So I don't think it is blocking, but I DO think this is beta-blocking (to discuss this topic in detail at least) and possibly we can need another alpha iteration to sort this out depending on the aforementioned feedback

I appreciate this sentiment. We do have a user we are working with that is willing to provide feedback but is looking for it to be in an Alpha release. From my understanding of the challenges on Linux, these types of features need some real feedback to get them dialed in, and I anticipate we will get feedback that we will need to incorporate back into the design.

Copy link
Contributor

@ffromani ffromani Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I see however is that from my interpretation of the KEP I would have expected different hint generation on windows based on https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/4885-windows-cpu-and-memory-affinity#windows-memory-considerations

I didn't see a difference in what we would provide for a hint. The major difference is in the semantic meaning of the hint, which on Windows we can't guarantee that it the hint will be respected, only make that suggestion. What kind of difference are you thinking?

I'm basing my expectations on what I saw so far on linux for the cpumanager policy. I would like to see if for example returning a different set of NUMA bitmasks or different values for Preferred field.

If we add a thin wrapper over the Static policy, we are saying that this policy should behave 1:1 like the static policy, and we are unnecessarily coupling the behavior. What I would rather see is a specification (and unit tests are a good vehicle for that) of the expected behavior, more precise than (yes, I'm oversimplifying here) "whatever Static policy does, but not enforcing".

Lacking specification is something that biten us already multiple already, so we should IMO go in the direction to have the spec spelled out more precisely, and the testcase well defined. If, after that, we can reuse the implementation, great.

If nothing else (which I can accept!) this should be discussed and elaborated in the KEP design details.
I'm happy to elaborate further in the next cycle. Again, more details in the KEP to elaborate more precisely the behavior can possibly be all we need.

There's likely not enough time to do this for this cycle. Plus I want to be practical: it's an alpha feature, disabled by default, and the early feedback from having some functionality in is important as well. So I don't think it is blocking, but I DO think this is beta-blocking (to discuss this topic in detail at least) and possibly we can need another alpha iteration to sort this out depending on the aforementioned feedback

I appreciate this sentiment. We do have a user we are working with that is willing to provide feedback but is looking for it to be in an Alpha release. From my understanding of the challenges on Linux, these types of features need some real feedback to get them dialed in, and I anticipate we will get feedback that we will need to incorporate back into the design.

Fine, so we will evaluate during the 1.33 cycle if we can move to beta (taking into account my oen feedback above) or if we need another alpha iteration.

The rest of the code overall LGTM, I'll have another pass ASAP but from my recollections I don't have anything major on my notes.

@ffromani
Copy link
Contributor

ffromani commented Nov 5, 2024

/triage accepted
/priority important-soon

initial priority assessment subject to be changed

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 5, 2024
@ffromani
Copy link
Contributor

ffromani commented Nov 5, 2024

/retitle Memory manager support for Windows nodes

@k8s-ci-robot k8s-ci-robot changed the title Memeory manager support for Windows nodes Memory manager support for Windows nodes Nov 5, 2024
@ffromani
Copy link
Contributor

ffromani commented Nov 5, 2024

@marosset adding the policy name ultimates surfaces to the user and to the kubeletconfig API object so this change MAY require api-review, please take this into account.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 5, 2024
@sftim
Copy link
Contributor

sftim commented Nov 6, 2024

Changelog suggestion

-Windows: Support memory manager on Windows 
+Added Windows support for the node memory manager.

pkg/kubelet/cm/internal_container_lifecycle_windows.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/memorymanager/memory_manager.go Outdated Show resolved Hide resolved

// bestEffortPolicy is implementation of the policy interfact for the BestEffort policy
type bestEffortPolicy struct {
static *staticPolicy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, first: I've nothing against reusing the code like this. We didn't do often in the codebase, but why not?
The issue I see however is that from my interpretation of the KEP I would have expected different hint generation on windows based on https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/4885-windows-cpu-and-memory-affinity#windows-memory-considerations

I think we can settle this with a good set of testcases, at very least unit tests. which seems lacking to me?
Granted, the policy is actually a thin wrapper over the static policy, which has its own tests; the tests I recommend to add are to ensure and document the expected behavior on windows.

The tests should cover preferably also the topology manager merge hint process. It will be similar in spirit to what you are adding to internal_container_lifecycle_windows_test.go, but covering the generic topology manager flow.

There's likely not enough time to do this for this cycle. Plus I want to be practical: it's an alpha feature, disabled by default, and the early feedback from having some functionality in is important as well. So I don't think it is blocking, but I DO think this is beta-blocking (to discuss this topic in detail at least) and possibly we can need another alpha iteration to sort this out depending on the aforementioned feedback

@marosset
Copy link
Contributor Author

marosset commented Nov 7, 2024

/lgtm cancel

need to rebase on top of #128657

Ack - I'll rebase on that as soon as it merges.

@marosset marosset force-pushed the win-mem-manager branch 2 times, most recently from ab90eb1 to 284fa5a Compare November 7, 2024 19:54
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

thanks for the rebase and sorry for the churn. You use the recently added containerMap.Clone(), so looks fine again to me.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 28376828105d5f848a8c1e75791828342d5b5d5c

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marosset, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 7, 2024
@ffromani
Copy link
Contributor

ffromani commented Nov 7, 2024

beta-blocking (or perhaps as later bugfix?) we need to update cmd/kubelet/app/options/options.go

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2024
@mrunalp
Copy link
Contributor

mrunalp commented Nov 7, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 9343fc1fc9ca096c90740b4240d16aae84833eed

@jsturtevant
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot merged commit 3c9380c into kubernetes:master Nov 7, 2024
15 checks passed
@tengqm
Copy link
Contributor

tengqm commented Nov 8, 2024

Do we have documentation updates (including the feature gate) to the website?

@marosset
Copy link
Contributor Author

marosset commented Nov 8, 2024

Do we have documentation updates (including the feature gate) to the website?

We'll use kubernetes/website#48469 to update docs for both cpu manager (#125296) and memory manager (this PR) support for windows.

@marosset marosset deleted the win-mem-manager branch November 14, 2024 23:37
@liggitt liggitt removed the api-review Categorizes an issue or PR as actively needing an API review. label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Status: Done
Development

Successfully merging this pull request may close these issues.

10 participants