-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Support memory qos with cgroups v2 #102970
Conversation
Hi @borgerli. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc |
/ok-to-test |
The first commit of this PR,"add unified on cri to support cgroup v2", is being reviewed in another PR: #102578. |
/test pull-kubernetes-node-crio-cgrpv2-e2e |
1 similar comment
/test pull-kubernetes-node-crio-cgrpv2-e2e |
The check of |
@borgerli: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/assign @derekwaynecarr |
/approve based on @bobbypage and @mrunalp and @ehashman reviews. |
Given that pull-kubernetes-e2e-gce-alpha-features is passing there is at least some coverage for the feature flag being enabled, but since none of the logic of this PR is triggered without the feature flag enabled and cgroupsv2, IMO this needs a test job added. Since none of our available jobs are able to check this right now, I'm hesitant to merge without exercising these code paths in CI. This PR should have had some test-infra changes submitted simultaneously so that we can trigger such a job on a PR. See e.g. https://github.com/kubernetes/test-infra/blob/58ae43ff3e79c2958827883f5f5b0164084bfb23/config/jobs/kubernetes/sig-node/sig-node-presubmit.yaml#L377-L415 If we can't get something merged in test-infra by EOD to test this I'll defer to Node leads to make the call since this is alpha, but I really think we need to add a CI job. This hasn't actually run through CI so I don't have confidence that it works. |
/hold to resolve discussion |
/test pull-kubernetes-node-memoryqos-cgrpv2 |
Job passed! 🥳 https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/102970/pull-kubernetes-node-memoryqos-cgrpv2/1413181901090328576/ /hold cancel |
/skip |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: borgerli, ehashman, mrunalp, thockin, xiaoxubeii The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/milestone v1.22 |
Here is a second batch for feature gate updates in 1.22. - CPUManagerPolicyOptions kubernetes/kubernetes#101432 - ControllerManagerLeaderMigration kubernetes/kubernetes#103533 - DelegateFSGroupToCSIDriver kubernetes/kubernetes#103244 - DynamicKubeletConfig kubernetes/kubernetes#102966 - EndpointSliceProxying kubernetes/kubernetes#103451 - EndpointSliceTerminatingCondition kubernetes/kubernetes#103596 - HugePageStorageMediumSize kubernetes/kubernetes#99144 - JobTrackingWithFinalizers kubernetes/kubernetes#98817 (also tracked in kubernetes#28841, can rebase). - MemoryQoS kubernetes/kubernetes#102970 - NodeSwap kubernetes/kubernetes#102823, kubernetes/kubernetes#103553 - ServiceInternalTrafficPolicy kubernetes/kubernetes#103462 - StatefulSetAutoDeletePVC kubernetes/kubernetes#99378 - WindowsEndpointSliceProxying kubernetes/kubernetes#103451 Some of these needs more detailed documentation.
Here is a second batch for feature gate updates in 1.22. - CPUManagerPolicyOptions kubernetes/kubernetes#101432 - ControllerManagerLeaderMigration kubernetes/kubernetes#103533 - DelegateFSGroupToCSIDriver kubernetes/kubernetes#103244 - DynamicKubeletConfig kubernetes/kubernetes#102966 - EndpointSliceProxying kubernetes/kubernetes#103451 - EndpointSliceTerminatingCondition kubernetes/kubernetes#103596 - HugePageStorageMediumSize kubernetes/kubernetes#99144 - JobTrackingWithFinalizers kubernetes/kubernetes#98817 (also tracked in kubernetes#28841, can rebase). - MemoryQoS kubernetes/kubernetes#102970 - NodeSwap kubernetes/kubernetes#102823, kubernetes/kubernetes#103553 - ServiceInternalTrafficPolicy kubernetes/kubernetes#103462 - StatefulSetAutoDeletePVC kubernetes/kubernetes#99378 - WindowsEndpointSliceProxying kubernetes/kubernetes#103451 Some of these needs more detailed documentation.
Here is a second batch for feature gate updates in 1.22. - CPUManagerPolicyOptions kubernetes/kubernetes#101432 - ControllerManagerLeaderMigration kubernetes/kubernetes#103533 - DelegateFSGroupToCSIDriver kubernetes/kubernetes#103244 - DynamicKubeletConfig kubernetes/kubernetes#102966 - EndpointSliceProxying kubernetes/kubernetes#103451 - EndpointSliceTerminatingCondition kubernetes/kubernetes#103596 - HugePageStorageMediumSize kubernetes/kubernetes#99144 - JobTrackingWithFinalizers kubernetes/kubernetes#98817 (also tracked in kubernetes#28841, can rebase). - MemoryQoS kubernetes/kubernetes#102970 - ServiceInternalTrafficPolicy kubernetes/kubernetes#103462 - StatefulSetAutoDeletePVC kubernetes/kubernetes#99378 - WindowsEndpointSliceProxying kubernetes/kubernetes#103451 Some of these needs more detailed documentation.
Here is a second batch for feature gate updates in 1.22. - CPUManagerPolicyOptions kubernetes/kubernetes#101432 - ControllerManagerLeaderMigration kubernetes/kubernetes#103533 - DynamicKubeletConfig kubernetes/kubernetes#102966 - EndpointSliceProxying kubernetes/kubernetes#103451 - EndpointSliceTerminatingCondition kubernetes/kubernetes#103596 - HugePageStorageMediumSize kubernetes/kubernetes#99144 - JobTrackingWithFinalizers kubernetes/kubernetes#98817 (also tracked in kubernetes#28841, can rebase). - MemoryQoS kubernetes/kubernetes#102970 - ServiceInternalTrafficPolicy kubernetes/kubernetes#103462 - StatefulSetAutoDeletePVC kubernetes/kubernetes#99378 - WindowsEndpointSliceProxying kubernetes/kubernetes#103451 Some of these needs more detailed documentation.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Implements KEP - Support memory qos with cgroups v2
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Note
After discussion with KEP owner @xiaoxubeii, this PR leaves the implementation of kube-reserved / system-reserved cgroups for future. The effective min boundary of a cgroup is also limited by memory.min values of all ancestor cgroups, and memory.min of all ancestor cgroups need to be set accordingly. But for kube-reserved / system-reserved cgroups, the ancestor cgroups may not be managed by kubelet.