Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add --system-reserved support for swap(cgroup v2 only): only for burstable pods #105271

Closed
wants to merge 6 commits into from

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Sep 27, 2021

/kind feature
/sig node
/triage accepted
/priority important-soon
/kind api-change

Pending on kubernetes/test-infra#24919 to test the e2e part of this PR.

Fixes #105019

kubelet: add the ability to set a the quantity of swap by `--system-reserved` flag on a node

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2400-node-swap

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Sep 27, 2021
@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Sep 27, 2021
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 15, 2021
@pacoxu pacoxu marked this pull request as ready for review October 15, 2021 09:26
@k8s-ci-robot k8s-ci-robot added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Oct 15, 2021
@pacoxu pacoxu changed the title wip: add --system-reserved support for swap add --system-reserved support for swap Oct 18, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Oct 18, 2021

/cc ehashman kevindelgado @yxxhero
cc all who are working on Swap-related PRs.

@k8s-ci-robot
Copy link
Contributor

@pacoxu: GitHub didn't allow me to request PR reviews from the following users: yxxhero.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc ehashman kevindelgado yxxhero
cc all who are working on Swap-related PRs.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pacoxu
Copy link
Member Author

pacoxu commented Oct 18, 2021

Some testing on my cluster

[root@daocloud ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi       2.1Gi       9.7Gi       2.0Mi       3.6Gi        13Gi
Swap:         1.0Gi       148Mi       875Mi

[root@daocloud ~]# cat /var/lib/kubelet/config.yaml
...
failSwapOn: false
featureGates:
  NodeSwap: true
memorySwap:
  swapBehavior: LimitedSwap
systemReserved:
  cpu: 1000m
  memory: 1000Mi
  ephemeral-storage: 10Gi
  pid: "300"
  swap: "400Mi"
...

[root@daocloud ~]# kubectl get node -o yaml | grep allocatable -A 20
    allocatable:
      cpu: "7"
      ephemeral-storage: "83916597092"
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 15107672Ki
      pods: "20"
      swap: 638972Ki
    capacity:
      cpu: "8"
      ephemeral-storage: 102706180Ki
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 16234072Ki
      pods: "20"
      swap: 1048572Ki

@pacoxu pacoxu marked this pull request as draft October 18, 2021 04:00
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2021
@pacoxu pacoxu force-pushed the swap-beta-dev branch 2 times, most recently from 99da08f to d865da9 Compare October 18, 2021 07:32
Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pacoxu! Great Job!

Left some minor comments below

@@ -260,6 +267,9 @@ func (m *cgroupManagerImpl) Validate(name CgroupName) error {
// in https://github.com/opencontainers/runc/issues/1440
// once resolved, we can remove this code.
allowlistControllers := sets.NewString("cpu", "cpuacct", "cpuset", "memory", "systemd", "pids")
if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) && swapControllerV2Available() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, there is no "swap" controller, but there are swap knobs that are part of the memory controller.
AFAIK, all cgroups with the memory controller enabled are also exposed to swap knobs.

Please correct me if I'm wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function swapControllerV2Available is // swap controller is available only if cgroup v2 swap file is accessible..

So we may rename it to something like swapCgroupV2FileAvailable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In which situations the swap knobs are missing with cgroups v2 and memory controller enabled?
  2. IIUC the whole purpose here is to add swap to allowlistControllers. Is that necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems that no need to run swapCgroupV2FileAvailable here. I drop it.

pkg/kubelet/cm/container_manager_linux.go Outdated Show resolved Hide resolved
pkg/apis/node/types.go Outdated Show resolved Hide resolved
@pacoxu
Copy link
Member Author

pacoxu commented Jul 11, 2023

@iholder101 thanks for the review and updated accordingly.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mmiranda96, pacoxu
Once this PR has been reviewed and has the lgtm label, please assign liggitt for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 19, 2023
pacoxu and others added 5 commits July 20, 2023 15:09
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
Co-authored-by: haoyun <yun.hao@daocloud.io>
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 20, 2023
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 20, 2023

@pacoxu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-kubelet-serial fb337bcda5e2198a87b543104294fce8305fdc35 link false /test pull-kubernetes-node-kubelet-serial
pull-kubernetes-node-swap-ubuntu 2a3bcf7e8a1c22c3bf9d424308f8331146603490 link false /test pull-kubernetes-node-swap-ubuntu
pull-kubernetes-node-swap-fedora-serial 556b74b3cd87a98e029c259f01c66fb7a64f31b4 link false /test pull-kubernetes-node-swap-fedora-serial

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 20, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dims dims added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 24, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. area/code-generation area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.28
Status: Done
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.

[KEP-2400] Add the ability to set a system-reserved quantity of swap on a node