KEP-1967: promote size backed memory volumes to stable #126981

kannon92 · 2024-08-28T20:47:13Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Promote KEP-1967 to stable.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Promote SizeMemoryBackedVolumes to stable

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

kannon92 · 2024-08-28T20:47:30Z

/hold

This is dependent on the KEP outcome.

kannon92 · 2024-08-28T20:54:44Z

/triage accepted
/priority important-soon

kannon92 · 2024-08-30T10:30:41Z

/retest

kannon92 · 2024-08-30T13:32:29Z

/retest

thockin · 2024-10-13T22:30:13Z

What happens if the container crashes and is restarted?

I don't really understand how this is related to this feature. What exactly are you concerned about? This feature uses cgroup of the container to set a max memory limit. I don't see how that would be impacted by restarts.

IIUC A write to tmpfs creates dirty anonymous pages (can't be flushed to disk for reclaim) attributed to the container's cgroup. If that cgroup were to get torn down for any reason, those pages are still anonymous and dirty, so they get accounted by the next-parent cgroup (I think?). If this happens, you can imagine a situation where the container tries to restart and immediately OOMs because those tmpfs files are accounted to the pod.

What I am asking is whether we have thought through those sorts of failure modes. Do we ever tear down the container cgroup? Are we confident that the accounting stays with the container? If you all say we are satisfied with the testing of that, I'm satisfied, but I wanted to call it out as an area where I know there are traps :)

pkg/volume/emptydir/empty_dir_test.go

SergeyKanzhelev · 2024-10-15T21:25:27Z

pkg/features/kube_features.go

 	// Enables kubelet support to size memory backed volumes
-	SizeMemoryBackedVolumes featuregate.Feature = "SizeMemoryBackedVolumes"
+	SizeMemoryBackedVolumes featuregate.Feature = "SizeMemoryBackedVolumes" // remove in 1.35


can you please add a comment saying that this FG is only used in kubelet and not needed for emulated version so in couple releases it will be easier to understand that we can just remove it

I will do this tonight or tomorrow.

kannon92 · 2024-10-15T23:03:50Z

@thockin
As far as I can tell, we are not using pod c groups to track memory. I looked into the pod cgroup of a pod to see if we report memory cgroup settings for the containers. I don’t think we use the chorus for tracking (memory.current is set to 0 even though container memory.current is nonzero). Memory.max for pods was set to max.

So I don’t think we are tracking anything in the pod c group.

SergeyKanzhelev

/lgtm
/approve

I think we resolved all the questions here.

k8s-ci-robot · 2024-10-16T16:45:40Z

LGTM label has been added.

Git tree hash: c83d2e73530fa93baacae24f1164c091409c09a3

kannon92 · 2024-10-16T23:07:02Z

/retest

kannon92 · 2024-10-25T14:20:29Z

@thockin I asked @ndixita her thoughts on the pod tracking. This is not a concern because we haven't been using placing tracking information into the pod cgroup.

What I am asking is whether we have thought through those sorts of failure modes. Do we ever tear down the container cgroup? Are we confident that the accounting stays with the container? If you all say we are satisfied with the testing of that, I'm satisfied, but I wanted to call it out as an area where I know there are traps :)

I took this item because of its long outstanding status as a beta feature (quote is from you).

This gate has been Beta since 1.22 - is there any reason not to just set it to GA? Seems like something any contributor can do to score some brownie points, and since it has been beta so long, I doubt we can change any behavior, even if we want to?

I am paraphrasing you here but this feature has been in beta on since 1.22. At this stage, I think any new issue related to emptyDir and tmpfs should be considered separate from this issue. This feature was about setting a size limit to node allocatable or pod limit. It didn't change anything around how the cgroups are tracked from container restarts. I think the feature works as intended and we have e2e tests that verify the limits.

kannon92 · 2024-10-25T15:25:57Z

@thockin

What happens if the container has requests: {memory: 1Gi}, limits: {memory: 1Gi} and the volume has a size-limit of 2Gi ? If I write to the volume, can I cause the pod to fail to start?

#128339

Yes. It looks like if a container hits an OOM limit with tmpfs than it will be stuck in an error state. I don't think this feature introduced this though.

SergeyKanzhelev · 2024-10-25T17:33:07Z

Even if we want to say that memory backed volume must have limit set and it must be lower than container limit, we will not be able to enforce it on API side without breaking existing Pods.

So the only option would be for kubelet to override the memory-backed volume size limit to the container's limit (when defined). This may help. But it also will not guarantee we out of the "OOMLoopBackoff" as code bootstrap can be quite memory-heavy.

I would see this as to be outside of the KEP scope, but it can be a good future enhancement. @thockin do you agree?

thockin

Thanks for doing the homework. I know how baroque some of the corner-cases are.

thockin · 2024-10-28T23:41:36Z

Thanks!

/lgtm
/approve

k8s-ci-robot · 2024-10-28T23:41:41Z

LGTM label has been added.

Git tree hash: 475bc53422b98905d0504291c826b62bd334a5a3

k8s-ci-robot · 2024-10-28T23:41:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kannon92, SergeyKanzhelev, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/features/OWNERS~~ [SergeyKanzhelev,thockin]
~~pkg/volume/emptydir/OWNERS~~ [thockin]
~~test/e2e_node/OWNERS~~ [SergeyKanzhelev,thockin]
~~test/featuregates_linter/test_data/OWNERS~~ [SergeyKanzhelev,thockin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 28, 2024

k8s-ci-robot requested review from dims and saad-ali August 28, 2024 20:47

kannon92 force-pushed the stable-empty-dir-promotion branch from a36c6bb to de67d25 Compare August 30, 2024 00:48

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 30, 2024

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 4, 2024

kannon92 force-pushed the stable-empty-dir-promotion branch from de67d25 to ee07a3f Compare September 11, 2024 16:42

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 11, 2024

k8s-ci-robot requested review from SergeyKanzhelev and thockin October 13, 2024 17:24

kannon92 force-pushed the stable-empty-dir-promotion branch from bdaebb4 to d667b7c Compare October 14, 2024 12:47

thockin reviewed Oct 14, 2024

View reviewed changes

pkg/volume/emptydir/empty_dir_test.go Show resolved Hide resolved

SergeyKanzhelev reviewed Oct 15, 2024

View reviewed changes

SergeyKanzhelev approved these changes Oct 16, 2024

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 16, 2024

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2024

kannon92 added 2 commits October 16, 2024 17:15

promote size backed memory volumes to stable

1d75220

feature gate comment

b690c4f

kannon92 force-pushed the stable-empty-dir-promotion branch from 4eb7301 to b690c4f Compare October 16, 2024 21:15

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 16, 2024

k8s-ci-robot requested review from SergeyKanzhelev and thockin October 16, 2024 21:16

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2024

thockin reviewed Oct 28, 2024

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 28, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2024

k8s-ci-robot merged commit 685b8b3 into kubernetes:master Oct 29, 2024
20 checks passed

k8s-ci-robot added this to the v1.32 milestone Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-1967: promote size backed memory volumes to stable #126981

KEP-1967: promote size backed memory volumes to stable #126981

kannon92 commented Aug 28, 2024

kannon92 commented Aug 28, 2024

kannon92 commented Aug 28, 2024

kannon92 commented Aug 30, 2024

kannon92 commented Aug 30, 2024

thockin commented Oct 13, 2024 •

edited

Loading

SergeyKanzhelev Oct 15, 2024

kannon92 Oct 15, 2024

kannon92 Oct 16, 2024

kannon92 commented Oct 15, 2024

SergeyKanzhelev left a comment

k8s-ci-robot commented Oct 16, 2024

kannon92 commented Oct 16, 2024

kannon92 commented Oct 25, 2024

kannon92 commented Oct 25, 2024

SergeyKanzhelev commented Oct 25, 2024

thockin left a comment

thockin commented Oct 28, 2024

k8s-ci-robot commented Oct 28, 2024

k8s-ci-robot commented Oct 28, 2024

KEP-1967: promote size backed memory volumes to stable #126981

KEP-1967: promote size backed memory volumes to stable #126981

Conversation

kannon92 commented Aug 28, 2024

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

kannon92 commented Aug 28, 2024

kannon92 commented Aug 28, 2024

kannon92 commented Aug 30, 2024

kannon92 commented Aug 30, 2024

thockin commented Oct 13, 2024 • edited Loading

SergeyKanzhelev Oct 15, 2024

Choose a reason for hiding this comment

kannon92 Oct 15, 2024

Choose a reason for hiding this comment

kannon92 Oct 16, 2024

Choose a reason for hiding this comment

kannon92 commented Oct 15, 2024

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Oct 16, 2024

kannon92 commented Oct 16, 2024

kannon92 commented Oct 25, 2024

kannon92 commented Oct 25, 2024

SergeyKanzhelev commented Oct 25, 2024

thockin left a comment

Choose a reason for hiding this comment

thockin commented Oct 28, 2024

k8s-ci-robot commented Oct 28, 2024

k8s-ci-robot commented Oct 28, 2024

thockin commented Oct 13, 2024 •

edited

Loading