-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-127: user namespace support for stateless pods #116377
Conversation
/sig node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
❓ do we publish the UID range that a Pod is using? eg: add that information into the |
no, that information is not exposed at the moment |
/priority important-soon |
add the definitions for the ID mappings to use at runtime for the volume mount. This is supported only on Linux where idmapped mounts are used to perform the runtime mapping. The new fields are mapped directly to the field in the OCI runtime specs: https://github.com/opencontainers/runtime-spec/blob/main/config.md#posix-platform-mounts The CRI runtime will pass the mappings to the OCI runtime as-is. Related to KEP-127. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Now KEP-127 relies on idmap mounts to do the ID translation and we won't do any chowns in the kubelet. This patch just removes the usage of GetHostIDsForPod() in operationexecutor to do the chown, and also removes the GetHostIDsForPod() method from the kubelet volume interface. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Latest changes to KEP-127 removed that phase, so let's stop reserving those IDs for that. While we are there, we replace 0 for 0*65536 as before we had a bug that we were not multiplying the index, to avoid bugs in the future. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
To that end, we need to add one kubelet getter listPodsFromDisk(). Other than that, it is a pretty trivial move. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: f5d61df8aeccd78ecd8e9b66f0d063b71702228f
|
/retest |
cc @dobsonj |
return volumetypes.NewOperationContext(eventErr, detailedErr, migrated) | ||
} | ||
} | ||
|
||
// Execute mount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea here is to, instead of kubelet giving permissions to mapped uid/gid, we are going to ask CRI to idmap these mount points when creating bind mounts? So no change in volumemanager code is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes exactly, the idea is to use idmapped mounts (through the CRI) to cancel the effect on volumes of running inside a user namespace
I think change is fine for alpha implementation with understanding that current limitation of idmapped mounts will prevent this from working in case of brown field volumes with non-uniform uid/gid permissions. But we are currently not targeting those anyways. Lets merge this, but I also want to hear from @dobsonj who has been tinkering in this area. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, gnufied, mrunalp, rata The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I think this looks fine, I'd just ask that we document which container runtimes support this so it's clear for users of this feature. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
it adds support for user namespace for stateless pods
Which issue(s) this PR fixes:
Special notes for your reviewer:
rebased on top of: #116249
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: