Description
Description
In recent versions of containerd 2 when using user namespaces, the setgroups
syscall has started failing with EPERM from inside the constructed user namespace.
The cause appears to be that /proc/self/setgroups
has been set to deny
.
#10611 and #10607 look like likely suspect to me, these change how the gid map is established in the user namespace provided to the container. Prior to #10611 we were writing to /proc/pid/gid_map
ourselves instead of using the Go stdlib, and nothing particularly touched /proc/pid/setgroups
at all, so it was left at its default of allow
.
Switching to the stdlib had the subtle side-effect of a deny
getting written to /proc/{cloned-pid}/setgroups
by Go's forkAndExecInChild
/ forkAndExecInChild1
unless SysProcAttr.GidMappingsEnableSetgroups
is true
.
cc: @AkihiroSuda @fuweid @rata
Steps to reproduce the issue
- Start a container with userns enabled
- Read
/proc/self/setgroups
from the container, observe its value isdeny
. Alternatively, attempt to callsetgroups
, such as withsudo
, from within the container.
Describe the results you received and expected
Expect setgroups to be allowed, as in non-user namespaced containers and as in namespaced containers before the changes from late August.
What version of containerd are you using?
current main
Any other relevant information
#10741 seems like it's about what's needed, it resolves the issue for me.
Show configuration if it is related to CRI plugin.
No response