Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can bypass PodSecurityContext.SupplementalGroups by custom container image although PSP(or other policy engines) enforces the field #112879

Closed
2 of 4 tasks
everpeace opened this issue Oct 5, 2022 · 16 comments · Fixed by #113047
Assignees
Labels
area/api Indicates an issue on api area. area/security kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@everpeace
Copy link
Contributor

everpeace commented Oct 5, 2022

What happened?

Even PSP or other policy engines enforces the field values of PodSecurityContext.SupplementalGroups, user can bypass the setting by building a custom container image like this.

Assume the below PSP is enforced:

kind: PodSecurityPolicy
...
spec:
  # this only allows uid=1000, gid=1000, supplementalGroups=[60000]
  runAsUser:
    ranges: { min: 1000, max: 1000 }
    rule: MustRunAs
  runAsGroup:
    ranges: { min: 1000, max: 1000 }
    rule: MustRunAs
  supplementalGroups:
    ranges: { min: 60000, max: 60000 }
    rule: MustRunAs

Then, a user can create a pod that PSP allows with a custom image. But it can bypass SupplementalGroups setting by crafting a container image.

# Dockerfile 
# In the image, uid=1000 belongs to 50000
FROM ubuntu:22.04
RUN groupadd -g 50000 bypassed-group \
    && useradd -m -u 1000 alice \
    && gpasswd -a alice bypassed-group
# Pod
kind: Pod
metadata:
  name: bypass-supplementalgroups-pod
spec:
  # the securityContext satisfies the above PSP
  securityContext:
    runAsUser: 1000
    runAsGroup: 1000
    supplementalGroups: [60000]
  containers:
  # However, because uid=1000 belongs to gid=50000 in the image,
  # the process's identity in the container has groups=50000,60000
  # Note that 50000 is bypassed by container image 
  # even though PSP enforces supplementalGroups=60000
  - image: built-above
    # the command outputs: 'uid=1000(alice) gid=1000(alice) groups=1000(alice),50000(bypassed-group),60000'
    command: ["bash", "-c", "id; true"]
...

Why does this behavior matter?

In a multi-tenant Kubernetes cluster using hostPath volumes, this behavior may be very confusing for cluster administrators.

hostPath volumes are access-controlled by uid/gid as in the usual Linux manner. In such clusters, I think it would be a common practice for the cluster administrator to use PSP or policy engine to restrict the values of PodSecurityContext.{runAsUser, runAsGroup, SupplementalGroups} field to secure private directory in hostPath volumes. However, a custom container image can easily bypass the setting and can get access to some private directory in the hostPath volume. This behavior confuses cluster administrators. They might say "What can supplementalGroups policy in PSP protect?"

See here for a more detailed and practical impact.

What did you expect to happen?

bypass-supplementalgroups-pod can have group identities defined only in PodSecurityContext.SupplementalGroups. That is, the pod's log expects to be

uid=1000(alice) gid=1000(alice) groups=1000(alice),60000

How can we reproduce it (as minimally and precisely as possible)?

See https://github.com/pfnet-research/strict-supplementalgroups-container-runtime/tree/reproduce-bypass-supplementalgroups. You can reproduce this behavior in a single command. You can try to reproduce this both with contained and cri-o.

Anything else we need to know?

Is this a security issue??

We reported this behavior to hackerone.com(#1688374) by following Kubernetes Security and Disclosure Information. Kubernetes Security Response Committee responded that this behavior "works as intended," and they recommended discussing how to handle this behavior in a public k/k issue.

However, I understand this behavior is not recognized widely. So, for clusters using NFS(or similar filesystems) as hostPath volumes, I think it's popular for supercomputers or HPC area, this behavior might cause critical consequences because many cluster administrators might understand PSP(or other policy engines) can provide enough protection to private directories in such hostPath volumes.

See here for a more detailed and practical impact.

cc/ @cjcullen (as a responder to our report)

Root Cause: ambiguous definition

In kubernetes API, Pod.SecurityContext.SupplementalGroups is defined as "A list of groups applied to the first process run in each container, in addition to the container's primary GID. If unspecified, no groups will be added to any container. Note that this field cannot be set when spec.os.name is windows."

I think this definition may be a bit ambiguous for readers regarding the gids defined in the container image. However, most popular CRI implementations(contained and cri-o) create OCI runtime spec(config.json) with merged gids defined in the container image and supplementalGroups in the pod spec like this:

{
  "ociVersion": "1.0.2-dev",
  "process": {
    "user": {
      "uid": 1000,
      "gid": 1000,
      # CRI merges group gids for uid=1000
      "additionalGids": [
        50000, # defined in container image (uid=1000 belongs to it)
        60000  # defined in securityContext.supplementalGroups
      ]
    },
...

How to resolve this issue?

added: as described in #112879 (comment), we will improve the API description first, and keep discussing new API in KEP.

First, I would like to discuss how to resolve this in the community. I propose several ways to resolve this.

  • Step 1. Improve the description in the API spec and warn this behavior in a blog or similar ways to users
    • Pros
    • Cons
      • This behavior will become official. It might break the "secure by design" principle.
  • Step 2. Add a new API: PodSecurityContext.SupplementalGroupsPolicy={Merge(default), Strict} or similar (KEP-3619)
    • Pros
      • No breaking change.
      • Users can have an official API to select the behavior.
    • Cons
      • Hard. Learge impact.
      • Needs KEP (I'm also welcome to write a KEP)
      • This affects all the CRI implementations
  • Change an API behavior
    • Pros
      • None.
    • Cons
      • Breaking change.

/area security
/sig security
/area api
/sig architecture

Tasks for Step 1

Tasks

Preview Give feedback
  1. approved area/code-generation area/kubelet area/test cncf-cla: yes kind/api-change kind/documentation lgtm needs-priority needs-triage release-note-none sig/api-machinery sig/apps sig/node sig/testing size/M triage/accepted
    derekwaynecarr liggitt
    mrunalp
  2. approved cncf-cla: yes kind/feature lgtm size/M
    haircommander
  3. kind/feature lead-opted-in sig/node stage/beta tracked/yes
    everpeace

Tasks for Step 2

Tasks

Preview Give feedback
  1. kind/feature lead-opted-in sig/node stage/beta tracked/yes
    everpeace

Kubernetes version

I checked this behavior in the below version. But I think this might affect all the recent versions.
$ kubectl version
Client Version: v1.24.6
Kustomize Version: v4.5.4
Server Version: v1.25.2

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux reproduce-bypass-supplementalgroups-control-plane 5.10.104-linuxkit #1 SMP PREEMPT Thu Mar 17 17:05:54 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

Install tools

Container runtime (CRI) and version (if applicable)

I tested this behavior both on containerd and cri-o.

containerd github.com/containerd/containerd v1.6.8 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
crio version 1.25.0

Related plugins (CNI, CSI, ...) and versions (if applicable)

@everpeace everpeace added the kind/bug Categorizes issue or PR as related to a bug. label Oct 5, 2022
@everpeace everpeace changed the title Can bypass PodSecurityContext.SupplementalGroups by docker image although PSP(or other policy engines) enforces the field Can bypass PodSecurityContext.SupplementalGroups by custom container image although PSP(or other policy engines) enforces the field Oct 5, 2022
@k8s-ci-robot k8s-ci-robot added area/security sig/security Categorizes an issue or PR as relevant to SIG Security. area/api Indicates an issue on api area. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 5, 2022
@everpeace everpeace changed the title Can bypass PodSecurityContext.SupplementalGroups by custom container image although PSP(or other policy engines) enforces the field Can bypass PodSecurityContext.SupplementalGroups by custom container image although PSP(or other policy engines) enforces the field Oct 5, 2022
@everpeace
Copy link
Contributor Author

I put labels of sig/security and sig/architecture. But I'd appreciate it if someone told me more suitable sig or area labels.

@nitishmalang
Copy link

assign me

@liggitt
Copy link
Member

liggitt commented Oct 10, 2022

/sig node
/remove-sig architecture

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Oct 10, 2022
@liggitt
Copy link
Member

liggitt commented Oct 10, 2022

routing to sig-node since it owns the Pod API and kubelet / CRI runtime implications

@everpeace
Copy link
Contributor Author

everpeace commented Oct 10, 2022

Thanks liggitt for routing this to sig/node!

I look forward to hearing a response from the sig/node about how to resolve this issue in the community.

@everpeace
Copy link
Contributor Author

everpeace commented Oct 11, 2022

/assign

Although we haven't agreed on how to resolve this issue in the community, I would like to contribute to the issue if I can. Of course, I will respect that the community will decide to assign this issue to a more suitable member. I just wanted to express my motivation for this issue.

@SergeyKanzhelev
Copy link
Member

Discussed at CI triage meeting. @everpeace will improve documentation of the field as a first step and improvements can be discussed as a KEP.

/remove-kind bug
/kind documentation

@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. and removed kind/bug Categorizes issue or PR as related to a bug. labels Oct 12, 2022
@SergeyKanzhelev
Copy link
Member

/triage accepted

@kfox1111
Copy link

kfox1111 commented Nov 1, 2022

Ouch... yeah, this is unexpected behavior.

@everpeace
Copy link
Contributor Author

/reopen

@k8s-ci-robot
Copy link
Contributor

@everpeace: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2023
@everpeace
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2023
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels May 5, 2024
@everpeace
Copy link
Contributor Author

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 5, 2024
@everpeace
Copy link
Contributor Author

KEP-3619 is alpha-released in v1.31.0. I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/security kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
7 participants