Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: ensure secret pulled image #114847

Closed
wants to merge 3 commits into from

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Jan 5, 2023

Rebased on #94899 by @mikebrow

/kind feature
/cc @mikebrow

Implement support in kubelet to ensure images pulled with pod imagePullSecrets are authenticated
by other pods that do not have the same credentials.
Design Details: track a hash map for the credentials in kubelet

kubernetes/enhancements#3532

Kubelet will track, in memory, a hash map for the credentials that were successfully used to pull an image. It has been decided that the hash map will be persisted to disk, in alpha.

What to store

  • image: string
  • auth hash (keys): string
  • ensuredBySecret: bool

Where to store the data

  • store in node status (need to be confirmed with sig-node): it needs API-change to add a list field like ensuredPulledImage in node status spec.
  • store in a local file like cpu policy state (should it be secured by an encryption scheme). [No API change, but a new file on every node.]
  • store in a secure volume? too complex?

Should the stored data be outdated after several hours for security reasons?
maybe 1, 6 hours, or 12 hours.

**Scenarios that should be taken care of **

  1. kubelet restart (that is why we need to store the data)
  2. pinned image: any special logic for a pinned image?
TODO List

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2535-ensure-secret-pulled-images/README.md

Test demo:

[root@daocloud ~]# bash -c "cat /var/lib/kubelet/image_manager_state | jq ."
{
  "images": {
    "sha256:eb6cbbefef909d52f4b2b29f8972bbb6d86fc9dba6528e65aad4f119ce469f7a": {
      "authHash": {
        "115b8808c3e7f073": {
          "ensured": true,
          "dueDate": "2023-05-26T18:00:09.802153792+08:00"
        }
      },
      "name": "daocloud.io/daocloud/dce-registry-tool:3.0.8"
    }
  }
}

[root@daocloud ~]# cat ensure.yaml
apiVersion: v1
kind: Pod
metadata:
  name: need-secret-but-has-no
spec:
  containers:
    - name: test-container
      image: daocloud.io/daocloud/dce-registry-tool:3.0.8

      command: [ "sleep", "100000" ]

---

apiVersion: v1
kind: Pod
metadata:
  name: need-secret-and-has-secret
spec:
  containers:
    - name: test-container
      image: daocloud.io/daocloud/dce-registry-tool:3.0.8
      command: [ "sleep", "100000" ]
  imagePullSecrets:
  - name: regcred

---

apiVersion: v1
kind: Pod
metadata:
  name: need-secret-and-has-wrong-secret
spec:
  containers:
    - name: test-container
      image: daocloud.io/daocloud/dce-registry-tool:3.0.8
      command: [ "sleep", "100000" ]
  imagePullSecrets:
  - name: regcred.1

[root@daocloud ~]#  kubectl get pod -o wide
NAME                               READY   STATUS             RESTARTS   AGE     IP               NODE       NOMINATED NODE   READINESS GATES
need-secret-and-has-secret         1/1     Running            0          2m5s    172.32.230.235   daocloud   <none>           <none>
need-secret-and-has-secret-2       1/1     Running            0          2m1s    172.32.230.234   daocloud   <none>           <none>
need-secret-and-has-wrong-secret   0/1     ImagePullBackOff   0          105s    172.32.230.236   daocloud   <none>           <none>
need-secret-but-has-no             0/1     ErrImagePull       0          70s     172.32.230.237   daocloud   <none>           <none>

@k8s-ci-robot k8s-ci-robot requested a review from mikebrow January 5, 2023 10:11
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 5, 2023
@pacoxu pacoxu marked this pull request as draft January 5, 2023 10:18
@pacoxu pacoxu marked this pull request as ready for review January 5, 2023 10:20
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from 9f8b7f9 to b618e2e Compare January 5, 2023 10:20
@pacoxu
Copy link
Member Author

pacoxu commented Jan 5, 2023

/priority important-soon
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 5, 2023
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from b618e2e to 3dbb9d7 Compare January 5, 2023 10:31
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 1, 2023
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from 3dbb9d7 to 3dcb3d1 Compare May 8, 2023 07:15
@k8s-ci-robot k8s-ci-robot removed the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 8, 2023
pkg/kubelet/images/image_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/images/image_manager.go Outdated Show resolved Hide resolved
}

// if the image is in the ensured secret pulled image list, it is pulled by secret
pulledBySecret := m.ensureSecretPulledImages[imageRef] != nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this in-memory map accurate if the kubelet restarts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkpointing was out then in .. now out again for the alpha.. TODO: restore from checkpoint..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkpointing was out then in .. now out again for the alpha.. TODO: restore from checkpoint..

Ok, sorry for asking questions which have already been answered :-)

Seems like checkpointing is the hard part of the problem. I'll defer to node approvers on that decision, but I'm a little surprised we're deferring tackling it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was the one that proposed we drop the checkpointing. The motivation was the recheck period was proposed to be 24hr, and I think the kubelet restarting is an infrequent operation. Redundantly checking the creds is part of the feature's intention, so the admin is opting into checking more than necessary and that's the point. Not having checkpointing reduces the code complexity while not materially changing the behavior IMO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading correctly, the current implementation means images previously pulled to the node using pull secrets are treated as non-secret-pulled images after kubelet restart and would never be required to revalidate their pull credentials. Am I reading that correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@mikebrow mikebrow Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point of this KEP is to add unnecessary rechecks

original point was to provide a path to get off pull always

The desire to do intermittent rechecks (e.g. after kubelet restart or after duration) for better security than if not present and better performance over pull always is at conflict with the desire to not have any rechecks because a recheck might fail in cases where a user does not use pull always or pull never and is expecting pull if not present to behave like pull never.

With rechecking enabled for duration.. in the scope of the if not present policy the checkpointing has minimal value.

With rechecking disabled for duration but enabled for multi-pod permission... in the scope of the if not present policy.. the checkpointing is needed before we go beta otherwise "how do we know which pre-existing images are intentionally preloaded and which are previously-pulled-by-secret." We would need to take the new presumption that preloaded was just for layers and we should recheck manifest permissions after restart.

So it would seem... the resolution for alpha would be to check if either of the recheck signals are enabled and if not .. disable the feature gate and possibly warn users that they should employ the pull always admission controller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7a120ec

if pullImageSecretRecheckPeriod>0 and KubeletEnsureSecretPulledImages is enabled, do recheck related logic.

This will keep the original behavior if we don't set the recheck period even when we enabled the FG by default in later releases or manually enable it in v1.30.

Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually out of the office today, so I can't give this more time until next week, the more I look at this, I don't see how this can work as expected (reliably make new pods/pulls prove they have access AND not destabilize already running/working pods if a registry goes unavailable) without kubelet keeping track across restarts of images it used pull credentials to fetch.

There's two questions the kubelet has to answer:

  1. did image X which is already pulled to the node require image pull credentials?
  2. if so, are the image pull credentials this pod requesting image X is providing good?

This PR assumes all images that exist at kubelet start don't require credentials, only keeps track of credential-enabled image pulls the kubelet does in-memory, then forgets them on restart.

The "periodically check" approach looks like it will not actually protect images that exist on startup from use by uncredentialed pods. That seems like a fundamental gap.

The "periodically check" approach also looks like it will put existing working credentialed pods at risk of registry outage. Is it actually intended that pods which had image pull credentials and proved their pull credential was valid on their initial start on a node could stop working later because the registry is unavailable? That doesn't seem right.

#94899 (comment) and #94899 (comment) were the feedback on the last attempt of this feature, and looks like we're in essentially the same position now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two stories from the kep https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2535-ensure-secret-pulled-images#user-stories.

  1. Story 1: User with multiple tenants will be able to support all image pull policies without concern that one tenant will gain access to an image that they don't have rights to.
    2.Story 2: User will will no longer have to inject the Pull Always Image Pull Policy to ensure all tenants have rights to the images that are already present on a host.

The aim of the KEP is to gain an image pull policy that is between Always and current IfNotPresent.

  • Image pull policyAlways trust nothing.
  • Image pull policyIfNotPresent trust everything on the disk, but no checking on the image pull secret.
  • What we want is something likeIfNotPresentWithTrustedAuthPeriod:
    • we add a recheck/trust period to control the weak level: 24h is like a daily trust credit. 1y may be a long credit trust. Node admin can control its trust for an image pull credential check result.

The PullImageSecretRecheckPeriod naming is somewhat inaccurate, as we didn't recheck the image secret every 12h if it is set to be 12h. The check will happen only when a container startup or restart happen IIUC. It will not kill a pod or container if it found the credential is out-of-dated.

What should be done?

  1. We should keep the behavior not changed even this feature is GAed.
  2. When node admin options in with adding a recheck period, (in another word, this is for those admins who only allow its users usingAlways ImagePullPolicy now), I prefer that restarting kubelet needs recheck, which will be more secure for them. The admin can control the secure/weak level using the recheck period. This is also why I think no checkpoint is acceptable.

pkg/kubelet/images/puller.go Outdated Show resolved Hide resolved
pkg/kubelet/images/puller.go Outdated Show resolved Hide resolved
Comment on lines 372 to 373
ensuredInfo := digest.Auths[hash]
if ensuredInfo != nil && ensuredInfo.ensured && ensuredInfo.lastEnsuredDate.Add(m.pullImageSecretRecheckPeriod.Duration).After(time.Now()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading this line, it seems like there's two aspects here:

  1. How often we recheck a previously-valid credential is still valid – pullImageSecretRecheckPeriod makes sense to control this.
  2. Whether we ever check if a pod's pull secrets were valid – using pullImageSecretRecheckPeriod for that is really weird to me... shouldn't the node admin be able to turn on cross-pod checking without also setting a revalidation period?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a boolean configuration for cross pod checking that can be set independently would we then default to some recheck period (cache invalidation) if not set or keep assuming image access for the entry in the cache till the kubelet is running (current in-memory impl)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a boolean configuration for cross pod checking that can be set independently would we then default to some recheck period (cache invalidation) if not set or keep assuming image access for the entry in the cache till the kubelet is running (current in-memory impl)?

Not sure... I'd probably default both to current behavior and let admins opt into things that could break or disrupt currently working setups, but I hadn't thought about it for long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point... let's add another config for the ensured by cross-pod checking option. I like the name.

What do you think about the default for the cross-pod checking option? Same as for the recheck period (defaults to off?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the default for the cross-pod checking option? Same as for the recheck period (defaults to off?)

Yeah, my starting position is always to default to settings that won't make an admin that failed to read 40 pages of release notes come hunt us down because we broke them on upgrade

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. We will have to document that setting the booelan and not the recheck period could have implications :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If recheck is disabled(with recheck duration is 0s), a new pod with same hash should be checked again.

It sounds a valid corner case.

pkg/kubelet/images/image_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved

// PullImageSecretRecheckPeriod defines the duration to recheck the pull image secret.
// By default, the kubelet will recheck the pull image secret every 24 hours(1d).
PullImageSecretRecheckPeriod *metav1.Duration `json:"pullImageSecretRecheckPeriod"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default change and comment lgtm. traced through to where this is used and had a question about whether the revalidation duration is what we want to decide whether to ever validate a pull secret

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see proposed style change to named return vals...

@pacoxu pls ignore the this style change.. diff is to big.. see comment below from Jordan

pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/container/runtime.go Outdated Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
pkg/kubelet/kuberuntime/kuberuntime_image.go Show resolved Hide resolved
@liggitt
Copy link
Member

liggitt commented Mar 7, 2024

see proposed style change to named return vals...

I'd rather minimize the diff... returning data (imageRef) alongside errors makes me wonder what is using that data in error cases

@mikebrow
Copy link
Member

mikebrow commented Mar 7, 2024

see proposed style change to named return vals...

I'd rather minimize the diff... returning data (imageRef) alongside errors makes me wonder what is using that data in error cases

agreed.. I think it was a mistake to return imageRef that error case.

@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch 2 times, most recently from 3d40dd9 to 0ad580e Compare March 7, 2024 17:30
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from 0ad580e to 9d2b749 Compare March 7, 2024 17:39
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from 9d2b749 to 5b444e4 Compare March 8, 2024 08:19
// successful pull no auth hash returned, auth was not required so we should reset the hashmap for this
// imageref since auth is no longer required for the local image cache, allowing use of the ImageRef
// by other pods if it remains cached and pull policy is PullIfNotPresent
delete(m.ensureSecretPulledImages, imageRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda feel like pullCredentialsHash == "" should be handled in a separate mechansim. As in: if we remove images with an empty credentials hash from this map, then we treat the first pull after kubelet restart and first pull after recheck period as pullCredentialHash == "'". Instead, can this be skipped or even add a "" entry in to the maps, and then we treat not present in map as always recheck?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the image needs no hash at first and then later needs auth, this would be a problem.

If recheck is enabled, the recheck for no auth image seems to be needed as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@mikebrow mikebrow Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pacoxu
What do you think about storing an "anonymous" entry instead of starting over (by that delete operation).. when anonymous is successful.. If we go that route we can check for known anonymous success first then if not in the list check for each hash.. With duration checking enabled can also recheck anonymous..

@haircommander agree to your point add a "" entry in to the maps "" or "anonymous" to be more explicit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iow isEnsuredBySecret() needs to be improved to isEnsured() // By Secret || Anonymous .. when anonymous worked for a particular registry that image is now in cache and thus any private pull with auth is moot, at least until another policy says recheck it now..

I'm taking the presumption that we are not going to check if an image was pulled from registry a or registry b.. the image in the k8s.io image cache is still the same bytes either way..

… is enabled, do recheck logic

- if recheck is enabeld, recheck no auth image(pullCredentialsHash=="") as well
@pacoxu pacoxu force-pushed the ensure-secret-pulled-image branch from 7a120ec to 51f8822 Compare March 8, 2024 15:16
@pacoxu
Copy link
Member Author

pacoxu commented Mar 9, 2024

/hold
for there are still some things not clear.

/milestone clear

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 7, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 7, 2024
@pacoxu
Copy link
Member Author

pacoxu commented Jul 11, 2024

/close
ref #125817

@k8s-ci-robot
Copy link
Contributor

@pacoxu: Closed this PR.

In response to this:

/close
ref #125817

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. area/code-generation area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.