-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: ensure secret pulled image #114847
Conversation
9f8b7f9
to
b618e2e
Compare
/priority important-soon |
b618e2e
to
3dbb9d7
Compare
3dbb9d7
to
3dcb3d1
Compare
} | ||
|
||
// if the image is in the ensured secret pulled image list, it is pulled by secret | ||
pulledBySecret := m.ensureSecretPulledImages[imageRef] != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this in-memory map accurate if the kubelet restarts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkpointing was out then in .. now out again for the alpha.. TODO: restore from checkpoint..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkpointing was out then in .. now out again for the alpha.. TODO: restore from checkpoint..
Ok, sorry for asking questions which have already been answered :-)
Seems like checkpointing is the hard part of the problem. I'll defer to node approvers on that decision, but I'm a little surprised we're deferring tackling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was the one that proposed we drop the checkpointing. The motivation was the recheck period was proposed to be 24hr, and I think the kubelet restarting is an infrequent operation. Redundantly checking the creds is part of the feature's intention, so the admin is opting into checking more than necessary and that's the point. Not having checkpointing reduces the code complexity while not materially changing the behavior IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading correctly, the current implementation means images previously pulled to the node using pull secrets are treated as non-secret-pulled images after kubelet restart and would never be required to revalidate their pull credentials. Am I reading that correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/kubernetes/kubernetes/pull/94899/files#r691688013 prior discussion on this context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the point of this KEP is to add unnecessary rechecks
original point was to provide a path to get off pull always
The desire to do intermittent rechecks (e.g. after kubelet restart or after duration) for better security than if not present and better performance over pull always is at conflict with the desire to not have any rechecks because a recheck might fail in cases where a user does not use pull always or pull never and is expecting pull if not present to behave like pull never.
With rechecking enabled for duration.. in the scope of the if not present policy the checkpointing has minimal value.
With rechecking disabled for duration but enabled for multi-pod permission... in the scope of the if not present policy.. the checkpointing is needed before we go beta otherwise "how do we know which pre-existing images are intentionally preloaded and which are previously-pulled-by-secret." We would need to take the new presumption that preloaded was just for layers and we should recheck manifest permissions after restart.
So it would seem... the resolution for alpha would be to check if either of the recheck signals are enabled and if not .. disable the feature gate and possibly warn users that they should employ the pull always admission controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pullImageSecretRecheckPeriod>0 and KubeletEnsureSecretPulledImages is enabled, do recheck related logic.
This will keep the original behavior if we don't set the recheck period even when we enabled the FG by default in later releases or manually enable it in v1.30.
Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually out of the office today, so I can't give this more time until next week, the more I look at this, I don't see how this can work as expected (reliably make new pods/pulls prove they have access AND not destabilize already running/working pods if a registry goes unavailable) without kubelet keeping track across restarts of images it used pull credentials to fetch.
There's two questions the kubelet has to answer:
- did image X which is already pulled to the node require image pull credentials?
- if so, are the image pull credentials this pod requesting image X is providing good?
This PR assumes all images that exist at kubelet start don't require credentials, only keeps track of credential-enabled image pulls the kubelet does in-memory, then forgets them on restart.
The "periodically check" approach looks like it will not actually protect images that exist on startup from use by uncredentialed pods. That seems like a fundamental gap.
The "periodically check" approach also looks like it will put existing working credentialed pods at risk of registry outage. Is it actually intended that pods which had image pull credentials and proved their pull credential was valid on their initial start on a node could stop working later because the registry is unavailable? That doesn't seem right.
#94899 (comment) and #94899 (comment) were the feedback on the last attempt of this feature, and looks like we're in essentially the same position now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two stories from the kep https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2535-ensure-secret-pulled-images#user-stories.
- Story 1: User with multiple tenants will be able to support all image pull policies without concern that one tenant will gain access to an image that they don't have rights to.
2.Story 2: User will will no longer have to inject the Pull Always Image Pull Policy to ensure all tenants have rights to the images that are already present on a host.
The aim of the KEP is to gain an image pull policy that is between Always
and current IfNotPresent
.
- Image pull policy
Always
trust nothing. - Image pull policy
IfNotPresent
trust everything on the disk, but no checking on the image pull secret. - What we want is something like
IfNotPresentWithTrustedAuthPeriod
:- we add a
recheck/trust period
to control the weak level: 24h is like a daily trust credit. 1y may be a long credit trust. Node admin can control its trust for an image pull credential check result.
- we add a
The PullImageSecretRecheckPeriod
naming is somewhat inaccurate, as we didn't recheck the image secret every 12h if it is set to be 12h. The check will happen only when a container startup or restart happen IIUC. It will not kill a pod or container if it found the credential is out-of-dated.
What should be done?
- We should keep the behavior not changed even this feature is GAed.
- When node admin options in with adding a
recheck period
, (in another word, this is for those admins who only allow its users usingAlways
ImagePullPolicy now), I prefer that restarting kubelet needs recheck, which will be more secure for them. The admin can control the secure/weak level using the recheck period. This is also why I think no checkpoint is acceptable.
pkg/kubelet/images/image_manager.go
Outdated
ensuredInfo := digest.Auths[hash] | ||
if ensuredInfo != nil && ensuredInfo.ensured && ensuredInfo.lastEnsuredDate.Add(m.pullImageSecretRecheckPeriod.Duration).After(time.Now()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reading this line, it seems like there's two aspects here:
- How often we recheck a previously-valid credential is still valid –
pullImageSecretRecheckPeriod
makes sense to control this. - Whether we ever check if a pod's pull secrets were valid – using
pullImageSecretRecheckPeriod
for that is really weird to me... shouldn't the node admin be able to turn on cross-pod checking without also setting a revalidation period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a boolean configuration for cross pod checking that can be set independently would we then default to some recheck period (cache invalidation) if not set or keep assuming image access for the entry in the cache till the kubelet is running (current in-memory impl)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a boolean configuration for cross pod checking that can be set independently would we then default to some recheck period (cache invalidation) if not set or keep assuming image access for the entry in the cache till the kubelet is running (current in-memory impl)?
Not sure... I'd probably default both to current behavior and let admins opt into things that could break or disrupt currently working setups, but I hadn't thought about it for long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point... let's add another config for the ensured by cross-pod checking option. I like the name.
What do you think about the default for the cross-pod checking option? Same as for the recheck period (defaults to off?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about the default for the cross-pod checking option? Same as for the recheck period (defaults to off?)
Yeah, my starting position is always to default to settings that won't make an admin that failed to read 40 pages of release notes come hunt us down because we broke them on upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. We will have to document that setting the booelan and not the recheck period could have implications :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If recheck is disabled(with recheck duration is 0s), a new pod with same hash should be checked again.
It sounds a valid corner case.
|
||
// PullImageSecretRecheckPeriod defines the duration to recheck the pull image secret. | ||
// By default, the kubelet will recheck the pull image secret every 24 hours(1d). | ||
PullImageSecretRecheckPeriod *metav1.Duration `json:"pullImageSecretRecheckPeriod"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default change and comment lgtm. traced through to where this is used and had a question about whether the revalidation duration is what we want to decide whether to ever validate a pull secret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see proposed style change to named return vals...
@pacoxu pls ignore the this style change.. diff is to big.. see comment below from Jordan
I'd rather minimize the diff... returning data (imageRef) alongside errors makes me wonder what is using that data in error cases |
agreed.. I think it was a mistake to return imageRef that error case. |
3d40dd9
to
0ad580e
Compare
0ad580e
to
9d2b749
Compare
…r IfNotPresent Image
9d2b749
to
5b444e4
Compare
pkg/kubelet/images/image_manager.go
Outdated
// successful pull no auth hash returned, auth was not required so we should reset the hashmap for this | ||
// imageref since auth is no longer required for the local image cache, allowing use of the ImageRef | ||
// by other pods if it remains cached and pull policy is PullIfNotPresent | ||
delete(m.ensureSecretPulledImages, imageRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda feel like pullCredentialsHash == "" should be handled in a separate mechansim. As in: if we remove images with an empty credentials hash from this map, then we treat the first pull after kubelet restart and first pull after recheck period as pullCredentialHash == "'"
. Instead, can this be skipped or even add a "" entry in to the maps, and then we treat not present in map as always recheck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the image needs no hash at first and then later needs auth, this would be a problem.
If recheck is enabled, the recheck for no auth image seems to be needed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pacoxu
What do you think about storing an "anonymous" entry instead of starting over (by that delete operation).. when anonymous is successful.. If we go that route we can check for known anonymous success first then if not in the list check for each hash.. With duration checking enabled can also recheck anonymous..
@haircommander agree to your point add a "" entry
in to the maps "" or "anonymous" to be more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iow isEnsuredBySecret() needs to be improved to isEnsured() // By Secret || Anonymous .. when anonymous worked for a particular registry that image is now in cache and thus any private pull with auth is moot, at least until another policy says recheck it now..
I'm taking the presumption that we are not going to check if an image was pulled from registry a or registry b.. the image in the k8s.io image cache is still the same bytes either way..
… is enabled, do recheck logic - if recheck is enabeld, recheck no auth image(pullCredentialsHash=="") as well
7a120ec
to
51f8822
Compare
/hold /milestone clear |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/close |
@pacoxu: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Rebased on #94899 by @mikebrow
/kind feature
/cc @mikebrow
Design Details: track a hash map for the credentials in kubelet
kubernetes/enhancements#3532
What to store
Where to store the data
ensuredPulledImage
in node status spec.cpu policy state
(should it be secured by an encryption scheme). [No API change, but a new file on every node.]Should the stored data be outdated after several hours for security reasons?
maybe 1, 6 hours, or 12 hours.
**Scenarios that should be taken care of **
TODO List
code complete
[x] feature gate added for issue comment
unit buckets complete.. for FG on and off
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Test demo: