Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix limitranger to handle latent caches without live lookups every time #21470

Merged
merged 2 commits into from
Feb 19, 2016

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Feb 18, 2016

Fixes #21234

If a limit range and a pod are all created very close together, then the cache for the limitranger will not be up to date and I'll be able to sneak a pod in.

It is too expensive to do a live lookup every time. Instead, and empty answer from the indexer is used to do a lookup in a lru cache of live lookups. If that results in a cache miss, then an actual live listing is done and the lru cache is primed.

This brought in a new LRU cache library which is thread-safe (the existing one is not).

@k8s-github-robot
Copy link

Labelling this PR as size/L

@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 18, 2016
return nil
lruList, ok := l.liveLookupCache.Get(a.GetNamespace())
if !ok {
liveList, err := l.client.Core().LimitRanges(a.GetNamespace()).List(api.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this prone to being wrong if the LRU cache is never cleared? for example, if you add a limit range to a namespace, and later remove it, the LRU cache could have the wrong data? also what happens if i delete and readd the same namespace, but the new namespace has no limit range? this looks like it could return wrong data in that scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a ttl to the cache entry to timebound the wrongness. One minute ought to do for the majority of cases. Once you're beyond that the locality doesn't help.

@derekwaynecarr
Copy link
Member

I would prefer a flag to admission controller that says to always do a live look-up, and downstream we can run with that flag enabled... that seems less error-prone and brings it fewer dependencies.

@k8s-bot
Copy link

k8s-bot commented Feb 18, 2016

GCE e2e build/test failed for commit 91b2e5ad1753d6afb898944e3b269e62ee3b19bb.

@derekwaynecarr
Copy link
Member

After in person chat, I see the point that its best to fail by default if LimitRange is viewed as a more formal aspect of the security apparatus. With a TTL on the LRU cache to handle re-using namespace names, this is fine.

@derekwaynecarr
Copy link
Member

Actually, the TTL is problematic, since it will keep checking for those namespaces that don't use it at all... I guess if that is a major concern, they can choose to not run with the admission controller enabled.

@deads2k
Copy link
Contributor Author

deads2k commented Feb 18, 2016

Actually, the TTL is problematic, since it will keep checking for those namespaces that don't use it at all... I guess if that is a major concern, they can choose to not run with the admission controller enabled.

And the re-lookup boundary would be hit only a couple times per minute, not every request.

@derekwaynecarr
Copy link
Member

sgtm

@deads2k deads2k force-pushed the fix-limit-ranger branch 2 times, most recently from 2826422 to 324ece0 Compare February 18, 2016 17:47
@deads2k
Copy link
Contributor Author

deads2k commented Feb 18, 2016

ttl added.

@derekwaynecarr
Copy link
Member

cool, lgtm.

@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 18, 2016

GCE e2e build/test failed for commit 2826422ccfc25af04d335bde53655e5b293aab72.

@k8s-bot
Copy link

k8s-bot commented Feb 18, 2016

GCE e2e test build/test passed for commit 324ece0b1bcbee018ce1442ab96a0cdacc1a3314.

@deads2k deads2k added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Feb 18, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 18, 2016

GCE e2e build/test failed for commit a34da7f3649c22a591ce004145b514ce26ede60b.

@deads2k deads2k added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Feb 19, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit c3ef5730a977b15cdda336b3bca9048c1f210097.

@k8s-github-robot
Copy link

PR changed after LGTM, removing LGTM.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 19, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit db8778fcff6c15af5070093d41a47ad555052780.

@nikhiljindal
Copy link
Contributor

This should fix #21234.
@deads2k You need to run hack/update-godep-licenses.sh to get travis to pass.

@deads2k
Copy link
Contributor Author

deads2k commented Feb 19, 2016

This should fix #21234.
@deads2k You need to run hack/update-godep-licenses.sh to get travis to pass.

Thanks, repushed

@deads2k deads2k added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 19, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit 24d5329.

@deads2k deads2k added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 19, 2016
@deads2k
Copy link
Contributor Author

deads2k commented Feb 19, 2016

fix for p1 flake

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit 24d5329.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

k8s-github-robot pushed a commit that referenced this pull request Feb 19, 2016
@k8s-github-robot k8s-github-robot merged commit 94ad715 into kubernetes:master Feb 19, 2016
@deads2k deads2k deleted the fix-limit-ranger branch February 26, 2016 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

e2e flake: LimitRange [It] should create a LimitRange with defaults and ensure pod has those defaults applied.
7 participants