-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resources outside the *kubernetes.io
namespace are integers and cannot be over-committed.
#48922
Resources outside the *kubernetes.io
namespace are integers and cannot be over-committed.
#48922
Conversation
|
I'd split compute resources into three categories:
I intend to introduce a more feature rich Compute Resource representation API in the future, which can allow for representing both integer and fractional resources across all the three categories mentioned above. tl;dr; I don't think we should deprecate OIRs. We should rather have well defined user journeys for each of these categories and let users choose which one they want to use. |
@vishh that makes a lot of sense to me. So currently the first-class resources live in the default namespace, also available to without fully-qualifying the name. In your mind would it be another namespace that distinguishes extended resources from opaque resources? IIUC that was the function of the proposed |
/unassign |
@ConnorDoyle What's the user story behind this? Resource accounting? I keep going back and forth, but it would help me a lot if you had a concrete user story. |
@timothysc here are some (others please chime in if I missed the mark):
The device plugin proposal (linked in description) takes a different approach by extending the node spec to have |
pkg/api/helper/helpers.go
Outdated
func IsOpaqueIntResourceName(name api.ResourceName) bool { | ||
return strings.HasPrefix(string(name), api.ResourceOpaqueIntPrefix) | ||
return IsCustomNamespacedResourceName(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could alternatively do something like:
func IsIntResourceName(name api.ResourceName) bool {
return IsOpaqueIntResourceName(name) || IsExtendedResourceName(name)
}
And then update current uses of IsOpaqueIntResourceName()
in the scheduler to use IsIntResourceName()
instead. This would preserve the existing OIR prefix behavior and let resources with the new "extensions.kubernetes.io/"
prefix ride along with the existing scheduler logic for OIRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidopp I like this change in general. It allows for exposing arbitrary compute resources in all domains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vishh do you mean you like the patch as written, or the change suggested in this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the point in adding more and more special cased prefixes in the API.
I like the current proposal. Any resources without a domain name or with kubernetes.io
domain name will be treated as first class resources.
Everything else will be treated as integer resources until we have a better API representation for resources that allows to represent non-integer resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this also include "alpha.kubernetes.io/xxx" or ResourceStorageScratch or ResourceStorageOverlay? Generally I feel it is probably safer to start with:
func IsIntResourceName(name api.ResourceName) bool {
return IsOpaqueIntResourceName(name) || IsExtendedResourceName(name)
}
to make sure we don't break the current resource behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue of extensibility across quota, limitrange, UI, etc., is beyond just the APIs. I'd recommend deferring it since it requires changes to the rest of the system as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiayingz pointed out that extensions.kubernetes.io is also part of the domain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me see if I understand. With the suggested change, alpha.kubernetes.io/*
, beta.kubernetes.io/*
, kubernetes.io/*
, and any resource not of the form xxx/*
will be considered a first class resource. All resources of the form xxx/*
that does not match alpha.kubernetes.io/*
, beta.kubernetes.io/*
, or kubernetes.io/*
will be considered opaque integer resources (or extended resources, which I admit I haven't read up on, so I don't really know what the desired semantics are for that). Is that correct?
If so, it seems fine to me, assuming it doesn't break any existing resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidopp Any resource that's of the form [xxx.]*.kubernetes.io/
or without a domain name is considered to be first class. So alpha.kubernetes.io
or foo.bar.kubernetes.io
or storage.kubernetes.io
are all first class resources.
Everything else with a domain name that's not kubernetes.io
will be considered as extended resource which are opaque to the scheduler and possibly opaque to the node.
To begin with all these extended resources are considered to be integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems fine. It's really hard to predict how/when a decision like this will bite us later on, but I can't see any reason right now why it might be a problem...
Other resource APIs like Quota, Limit Range, etc., aren't extensible yet. So this PR will lead to feature requests for Quota and Limit range API extensions @davidopp |
/lgtm |
pkg/api/helper/helpers.go
Outdated
// Note: partially-qualified (unprefixed) names are assumed by default to be | ||
// in the kubernetes.io/ namespace. | ||
func IsCustomNamespacedResourceName(name api.ResourceName) bool { | ||
return strings.Contains(string(name), "/") && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since api.ResourceDefaultNamespacePrefix ends with a slash, why do you need this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, never mind
This needs an associated issue and a |
a02bac4
to
69b5ac4
Compare
Updated to fix unit test failure. |
cdd9e59
to
5c85df5
Compare
@davidopp @vishh @thockin updated. The previous OIR behavior is not opt-in yet. I would like to work on that in a follow-up in order to unblock device plugins. It's blocked on fixing #50658. Rationale: if we make the OIR behavior opt-in, then the existing alternative behavior of ignoring the resource request would be a very bad surprise for users. We need to reject unknown first-class resources before making the change so that users get a clear fail-fast signal that they have done something wrong. |
pkg/api/v1/helper/helpers.go
Outdated
// Returns true if the resource name is not in the *kubernetes.io/ namespace. | ||
// Note: partially-qualified (unprefixed) names are assumed by default to be | ||
// in the kubernetes.io/ namespace. | ||
func IsCustomNamespacedResourceName(name v1.ResourceName) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the inverse of the IsDefaultNamespaceResource() function in pkg/api/helper/helpers.go. Can you at least add a comment to both of them so that someone in the future who changes them will ensure they stay in sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't think we need both. Will remove one of them. Good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
d5e77f1
to
6d4d735
Compare
6d4d735
to
630af54
Compare
*kubernetes.io
namespace are integers and cannot be over-committed.
/lgtm |
Is this right? Given the design for device plugins, do we believe that all such plugin extensions represent integer-countable resources? I'll approve, because Vish is on it, but I'd like that spelled out somewhere. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ConnorDoyle, thockin, vishh Associated issue: 50473 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@thockin it should be included in the docs for the device plugin feature (at least). Some of the discussion happened on Vish's hardware accelerator proposal. To summarize, we didn't want to encode per-resource attributes like fractional-ness and overcommit-ibility in the resource name, and there's no API structure yet for them. Our thought is that this approach has a decent chance of being forward-compatible if one is added. |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue (batch tested with PRs 46317, 48922, 50651, 50230, 47599) |
@ConnorDoyle: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The plan is to introduce a better API for representing resources (as discussed offline) which would then allow for opening up sharing of extended resources. Until then, extended resources are expected to not allow sharing. If we go with the option of allowing sharing now, then taking that away when a new API pipeline gets introduced will become complicated. |
/retest |
What this PR does / why we need it:
Fixes #50473
Rationale: since the scheduler handles all resources except CPU as integers, that could just be the default behavior for namespaced resources.
cc @RenaudWasTaken @vishh
Release note: