-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PHASE 1] Opaque integer resource accounting. #31652
[PHASE 1] Opaque integer resource accounting. #31652
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
6 similar comments
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. |
1490d45
to
b5e1933
Compare
@balajismaniam and I are both covered by the Intel corporate CLA (org emails are public in our GitHub profiles) |
|
||
## Terms | ||
### Opaque node-level resource | ||
A resource associated with a node that has a consumable capacity, which is handled uniformly by the scheduler and Kubelet. (If there are specializations that hang off of the resource name, then it is not opaque!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/(If there are specializations that hang off of the resource name, then it is not opaque!)/
makes more sense to define what you mean by opaque. In general, the name opaque lends to more confusion. We used to call these "machine limits" in condor.
right now the definition of an opaque resource is opaque.
@timothysc squashed, just awaiting CI results then everything should be good to go. Thanks all for reviews. |
@timothysc checks are green, could you please add your LGTM? |
@ConnorDoyle could you add your release note above too please. |
LGTM |
@timothysc added summary line and extended release note in the PR description. |
Jenkins verification failed for commit 3f5ad42ac98463d2a4dd5dbfe312b38f53557c3e. Full PR test history. The magic incantation to run this job again is |
Jenkins GCE Node e2e failed for commit 3f5ad42ac98463d2a4dd5dbfe312b38f53557c3e. Full PR test history. The magic incantation to run this job again is |
- Prevents kubelet from overwriting capacity during sync. - Handles opaque integer resources in the scheduler. - Adds scheduler predicate tests for opaque resources. - Validates opaque int resources: - Ensures supplied opaque int quantities in node capacity, node allocatable, pod request and pod limit are integers. - Adds tests for new validation logic (node update and pod spec). - Added e2e tests for opaque integer resources.
@ConnorDoyle You're failing Jenkins verification. I think you need to run ./hack/update_owners.py to fix that. |
3f5ad42
to
b0421c1
Compare
@foxish thanks, just pushed a commit for that. The verification script is passing locally for me now. |
Automatic merge from submit-queue |
Automatic merge from submit-queue (batch tested with PRs 36693, 40154, 40170, 39033) Minor hygiene in scheduler. **What this PR does / why we need it**: Minor cleanups in scheduler, related to PR kubernetes#31652. - Unified lazy opaque resource caching. - Deleted a commented-out line of code. **Release note**: ```release-note N/A ```
@ConnorDoyle in your example, the superchimp pod gets a "FailedScheduling" status. However, the behavior for CPU and memory seem to be to put the pod in a "Pending" state which means that when the resource becomes available the pod can get rescheduled. It seems to me that the current behavior for opaque resource means that pods which request more resources than available fail to be rescheduled once the resources becomes available once again is that correct ? Is it the expected behavior and/or is there a way to get the same behavior as CPU/memory 😄? |
@RenaudWasTaken the behavior should be the same as CPU and memory, if you see something else happening then it's a bug. The example in the PR description may not have been kept up to date since the very first version of the patch set.
… On Feb 17, 2017, at 16:06, RenaudWasTaken ***@***.***> wrote:
@ConnorDoyle in your example, the superchimp pod gets a "FailedScheduling" status.
However, the behavior for CPU and memory seem to be to put the pod in a "Pending" state which means that when the resource becomes available the pod can get rescheduled.
It seems to me that the current behavior for opaque resource means that pods which request more resources than available fail to be rescheduled once the resources becomes available once again is that correct ?
Is it the expected behavior and/or is there a way to get the same behavior as CPU/memory 😄?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
[PHASE 1] Opaque integer resource accounting.
This change provides a simple way to advertise some amount of arbitrary countable resource for a node in a Kubernetes cluster. Users can consume these resources by including them in pod specs, and the scheduler takes them into account when placing pods on nodes. See the example at the bottom of the PR description for more info.
Summary of changes:
pod.alpha.kubernetes.io/opaque-int-resource-
.Feature issue: kubernetes/enhancements#76
Design: http://goo.gl/IoKYP1
Issues:
#28312
#19082
Related:
#19080
CC @davidopp @timothysc @balajismaniam
Release note:
Usage example
After patching, the kubelet's next sync fills in allocatable:
Create two pods, one that needs a single banana and another that needs a truck load:
Inspect the scheduler result and pod status:
This change is