Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile extended resource capacity after kubelet restart. #64784

Merged
merged 1 commit into from
Jun 6, 2018

Conversation

jiayingz
Copy link
Contributor

@jiayingz jiayingz commented Jun 5, 2018

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #64632

Special notes for your reviewer:

Release note:

Kubelet will set extended resource capacity to zero after it restarts. If the extended resource is exported by a device plugin, its capacity will change to a valid value after the device plugin re-connects with the Kubelet. If the extended resource is exported by an external component through direct node status capacity patching, the component should repatch the field after kubelet becomes ready again. During the time gap, pods previously assigned with such resources may fail kubelet admission but their controller should create new pods in response to such failures.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 5, 2018
@k8s-ci-robot k8s-ci-robot requested review from ncdc and sjpotter June 5, 2018 18:42
@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jun 5, 2018
@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

/assign @vishh

@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

/cc @ConnorDoyle please let us know asap if you have any concerns on this change.

Copy link
Contributor

@ConnorDoyle ConnorDoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a safe workaround to the problem described in #64632, especially with additional context provided by @vishh here.

Is this true? With this patch, pods that consume extended resources can no longer survive a Kubelet restart because they will fail admission. If so, could we add that to the release note?

requiresUpdate := false
for k := range node.Status.Capacity {
if v1helper.IsExtendedResourceName(k) {
node.Status.Capacity[k] = *resource.NewQuantity(int64(0), resource.DecimalSI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sanity's sake, should this also set that resource's allocatable value to zero? Otherwise this could temporarily set capacity such that allocatable > capacity. As is, allocatable would be overwritten to zero on the next kubelet sync iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

@ConnorDoyle Thanks a lot for the comment! I updated the release note to clarify the pod failure behavior associated with the change. PTAL.

@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

/test pull-kubernetes-local-e2e-containerized

{
name: "no update needed without extended resource",
existingNode: &v1.Node{
Status: v1.NodeStatus{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test check Allocatable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my bad. I did extend the test to check allocatable but forgot to push the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you just updated it!

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 5, 2018
@vishh
Copy link
Contributor

vishh commented Jun 5, 2018

@jiayingz what will be the behavior after this PR for trivial kubelet restarts? Will the kubelet update capacity/allocatable even though it has a valid checkpoint?

@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 5, 2018

For device plugin resource, even with a valid checkpoint, we already sets the resource capacity/allocatable to zero since #60856 to make sure no new pods will get assigned to the node till the device plugin re-connects. Existing pods already assigned with the resource can continue though with the valid checkpoint in place. That is the logic covered from device manager Allocate().

@vishh
Copy link
Contributor

vishh commented Jun 5, 2018 via email

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiayingz, vishh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2018
@vishh vishh added this to the v1.11 milestone Jun 6, 2018
@vishh vishh added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. status/approved-for-milestone labels Jun 6, 2018
@jiayingz
Copy link
Contributor Author

jiayingz commented Jun 6, 2018

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 6, 2018
@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@jiayingz @vishh

Pull Request Labels
  • sig/node: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/bug: Fixes a bug discovered during the current release.
Help

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit a32e5b6 into kubernetes:master Jun 6, 2018
k8s-github-robot pushed a commit that referenced this pull request Aug 6, 2018
…84-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #64784: Reconcile extended resource capacity after kubelet restart.

Cherry pick of #64784 on release-1.10.

#64784: Reconcile extended resource capacity after kubelet restart.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
6 participants