Update Rescheduler's manifest #65454

bsalamat · 2018-06-25T23:39:25Z

What this PR does / why we need it: Updates Rescheduler's manifest to use version 0.4.0

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Update Rescheduler's manifest to use version 0.4.0.

vishh · 2018-06-25T23:51:31Z

@bsalamat can you add more details on what this change means to k8s on GCE?

jberkus · 2018-06-26T00:05:10Z

adding priority

/priority critical-urgent

bsalamat · 2018-06-26T00:53:17Z

Rescheduler is changed to work with Scheduler's preemption. In the latest version of the Rescheduler, it only evicts Pods when a critical DaemonSet Pod cannot be scheduled. Other critical system pods rely on the scheduler preemption logic to be scheduled.
Before the recent changes in the Rescheduler, it would evict Pods for any critical system Pods. So, using older versions of the Rescheduler could cause double preemption (one by the default scheduler and one by the Rescheduler) in Kubernetes 1.11 when a system critical Pod remains pending due to lack of resources in a cluster.

vishh · 2018-06-26T00:55:00Z

/lgtm
/approve

k8s-ci-robot · 2018-06-26T00:55:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, vishh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster/gce/OWNERS~~ [vishh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-github-robot · 2018-06-26T00:55:07Z

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@bsalamat @vishh

Pull Request Labels

sig/scheduling: Pull Request will be escalated to these SIGs if needed.
priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
kind/feature: New functionality.

Help

ravisantoshgudimetla · 2018-06-26T01:43:00Z

/retest

AishSundar · 2018-06-26T03:11:43Z

/retest

AishSundar · 2018-06-26T04:28:54Z

/retest

fejta-bot · 2018-06-26T07:25:58Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

fejta-bot · 2018-06-26T10:34:27Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

fejta-bot · 2018-06-26T13:22:27Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

fejta-bot · 2018-06-26T16:10:58Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

bsalamat · 2018-06-26T17:21:08Z

/retest

yguo0905 · 2018-06-26T17:37:33Z

/retest

yguo0905 · 2018-06-26T18:15:26Z

/retest

ravisantoshgudimetla · 2018-06-26T18:46:38Z

Is this network issue, when I skimmed through logs, I am seeing continuously following error:

{default-scheduler } FailedScheduling: 0/5 nodes are available: 1 node(s) were unschedulable, 5 node(s) had unavailable network, 5 node(s) were not ready.

bsalamat · 2018-06-26T18:53:35Z

I think tests should be fine now, if they finish!

ravisantoshgudimetla · 2018-06-26T19:36:07Z

@bsalamat So, what has changed to make tests pass? Was this related to something on infra side?

yguo0905 · 2018-06-26T19:38:01Z

The tests were failing because the upgraded rescheduler:v0.4.0 image was not published yet. Tests passed since the image is there now.

k8s-github-robot · 2018-06-26T19:57:32Z

/test all [submit-queue is verifying that this PR is safe to merge]

ravisantoshgudimetla · 2018-06-26T20:18:59Z

@yguo0905 Thanks but I think scheduler wouldn't throw those errors and rescheduler pod won't be in Pending state for a long time if there is an issue with container image not available in registry.

bsalamat · 2018-06-26T20:23:54Z

@ravisantoshgudimetla All the tests passed after we uploaded the image. The merge bot is running the tests again for merging. The issue was definitely caused by the absence of the image.

yguo0905 · 2018-06-26T20:29:07Z

rescheduler pod won't be in Pending state for a long time if there is an issue with container image not available in registry

It's a static pod, which I guess makes a difference here.

k8s-github-robot · 2018-06-26T21:21:19Z

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

…54-upstream-release-1.11 Automatic merge from submit-queue. Automated cherry pick of #65454: Update Rescheduler's manifest Cherry pick of #65454 on release-1.11. #65454: Update Rescheduler's manifest

Update Rescheduler's manifest

2cd3664

bsalamat added this to the v1.11 milestone Jun 25, 2018

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 25, 2018

k8s-ci-robot requested review from jszczepkowski and vishh June 25, 2018 23:39

k8s-github-robot added the milestone/incomplete-labels label Jun 25, 2018

bsalamat requested a review from yguo0905 June 25, 2018 23:42

bsalamat added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. kind/feature Categorizes issue or PR as related to a new feature. status/approved-for-milestone labels Jun 25, 2018

k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 26, 2018

k8s-github-robot removed the milestone/incomplete-labels label Jun 26, 2018

k8s-ci-robot assigned vishh Jun 26, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2018

k8s-github-robot merged commit 35d5daa into kubernetes:master Jun 26, 2018

bsalamat deleted the rescheduler_version branch June 26, 2018 22:08

yguo0905 mentioned this pull request Jun 27, 2018

Automated cherry pick of #65454: Update Rescheduler's manifest #65537

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Rescheduler's manifest #65454

Update Rescheduler's manifest #65454

bsalamat commented Jun 25, 2018

vishh commented Jun 25, 2018

jberkus commented Jun 26, 2018

bsalamat commented Jun 26, 2018

vishh commented Jun 26, 2018

k8s-ci-robot commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

AishSundar commented Jun 26, 2018

AishSundar commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

bsalamat commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

bsalamat commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

bsalamat commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018

Update Rescheduler's manifest #65454

Update Rescheduler's manifest #65454

Conversation

bsalamat commented Jun 25, 2018

vishh commented Jun 25, 2018

jberkus commented Jun 26, 2018

bsalamat commented Jun 26, 2018

vishh commented Jun 26, 2018

k8s-ci-robot commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

AishSundar commented Jun 26, 2018

AishSundar commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

fejta-bot commented Jun 26, 2018

bsalamat commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

bsalamat commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018

ravisantoshgudimetla commented Jun 26, 2018

bsalamat commented Jun 26, 2018

yguo0905 commented Jun 26, 2018

k8s-github-robot commented Jun 26, 2018