-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable #125197
Conversation
This reverts commit 9fcd791.
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hi @gabesaba. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/assign @alculquicondor |
/ok-to-test |
/release-note-edit
|
@alculquicondor: /release-note-edit must be used with a release note block. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/release-note-edit
|
looks like it needs
|
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
|
||
diagnosis := framework.Diagnosis{ | ||
NodeToStatusMap: make(framework.NodeToStatusMap, len(allNodes)), | ||
return nil, diagnosis, err | ||
} | ||
// Run "prefilter" plugins. | ||
preRes, s := fwk.RunPreFilterPlugins(ctx, state, pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on an unchanged line.
So based on our implementation, in L460, we should only set status for all nodes if the status returned by fwk.RunPreFilterPlugins(ctx, state, pod)
has Unschedulable
code, right? Which can only happen in a scheduler with specific out-of-tree plugins.
In fact, I think we can only list allnodes
and update diagnosis.NodeToStatusMap
when runprefilter returns an Unschedulable
status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this sounds right to me. However, I omited it from this change to keep the diff small, since updating the tests produces a 30 line (+7, -23) diff. Then, the intention is to clean it up in a PR which won't be cherry-picked.
Do you prefer I include it in this change, or do you think the original plan makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, as we're going to cherry-pick this to recent releases. I agree that this should be as simple as possible(Provided it can bring back our previous performance).
I think we can leave a comment here, and leave it for a follow up.
nodeStatuses[nodeName] = framework.NewStatus(framework.UnschedulableAndUnresolvable, "Preemption is not helpful for scheduling") | ||
continue | ||
} | ||
potentialNodes = append(potentialNodes, node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me if this is a dumb question:
Since we only add the node in framework.NodeToStatusMap to potentialNodes.
Why don't we iterate framework.NodeToStatusMap
directly instead of iterating allNodes
and check if the node is in framework.NodeToStatusMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the contrary, your comment is absolutely right :)
If we decide to not fill in the map (depending on resolution of #125197 (comment)), I will implement this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok filling up the map for this cherry-pickable patch, but we should follow this suggestion for 1.31.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in this PR, we are still filling up the map to minimize changes and facilitate cherry-picking.
However, in version 1.31, we will change the behavior so that we no longer fill up the map with UnschedulableAndUnresolvable status, am I understanding correctly??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the NodeToStatusMap
returned by nodesWherePreemptionMayHelp
. UnschedulableAndUnresolvable
status is only used for the error message (#nodes, and #"Preemption is not helpful for scheduling"). I think we can pipe this information in a more efficient way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, sgtm.
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Please prepare cherry-picks for all supported versions
LGTM label has been added. Git tree hash: 539b029ec9b38a8a86a8a5e323e17bee4b648401
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, gabesaba The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…5197-upstream-release-1.30 Cherry pick of #125197: [scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
Cherry pick of #125197: [scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
Cherry pick of #125197: [scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
Cherry pick of #125197: [scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
What type of PR is this?
/kind bug
/kind regression
/kind api-change
What this PR does / why we need it:
#119779 fixed a bug, but caused a performance regression #124709 observed at 5k nodes. Performance fix #124714 was merged, with modest improvement in performance. We still observe reduced throughput when running a test (15k nodes, 60k daemonset pods)
baseline (pre #119779): ~470 pods/s
current (with #124714): ~70 pods/s
more perf engineering: ~300 pods/s
this change: ~460 pods/s
This fix attempts to bring us back to baseline performance. We revert #124714, and part of #119779. We implement option 2 proposed here. While there are two unaddressed O(n) operations (1, 2), these haven't revealed themselves as performance problems in the wild. To keep this diff as small as possible for cherry-pick, we will defer the fix of those to a future minor version. This future change will require a breaking change to the NodeToStatusMap type, to allow better than O(n), or at least really fast O(n), representation of many nodes with the same status.
pair @mskrocki
/assign @alculquicondor, @liggitt, @Huang-Wei
/sig scheduling
Which issue(s) this PR fixes:
Fixes #124709
Special notes for your reviewer:
Does this PR introduce a user-facing change?