[scheduler] Improve Handling of Node Status #125345

gabesaba · 2024-06-05T14:47:59Z

What would you like to be added?

This issue is to track follow-ups from #125197, which addressed a performance issue related to the NodeToStatusMap. There are latent issues which we would like to address to prevent future regressions. We can either improve NodeToStatusMap to support some of the below operations more efficiently, or come up with different way to propagate these statuses through the scheduler.

current problems:

we loop through all nodes to set the same status
- w/ current implementation, only need to set Unschedulable status - see comment
we loop through all nodes when determining preemption candidates, so that we can create an error message based on the UnschedulableAndUnresolvable status. this error message can be generated more efficiently. also see comment

other considerations:

when operating over many nodes, hashing node name string is hotspot when doing hashmap gets/sets. consider creating a lightweight NodeKey type, which, for a scheduling cycle, references an existing node. This type should be cheap to hash (+allow lookup in flat data structure), and is safer than string node name, as it is guaranteed to reference a valid node.
consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable. at least, we should reconcile the documented behavior here and here and here
consider changing Status to an immutable type, so it can be safely shared.

/sig scheduling

Why is this needed?

The current implementation includes some operations which may cause future performance regression. Additionally, there is some room for simplification.

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-06-05T14:48:07Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

gabesaba · 2024-06-05T14:48:33Z

cc @AxeZhan @sanposhiho @alculquicondor

AxeZhan · 2024-06-07T07:25:22Z

consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable.

I personally prefer this approach. This seems to be sufficient and the changes are relatively small.

sanposhiho · 2024-06-09T16:26:35Z

consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable.

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.
But, there are some other usecases in PostFilter, such as record something, clean up something, etc when each scheduling fails.

AxeZhan · 2024-06-10T12:25:36Z

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.
But, there are some other usecases in PostFilter, such as record something, clean up something, etc when each scheduling fails.

Ah right. We have already changed current behaviour to not set UnschedulableAndUnresolvable for allNodes if preFilter fails, not sure if this will break some custom postFilter plugins.
Then I think instead of not call postfilter, passing an empty NodeToStatus map may be enough. However, we'll need a new field in NodeToStatus map to indicate that all nodes are UnschedulableAndUnresolvable for plugins.

alculquicondor · 2024-06-10T15:18:47Z

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.

Still, the preemption plugin could exit early if can tell that all nodes where UnschedulableAndUnresolvable

YamasouA · 2024-07-05T10:58:24Z

/assign

alculquicondor · 2024-07-05T13:34:45Z

@YamasouA this is not a task that is friendly to newcomers. Do you have experience writing code for the scheduler? Or perhaps you have someone that can mentor you already?

sanposhiho · 2024-07-06T01:18:32Z

No worries Aldo, I'm with them.

macsko · 2024-07-09T08:49:40Z

I have been experimenting with these improvements over the past two weeks (#125954). Unfortunately, I forgot to assign myself to the issue, my bad. Nevertheless, the PR addresses only the "current problems" part of the improvements. I think the lightweight NodeKey and immutable Status types are still worth exploring, but in my opinion both would require changes in plugins' interface.

gabesaba added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 5, 2024

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Jun 5, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 5, 2024

gabesaba mentioned this issue Jun 10, 2024

REQUEST: New membership for gabesaba kubernetes/org#5008

Closed

11 tasks

AxeZhan mentioned this issue Jun 21, 2024

Throughput degradation scheduling daemonset pods #124709

Closed

k8s-ci-robot assigned YamasouA Jul 5, 2024

macsko mentioned this issue Jul 8, 2024

WIP: Improve usage of nodeToStatusMap in scheduler #125954

Closed

macsko mentioned this issue Jul 11, 2024

Change structure of NodeToStatus map in scheduler #126022

Merged

YamasouA removed their assignment Jul 25, 2024

k8s-ci-robot closed this as completed in #126022 Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[scheduler] Improve Handling of Node Status #125345

[scheduler] Improve Handling of Node Status #125345

gabesaba commented Jun 5, 2024

k8s-ci-robot commented Jun 5, 2024

gabesaba commented Jun 5, 2024

AxeZhan commented Jun 7, 2024

sanposhiho commented Jun 9, 2024

AxeZhan commented Jun 10, 2024

alculquicondor commented Jun 10, 2024

YamasouA commented Jul 5, 2024

alculquicondor commented Jul 5, 2024 •

edited

Loading

sanposhiho commented Jul 6, 2024 •

edited

Loading

macsko commented Jul 9, 2024

[scheduler] Improve Handling of Node Status #125345

[scheduler] Improve Handling of Node Status #125345

Comments

gabesaba commented Jun 5, 2024

What would you like to be added?

Why is this needed?

k8s-ci-robot commented Jun 5, 2024

gabesaba commented Jun 5, 2024

AxeZhan commented Jun 7, 2024

sanposhiho commented Jun 9, 2024

AxeZhan commented Jun 10, 2024

alculquicondor commented Jun 10, 2024

YamasouA commented Jul 5, 2024

alculquicondor commented Jul 5, 2024 • edited Loading

sanposhiho commented Jul 6, 2024 • edited Loading

macsko commented Jul 9, 2024

alculquicondor commented Jul 5, 2024 •

edited

Loading

sanposhiho commented Jul 6, 2024 •

edited

Loading