Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scheduler] Improve Handling of Node Status #125345

Closed
gabesaba opened this issue Jun 5, 2024 · 10 comments · Fixed by #126022
Closed

[scheduler] Improve Handling of Node Status #125345

gabesaba opened this issue Jun 5, 2024 · 10 comments · Fixed by #126022
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@gabesaba
Copy link
Contributor

gabesaba commented Jun 5, 2024

What would you like to be added?

This issue is to track follow-ups from #125197, which addressed a performance issue related to the NodeToStatusMap. There are latent issues which we would like to address to prevent future regressions. We can either improve NodeToStatusMap to support some of the below operations more efficiently, or come up with different way to propagate these statuses through the scheduler.

current problems:

  • we loop through all nodes to set the same status
    • w/ current implementation, only need to set Unschedulable status - see comment
  • we loop through all nodes when determining preemption candidates, so that we can create an error message based on the UnschedulableAndUnresolvable status. this error message can be generated more efficiently. also see comment

other considerations:

  • when operating over many nodes, hashing node name string is hotspot when doing hashmap gets/sets. consider creating a lightweight NodeKey type, which, for a scheduling cycle, references an existing node. This type should be cheap to hash (+allow lookup in flat data structure), and is safer than string node name, as it is guaranteed to reference a valid node.
  • consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable. at least, we should reconcile the documented behavior here and here and here
  • consider changing Status to an immutable type, so it can be safely shared.

/sig scheduling

Why is this needed?

The current implementation includes some operations which may cause future performance regression. Additionally, there is some room for simplification.

@gabesaba gabesaba added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 5, 2024
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Jun 5, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 5, 2024
@gabesaba
Copy link
Contributor Author

gabesaba commented Jun 5, 2024

@AxeZhan
Copy link
Member

AxeZhan commented Jun 7, 2024

consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable.

I personally prefer this approach. This seems to be sufficient and the changes are relatively small.

@sanposhiho
Copy link
Member

consider updating scheduler to not call postfilter plugins when all nodes are UnschedulableAndUnresolvable.

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.
But, there are some other usecases in PostFilter, such as record something, clean up something, etc when each scheduling fails.

@AxeZhan
Copy link
Member

AxeZhan commented Jun 10, 2024

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.
But, there are some other usecases in PostFilter, such as record something, clean up something, etc when each scheduling fails.

Ah right. We have already changed current behaviour to not set UnschedulableAndUnresolvable for allNodes if preFilter fails, not sure if this will break some custom postFilter plugins.
Then I think instead of not call postfilter, passing an empty NodeToStatus map may be enough. However, we'll need a new field in NodeToStatus map to indicate that all nodes are UnschedulableAndUnresolvable for plugins.

@alculquicondor
Copy link
Member

We cannot take this way, UnschedulableAndUnresolvable has the meaning only to plugin(s) that do preemption-ish stuff.

Still, the preemption plugin could exit early if can tell that all nodes where UnschedulableAndUnresolvable

@YamasouA
Copy link
Contributor

YamasouA commented Jul 5, 2024

/assign

@alculquicondor
Copy link
Member

alculquicondor commented Jul 5, 2024

@YamasouA this is not a task that is friendly to newcomers. Do you have experience writing code for the scheduler? Or perhaps you have someone that can mentor you already?

@sanposhiho
Copy link
Member

sanposhiho commented Jul 6, 2024

No worries Aldo, I'm with them.

@macsko
Copy link
Member

macsko commented Jul 9, 2024

I have been experimenting with these improvements over the past two weeks (#125954). Unfortunately, I forgot to assign myself to the issue, my bad. Nevertheless, the PR addresses only the "current problems" part of the improvements. I think the lightweight NodeKey and immutable Status types are still worth exploring, but in my opinion both would require changes in plugins' interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
7 participants