Skip to content

Increase scheduling backoff queue max duration and attach specific error message to unschedulable pods #81214

Closed
@NickrenREN

Description

What would you like to be added:

Increase backoff queue max duration and attach specific error message to unschedulable pods

Why is this needed:

Now the scheduling backoff queue max duration is 10 seconds. We find that some pods in our cluster(5K nodes and 10w+ pods) will wait a very long time to be scheduled. These pods are in the active queue with lower priority.
If some higher priority pods can not be scheduled and be added to backoff queue because of many events which trigger MoveAllToActiveQueue, these higher priority pods will be moved back to active queue in at most 10 seconds, which makes the lower priority pods can not even get a chance to be scheduled, can we increase the backoff queue max duration to relieve this situation ?

And also, some events such as PVC/Service ADD/UPDATE events will blindly move all pods in unschedulable queue to active queue. Can we attach the specific error message when we add pods to unschedulable queue so that events will only move partial pods in unschedulable queue to active queue ?

/assign

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions