-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduling Inconsistency Caused by kube-scheduler Restart #126499
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/sig scheduling storage |
Some information is not clear. What is your k8s version? Can you provide a yaml file to reproduce this problem? In addition, what do you mean by restarting kube-scheduler? Can you print out the log? |
/triage needs-information |
/remove-kind feature |
k8s version:1.28.1 Problem Description:
Ignore the reason why the APIServer cannot be connected and check why the temporary volume information scheduled before the restart cannot be read after the kube-scheduler is restarted. |
The procedure is as follows:
|
The kube-scheduler restarts after the PVC node information is updated. The code is stored in the following directory: kubernetes/pkg/scheduler/framework/plugins/volumebinding/binder.go Lines 596 to 609 in 619b005
The pod configured with strong anti-affinity has two copies. Before the pod is restarted, the PVC of copy 1 is scheduled to node2. This seems to be a bug in k8s. Does the community have a plan to fix it? |
kube-scheduler
scheduled the ephemeral volume.
FYI on a related issue #125491 |
I'm looking into #125491 which potentially should help with fixing this one, once we manage to restore the in-flight actions after scheduler restart. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
What would you like to be added?
I have a deployment that needs to pull up two replicas and configure pod strong anti-affinity as follows:
Why is this needed?
kube-scheduler schedules the temporary volume of replica 1 to node 1 but does not schedule the pod of replica 1. At this time, kube-scheduler restarts. After the restart, kube-scheduler starts to schedule the temporary volume and pod of replica 2 and schedules replica 2 to node 1. As a result, copy 1 cannot be scheduled due to anti-affinity.
After kube-scheduler is restarted, it cannot load the temporary volume information scheduled before the restart.
Does the community have any suggestions on this issue?
The text was updated successfully, but these errors were encountered: