-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: send all recevied pods in one update #23141
Conversation
Labelling this PR as size/S |
GCE e2e build/test passed for commit 75dfc2249859d51e8eaaaec176538adbfc7f53ba. |
// this is an add | ||
addPods = append(addPods, ref) | ||
} | ||
addPods = append(addPods, ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yujuhong nit: maybe add a TODO or file an issue about the original issue of pod rejection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your old comment, with this change, some existing pods might be rejected by admission process if there is over-commitment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I saw your comment below. But I am not 100% convinced the change is safe here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dchen1107 FYI, This part of code is introduced in #18546
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dchen1107 In fact, the original pod rejection bug was there for a while. Yuju tried to fix it with #18546, but introduced one of the issues in #23104. This PR just recover the old behavior.
If you think we should still fix the old bug, maybe we could try some other ways like: #18546 (comment)
LGTM with a nit. |
The kubelet sync loop relies on getting one update as the signal that the specific source is ready. This change ensures that we don't send multiple updates (ADD, UPDATE) for the first batch of pods. This is required to prevent the cleanup routine from killing pods prematurely.
/cc @dchen1107, we should get this in v1.2. |
GCE e2e build/test passed for commit deafa44. |
Removing label |
Offline discussed with @dchen1107 , removed LGTM for a little more discussion. |
Discussed with @yujuhong offline. This change reset kubelet's behavior to the old behavior so that kubelet won't prematurely kill existing pods. But this introduces back the issue which might be caused by a very small race windows: several pods are running and apiserver sends Kubelet a new request, and Kubelet is restart. In this case, running pods might be killed due to fail the feasibility check. But I think the above issue should be very rare and eventually the problem should be properly addressed once we introduced eviction cost #22212 |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit deafa44. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit deafa44. |
Automatic merge from submit-queue |
Auto commit by PR queue bot
Auto commit by PR queue bot (cherry picked from commit 61b9a21)
Commit e2ad6d7 found in the "release-1.2" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this s an error find help to get your PR picked. |
Auto commit by PR queue bot (cherry picked from commit 61b9a21)
Auto commit by PR queue bot (cherry picked from commit 61b9a21)
Auto commit by PR queue bot (cherry picked from commit 61b9a21)
Auto commit by PR queue bot (cherry picked from commit 61b9a21)
The kubelet sync loop relies on getting one update as the signal that the
specific source is ready. This change ensures that we don't send multiple
updates (ADD, UPDATE) for the first batch of pods. This is required to prevent
the cleanup routine from killing pods prematurely.
This fixes one issue seen in #23104