-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop deleting DaemonSet pods since they will be recreated immediately, and also require a flag when there are DaemonSet pods on the node. #20232
Conversation
Labelling this PR as size/M |
GCE e2e test build/test passed for commit 1ca5a82de75531a79a99c0a803172c158f3287fc. |
Yeah, figuring out the right semantics here is tough, I think what you chose is pretty reasonable (I can think of other semantics, but none are obviously better). I have two suggestions Does that sound reasonable? |
The flag doesn't just turn off a warning. In fact, without it we won't drain a node with a DaemonSet pod. With it, we literally skip over those pods, not deleting them but still saying "drained". There may be a better name, but "nowarn" makes it sound like it only toggles diagnostics and not behavior.
This I can add. |
Ah yes, I was just thinking about the fact that we never actually kill DaemonSet pods under any circumstances, hence the flag's behavior with respect to DaemonSet pods is really just a warning. But you're right, it changes the behavior with respect to machines (i.e. whether we drain other pods on the machine or skip the machine). So current flag name sounds good. |
GCE e2e test build/test passed for commit 8df7861555caffd1814b36678f38931847d1da5e. |
GCE e2e build/test failed for commit b9fd103dc718244027adfb102950add449a3edca. |
@davidopp here's the new output. % _output/local/bin/linux/amd64/kubectl drain kubernetes-minion-c9db --ignore-daemonsets --force
node "kubernetes-minion-c9db" already cordoned
WARNING: About to delete these pods managed by neither a ReplicationController nor a DaemonSet: naked and DaemonSet-managed pods: mmlds-l1ekr
pod "naked" deleted
pod "kube-dns-v10-sp51v" deleted
node "kubernetes-minion-c9db" drained |
NM. Just realized I printed exactly the wrong thing, fixing it now. |
|
5e89921
to
91d8f7b
Compare
@davidopp PTAL |
GCE e2e test build/test passed for commit 91d8f7bc535eda0c671038472c8d202061b3ab06. |
pods unless you use \-\-force. | ||
the API server) and DaemonSet\-managed pods. If there are DaemonSet\-managed | ||
pods, drain will not proceed without \-\-ignore\-daemonsets, since those pods will | ||
be immediately replaced by the NodeController. If there are any pods that are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never actually moved this to the NodeController; the daemons would be replaced by the DaemonSet controller.
Correct. Fixed. Also removed all mention of NodeController in favor of DaemonSet Controller. PTAL |
GCE e2e test build/test passed for commit fa0a01173e5a7ba28c448c1b229281c740b3b1ee. |
managed by a ReplicationController or DaemonSet, then drain will not delete any | ||
pods unless you use \-\-force. | ||
the API server) and DaemonSet\-managed pods. If there are DaemonSet\-managed | ||
pods, drain will not proceed without \-\-ignore\-daemonsets, since those pods will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "those pods will be immediately replaced by the DaemonSet Controller" thing is a little confusing because drain actually won't delete the DaemonSet pods, even if you pass --ignore-daemonsets. I would rephrase as something like "If there are DaemonSet-managed pods, drain will not proceed without --ignore-daemonsets, and regardless it will not delete any DaemonSet-managed pods, because those pods would be immediately replaced by the DaemonSet controller, which ignores unschedulable markings."
Of course for all of this stuff we need to make sure your Job PR merges properly, i.e. ReplicationController -> ReplicationController, Job
} | ||
fmt.Fprintf(o.out, "WARNING: About to delete these pods managed by neither a ReplicationController nor a DaemonSet: %s\n", joined) | ||
} else if len(unreplicatedPodNames) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, ignore that
Have we effectively reverted #17318 -- will daemons land on unschedulable nodes? |
PR needs rebase |
We do this because they will be recreated immediately by the DaemonSet Controller. In addition, we also require a specific flag (--ignore-daemonsets) when there are DaemonSet pods on the node.
I didn't quite do 16 cases, but I think it's pretty clear now. LMK |
GCE e2e test build/test passed for commit 10164659ef077d956ab92dd9116edcf7a845d692. |
GCE e2e test build/test passed for commit f2d5375. |
func unmanagedMsg(unreplicatedNames []string, daemonSetNames []string, include_guidance bool) string { | ||
msgs := []string{} | ||
if len(unreplicatedNames) > 0 { | ||
msg := fmt.Sprintf("pods not managed by ReplicationController, Job, or DaemonSet: %s", strings.Join(unreplicatedNames, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not say DaemonSet, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong. These are pods with no created-by ref, so they aren't managed by RC, Job, or DS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, the thing I missed was that you only append to unreplicatedPodNames in a case branch that comes after daemonset_pods, thus daemonset is never in unreplicatedPodNames.
LGTM |
@k8s-bot test this issue: #IGNORE Tests have been pending for 24 hours |
1 similar comment
@k8s-bot test this issue: #IGNORE Tests have been pending for 24 hours |
GCE e2e test build/test passed for commit f2d5375. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit f2d5375. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit f2d5375. |
Automatic merge from submit-queue |
Auto commit by PR queue bot
It took some work to get all the cases right in terms of error and warning messages, but the transcript below shows that it's good. I will refactor that part and squash the commits, but putting the PR out for a little scrutiny of semantics.