-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --force to kubectl delete and explain force deletion #35484
Add --force to kubectl delete and explain force deletion #35484
Conversation
@kubernetes/kubectl see referenced proposal for description of changes, @foxish |
gracePeriod = 1 | ||
} | ||
if gracePeriod == 0 && !cmdutil.GetFlagBool(cmd, "force") { | ||
return fmt.Errorf("Immediate deletion may result in multiple copies of your resources running at the same time. You must pass --force to delete with grace period 0.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this error message may not be clear to someone who isn't familiar with the entire issue. Also, if it's force deleting a simple pod object, the multiple copies comment does not apply. Maybe we could rephrase it to say something along the lines of:
Immediate deletion does not wait for confirmation that the running resource has been terminated and may leave it running in case of a network partition.
down or unable to communicate to the server. To force delete a resource, you must pass a grace | ||
period of 0 and specify the --force flag. | ||
|
||
IMPORTANT: Force deleting pods may result in multiple instances of that pod running at the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe clarification that multiple instances may be left running if the pod is part of a collection managed by a higher level controller (RS/RC/PS).
3cd0d74
to
4f2504b
Compare
Updated text (but not generated files) |
@foxish @vishh @bprashanth if you guys have more opinions on the text here please let me know, otherwise going to generate docs and make sure the tests are right. |
Also any opinionated kubectl reviewers on messages on delete. This is an explicit CLI API change for safety purposes and can break scripts that were doing unsafe things before. |
gracePeriod = 1 | ||
} | ||
if gracePeriod == 0 && !cmdutil.GetFlagBool(cmd, "force") { | ||
return fmt.Errorf("Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. You must pass --force to delete with grace period 0.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd place a WARNING:
prefix to this message!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but we can't do that today without changes to the command stuff. I'll open a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse my ignorance, why is it not possible to add a string prefix to the error message here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All errors are prefixed with error:
. So if we added a prefix it would be error: WARNING: blah
which doesn't make much sense.
before they are forcibly terminated (the grace period) but you may override that value with | ||
the --grace-period flag, or pass --now to set a grace-period of 1. Because these resources often | ||
represent entities in the cluster, deletion may not be acknowledged immediately. If a resource | ||
like a pod has not been deleted within the grace period, the node that hosts that pod may be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grace period can be as low as 1 second. We should not place expectations on the system to gracefully delete pod objects in 1 second. Instead, let's say that if a pod isn't deleted in a reasonable amount of time, the node might be down and/or unreachable. In such scenarios, it might be necessary to forcibly delete pods to free up cluster resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
period of 0 and specify the --force flag. | ||
|
||
IMPORTANT: Force deleting pods does not wait for confirmation that the pod's processes have been | ||
terminated, which means the pod may still be running on the cluster. This may result in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just concurrent access, it will also lead to resource leakage on the nodes which over time can lead to unexpected evictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think most users care about that - that's a kubelet problem. Most users care about "my database is now corrupted" >>>> "nodes may not have as many resources as they said".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, users don't care about resources up and until their pods start getting evicted. Would it hurt conveying this additional information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just trying not to make this a novel. Let me see if I can work in a sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a sentence at the end.
@smarterclayton Looks fine to me. |
4f2504b
to
9a8d801
Compare
Jenkins GKE smoke e2e failed for commit 9a8d8016f25263645d2df42cc23f38d396a11743. Full PR test history. The magic incantation to run this job again is |
Jenkins GCI GKE smoke e2e failed for commit 9a8d8016f25263645d2df42cc23f38d396a11743. Full PR test history. The magic incantation to run this job again is |
Jenkins GCI GCE e2e failed for commit 9a8d8016f25263645d2df42cc23f38d396a11743. Full PR test history. The magic incantation to run this job again is |
Jenkins GCE etcd3 e2e failed for commit 9a8d8016f25263645d2df42cc23f38d396a11743. Full PR test history. The magic incantation to run this job again is |
Jenkins GCE e2e failed for commit 9a8d8016f25263645d2df42cc23f38d396a11743. Full PR test history. The magic incantation to run this job again is |
9a8d801
to
46972ec
Compare
Jenkins unit/integration failed for commit 46972ec0c54381910fbe8e492d16423ae9aca5e0. Full PR test history. The magic incantation to run this job again is |
Jenkins GCE Node e2e failed for commit 46972ec0c54381910fbe8e492d16423ae9aca5e0. Full PR test history. The magic incantation to run this job again is |
46972ec
to
3c81f5a
Compare
node e2e looks like a flake (or maybe needs a --force and is failing hard somewhere)
|
LGTM |
Any other comments or is this good enough for government work? |
@smarterclayton two more open comments. |
--force is required for --grace-period=0. --now is == --grace-period=1. Improve command help to explain what graceful deletion is and warn about force deletion.
3c81f5a
to
6e25830
Compare
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
@smarterclayton cc @saad-ali Expected error:
<*errors.errorString | 0xc820019820>: {
s: "Error running &{/workspace/kubernetes_skew/cluster/kubectl.sh [/workspace/kubernetes_skew/cluster/kubectl.sh --server=https://199.223.235.243 --kubeconfig=/workspace/.kube/config delete --grace-period=0 -f - --namespace=e2e-tests-kubectl-up6nc] [] 0xc8207b9b60 error: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. You must pass --force to delete with grace period 0.\n [] <nil> 0xc8206fe440 exit status 1 <nil> true [0xc820ea4718 0xc820ea4740 0xc820ea4750] [0xc820ea4718 0xc820ea4740 0xc820ea4750] [0xc820ea4720 0xc820ea4738 0xc820ea4748] [0xa97840 0xa979a0 0xa979a0] 0xc82027b800}:\nCommand stdout:\n\nstderr:\nerror: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. You must pass --force to delete with grace period 0.\n\nerror:\nexit status 1\n",
}
Error running &{/workspace/kubernetes_skew/cluster/kubectl.sh [/workspace/kubernetes_skew/cluster/kubectl.sh --server=https://199.223.235.243 --kubeconfig=/workspace/.kube/config delete --grace-period=0 -f - --namespace=e2e-tests-kubectl-up6nc] [] 0xc8207b9b60 error: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. You must pass --force to delete with grace period 0.
[] <nil> 0xc8206fe440 exit status 1 <nil> true [0xc820ea4718 0xc820ea4740 0xc820ea4750] [0xc820ea4718 0xc820ea4740 0xc820ea4750] [0xc820ea4720 0xc820ea4738 0xc820ea4748] [0xa97840 0xa979a0 0xa979a0] 0xc82027b800}:
Command stdout:
stderr:
error: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. You must pass --force to delete with grace period 0.
error:
exit status 1
not to have occurred |
Thanks for raising this issue @krousey! Can you create an bug with the 1.5 milestone to track this. @smarterclayton any thoughts on how to handle this? Can you have server side check client version before enforcing the new behavior? |
This change is completely at the client-side and is a departure from the old behavior. I imagine one could check the server version and then enforce the use of the |
@foxish I just created an issue. Please move discussion to there. |
This breaks all previous uses of We need to not break scripts using kubectl where there was no well defined way of ensuring forward compatibility in order to avoid such breakage. |
The break was intentional and the reasoning laid in the proposal for why allowing this to continue is unsafe. |
Let's continue discussion in #37117 |
--force is required for --grace-period=0. --now is == --grace-period=1.
Improve command help to explain what graceful deletion is and warn about
force deletion.
Part of #34160 & #29033
This change is