-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make background garbage collection cascading #44058
Make background garbage collection cascading #44058
Conversation
Isn't the problem that deployments should make replica sets which set a policy which isn't orphan? Making the GC choose background seems wrong, it means the setting a user may or may not have added to the object in question won't apply, right? |
How about letting deployment controller creates replicaset with finalizer set to "foregroundDeletion"? But it only fixes the deployment->rs->pods chains. If a user creates a dependency chain that include a rs (or any other controller type) as a link, the background deletion won't be cascading. I think it's against user's intuition if the deletion is cascading only when the user explicitly sets the finalizer of the rs to "foregroundDeletion". |
Making the deployment controller set the policy for replicasets makes sense and could be easily factored out if we ever come up with something better. CronJobs will also need the fix. I don't have a good solution for handling this holistically but let's not make perfect the enemy of the good for now. |
Ok. I can send a PR, or if @Kargakis @krmayankk are interested to do that? Should be a small change. |
I haven't follows the thread but seems like a easy fix . I am assuming it
only needs to be done for deployment creating replica sets
I will send out a pr
Mayank
On Tue, Apr 25, 2017 at 12:59 PM Chao Xu ***@***.***> wrote:
Ok. I can send a PR, or if @Kargakis <https://github.com/kargakis>
@krmayankk <https://github.com/krmayankk> are interested to do that?
Should be a small change.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44058 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AP4-NFV2k74TN7dMD1ZA8Mw4-MrEBNbFks5rzlChgaJpZM4MzcZ1>
.
--
-Mayank
|
@krmayankk we need to set the Replicaset.ObjectMeta.Finalizer to DeletingDependents when deployment create the replicaset. Also we need to update the gc test you wrote for deployment to verify that the pods are deleted. Thanks for helping. |
@caesarxuchao you mean Replicaset.ObjectMeta.Finalizer to foregroundDeletion ? |
@krmayankk that's a resource handler, you probably want to set it in a different place. I think the ideal one is in AdoptReplicaSets but the update needs to be atomic and I am not sure we can combine two different patches into a single call (the other one updates the owner refs of the RS). We should consider switching away from Patch to an Update with retries on conflicts so we can facilitate both updates (owner refs, finalizer) into one call. |
cc: @enisoc who was on the code recently. Any thoughts on the above? |
@Kargakis i think it should be done both when "add" and "adopt" replicasets. Also it should be possible to patch the objectMeta.ownerRef and objectMeta.Finalizers at the same time. |
The only logic that needs to exist in the resource handlers should be about requeueing the resource and nothing more. Adding the finalizer during adoption, when we create an RS, or when we find one that doesn't have them here should be enough.
If we can combine both then that's great. |
That's what i meant. I should had used the word "create" instead of "add" in my previous comment. @krmayankk sorry i didn't look at the link you pasted closely when i typed my last comment. As @Kargakis said, it's the resource handler and we shouldn't change the code there.
I disagree this part. User should be able to manually remove the finalizer, because the user might want to keep the pods alive when the deployment is deleted. The deployment controller shouldn't add the finalizer back. But this also means replicasets created by a deployment before 1.7 will continue to suffer #44046. |
If users want to keep the pods shouldn't they do non-cascading deletion of a deployment followed by a non-cascading deletion of the replica set? |
Or even better, orphan the replica set by changing its labels so that it won't be selected by the deployment, do a cascading deletion of the deployment, do a non-cascading deletion of the replica set. |
Both are more complicated than manually changing the finalizer of one replicaset. If the deployment controller is going to overwrite the finalizers set by a user, we might as well let the garbage collector overwrite the finalizers, that's what this PR is doing. |
@Kargakis wrote:
@caesarxuchao wrote:
Patch should work (in theory) for updating multiple things at once, as long as the change you're making is not dependent on existing values. For example, if you need to first check the existing list of Finalizers before deciding what/whether you're modifying, you would need to use Update with a retry loop. |
Users shouldn't muck with ReplicaSets owned by Deployments and we should discourage them from doing so as much as possible, otherwise they are racing with the controller and open the door for unexpected behavior. If you want to isolate a specific set of Pods, that is already not a common use case and orphaning the parent ReplicaSet or doing two non-cascading deletions of the parent objects doesn't sound complex. |
User will need to also delete the replicasets in the deployment history as well, right? Then it's more than "two" :) The "changing labels" solution requires the deployment controller to react to release the pod so the deletion behavior is not less deterministic. Is there benefit other than fixing pre-1.7 replicasets? If not, i think we should take the one-time pain. |
Or
You mean the replica set. Removing a label from the replica set is declaring that you want to orphan and the deployment controller will remove the owner ref eventually (99,9% of the time within a second).
IMO it's more than just fixing pre-1.7 replicasets. We should discourage users from working directly with ReplicaSets owned by Deployments. If users want more control, they should either not use Deployments or for one-off cases such as the one you are describing, they should orphan. |
You can also script out the steps above but you would want to wait for the controllers to observe the changes during each step. |
I wouldn't say it's invalid but it's not a common thing to do. The only case I can think of where you would want to do that is to move the Pods under a new controller. |
@caesarxuchao @Kargakis one thing i am not clear is setting ReplicaSets.Finalizers to foregroundDeletion, how would that result in triggerring the deletion of the pods of the replica sets(which is the problem we are trying to solve). As per this https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#foreground-cascading-deletion when doing foreground cascading deletion , the finalizer gets set by the gc controller to foregroundDeletion. Does the GC notice that the foregroundDeletion is set for an object, so it triggers the deletion of its dependenet obbjects ? Also is the blockOwnerDeletion already set for pods of the replica sets ? Also its not clear to me why setting the propogation policy to background for RS when deleting them is bad idea. I have read all the comments above, but not sure i understand the logic. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: caesarxuchao, liggitt Assign the PR to them by writing Associated issue: 44046 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: caesarxuchao, liggitt Assign the PR to them by writing Associated issue: 44046 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/release-note-none |
@caesarxuchao: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test pull-kubernetes-e2e-gce-etcd3 |
@caesarxuchao updated release note to put the general change first, and the specific issue fixed second |
Thanks @liggitt. |
cc @dchen1107 |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: caesarxuchao, dchen1107, liggitt Associated issue: 44046 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 44058, 48085, 48077, 48076, 47823) |
Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
Automatic merge from submit-queue Add e2e test for cronjob chained removal This is test proving #44058 works with cronjobs. This will fail until the aforementioned PR merges. @caesarxuchao ptal
Fix #44046, fix #47843 where user reported that the garbage collector didn't delete pods when a deployment was deleted with PropagationPolicy=Background.
The cause is that when propagating background garbage collection request, the garbage collector deletes dependents with DeleteOptions.PropagationPolicy=nil, which means the default GC policy of a resource (defined by its REST strategy) and the existing GC-related finalizers will decide how the delete request is propagated further. Unfortunately, the default GC policy for RS is orphaning, so the pods are behind when a deployment is deleted.
This PR changes the garbage collector to delete dependents with
DeleteOptions.PropagationPolicy=Background when the owner is deleted in background. This means the dependent's existing GC finalizers will be overridden, making orphaning less flexible (see this made-up case). I think sacrificing the flexibility of orphaning is worthwhile, because making the behavior of background garbage collection matching users' expectation is more important.cc @lavalamp @Kargakis @krmayankk @enisoc