Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cascading delete #48138

Closed
wants to merge 1 commit into from
Closed

Conversation

FengyunPan
Copy link

@FengyunPan FengyunPan commented Jun 27, 2017

Kubernetes 1.6 adds a PropagationPolicy to the delete options,
obsoleting the previous OrphanDependents boolean. This PR sets
PropagationPolicy=foreground instead of OrphanDependents=false
when federation controller manager delete cluster resource.

@caesarxuchao ptal

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 27, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @FengyunPan. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: FengyunPan
We suggest the following additional approver: madhusudancs

Assign the PR to them by writing /assign @madhusudancs in a comment when ready.

No associated issue. Update pull-request body to add a reference to an issue, or get approval with /approve no-issue

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 27, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-none Denotes a PR that doesn't merit a release note. labels Jun 27, 2017
@madhusudancs
Copy link
Contributor

/unassign
/assign @nikhiljindal

@caesarxuchao
Copy link
Member

/assign @nikhiljindal @caesarxuchao

@caesarxuchao
Copy link
Member

Sorry, i'll take a look after release 1.7 ships.

if err != nil {
c.tl.Fatalf("Error deleting federated %s %q: %v", c.kind, namespacedName, err)
}

deletingInCluster := (orphanDependents != nil && *orphanDependents == false)
deletingInCluster := defaultPropagationPolicy != nil && *defaultPropagationPolicy == "Foreground"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. && *defaultPropagationPolicy == metav1.DeletePropagationForeground

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thank you.

@spiffxp
Copy link
Member

spiffxp commented Jun 30, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 30, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2017
@@ -186,8 +186,10 @@ func NewDeploymentController(federationClient fedclientset.Interface) *Deploymen
},
func(client kubeclientset.Interface, obj runtime.Object) error {
rs := obj.(*extensionsv1.Deployment)
orphanDependents := false
err := client.Extensions().Deployments(rs.Namespace).Delete(rs.Name, &metav1.DeleteOptions{OrphanDependents: &orphanDependents})
defaultPropagationPolicy := metav1.DeletePropagationForeground
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe s/defaultPropagationPolicy/policy? I don't see why it's called "default".

Copy link
Member

@caesarxuchao caesarxuchao Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhiljindal is the federation control plane going to support both "foreground" and "background" gc? If so, an option should be passed into this function.

Actually if the federation control plane is going to support "background" GC, it could still deletes underlying resources with "ForegroundPropagtionPolicy", it only needs to remove the federation resource immediately. That's a separate issue anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orphanDependents := false
client.Extensions().Deployments(rs.Namespace).Delete(rs.Name, metav1.DeleteOptions{OrphanDependents: &orphanDependents})
Actually, the code can't cascade the deletion of deployment's pods. I will check it later.

@caesarxuchao
Copy link
Member

Federation controllerManager just removes the federation resource
and orphans the corresponding resources when orphanDependents is
false and PropagationPolicy is nil,

@FengyunPan i don't think this claim is true. The underlying objects are deleted with orphan=false, so they and their dependents will be garbage collected, in the background.

That said, i think using DeletePropagationForeground when deleting the underlying resources is correct, because the federation control plane intends to support foreground GC. @nikhiljindal could you confirm?

@FengyunPan
Copy link
Author

FengyunPan commented Jul 6, 2017

The 'kubectl delete deployment xxx' work fine, it called "m.RESTClient.Delete().
NamespaceIfScoped(namespace,m.NamespaceScoped).Resource(m.Resource).Name(name).Body(options).Do().Error()",
but "c.client.Delete().Namespace(c.ns).Resource("deployments"). VersionedParams(&listOptions, heme.ParameterCodec). Body(options). Do().Error()" can't work.
I don't know the relationship between OrphanDependents and PropagationPolicy, is there any doc about it?

@caesarxuchao
Copy link
Member

There should be comment on the types.

The failed cascading deletion was fixed by #44058.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2017
@FengyunPan
Copy link
Author

FengyunPan commented Jul 12, 2017

@caesarxuchao Thank you, I have update the comment and PR.
The fedaration deployment has been merged into 'sync/controller.go'. When user delete fed-deployment, adapter.ClusterDelete() will be called.
@nikhiljindal PTAL.

orphanedDependents := true
testCases := map[string]*bool{
"Resource should not be deleted from underlying clusters when OrphanDependents is true": &orphanedDependents,
"Resource should not be deleted from underlying clusters when OrphanDependents is nil": nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove this case?

Copy link
Author

@FengyunPan FengyunPan Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry, after reading the federation code again, I find that the federation apiserver use DeletionOptions.OrphanDependents to delete federation-resource and cluster-resource, so update the case until federation-apiserver don't use OrphanDependents.

@caesarxuchao
Copy link
Member

One nit, otherwise lgtm.

@FengyunPan FengyunPan force-pushed the cascade-delete branch 2 times, most recently from 0a7176f to 3260752 Compare July 13, 2017 11:47
// May need extra time to delete both federation and cluster resources
waitTimeout = c.clusterWaitTimeout
}
waitTimeout := ForeverTimeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the right clusterWaitTimeout in

return crudtester.NewFederatedTypeCRUDTester(logger, adapter, clusterClients, DefaultWaitInterval, wait.ForeverTestTimeout)
rather than overwriting it here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I will update it.

}
waitTimeout := ForeverTimeout
//if deletingInCluster {
// // May need extra time to delete both federation and cluster resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete unused code

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -155,7 +160,7 @@ func (c *FederatedTypeCRUDTester) CheckDelete(obj pkgruntime.Object, orphanDepen
return false, err
})
if err != nil {
c.tl.Fatalf("Error deleting federated %s %q: %v", c.kind, qualifiedName, err)
c.tl.Fatalf("Error deleting federated %s %q in %s: %v", c.kind, qualifiedName, waitTimeout, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%s seconds?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waitTimeout contains unit.

// and deletion helper does a cascading deletion.
// If user deletes the resource with nil DeleteOptions or DeletionOptions.OrphanDependents = true,
// then the federation controllerManager just delete fed-resource.
// if user deletes the resource with DeletionOptions.OrphanDependents = true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be OrphanDependents = false here instead of true?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done.

// DeletionOptions.OrphanDependents = true then the apiserver removes the orphan finalizer
// and deletion helper does a cascading deletion.
// If user deletes the resource with nil DeleteOptions or DeletionOptions.OrphanDependents = true,
// then the federation controllerManager just delete fed-resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

controllerManager -> controller manager.
delete -> deletes
fed-resource -> federation resource

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing it

// then the federation controllerManager just delete fed-resource.
// if user deletes the resource with DeletionOptions.OrphanDependents = true,
// then the federation apiserver removes the orphan finalizer and deletion helper
// does a cascading deletion(delete fed-resource and delete cluster-resource).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does a cascading deletion (deletes federation resource and cluster resources).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@FengyunPan
Copy link
Author

@nikhiljindal Deleting federation deployment and cluster deployment(contains rs and pod) is my PR's purpose. Actrually, after deleting federation deployment with OrphanDependents = false, the pods of federation deployment was left behind. Is it need to add Foreground PropagationPolicy into DeleteOptions?

@nikhiljindal
Copy link
Contributor

aah thanks for explaining that.
Please update the PR description to say that you are fixing that bug. Please update the PR description as well.

I looked at the code again and we are only changing the DELETE request that is sent to clusters, we are now setting PropagationPolicy to Foreground. This is not changing the DELETE request that is sent to federation apiserver by federation controller, hence my comment above is not valid.

This looks good to me. Will add lgtm label, once the open comments have been fixed.
Thanks!

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 19, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 19, 2017
@FengyunPan FengyunPan force-pushed the cascade-delete branch 3 times, most recently from 3913acb to 5a98fbc Compare July 19, 2017 08:24
@FengyunPan
Copy link
Author

/test pull-kubernetes-unit

@FengyunPan
Copy link
Author

/retest

@FengyunPan
Copy link
Author

@nikhiljindal Hi, can you help me? I just use PropagationPolicy for deleting cluster resource and didn't update the deletion of federation resource. Why the test case times out?

@nikhiljindal
Copy link
Contributor

Which test case is timing out?

Before this change, deletion of dependent resources in underlying clusters was happening in background. For ex: while deleting federated deployment, federated controller was deleting deployment from underlying cluster such that it does not wait for replicasets and pods to be deleted (they were being deleted later in the background, without blocking deployment deletion). With this change, deployment deletion in underlying cluster will wait for replicasets and pods deletion. So it is expected to take some more time than before.

@FengyunPan
Copy link
Author

@nikhiljindal This case is timing out: k8s.io/kubernetes/test/integration/federation -run TestFederationCRUD

I change more time for this case later.

@FengyunPan FengyunPan force-pushed the cascade-delete branch 2 times, most recently from 10c4242 to c537d32 Compare August 22, 2017 02:45
After deleting federation deployment with OrphanDependents=false,
the pods of federation deployment was left behind.
Kubernetes 1.6 adds a PropagationPolicy to the delete options,
obsoleting the previous OrphanDependents boolean. This PR sets
PropagationPolicy=foreground instead of OrphanDependents=false
when federation controller manager delete cluster resource.
@FengyunPan
Copy link
Author

/test pull-kubernetes-bazel
/test pull-kubernetes-unit
/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-e2e-gce-etcd3

@FengyunPan
Copy link
Author

/retest

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Aug 24, 2017

@FengyunPan: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-unit 3c3d638 link /test pull-kubernetes-unit

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

@FengyunPan PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2017
@k8s-github-robot
Copy link

This PR hasn't been active in 90 days. Closing this PR. Please reopen if you would like to work towards merging this change, if/when the PR is ready for the next round of review.

cc @FengyunPan @caesarxuchao @colhom @nikhiljindal

You can add 'keep-open' label to prevent this from happening again, or add a comment to keep it open another 90 days

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants