Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Delete empty directories when namespaces are deleted #7344

Closed
wants to merge 2 commits into from
Closed

(WIP) Delete empty directories when namespaces are deleted #7344

wants to merge 2 commits into from

Conversation

jawnsy
Copy link

@jawnsy jawnsy commented Feb 16, 2016

This change resolves #7032

@liggitt
Copy link
Contributor

liggitt commented Feb 16, 2016

before this would work, we'd need the new namespace controller from upstream that uses deletecollection, and to update the origin namespace controller to use deletecollection

@jawnsy
Copy link
Author

jawnsy commented Feb 19, 2016

Hey Jordan,

We did some research and found that 300k empty directories takes 254MB in memory and 215MB on disk (etcd). Even though this solution isn't perfect, we'd like to get this change into Origin as soon as possible. This means we'll diverge from upstream Kubernetes, but that should be okay since eventually Kubernetes will switch to etcd 3.x which doesn't have this problem.

Can you review this request? I tested by making the same changes in upstream Kubernetes and saw that the changes work, for the most part (just excluding the kinds like pods that aren't using the etcd registry.)

cc @derekwaynecarr @abhgupta

Jonathan

@jawnsy jawnsy changed the title (WORK-IN-PROGRESS) Delete empty directories Delete empty directories when namespaces are deleted Feb 19, 2016
@liggitt
Copy link
Contributor

liggitt commented Feb 19, 2016

that doesn't seem like a big enough impact to justify carrying this

@derekwaynecarr
Copy link
Member

unit tests?

@@ -264,6 +264,20 @@ func (h *etcdHelper) Delete(ctx context.Context, key string, out runtime.Object)
}

// Implements storage.Interface.
func (h *etcdHelper) DeleteDir(ctx context.Context, directory string) error {
if ctx == nil {
glog.Errorf("Context is nil")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return an error, don't log

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt Wouldn't this make the behaviour of DeleteDir different from the rest of the methods, e.g. Watch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right... the other places are weird, too

if ctx is required, we should return an error
if it's not required, we shouldn't log an error

@derekwaynecarr
Copy link
Member

read about how to name commits when making upstream changes:

UPSTREAM: <carry>: A carried kube change

https://github.com/openshift/origin/blob/master/HACKING.md#cherry-picking-an-upstream-commit-into-origin

@derekwaynecarr
Copy link
Member

@liggitt - so I think its fair to have @jawnsy determine what kinds in openshift would actually benefit from the change looking at the code that is actually in the code base today. As you note, this helps with upstream, but does not yet fix the dangling pod and service folders. Nothing downstream uses delete collection, so maybe he can get a PR to start moving the ball forward on that in parallel since I do not anticipate we will get the dynamic client in time for 3.2 since it has not yet merged in 1.2 kube.

I agree there is no sense in carrying the patch if its ignored by a large number of downstream resources.

@jawnsy jawnsy changed the title Delete empty directories when namespaces are deleted (WIP) Delete empty directories when namespaces are deleted Feb 29, 2016
@jawnsy
Copy link
Author

jawnsy commented Mar 1, 2016

I was struggling to answer @derekwaynecarr's question for a few days, so @liggitt took mercy on me and wrote some quick Go code to determine which resources need to be updated.

Here's the code for posterity (dropped into master.go in the GetRestStorage() func after we build the storage map):

    for resource, handler := range storage {
        _, isLister := handler.(rest.Lister)
        _, isDeleter := handler.(rest.Deleter)
        _, isGracefulDeleter := handler.(rest.GracefulDeleter)
        _, isCollectionDeleter := handler.(rest.CollectionDeleter)
        if isLister && (isDeleter || isGracefulDeleter) && !isCollectionDeleter {
            fmt.Println(resource)
        }
    }
    os.Exit(1)

This outputs:

roles
clusterRoleBindings
projects
imageStreamTags
oAuthAuthorizeTokens
roleBindings
oAuthAccessTokens
clusterRoles

This is the list of things that have to be listed and deleted one-by-one. However, roles, clusterRoleBindings, projects, imageStreamTags, roleBindings, and clusterRoles are all virtual (they're backed by other resources) so we don't have to worry about them.

Thus the remaining real resources that need to be updated are oAuthAuthorizeTokens and oAuthAccessTokens to use the new generic etcd storage mechanism.

@abhgupta how should we proceed given the above?

@liggitt
Copy link
Contributor

liggitt commented Mar 1, 2016

Also note that both of those types are not namespaced, so would not factor into per-namespace empty directory counts

@abhgupta
Copy link
Member

abhgupta commented Mar 2, 2016

So, for all of the project scoped resources, the current PR will be able to delete their leftover directories in etcd. Is my understanding correct?

Also, since this PR is not being submitted/accepted upstream K8s, what is the cost of carrying this patch in origin and do we have consensus on carrying this patch in origin?

@liggitt
Copy link
Contributor

liggitt commented Mar 2, 2016

No, that just means that all the namespaced resources have a deletecollection API method. The controller that cleans up origin resources has not been updated to use deletecollection yet. The controller that cleans up kube resources uses deletecollection for most, but not all kube resources.

@abhgupta
Copy link
Member

abhgupta commented Mar 2, 2016

Alright, so it doesn't really make sense to have this patch at our end till we have the namespaced resources in origin being cleaned up via the deletecollection API call.

Agreed?

@derekwaynecarr
Copy link
Member

This should make kube 1.2 if I ever get a chance to finally finish the last code review updates:
kubernetes/kubernetes#21400

It will move to use delete collection for all things that support it upstream - which is everything but services. After the next rebase, we need to assess using this controller in origin for origin resources or updating the existing controller to call delete collection for origin resources. I think it should support reuse, but we cannot use discovery in OpenShift to find its namespaced resources yet so it would be a hard coded list of group kinds that get passed into controller.

@liggitt
Copy link
Contributor

liggitt commented Mar 2, 2016

everything but services

and pods, for graceful deletion purposes, right?

@derekwaynecarr
Copy link
Member

@liggitt - pods now use delete collection.

per the latest rebase, using the upstream namespace controller to delete downstream OpenShift resources is listed as a p2 #7975

@openshift-bot
Copy link
Contributor

Origin Action Required: Pull request cannot be automatically merged, please rebase your branch from latest HEAD and push again

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 16, 2016

// Try to delete the directory, which will only succeed if the directory is empty.
// We can ignore errors since the cost of an empty directory is relatively small.
e.Storage.DeleteDir(ctx, e.KeyRootFunc(ctx));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not call this when list options has provided a field selector or label selector?

delete collection lets you scope the set of items to delete....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only call this when the ctx has no namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean when it has a namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. We only want to delete namespaced resources. The comment was probably a typo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, yes, it was a typo.

@derekwaynecarr
Copy link
Member

https://github.com/openshift/origin/blob/master/pkg/project/controller/controller.go#L100

^^^

I am less concerned about changing the existing controller to call delete collection on a per call basis.

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2016
@@ -534,12 +534,22 @@ func (e *Etcd) DeleteCollection(ctx api.Context, options *api.DeleteOptions, lis
}()
}
wg.Wait()

if listOptions.LabelSelector == nil && listOptions.FieldSelector == nil && api.NamespaceValue(ctx) != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to the success case (just before we return listObj, nil below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these can be nil or the result of labels.Everything(), etc... change to

anyLabels := listOptions.LabelSelector == nil || listOptions.LabelSelector.Empty()
anyFields := listOptions.FieldSelector == nil || listOptions.FieldSelector.Empty()
if anyLabels && anyFields {
  ...
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR, thanks for the review!

Jonathan Yu added 2 commits March 29, 2016 15:52
Signed-off-by: Jonathan Yu <jawnsy@redhat.com>
…tion

Whenever DeleteCollection is invoked, we attempt to delete the parent directory using
DeleteDir, which will only succeed if the directory is, in fact, empty. This prevents
empty directories from collecting and consuming memory.

Signed-off-by: Jonathan Yu <jawnsy@redhat.com>
@jawnsy jawnsy self-assigned this Apr 26, 2016
@jawnsy
Copy link
Author

jawnsy commented May 13, 2016

We are going to withdraw this PR for now, as given the current plan and estimated growth rate, we do not anticipate the extra memory usage to be problematic. We can always reopen this if it becomes necessary later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Namespace/Project leftovers
5 participants