Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Modify REST framework to support deleting all resources in a namespace #7372

Closed

Conversation

kbeecher
Copy link
Contributor

A WIP that so far aims to cover points 1 and 2 in @derekwaynecarr's comment: #5933 (comment)

Refs: #5933

@derekwaynecarr derekwaynecarr self-assigned this Apr 27, 2015
@derekwaynecarr derekwaynecarr changed the title Modify REST framework to support deleting all resources in a namespace WIP Modify REST framework to support deleting all resources in a namespace Apr 27, 2015
@@ -107,6 +107,13 @@ func (w GracefulDeleteAdapter) Delete(ctx api.Context, name string, options *api
return w.Deleter.Delete(ctx, name)
}

// An object that can delete a collection of RESTful resources in a namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just say "An object can that can delete a collection of RESTful resources"

I could see this being implemented for non-namespaced resources, for example, to delete all nodes.

@derekwaynecarr
Copy link
Member

There was nothing immediately wrong with this, but when I went to test it on a vagrant setup, the kube-apiserver is failing to launch with the following output to log:

F0427 20:00:22.643514       1 server.go:244] Invalid Authentication Config: error reading /dev/null: could not read any certificates

This does not appear related to your PR, so I need to investigate that separately before I can test why you were not getting output to Swagger as you expected.

@kbeecher kbeecher force-pushed the kbeecher/wip_namespace_delete branch from e685d1e to 15e6e21 Compare April 28, 2015 09:28
@kbeecher
Copy link
Contributor Author

@derekwaynecarr Understood. Thanks for taking the time.

@kbeecher kbeecher force-pushed the kbeecher/wip_namespace_delete branch 3 times, most recently from 2d2e5d6 to 7e49ed4 Compare April 28, 2015 13:52
@kbeecher
Copy link
Contributor Author

@derekwaynecarr I added a bit more code towards items 3 and 4.

Specifically, I added a "slice" of code that implements the other steps but only for pods. Hopefully this shows the basic outline of the approach (and asks a couple of questions too).

@derekwaynecarr
Copy link
Member

@kbeecher - I will dedicate time today to trying this out and giving feedback. Thanks for your willingness to push the feature forward.

@derekwaynecarr
Copy link
Member

Tried this out, things look to be moving on the right path, swagger looked good, will review and comment further.

@@ -107,6 +107,13 @@ func (w GracefulDeleteAdapter) Delete(ctx api.Context, name string, options *api
return w.Deleter.Delete(ctx, name)
}

// An object that can delete a collection of RESTful resources.
type CollectionDeleter interface {
// Deletes all resources of a certain type under a given namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deletes collection of resources (remove the mention of namespaces entirely)

@derekwaynecarr
Copy link
Member

I like where this is headed, interested in getting some other feedback - most notably wrt to graceful delete.

@smarterclayton
Copy link
Contributor

Will have comments later on this, please hold.

On Apr 28, 2015, at 11:39 AM, Derek Carr notifications@github.com wrote:

I like where this is headed, interested in getting some other feedback - most notably wrt to graceful delete.


Reply to this email directly or view it on GitHub.

@kbeecher
Copy link
Contributor Author

kbeecher commented May 4, 2015

@smarterclayton thanks for taking a look!

@smarterclayton
Copy link
Contributor

Approach looks fine to me, derek seems to have you covered on comments.

@kbeecher
Copy link
Contributor Author

kbeecher commented May 5, 2015

@derekwaynecarr I'm going to resume work on this. Do you have any preference between the adapter/delegate approaches? Personally, I would choose the delegate approach, but that's only because I'm familiar with that, and not exactly sure right now how to implement an adapter :-)

@smarterclayton
Copy link
Contributor

A default delegate is probably better on generic.Etcd - do the default list implementation and have a number of selectable implementations an implementor can choose to use.

@kbeecher kbeecher force-pushed the kbeecher/wip_namespace_delete branch from 7e49ed4 to e1d0e52 Compare May 5, 2015 16:09
@kbeecher
Copy link
Contributor Author

kbeecher commented May 5, 2015

I've added a little more, trying to implement according to your suggestions. If you guys think this is the right approach, I'll start 'widening' this patch to cover all other resources.

@smarterclayton
Copy link
Contributor

@csrwng can you describe the changes you are making as well so Karl is aware?

----- Original Message -----

I've added a little more, trying to implement according to your suggestions.
If you guys think this is the right approach, I'll start 'widening' this
patch to cover all other resources.


Reply to this email directly or view it on GitHub:
#7372 (comment)

@csrwng
Copy link
Contributor

csrwng commented May 5, 2015

@kbeecher @smarterclayton
So basically we are saying that we will have a set of methods that act on a collection vs a set of methods that act on a named object. However, the naming still needs some thought (doesn't make sense to apply -Collection or -Named to all of them). Examples URIs are not necessarily real stuff:

Method Collection Single Named Object Sample URIs
GET List Get Collection: /ns/pods, One: /ns/pods/[name], Subresource: /ns/pods/[name]/log
GET (w/ watch param) Watch WatchNamed Collection: /ns/pods?watch=true, One: /ns/pods/[name]?watch=true, Subresource: /ns/pods/[name]/special?watch=true
POST Create CreateNamed Collection: /ns/pods, One: /ns/pods/[name], SubResource: /ns/pods/[name]/generate
PUT UpdateCollection Update Collection /ns/pods, One: /ns/pods/[name], SubResource: /ns/pods/[name]/status
DELETE DeleteCollection Delete Collection /ns/pods, One: /ns/pods/[name], SubResource: /ns/pods/[name]/status

@kbeecher
Copy link
Contributor Author

@csrwng Thanks for the info. Is this only in planning so far, or do you also have some code/PRs that I can look at?

@csrwng
Copy link
Contributor

csrwng commented May 11, 2015

@kbeecher mostly planning. I submitted a PR to start that change with the Create method here: #7718 with corresponding comments in the rest package: csrwng@fd65427#diff-47bcf1e79133be8a14b98c919e1d1361R31

@derekwaynecarr
Copy link
Member

I don't see a reason to gate this work with what Cesar is looking at. They seem aligned from what I can gather. I would like to get this in and used before Kube 1.0

Sent from my iPhone

On May 11, 2015, at 8:48 AM, Cesar Wong notifications@github.com wrote:

@kbeecher mostly planning. I submitted a PR to start that change with the Create method here: #7718 with corresponding comments in the rest package: csrwng/kubernetes@fd65427#diff-47bcf1e79133be8a14b98c919e1d1361R31


Reply to this email directly or view it on GitHub.

@k8s-github-robot k8s-github-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 27, 2015
@k8s-github-robot
Copy link

Labelling this PR as size/XL

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 1, 2015
@k8s-github-robot
Copy link

Labelling this PR as size/L

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2015
@wojtek-t
Copy link
Member

wojtek-t commented Oct 8, 2015

@derekwaynecarr - I'm getting back to it since I would really like to make it happen.

I think that we can solve the watch problem using the the "watch in apiserver" mechanism that I added some time ago.
Basically, the cacher is storing the current state of resources in memory. So it is enough for it to observe just an action of "directory deleted", because it can iterate over its elements and remove all that were in that directory.
That would require switching it on for all resource, but we wanted to do it anyway.
WDYT?

However, I think that there might be one more problem - basically I'm talking about graceful deletion. What should happen if DeletionGracePeriodSeconds is set for an object and we are removing directory that contains such objects? I think that shouldn't be ignored... @smarterclayton

@derekwaynecarr
Copy link
Member

We had discussed leaving pod deletion to happen on a per item basis, but
moving all other deletions to a collection level delete.

Do you have an idea for how much complication this adds to our watch cache
to understand hierarchy in this manner?

On Thursday, October 8, 2015, Wojciech Tyczynski notifications@github.com
wrote:

@derekwaynecarr https://github.com/derekwaynecarr - I'm getting back to
it since I would really like to make it happen.

I think that we can solve the watch problem using the the "watch in
apiserver" mechanism that I added some time ago.
Basically, the cacher is storing the current state of resources in memory.
So it is enough for it to observe just an action of "directory deleted",
because it can iterate over its elements and remove all that were in that
directory.
That would require switching it on for all resource, but we wanted to do
it anyway.
WDYT?

However, I think that there might be one more problem - basically I'm
talking about graceful deletion. What should happen if
DeletionGracePeriodSeconds is set for an object and we are removing
directory that contains such objects? I think that shouldn't be ignored...
@smarterclayton https://github.com/smarterclayton


Reply to this email directly or view it on GitHub
#7372 (comment)
.

@wojtek-t
Copy link
Member

wojtek-t commented Oct 8, 2015

We had discussed leaving pod deletion to happen on a per item basis, but
moving all other deletions to a collection level delete.

So do we implicitly assume that graceful deletion makes sense only for pods? The graceful deletion period is defined at the ObjectMeta level, so potentially it can be set for any object.
I'm not saying this is a wrong approach, but if so we need to document it well.

Do you have an idea for how much complication this adds to our watch cache
to understand hierarchy in this manner?

Actually - I just looked into the code and I'm afraid I was wrong :(
In the Cacher we explicitly get an object that was deleted (it's already processed by etcd_watcher.go).
So we would have to modify etcd_watcher to understand it and also modify watch interface to enable passing such values (cacher in fact is a decorator of etcd_helper).

So unfortunately it will NOT be easy...

@smarterclayton
Copy link
Contributor

Graceful is going to extend over other resources in the future on a case by
case basis. Pods and namespaces are the only concrete gracefuls, and
namespaces have extra semantics that possibly should be aligned in the
future.

I suspect bulk deletion is orthogonal to whether you can graceful
deletion. If we have bulk delete with grace, doing a bulk set is
reasonable. I don't see an easy way to make that efficient though. It's
worth noting that the Kubelet handles a raw delete with no graceful delete
metadata set as "use the default value from the pod spec", which means if
you implement bulk delete as is you're getting "bulk delete with graceful
termination on the Kubelet". We could reject delete calls with explicit
grace periods in the short term.

On Oct 8, 2015, at 8:53 AM, Wojciech Tyczynski notifications@github.com
wrote:

We had discussed leaving pod deletion to happen on a per item basis, but
moving all other deletions to a collection level delete.

So do we implicitly assume that graceful deletion makes sense only for
pods? The graceful deletion period is defined at the ObjectMeta level, so
potentially it can be set for any object.
I'm not saying this is a wrong approach, but if so we need to document it
well.

Do you have an idea for how much complication this adds to our watch cache
to understand hierarchy in this manner?

Actually - I just looked into the code and I'm afraid I was wrong :(
In the Cacher we explicitly get an object that was deleted (it's already
processed by etcd_watcher.go).
So we would have to modify etcd_watcher to understand it and also modify
watch interface to enable passing such values (cacher in fact is a
decorator of etcd_helper).

So unfortunately it will NOT be easy...


Reply to this email directly or view it on GitHub
#7372 (comment).

@wojtek-t
Copy link
Member

@derekwaynecarr - I talked to coreos folks about the watch issue and they already solve it in v3 api, but are not going to solve it in v2 api - see etcd-io/etcd#3672 for reference.
So I think that we should wait until we will be using v3 api in k8s and solve it then - WDYT?

@wojtek-t
Copy link
Member

wojtek-t commented Dec 4, 2015

@derekwaynecarr @lavalamp @gmarek
I thought about it one more time, and I think there is a way to proceed with this PR.
So what I would like to do is to:

  • implement DeleteCollection in apiserver as: "List + Delete all returned items one by one"
  • [this is obviously not atomic, but for deleting namespace is good enough]
  • once we migrate to v3 api in etcd, we can simply get rid of this code in apiserver and use appropriate etcd api call
  • however, writing this additional code in apiserver should be extremely simple and is not much overhead

The advantage of it is that deleting namespace can be much much faster, because we won't be blocked on namespace-controller QPS limit (and this will help us a lot in scalability tests)

Any thoughts on it? If you agree with it, I'm happy to work on it next week.

@derekwaynecarr
Copy link
Member

@wojtek-t - so if I understand correctly, we will still do N deletes to etcd, but client will make a single call to the API server? That sounds fine to me, fixes the watch problem for time being. I didn't think the v3 model fixed our watch problem either. I assume if request times out, caller just needs to know to try again (and the remaining items will be deleted)?

@wojtek-t
Copy link
Member

wojtek-t commented Dec 4, 2015

@wojtek-t - so if I understand correctly, we will still do N deletes to etcd, but client will make a single call to the API server?

That's what I'm suggesting.

That sounds fine to me, fixes the watch problem for time being. I didn't think the v3 model fixed our watch problem either.

In v3 api, this problem is solved - I talked to coreos guys about it.

I assume if request times out, caller just needs to know to try again (and the remaining items will be deleted)?

Yes - I guess so.

@derekwaynecarr
Copy link
Member

@wojtek-t - sgtm.

@lavalamp
Copy link
Member

lavalamp commented Dec 5, 2015

I'm sorry for being so late to this party, but can we make the collection-level deletion take the same parameters as list (and watch)--e.g., label and field selectors--and not call it "delete-all"? (maybe "bulk-delete") I think that'd be much more generally useful!

@lavalamp
Copy link
Member

lavalamp commented Dec 5, 2015

Yeah, I think @wojtek-t and I are on the same page.

@derekwaynecarr
Copy link
Member

please cc me on the resultant PR, i am interested in understanding how large the impact will be.

please confirm that the delete collection call also delete the directory node in the registry, and not just its child items as this PR intended?

@wojtek-t
Copy link
Member

wojtek-t commented Dec 5, 2015

please cc me on the resultant PR, i am interested in understanding how large the impact will be.

sure

please confirm that the delete collection call also delete the directory node in the registry, and not just its child items as this PR intended?

I will check if we can easily do it without v3 etcd api (I talked with coreos guys and it will definitely be possible in v3).

@smarterclayton
Copy link
Contributor

I'd prefer DELETE against the collection - we probably to support delete
options AND list options together, because otherwise there is no bulk
graceful delete (what namespace controller needs, vs what tear down needs).

On Dec 4, 2015, at 8:17 PM, Daniel Smith notifications@github.com wrote:

I'm sorry for being so late to this party, but can we make the
collection-level deletion take the same parameters as list (and
watch)--e.g., label and field selectors--and not call it "delete-all"?
(maybe "bulk-delete") I think that'd be much more generally useful!


Reply to this email directly or view it on GitHub
#7372 (comment).

@wojtek-t
Copy link
Member

wojtek-t commented Dec 5, 2015

@smarterclayton - I agree that passing both ListOption & DeletionOptions is exactly what we need.

@wojtek-t
Copy link
Member

wojtek-t commented Dec 7, 2015

@derekwaynecarr @smarterclayton - I've just send out the initial version of my idea in #18290 out for review.
Since this PR is even pretty hard to rebase, I think we should just close it.

@lavalamp
Copy link
Member

lavalamp commented Dec 7, 2015

Yes, that SGTM

On Sat, Dec 5, 2015 at 7:45 AM, Clayton Coleman notifications@github.com
wrote:

I'd prefer DELETE against the collection - we probably to support delete
options AND list options together, because otherwise there is no bulk
graceful delete (what namespace controller needs, vs what tear down needs).

On Dec 4, 2015, at 8:17 PM, Daniel Smith notifications@github.com wrote:

I'm sorry for being so late to this party, but can we make the
collection-level deletion take the same parameters as list (and
watch)--e.g., label and field selectors--and not call it "delete-all"?
(maybe "bulk-delete") I think that'd be much more generally useful!


Reply to this email directly or view it on GitHub
<#7372 (comment)

.


Reply to this email directly or view it on GitHub
#7372 (comment)
.

@timothysc
Copy link
Member

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.