Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bulk resource deletion in the API #10217

Closed
wojtek-t opened this issue Jun 23, 2015 · 13 comments
Closed

Support bulk resource deletion in the API #10217

wojtek-t opened this issue Jun 23, 2015 · 13 comments
Labels
area/kubectl area/usability priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@wojtek-t
Copy link
Member

We would like to support bulk resource deletion (filtered by label selector) in the API.

The motivation is that currently removing the namespace after scalability tests (removing all events from that namespace) can take more than half an hour.

Having a single API call "remove all resources satisfying this selector" can hopefully reduce this time to less than a minute.

Similarly to what while deleting a single object, we need to support graceful termination, thus the call will need to be asynchronous.

cc @fgrzadkowski @bgrant0607

@wojtek-t wojtek-t added priority/backlog Higher priority than priority/awaiting-more-evidence. team/master labels Jun 23, 2015
@wojtek-t wojtek-t added this to the v1.0-post milestone Jun 23, 2015
@wojtek-t
Copy link
Member Author

In fact, what we need to do here:

  1. extend "DeleteOptions" object to contain additionally LabelSelector and FieldSelector (similarly to ListOptions)
  2. Change the pkg/apiserver/resthandler.go DeleteResource method so that it supports also deletion when name is not specified (and then uses selectors from options instead)
  3. Change the interface of registry so that it supports also deleting all resources satisfying the selector (either pkg/registry/generic/etcd/etcd.go Delete method or adding new "DeleteWithLabels" method)
  4. Implement the proper deletion code in the registry.

@bgrant0607
Copy link
Member

Let's not add a field selector to DeleteOptions for now, please. I'm not convinced that's a pattern we want to encourage, and power is hard to take away.

For these bulk mutations, we need to figure out how to make them work asynchronously, even in the presence of TCP disconnects and apiserver failures. For instance, Namespace deletion (#7372) can record the fact that it's deleting in the DeletionTimestamp of the Namespace resource. We probably need to create a Deletion controller with the specified label selector in order to drive the process until completion. We should have an API that would allow users to discover/confirm that a such deletion operation was in progress. Perhaps this suggests we should reintroduce an Operation API, only properly scoped (e.g., by namespace and kind) and persisted.

cc @derekwaynecarr @smarterclayton

@bgrant0607 bgrant0607 added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jun 25, 2015
@derekwaynecarr
Copy link
Member

@wojtek-t - isn't the solution to bulk resource deletion of events in a namespace is to not delete them one by one in etcd and instead do a single delete call to the parent folder?

/registry/events/namespaceA/event1
/registry/events/namespaceA/event2
...

i.e. we would just do a DELETE /registry/events/namespaceA which is what is implemented in #7372 ? To me that seems the ideal solution for the problem identified.

@derekwaynecarr
Copy link
Member

@wojtek-t - is the concern the time to wait or the actual time to remove from etcd? I don't see how adding a label selector makes the actual time to purge from etcd better versus the single bulk delete of the collection.

@liggitt
Copy link
Member

liggitt commented Jun 25, 2015

does a collection delete result in the proper watch events for the individual items deleted?

@wojtek-t
Copy link
Member Author

@derekwaynecarr - it's definitely enough for my usecase motivation. I think that what @bgrant0607 suggested is to have something more generic.

@derekwaynecarr
Copy link
Member

@liggitt - yes

I guess rather than something generic that still needs to result in N calls to etcd to satisfy for deleting N items, I would like to get the basic case of making 1 call to delete N items from etcd right. The only pattern for that now is collection delete which works well in namespace case, but longer term it would be good if we can pass multiple keys to delete to etcd. I need to review the v3 api for etcd, but I thought that pattern was there.

@wojtek-t
Copy link
Member Author

In the v3 etcd API there will be support for transactions. So you will be able to delete multiple objects within a single transaction (which will effectively be a single call).

@bgrant0607
Copy link
Member

Probably we should just wait for that API to implement the label selector filtering, then.

@wojtek-t
Copy link
Member Author

Probably we should just wait for that API to implement the label selector filtering, then.

SGTM

For now (after 1.0) it would be enough to have @derekwaynecarr #7372 merged.

@bgrant0607 bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015
@wojtek-t
Copy link
Member Author

Just as an update - we already have a support for collection deletion at the apiserver level (supporting selectors, delete options etc.).
The only missing thing is that it's currently implemented naively in apiserver, because of etcd v2 limitations. Once we are migrated to etcd v3, we should just improve the implementation and close this issue.

@timothysc
Copy link
Member

@wojtek-t do we need to keep this issue open then? Right now I'd prefer to open specifics on the most recent changes.

@wojtek-t
Copy link
Member Author

I'm fine with that and closing the new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl area/usability priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

7 participants