-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting Antrea should do a complete cleanup #181
Comments
I think we could either have a separate DaemonSet or add a ConfigMap option to indicate antrea-agent to do the cleanup (the two approaches are not very different from each other, but with the ConfigMap way agent could watch the option in runtime when we support runtime config changes later). |
I'd like to start working on this. I like the DaemonSet approach more since it works even if there is an issue (e.g. buggy agent) or if Antrea has already been deleted from the cluster. Someone might have done My proposal is as follows:
A few considerations:
@jianjuns @salv-orlando @tnqn any thoughts or concerns? |
Questions:
|
|
I was thinking about running the same agent and controller bins, to avoid introducing new bins. But I saw there are benefits of using a separate bin too, that could be executed standalone, including executed manually. In this sense, do you think it is good to make it part of CLI, and so user could run the CLI manually to clean up? |
Just to add, it might be helpful to be able to clean up node after K8s is down. |
I think it's better to introduce new bins as I like the idea of separating the cleanup logic from the core functionality (rather than adding a new mode of operation for existing binaries). I definitely think we should have the ability to run the binary directly as a process. I would like to organize the code as a library that can be used by the CLI & or in a DaemonSet. Even though |
This is still very much a work-in-progress, I'm opening this PR to gather feedback on the approach. For someone deleting Antrea, the steps would be as follows: * `kubectl delete -f <path to antrea.yml>` * `kubectl apply -f <path to antrea-cleanup.yml>` * check that job has completed with `kubectl -n kube-system get jobs` * `kubectl delete -f <path to antrea-cleanup.yml>` The cleanup manifest creates a DaemonSet that will perform the necessary deletion tasks on each Node. After the tasks have been completed, the "status" is reported to the cleanup controller through a custom resource. Once the controller has received enough statuses (or after a timeout of 1 minutes) the controller job completes and the user can delete the cleanup manifest. Known remaining items: * place cleanup binaries (antrea-cleanup-agent and antrea-cleanup-controller) in a separate docker image to avoid increasing the size of the main Antrea docker image * generate manifest with kustomize? * find a way to test this as part of CI? * update documentation * additional cleanup tasks: as of now we only take care of deleting the OVS bridge * place cleanup CRD in non-default namespace * use kubebuilder instead of the code generator directly (related to antrea-io#16); we probably want to punt this task to a future PR. See antrea-io#181
This is still very much a work-in-progress, I'm opening this PR to gather feedback on the approach. For someone deleting Antrea, the steps would be as follows: * `kubectl delete -f <path to antrea.yml>` * `kubectl apply -f <path to antrea-cleanup.yml>` * check that job has completed with `kubectl -n kube-system get jobs` * `kubectl delete -f <path to antrea-cleanup.yml>` The cleanup manifest creates a DaemonSet that will perform the necessary deletion tasks on each Node. After the tasks have been completed, the "status" is reported to the cleanup controller through a custom resource. Once the controller has received enough statuses (or after a timeout of 1 minutes) the controller job completes and the user can delete the cleanup manifest. Known remaining items: * place cleanup binaries (antrea-cleanup-agent and antrea-cleanup-controller) in a separate docker image to avoid increasing the size of the main Antrea docker image * generate manifest with kustomize? * find a way to test this as part of CI? * update documentation * additional cleanup tasks: as of now we only take care of deleting the OVS bridge * place cleanup CRD in non-default namespace * use kubebuilder instead of the code generator directly (related to antrea-io#16); we probably want to punt this task to a future PR. See antrea-io#181
Removing this from 0.3.0, as PR #307 still needs a lot of work. |
Unfortunately there weren't many review in the past few days. @antoninbas do you think we can still make a final push for 0.5.0 inclusion, or do we just move this issue to 0.6.0? |
Pushing this out to next release |
Delete the unrequired method of deleting record from record map without lock. Add a method to get the flow updated time for flow given flow key
Delete the unrequired method of deleting record from record map without lock. Add a method to get the flow updated time for flow given flow key
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days |
Describe the bug
When running
kubectl delete -f antrea.yml
, one would expect everything created by Antrea to be deleted, including the OVS bridge on each Node, along with the gateway interface. Failure do so may cause networking issues if someone decides to deploy another CNI after deleting Antrea.Versions:
Antrea version:
Additional context
The text was updated successfully, but these errors were encountered: