Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove CiliumNode deletion logic from CiliumNode watcher and guarantee CiliumNode's OwnerReference is always set #17329

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented Sep 7, 2021

First commit:

We don't need to implement this logic for two reasons:

  1. We rely on CiliumNode resources to be deleted / cleaned up by
    attaching the corresponding K8s Node as an ownerReference in the
    CiliumNode.
  2. It is redundant to delete the CiliumNode in response to an
    event...of the CiliumNode deletion itself.

In very rare cases, this logic can actually delete a newly created
CiliumNode by accident (see example below). Instead, keep all deletion
logic besides the actual K8s API calls (DELETE) and perform a Get() to
ensure that it's been deleted. Otherwise, log to the user that the
resource may still exist.

Example:
Say an existing node was deleted and then recreated in
quick succession with the same name. When the node is recreated, the
agent will be scheduled on it. During bootstrap it'll create a
corresponding CiliumNode resource. Given that only one Operator is
operational at any time in a cluster, it is already running on another
node in the cluster. The node-delete event will first delete the K8s
node and then trigger a CN-delete via reason 1 from above. It is
possible for the CN-delete event to be delayed such that it is received
after the node-create event (the recreate). When the CN-delete event is
received by the already-running Operator, the CiliumNode watcher logic
will then trigger (erroneously) another CN-delete, thereby deleting the
CiliumNode resource while the K8s node is still alive.

Second commit:

It is impossible to set the OwnerReference if we fail to fetch the
corresponding Kubernetes Node and the existing CiliumNode resource
doesn't already have it set. We can rely the OwnerReference to be set
because this logic was added in v1.6, which is sufficiently earlier
version of Cilium. 1

The reason for doing this is to ensure that the OwnerReference can
always be set. If we cannot, this should be treated as an error and we
shouldn't proceed. Cilium should not run in an environment where the
Kubernetes Node resource is missing.

@christarazi christarazi added area/operator Impacts the cilium-operator component sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. release-note/bug This PR fixes an issue in a previous release of Cilium. kind/bug This is a bug in the Cilium logic. labels Sep 7, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Sep 7, 2021
@christarazi christarazi requested a review from aanm September 7, 2021 06:23
@christarazi
Copy link
Member Author

test-me-please

@pchaigno
Copy link
Member

pchaigno commented Sep 7, 2021

The Travis test failure seems legitimate... and not very reassuring. There's actually a test (TestNewNodesPodCIDRManager) that expects two calls to k8sNode's OnDelete when receiving a k8sOpDelete. Makes me wonder if all this is on purpose for some reason :-/

@christarazi christarazi force-pushed the pr/christarazi/remove-ciliumnode-deletion-operator branch 3 times, most recently from 3ab0ac7 to ad01ab6 Compare September 8, 2021 01:30
@christarazi christarazi requested a review from pchaigno September 8, 2021 01:40
@christarazi
Copy link
Member Author

test-me-please

@christarazi christarazi changed the title operator: Remove CiliumNode deletion logic from CiliumNode watcher ipam, operator: Remove CiliumNode deletion logic from CiliumNode watcher Sep 8, 2021
@pchaigno
Copy link
Member

pchaigno commented Sep 8, 2021

Filed #17336 for that flake:
ci-l4lb

@pchaigno
Copy link
Member

pchaigno commented Sep 8, 2021

Filed #17337 for that flake:
ci-eks

@christarazi christarazi marked this pull request as ready for review September 8, 2021 18:09
@christarazi christarazi requested a review from a team September 8, 2021 18:09
@christarazi christarazi requested a review from a team as a code owner September 8, 2021 18:09
@tgraf
Copy link
Member

tgraf commented Sep 9, 2021

So far, we are not guaranteeing the that ownerReference is set:

        // Tie the CiliumNode custom resource lifecycle to the lifecycle of the
        // Kubernetes node
        if k8sNode, err := n.k8sNodeGetter.GetK8sNode(context.TODO(), nodeTypes.GetName()); err != nil {
                log.WithError(err).Warning("Kubernetes node resource representing own node is not available, cannot set OwnerReference")
        } else {

We would have to fail bootstrapping of the node in this case, otherwise we might have a silent failure to set the ownerReference and a leak of the CiliumNode.

I can't think of a reason to node fail on failure to retrieve the k8s node. I would propose the following logic:
In mutateNodeResource():

  • Fail if n.k8sNodeGetter.GetK8sNode() fails and nodeResource.ObjectMeta.OwnerReferences is not already set.

@christarazi christarazi force-pushed the pr/christarazi/remove-ciliumnode-deletion-operator branch from ad01ab6 to 17101e8 Compare September 9, 2021 19:11
@christarazi christarazi requested a review from a team as a code owner September 9, 2021 19:11
@christarazi
Copy link
Member Author

Thanks @tgraf, good suggestion. I added the commit to guarantee we always have an ownerReference.

@christarazi christarazi changed the title ipam, operator: Remove CiliumNode deletion logic from CiliumNode watcher Remove CiliumNode deletion logic from CiliumNode watcher and guarantee CiliumNode's OwnerReference is always set Sep 9, 2021
sayboras added a commit to sayboras/cilium that referenced this pull request May 5, 2022
In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: cilium#17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
kaworu pushed a commit that referenced this pull request May 6, 2022
In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: #17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
kaworu pushed a commit to kaworu/cilium that referenced this pull request May 9, 2022
[ upstream commit edc1a0a ]

In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: cilium#17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
Signed-off-by: Alexandre Perrin <alex@kaworu.ch>
aanm pushed a commit to kaworu/cilium that referenced this pull request May 9, 2022
[ upstream commit edc1a0a ]

In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: cilium#17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
Signed-off-by: Alexandre Perrin <alex@kaworu.ch>
aanm pushed a commit that referenced this pull request May 9, 2022
[ upstream commit edc1a0a ]

In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: #17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
Signed-off-by: Alexandre Perrin <alex@kaworu.ch>
aanm pushed a commit that referenced this pull request May 9, 2022
[ upstream commit edc1a0a ]

In the normal scenario, CiliumNode is created by agent with owner
references attached all time in below PR[0]. However, there could
be the case that CiliumNode is created by IPAM module[1], which
didn't have any ownerReferences at all. For this case, if the
corresponding node got terminated and never came back with same
name, the CiliumNode resource is still dangling, and needs to be
garbage collected.

This commit is to add garbage collector for CiliumNode, with below
logic:

- Gargage collector will run every predefined interval (e.g. specify
  by flag --nodes-gc-interval)
- Each run will check if CiliumNode is having a counterpart k8s node
  resource. Also, remove this node from GC candidate if required.
- If yes, CiliumNode is considered as valid, happy day.
- If no, check if ownerReferences are set.
  - If yes, let k8s perform garbage collection.
  - If no, mark the node as GC candidate. If in the next run, this
    node is still in GC candidate, remove it.

References:

[0]: #17329
[1]: https://github.com/cilium/cilium/blob/master/pkg/ipam/allocator/podcidr/podcidr.go#L258

Signed-off-by: Tam Mach <tam.mach@isovalent.com>
Signed-off-by: Alexandre Perrin <alex@kaworu.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operator Impacts the cilium-operator component kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants