Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't apply waitBeforeExitSeconds to control-plane pods #10276

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

alpeb
Copy link
Member

@alpeb alpeb commented Feb 6, 2023

Close #10058

The Helm value proxy.waitBeforeExitSeconds introduces a pause after the pod receives the shutdown signal, and it was intended for pods in the data-plane whose main container needs to perform shutdown-time operations that require the network. Linkerd's control-plane pods don't require that*.

Additionally, if such shutdown operations take longer than 30s, then the user needs to set the pod's terminationGracePeriod (whose default is 30s) to be greater than proxy.waitBeforeExitSeconds to avoid the kubelet killing the pod before the operations completes. We don't expose terminationGracePeriod as a parameter to linkerd's pods so this scenario results in an error such as this:

Exec lifecycle hook ([/bin/sleep 40]) for Container "linkerd-proxy" in Pod "linkerd-destination-9559586c5-g9jns_linkerd(e33e8d02-66ca-42fa-9a7c-0ea45bda814a)" failed - error: command '/bin/sleep 40' exited with 137: , message: ""

For these two reasons, this change disables the proxy.waitBeforeExitSeconds setting for the linkerd pods, either by overriding it at the template level (for core control-plane pods) or through an annotation (for extension pods).

(*) The Viz and Jaeger extensions don't require the network during shutdown either. The Multicluster extension already exposes a setting for terminationGracePeriod, so this change doesn't affect this particular extension.

@alpeb alpeb requested a review from a team as a code owner February 6, 2023 16:19
Close #10058

The Helm value `proxy.waitBeforeExitSeconds` introduces a pause after the pod receives the shutdown signal, and it was intended for pods in the data-plane whose main container needs to perform shutdown-time operations that require the network. Linkerd's control-plane pods don't require that*.

Additionally, if such shutdown operations take longer than 30s, then the user needs to set the pod's `terminationGracePeriod` (whose default is 30s) to be greater than `proxy.waitBeforeExitSeconds` to avoid the kubelet killing the pod before the operations completes. We don't expose `terminationGracePeriod` as a parameter to linkerd's pods so this scenario results in an error such as this:
```
Exec lifecycle hook ([/bin/sleep 40]) for Container "linkerd-proxy" in Pod "linkerd-destination-9559586c5-g9jns_linkerd(e33e8d02-66ca-42fa-9a7c-0ea45bda814a)" failed - error: command '/bin/sleep 40' exited with 137: , message: ""
```

For these two reasons, this change disables the `proxy.waitBeforeExitSeconds` setting for the linkerd pods, either by overriding it at the template level (for core control-plane pods) or through an annotation (for extension pods).

(*) The Viz and Jaeger extensions don't require the network during shutdown either. The Multicluster extension already exposes a setting for `terminationGracePeriod`, so this change doesn't affect this particular extension.
@alpeb alpeb force-pushed the alpeb/terminationgraceperiod branch from 14c98de to d3b4fd9 Compare February 6, 2023 16:21
Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@alpeb alpeb merged commit fc7d553 into main Feb 7, 2023
@alpeb alpeb deleted the alpeb/terminationgraceperiod branch February 7, 2023 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose terminationGracePeriodSeconds for configuration via Linkerd Helm Charts
3 participants