Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add De-Init Containers for POD Termination Lifecycle #70496

Closed
rorysavage77 opened this issue Oct 31, 2018 · 24 comments
Closed

Add De-Init Containers for POD Termination Lifecycle #70496

rorysavage77 opened this issue Oct 31, 2018 · 24 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@rorysavage77
Copy link

rorysavage77 commented Oct 31, 2018

What would you like to be added:
Rationale: Currently, during POD instantiation Kubernetes offers initContainers and PostStart hooks. However, for POD termination, the only offering is PreStop hooks. This enhancement/feature request is to allow the same duality during POD instantiation also during POD termination. Having the ability to have controlled sequential containerized jobs for POD termination allows for dealing with application management such as clearing out configurations, a de registration/or series of de registration processes, or a whole slew of opportunities which are limited by a single PreStop hook.

Why is this needed:
It is needed because it is missing from the Kubernetes ecosystem. Offering initContainers and PostStart hooks for startup but only offering PreStop hooks for termination is extremely limiting. Having a De-Init container (or whatever you would like to call it), is imperative for closing the gap on complete container orchestration.

Example:

I have a vast multi-pod microservices based ecosystem. For example: when certain worker services come online, they can 1-Register with a registration API, 2-Share their configuration with other services, 3-Dynamically build their own configuration based on the current state of the ecosystem, 4-Perform SQL updates, and 5-Obtain a software license via API call - then they can start and run. However during Termination, having this ability to perform the reverse with just a PreStop hook is impossible.

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 31, 2018
@rorysavage77
Copy link
Author

rorysavage77 commented Oct 31, 2018

@kubernetes/sig-architecture
@kubernetes/sig-cluster-lifecycle
@kubernetes/sig-cluster-ops
@kubernetes/sig-scheduling

/sig architecture
/sig cluster-lifecycle
/sig cluster-ops
/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/cluster-ops sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 31, 2018
@goblain
Copy link
Contributor

goblain commented Nov 3, 2018

Is it not an issue though that while for initContainers it is a hard requirement for them to succeed for the actual workload to start, while potential shutdownContainers could only act on best effort with no solid guarantee that they will get executed as expected, introducing volatility to the design of deployed solution? Is it even worth having if it can not be guaranteed to execute?

@neolit123
Copy link
Member

i tend to agree with the above statement.

if you want to get more attention from maintainers try asking in these slack channels #sig-node or #sig-scheduling.

this is not related to code that these sigs maintain:
/remove-sig cluster-lifecycle cluster-ops
/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/cluster-ops labels Nov 4, 2018
@rorysavage77
Copy link
Author

Is it not an issue though that while for initContainers it is a hard requirement for them to succeed for the actual workload to start, while potential shutdownContainers could only act on best effort with no solid guarantee that they will get executed as expected, introducing volatility to the design of deployed solution? Is it even worth having if it can not be guaranteed to execute?

@goblain - perhaps I was not detailed enough in my feature request resulting in the above assumptions. I am not necessarily requesting that shutdownContainers succeed prior to POD termination. What I am requesting is, "upon POD termination", shutdownContainer(s) can be triggered.

[POD-workers] - State: Terminating
[shutdownContainer-1]: Starts as soon as the above POD is being "Terminated"
[shutdownContainer-2]: waiting on #1 to succeed

In this model, since the termination signal is sent to the POD (and since there is really no way to revert a termination signal), that's all we need to know to trigger a shutdownContainer sequence of jobs. They can trigger asynchronously with the termination signal or wait for the POD to be completely destroyed prior to kicking off -- it really doesn't matter as long as they are executed with or after the termination signal.

@keppi2
Copy link

keppi2 commented Jan 22, 2019

👍

@bibryam
Copy link

bibryam commented Feb 18, 2019

@rorysavage77
Copy link
Author

The defer-container proposal would suffice and provide additional benefits. Seems like that proposal was started in 2017. How can we raise the priority?

@bgrant0607
Copy link
Member

/remove-sig architecture

@k8s-ci-robot k8s-ci-robot removed the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label May 7, 2019
@bgrant0607
Copy link
Member

/remove-sig scheduling
/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels May 7, 2019
@bgrant0607
Copy link
Member

Defer containers proposal: kubernetes/community#483
Note there are a number of cases where defer containers could not be guaranteed to execute (e.g., preemption, OOM, disk full, other resource contention).

Comment on our current hooks generally: kubernetes/community#1171 (comment)

cc @kow3ns

@lukasheinrich
Copy link

lukasheinrich commented May 28, 2019

when reviewing kubecon eu talks, I noticed tekton seems to have a similar issue, which they also solved via a "hack" (injecting a smart binary that handles the order).

https://youtu.be/4EyTGYB7GvA?t=913

So it seems there are a bunch of use-cases for "run a linear sequence of pods on the same node"), thus a native solution would be quite nice

@bobcatfish @mrbobbytables @rochaporto @afortiorama @clelange

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 26, 2019
@kfox1111
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 29, 2019
@kfox1111
Copy link

One of the things I've suffered through is that preStop hooks all happen at once and the container they are associated with gets killed when that preStop is done. There may be some support containers that need to stay up while the main preStop hook happens. So maybe this needs to work with that. Maybe all de-init containers happen first, then prestop hooks, then actual pod death?

@zrss
Copy link

zrss commented Nov 12, 2019

/cc

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 10, 2020
@kfox1111
Copy link

This may be covered by the Sidecar container use case?

@filipedeo
Copy link

@kfox1111 A sidecar container runs in parallel with the main application. Any sidecars would be stopped in conjunction with the main container. Check this old conversation of defer-containers (de-init containers) here: kubernetes/community#483

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 24, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Warxcell
Copy link

Warxcell commented Nov 22, 2021

+1 - would be useful for example main container is generating some files (it could be database dump, could be unit test reports) - and deinit container - can upload these files somewhere. That way we can split responsibilities and remove some binaries from main container - responsible only for uploading the generated files.

@flavienbwk
Copy link

flavienbwk commented Jun 6, 2022

This should be re-opened or a solution suggested.

Waiting for a proper solution, this could help : https://github.com/target/pod-reaper

/reopen

/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@flavienbwk: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

This should be re-opened or a solution suggested.

/reopen

/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests