-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Lifecycle Hooks #1171
Conversation
bf56307
to
9635ff1
Compare
``` | ||
|
||
## Algorithm | ||
If there is a LifecycleTemplate referenced from an object: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an owner reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is an explicit reference from DeploymentSpec, StatefulSetSpec and DaemonSetSpec - LifecycleTemplate *ObjectReference
**RFC Issue**: https://github.com/kubernetes/kubernetes/issues/14512 | ||
|
||
## Abstract | ||
The intent of this proposal it to support lifecycle hooks in all objects having rolling/recreate update; currently those are Deployments, StatefulSets and DaemonSets. Lifecycle hooks should allow users to better customize updates (rolling/recreate) without having to write their own controllers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please split these into separate lines? Otherwise, it's difficult to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
``` | ||
|
||
### Affected Objects | ||
As of now the lifecycle hooks are relevant for Deployments, StatefulSets and DaemonSets. They will require new optional field to be able to reference LifecycleTemplate and extending its status by []LifecycleHookRevisionStatus. Also the controller will need to be slightly adjusted to scale in appropriate chunks and trigger the hooks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this proposal is about extending the existing controllers to run the hooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, (extending controllers + API objects)
@kubernetes/sig-apps-proposals |
9635ff1
to
ef0f993
Compare
## Algorithm | ||
If there is a LifecycleTemplate referenced from an object: | ||
|
||
1. Calculate next partition point to reach the closest lifecycle hook progress point and scale replicas in update appropriately. If there is no hook remaining, GOTO 5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Does the controller block until the Job completes?
- How do you determine if a Job exists for a particular execution of the rolling update? That is, it seems to me that the execution of a Job is tied to the roll out of a specific revision, how do we track the lifecycle of the an individual Job for each roll out and rollback of a particular revision?
- Do we require the Jobs to be idempotent for safety. I think it will be difficult to ensure at most once execution. We could probably ensure at least once.
- If we are using available replicas as the trigger, what happens if the number of available replicas decreases during execution of the Job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some concrete examples for 3:
- running a database migration - should be reentrant
- clearing a cache - should be reentrant
- notifying an external system of our current location - should be reentrant
I think at least once is probably ok, but I don't know that all users would expect at-least-once. Similar to daemonset discussion - will users naturally assume these are at-most-once, and if so, will they experience catastrophic failure due to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the examples from the previous comment, I think all of them imply the controller MUST block until the job completes. Not running a database migration could be catastrophic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Does the controller block until the Job completes?
Yes. (Although there is always possibility of adding an option to customize it.)
- Do we require the Jobs to be idempotent for safety. I think it will be difficult to ensure at most once execution. We could probably ensure at least once.
User will get whatever guarantees Kubernetes Job gives him. (But it should be idempotent.)
- If we are using available replicas as the trigger, what happens if the number of available replicas decreases during execution of the Job?
Nothing, once we reach certain progress point we trigger the hook and never go back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to think about details of rollback in case of 2. more but it should be based on revision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you determine if a Job exists for a particular execution of the rolling update? That is, it seems to me that the execution of a Job is tied to the roll out of a specific revision, how do we track the lifecycle of the an individual Job for each roll out and rollback of a particular revision?
Would this be the same mechanism we use to determine whether a replicaset exists for a given deployment? That is, a deterministically named job based on the pod template spec hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could associate all the hooks (Jobs) by setting controllerRef to particular RS. If we decide to run hooks for rollbacks, those could be distinguished by label.
We would have probably done it either way for purposes of adoption and garbage collection.
4. GOTO 1. | ||
5. Finish | ||
|
||
(In case of rollover cancel any hooks running and don't execute new ones.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by rollover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now the lifecycle hooks are relevant for Deployments, StatefulSets and DaemonSets. | ||
They will require new optional field to be able to reference LifecycleTemplate and extending its status by []LifecycleHookRevisionStatus. | ||
Also the controller will need to be slightly adjusted to scale in appropriate chunks and trigger the hooks. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that we might start with Deployment (its the most utilized and generally its used for stateless non-critical components), promote a stable implementation there, and then adapt that implementation to StatefulSet and DaemonSet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my plan is to start with Deployments.
type DeploymentSpec struct { | ||
// ... | ||
// Addition | ||
LifecycleTemplate *ObjectReference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I want to control the rollout process? Isn't that a job itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a separate strategy that forces the controller to ignore the deployment. Wouldn't that process need to implement the hooks api? Sounds like a separate design proposal.
// "Ignore" - Even if the hook fails deployment continues rolling out new version. | ||
FailurePolicy string | ||
|
||
JobSpec v1.JobSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I parameterize the job spec based on data in my deployment? Or would I have to create one hook per deployment in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan was to inject some environment variables inside (like image) but I failed to mention it in the proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way for downward API to help here? The platform choosing a static and arbitrary set of fields to expose as (arbitrary) env vars is definitely useful, but allowing the users to select what data to expose and as mounted files seems very interesting to me. Crazy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of extending downward API to cover this use case where the objects are separate. That would allow you to pass any deployment property and we could inject just limited and necessary number of envs like the previous image or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of extending downward API to cover this use case where the objects are separate. That would allow you to pass any deployment property
Expanding downward API to be cross-object is a pretty big change... is that desired? If so, we'd need to be really careful:
- puts a burden on the kubelet to watch more objects
- could get the kubelet to fetch and inject info you shouldn't be able to see
- can run into ordering problems when multiple unsynchronized data sources update (current downward API info all comes from the pod object which has a linearizable resourceVersion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusing existing downward API could be a good argument for having the PolicyTemplate embedded in the object (Deployment, ...) at the expense of re-usability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusing existing downward API could be a good argument for having the PolicyTemplate embedded in the object (Deployment, ...) at the expense of re-usability.
Talked some with @tnozicka about this offline and I'm now convinced that the lifecycle hook makes more sense embedded into the deployment spec itself rather than as a reference. Being able to use the downward API for the job template seems more generally useful than reusing hook templates across deployments. Deployments themselves (and any hooks defined within) could be templated in other ways.
1. Calculate next partition point to reach the closest lifecycle hook progress point and scale replicas in update appropriately. If there is no hook remaining, GOTO 5. | ||
2. When partition point is reached, run the hook by creating a Job using LifecycleHook.JobSpec | ||
3. | ||
1. If the hook failed and FailurePolicy is Abort - emit event, fail the update and initiate rollback (GOTO 5.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define rollback here. I'm not sure whether you mean "give up on the new deployment and mutate the deployment back to the old revision" or "go into very long backoff". I.e. perm-fail vs not-perm fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean fail this rollout for good and rollback to previous (working) revision
// before any later one. | ||
ProgressPoint intstr.IntOrString | ||
|
||
// "Abort" - Failure for this hook is fatal, deployment is considered failed and should be rollbacked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such thing as automatic rollback today in Deployments so in terms of what we have the deployment is stuck. If we implement automatic rollback here we should also think of the case where we don't run any hooks (only progressDeadlineSeconds).
/cc @ironcladlou |
|
||
type LifecycleHook struct { | ||
// Unique name | ||
Name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unique in what scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the scope of Hooks []LifecycleHook
; likely better to mention the restriction there
RetainPolicy string | ||
|
||
// After reaching this limit the Job history will be pruned. | ||
RevisionHistoryLimit *int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit inflexible given the 1:1 relationship between a LifecycleHook and Job, especially given the size of Hooks is variable. Does it make any sense for the history limit to be per hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both options make sense to me; limit hooks history per rollout and per hook.
Other option is to start with no limiting because with owner references it will be automatically limited by the RevisionHistoryLimit from the deployment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other option is to start with no limiting because with owner references it will be automatically limited by the RevisionHistoryLimit from the deployment
Interesting point. I see there was some prior discussion of job pruning.
My current thought is the proposal would be simplified by removing any sort of bespoke pruning capability. Pruning of jobs generally seems orthogonal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine starting small without any explicit pruning.
// "Always" - keep all Jobs | ||
// "OnFailure" - keep only failed Jobs | ||
// "Never" - delete Jobs immediately | ||
RetainPolicy string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there ever a case where this policy might differ between hooks within the same template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different hooks can have different priority (say notifying IRC vs. migrating a database); this way you can choose to retain only some of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those use cases seem to provide an argument for retention policy being associated with each LifecycleHook rather than the LifecycleTemplateSpec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to per hook then ;)
// If such situation shall occur that two different ProgressPoints should be reached at | ||
// the same time, all the hooks for an earlier ProgressPoint will be ran (and finished) | ||
// before any later one. | ||
ProgressPoint intstr.IntOrString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has anybody yet envisioned another triggering criterion? If we think we might come up with more, a single field coupled to availableReplicas will become messy and we might want to talk about a new type here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a use case not covered by ProgressPoint
but interested to hear ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please let's not use this type any more. It was a bad idea. Make individual fields with semantically significant names.
My use case for a deployment lifecycle hook is as follows: I will have a Helm lifecycle hook which will talk to an authentication service to create a new authentication credential for the deployment version. I would want a lifecycle hook to run when that deployment version is completely rolled off of (deleted, rolled back, or replaced with new verison) in order to disable that credential. If the deployment is rolled back to that version, a lifecycle hook would have to run in order to re-enable that credential prior to starting any PODs. I don't think this proposal handles that use case. |
@kow3ns thanks for the feedback!
This predates KEP, but it makes sense to convert it.
Sure, I'll emphasize that in the updated proposal.
I am gonna write down some points and put the discussing in the agenda.
I hope we succeed in 3. because the complexity of not doing this in core is significantly higher as we need to interact with workload scaling which would require autopausing and/or external strategy.
I'll put that into the updated proposal.
Are other Sig-Apps proposals waiting for that? Is there a case where this one differs? Generally I think the alpha gate rule is in effect until a decision is made and it seems it might be just as close as enhancing the old approach. |
// "Never" - delete this job immediately after it is done | ||
RetainPolicy string | ||
|
||
JobSpec v1.JobSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the apps API group should depend on the batch API group in this way. What is needed more than a pod template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling Pods directly is too low level. It means you have to handle the retries, completions or situation like when node with the pod goes down and you need to delete it and run another one. We care about a completion of the hook, Jobs provide a higher level primitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to expose all of Job? Wouldn't we be better to librarify the job control logic and use it from a purpose-built controller?
@@ -0,0 +1,152 @@ | |||
# Lifecycle Hooks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All docs in the design-proposals directly belong in SIG-specific subdirectories
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm gonna move it to KEP; this structure is year old based on what was there at the time it started.
Type HookTriggerType | ||
|
||
// ProgressAvailable config params. Present only if type = "ProgressAvailable". | ||
ProgressAvailableParams *ProgressAvailableParams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apelisse Do we have conventions for unions yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing formal yet, but new APIs that have "oneof" structs should try to:
- Have a struct that only contains:
- a discriminator as a string,
- optional (with
// +optional
), omitempty, pointer fields,
- That structure must not be inlined,
- Values for the discriminator must match corresponding field names (modulo capitalized first letter).
} | ||
|
||
type HookTrigger struct { | ||
// Type of hook trigger. Can be ProgressAvailable. Default is ProgressAvailable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to explicitly document that clients should expect the set of valid values to grow.
https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#backward-compatibility-gotchas
Are we planning to validate the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to explicitly document that clients should expect the set of valid values to grow.
https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#backward-compatibility-gotchas
Will document it's open-ended.
Are we planning to validate the value?
I was. Is it ok to weaken the validation in the future when it's documented the type is open-ended?
// Addition | ||
// List of lifecycle hooks | ||
// Can have multiple hooks at the same ProgressPoint; order independent | ||
LifecycleHooks []LifecycleHook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the merging semantics for these lists?
cc @apelisse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LifecycleHook.Name
is the unique key, when merging 2 list the newer LifecycleHook
replaces the old one where the key matches.
Not sure if there is a syntax to express it or hint apimachinery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example.
PS: We are in the process of changing the name and semantic of these "hints", but for now you should be using patchMergeKey
and patchStrategy
.
External solutions, such as Helm and Operators, provide lifecycle hooks around creation, updates, and deletion. Is the main value here from mid-rollout hooks? Is there a reason why multiple workload controllers can't be used to expose intermediate points? For instance, I'm convinced we should never implement promotable canaries (kubernetes/kubernetes#11505). Just create a canary deployment. Deployments, ReplicaSets, and DaemonSets should have uniform, fungible replicas. Is the main reason to provide hooks for those resources just consistency with StatefulSet, or are there important use cases for hooks for non-stateful applications? I'm pretty unhappy with the state of lifecycle hooks in Kubernetes currently. I'd like to see a more comprehensive, systematic proposal before we add more.
|
@bgrant0607 - Agreed that we need more consistency here. I can add some insight as to what is being done and why. entrypoint.sh (kubernetes/kubernetes#30716): We decided we were not going to do templates for file generation. We can add best practices to the documentation, but imo this has not been a major hurdle for our users to get over. Containers: postStart and preStop hooks; never added preStart and postStop (kubernetes/kubernetes#140): We can take a look at adding these. However, this would have to be done in coordination with SIG Node. SIG Apps could drive it though. Pods: initContainers; have not added deferContainers (#483) : After much discussion, its not clear that this solves actual problems. API symmetry and this are the best rationals I've seen for doing this. After multiple presentations at SIG-Apps it failed to gather the support or interest of the community. Volumes: never implemented container volumes (kubernetes/kubernetes#831): I'm not sure how this relates directly to life-cycle hooks. There are implications to the CRI and CSI here. The CSI doesn't support anything like this today, but, there is work in progress in SIG Storage that might let us do this given an arbitrary CRI implementation in the near future. I don't see how it relates to this issue though. Sidecar containers: (e.g., kubernetes/kubernetes#25908) : See 2148, there is an open proposal and we're talking it through with SIG Node now. ConfigMap: never implemented HUP (kubernetes/kubernetes#24957), nor any rollout mechanism (kubernetes/kubernetes#22368), nor post-update hooks (kubernetes/kubernetes#1553) : There isn't general agreement that we should do this 1163. So, with the exception 140, we have at least thought through and attempted to make progress on all of these, except for 831, and isn't clear to me wrt to how it relates to this space. This proposal, along with 2148, and potentially with 140 are places we feel we have enough consensus to make significant progress. There is nothing stopping us from doing all three in tandem so we can take a look at a proposal for 140. When 2148 is ready to take to SIG architecture for further review we will do so. OpenShift already uses this extension in DeployConfig, and the developers want to upstream it into core. The thought is to take it to SIG Architecture for a decision on whether this should be a core feature or implemented as a add on extension. |
@kow3ns Thanks. What I am looking for is:
|
@bgrant0607 will do |
Thanks. And, as a design alternative, how would this compare to a watching the controllers via something like metacontroller, which could provide hooks, if necessary? |
Lifecycle hooks should allow users to better customize updates (rolling/recreate) without having to write their own controllers. | ||
|
||
## Lifecycle Hooks | ||
Lifecycle hook is a Job that will get triggered by the update reaching certain progress point (like 50%). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these generic "lifecycle" hooks or just "update progress" hooks? The naming implies they are hooks that could apply at other points in the lifecycle (scale up, down, thresholds, delete) but they are only used for updates. Fix the naming of the usages?
// Accepts both the number of new pods available or a percentage value representing the ratio | ||
// of new.availableReplicas to the declared replicas. In case of getting a percentage | ||
// it will try to reach the exact or the closest possible point right after it, | ||
// trigger the job, wait for it to complete and then continue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what happens with edges - 0, "0%", and "100%"
/kind kep |
REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
KEPs have moved to k/enhancements. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
@justaugustus: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Not sure where did the discussion move to, can't find any further references. Any clues? |
@tnozicka / @justaugustus to where this proposal was moved on? I can't find it anywhere like @acasademont doesn't too... |
I have not moved it anywhere yet, I am temporarily out of bandwidth to move this forward, so I'll resurrect the proposal when things are back to normal. |
@tnozicka / @justaugustus Is there any update to this feature? Followed a link from @ironcladlou on issue#14512, following a link from @Kargakis on issue#36477. Looks like it has been sitting about a year now? |
Something like this is exactly what I am looking for. I want to create entries in a Redis database when certain deployments are, well... deployed. I cannot use container lifecycle hooks because the deployment should be able to start with 0 pod replicas (zero-scaled). |
@james-mchugh you can use admission webhooks for that: https://kubernetes.io/blog/2019/03/21/a-guide-to-kubernetes-admission-controllers/ |
cc: @mfojtik @Kargakis @smarterclayton
/xref
RFC Issue: kubernetes/kubernetes#14512
Previous proposal: kubernetes/kubernetes#33545