Skip to content

Commit

Permalink
Additional KEP updates for Alpha
Browse files Browse the repository at this point in the history
  • Loading branch information
mimowo committed Aug 4, 2022
1 parent abe9a2b commit c7b8b97
Showing 1 changed file with 61 additions and 12 deletions.
73 changes: 61 additions & 12 deletions keps/sig-apps/3329-retriable-and-non-retriable-failures/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ tags, and then generate with `hack/update-toc.sh`.
- [Evolving condition types](#evolving-condition-types)
- [Design Details](#design-details)
- [New PodConditions](#new-podconditions)
- [Interim FailureTarget condition](#interim-failuretarget-condition)
- [JobSpec API](#jobspec-api)
- [Evaluation](#evaluation)
- [Test Plan](#test-plan)
Expand Down Expand Up @@ -169,7 +170,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [x] (R) Production readiness review completed
- [x] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [x] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Expand Down Expand Up @@ -674,7 +675,8 @@ The Pod status (which includes the `conditions` field and the container exit
codes) could be lost if the failed pod is garbage collected.

Losing Pod's status before it is interpreted by Job Controller can be prevented
by using the feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/).
by using the feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/)
(see more about the design details section: [Interim FailureTarget condition](#interim-failuretarget-condition)).

#### Evolving condition types

Expand Down Expand Up @@ -739,13 +741,33 @@ pod delete requests are issued to modify the code to also append a meaningful
condition with dedicated `Type`, `Reason` and `Message` fields based on the
invocation context.

### Interim FailureTarget condition

There is a risk of losing the Pod status information due to PodGC, which could
prevent Job Controller to react to a pod failure with respect to the configured
pod failure policy rules (see also: [Garbage collected pods](#garbage-collected-pods)).

In order to make sure all pods are checked against the rules we require the
feature of [job tracking with finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/)
to be enabled.

Additionally, before we actually remove the finalizers from the pods
(allowing them to be deleted by PodGC) we record the determined job failure
message (if any rule with `JobFail` matched) in an interim job condition, called
`FailureTarget`. Once the pod finalizers are removed we update the job status
with the final `Failed` job condition. This strategy eliminates a possible
race condition that we could lose the information about the job failure if
Job Controller crashed between removing the pod finalizers are updating the final
`Failed` condition in the job status.

### JobSpec API

We extend the Job API in order to allow to apply different actions depending
on the conditions associated with the pod failure.

```golang
// PodFailurePolicyAction specifies how a Pod failure is handled.
// +enum
type PodFailurePolicyAction string
const (
Expand All @@ -764,6 +786,7 @@ const (
PodFailurePolicyActionCount PodFailurePolicyAction = "Count"
)
// +enum
type PodFailurePolicyOnExitCodesOperator string
const (
Expand All @@ -780,38 +803,47 @@ const (
type PodFailurePolicyOnExitCodesRequirement struct {
// Restricts the check for exit codes to the container with the
// specified name. When null, the rule applies to all containers.
// When specified, it should match one the container or initContainer
// names in the pod template.
// +optional
ContainerName *string
// Represents the relationship between the container exit code(s) and the
// specified values. Containers completed with success (exit code 0) are
// excluded from the requirement check.Possible values are:
// excluded from the requirement check. Possible values are:
// - In: the requirement is satisfied if at least one container exit code
// (might be multiple if there are multiple containers not restricted
// by the 'containerName' field) is in the set of specified values.
// - NotIn: the requirement is satisfied if at least one container exit code
// (might be multiple if there are multiple containers not restricted
// by the 'containerName' field) is not in the set of specified values.
// Additional values are considered to be added in the future. Clients should
// react to an unknown operator by assuming the requirement is not satisfied.
Operator PodFailurePolicyOnExitCodesOperator
// Specifies the set of values. Each returned container exit code (might be
// multiple in case of multiple containers) is checked against this set of
// values with respect to the operator. Value '0' cannot be used for the In
// operator.
// values with respect to the operator. The list of values must be ordered
// and must not contain duplicates. Value '0' cannot be used for the In operator.
// At least one element is required. At most 255 elements are allowed.
// +listType=set
Values []int32
}
// PodFailurePolicyOnPodConditionsPattern describes a pattern for matching
// an actual pod condition type.
type PodFailurePolicyOnPodConditionsPattern struct {
// Specifies the required Pod condition type. The pattern matches a pod condition
// if the specified type equals the pod condition type.
// Specifies the required Pod condition type. To match a pod condition
// it is required that specified type equals the pod condition type.
Type api.PodConditionType
// Specifies the required Pod condition status. To match a pod condition
// it is required that the specified status equals the pod condition status.
// Defaults to True.
Status api.ConditionStatus
}
// PodFailurePolicyRule describes how a pod failure is handled when the requirements are met.
// Only one of OnExitCodes and onPodConditions can be used in each rule.
// One of OnExitCodes and onPodConditions, but not both, can be used in each rule.
type PodFailurePolicyRule struct {
// Specifies the action taken on a pod failure when the requirements are satisfied.
// Possible values are:
Expand All @@ -821,6 +853,8 @@ type PodFailurePolicyRule struct {
// incremented and a replacement pod is created.
// - Count: indicates that the pod is handled in the default way - the
// counter towards the .backoffLimit is incremented.
// Additional values are considered to be added in the future. Clients should
// react to an unknown action by skipping the rule.
Action PodFailurePolicyAction
// Represents the requirement on the container exit codes.
Expand All @@ -829,7 +863,7 @@ type PodFailurePolicyRule struct {
// Represents the requirement on the pod conditions. The requirement is represented
// as a list of pod condition patterns. The requirement is satisfied if at
// least pattern matches an actual pod condition.
// least one pattern matches an actual pod condition. At most 20 elements are allowed.
// +listType=atomic
OnPodConditions []PodFailurePolicyOnPodConditionsPattern
}
Expand All @@ -840,7 +874,7 @@ type PodFailurePolicy struct {
// Once a rule matches a Pod failure, the remaining of the rules are ignored.
// When no rule matches the Pod failure, the default handling applies - the
// counter of pod failures is incremented and it is checked against
// the backoffLimit.
// the backoffLimit. At most 20 elements are allowed.
// +listType=atomic
Rules []PodFailurePolicyRule
}
Expand All @@ -854,7 +888,7 @@ type JobSpec struct {
// If empty, the default behaviour applies - the counter of failed pods,
// represented by the jobs's .status.failed field, is incremented and it is
// checked against the backoffLimit. This field cannot be used in combination
// with restartPolicy=OnFailure.
// with .spec.podTemplate.spec.restartPolicy=OnFailure.
//
// This field is alpha-level. To use this field, you must enable the
// `JobPodFailurePolicy` feature gate (disabled by default).
Expand Down Expand Up @@ -1083,10 +1117,15 @@ Below are some examples to consider, in addition to the aforementioned [maturity
- Re-evaluate introduction of a generic opinionated condition type
indicating that a pod should be retried (see: [Evolving condition types](#evolving-condition-types))
- Simplify the code in job controller responsible for detection of failed pods
based on the fix for pods stuck in the running phase (see: [Marking pods as Failed](marking-pods-as-failed))
based on the fix for pods stuck in the running phase (see: [Marking pods as Failed](marking-pods-as-failed)).
Also, introduce a worker in the disruption controller to mark pods
stuck in the pending phase, with set deletionTimestamp, as failed.
- Commonize the code for appending pod conditions between components
- Do not update the pod disruption condition (with type=`DisruptionTarget`) if
it is already present with `status=True`
- Re-evaluate the decision about handling failed appending of a pod disruption
condition. Currently, failed patch to append the pod condition interrupts the
pod delete operation.
- Review and implement if feasible adding of pod conditions with the use of
[SSA](https://kubernetes.io/docs/reference/using-api/server-side-apply/) client.
- The feature flag enabled by default
Expand Down Expand Up @@ -1545,6 +1584,16 @@ For each of them, fill in the following information by copying the below templat

## Implementation History

- 2022-06-23: Initial KEP merged
- 2022-07-28: KEP updates merged
- 2022-08-01: Addiitonal KEP updates merged
- 2022-07-12: Preparatory PR "Refactor gc_controller to do not use the deletePod stub" merged
- 2022-07-14: Preparatory PR "efactor taint_manager to do not use getPod and getNode stubs" merged
- 2022-07-20: Preparatory PR "Add integration test for podgc" merged
- 2022-07-20: Preparatory PR "Add integration test for podgc" merged
- 2022-08-02: PR "Append new pod conditions when deleting pods to indicate the reason for pod deletion" merged
- 2022-08-02: PR "Add worker to clean up stale DisruptionTarget condition" merged

<!--
Major milestones in the lifecycle of a KEP should be tracked in this section.
Major milestones might include:
Expand Down

0 comments on commit c7b8b97

Please sign in to comment.