- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable
- (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- (R) Graduation criteria is in place
- (R) Production readiness review completed
- Production readiness review approved
- "Implementation History" section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Identify remaining gaps to promote CRI to Beta and GA to reflect its practical use in production for many years.
CRI based runtimes such as CRI-O and containerd have been in use in production for over a year now with the current CRI API. We want to signal to the users that the CRI API is production ready and they should feel comfortable moving away dockershim as it is slated to be deprecated.
- Graduate the CRI API to stable.
- Identify any fields that need to made more type safe such as Seccomp.
- Address and cleanup the notes/todos in the CRI.
- Block on any big new features.
Evolve the CRI API version as we address feedback in each milestone towards stable.
- v1alpha2 (alpha, current state)
- v1beta (beta, proposed 1.20)
- v1 (stable, TBD)
Risk | Detail | Mitigation |
---|---|---|
CRI stats performance | CRI stats performance may be worse compared to cadvisor | Measure performance and share report with community |
- No longer map the
container-runtime-endpoint
flag as experimental. - Keep the
image-service-endpoint
flag as experimental and evaluate if it makes sense to keep as a configurable or remove it. - Introduce field in the
Image
message to indicate an image should not be garbage collected (see #pinned-images)
- kubenet: There exists an open TODO in the specification to remove support for setting PodCidr for kubenet networking. However for CRI
implementations CNI is the existing standard and is primarily the only solution being tested with the CRI container runtime integrations.
Need Sig-Node and Sig-Networking to help validate if / when kubenet is being deprecated and if we should deprecate this before beta. If not when.
kubernetes/kubernetes#62288
type NetworkConfig struct {
// CIDR to use for pod IP addresses. If the CIDR is empty, runtimes
// should omit it.
PodCidr string
protobuf:"bytes,1,opt,name=pod_cidr,json=podCidr,proto3" json:"pod_cidr,omitempty"
XXX_NoUnkeyedLiteral struct{}json:"-"
XXX_sizecache int32json:"-"
}
- Removal of TODOs that are no longer valid should be done before v1beta. We have scraped the api specification once and have a small list of commits to file.
- Introduce field in the
Image
message to indicate an image should not be garbage collected:
message Image {
...
ImageSpec spec = 7;
// Recommendation on whether this image should be exempt from garbage collection.
// It must only be treated as a recommendation--the client can still request the image be deleted,
// and the runtime must oblige.
bool pinned = 8;
}
...
- Rely on this behavior instead of having the Kubelet keep track of expected pause image.
- In other words, have the runtime say the required pause image is pinned, indicating the Kubelet should not garbage collect it.
- As a result, support for
pod-infra-container
option in Kubelet will be dropped (not part of this KEP, but worth mentioning).
- See kubernetes/kubernetes#101808 for more details.
- Review of the existing test cases in critest and adding more if we find any gaps.
- Make sure we have e2e node (and possibly selected e2e conformance) tests running on more than one CRI implementation.
- Passes all existing CRI tests on at least two container runtimes (sig-node(e2e-node) and cri-tools(critest)).
- Is in production on numerous clouds. (Note: this reflects the urgency of the signal to move off non CRI solutions.)
- Documentation is updated to reflect beta status.
- Update the CI with containerd and CRI-O versions that support the v1 proto.
- Ensure that the required CRI stats changes are included. See #2371.
- Introduce field in the
Image
message to indicate an image should not be garbage collected (see #pinned-images)
- TBD
Kubelet and the runtime versions should use the same CRI version in lock-step. Upgrade involves draining all pods from a node, installing a CRI runtime with this version of the API and update to a matching kubelet and then make the node schedulable again.
Kubelet and the CRI runtime versions are expected to match so we don't have to worry about it for v1beta1.
-
How can this feature be enabled / disabled in a live cluster?
- Feature gate (also fill in values in
kep.yaml
)- Feature gate name:
- Components depending on the feature gate:
- Other
- Describe the mechanism: Install, configure, and run a CRI runtime on a node. Change the kubelet configuration to point to the CRI runtime socket and restart the kubelet.
- Will enabling / disabling the feature require downtime of the control plane? No. The control plane nodes could be modified one at a time to switch to CRI runtimes.
- Will enabling / disabling the feature require downtime or reprovisioning of a node? Yes. One could re-provision an existing nodes or provision new nodes with a CRI runtime and kubelet configured to talk to that runtime and then migrate your existing workloads to the new nodes.
- Feature gate (also fill in values in
-
Does enabling the feature change any default behavior?
- It changes the default container runtime from dockershim, but the container workloads are expected to work the same way as they do with dockershim.
-
Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? Yes, the users could switch back to dockershim on a node reversing the process of installing CRI based runtime.
-
What happens if we reenable the feature if it was previously rolled back? No impact per existing kubernetes policy for draining nodes for node lifecyle. IOW container runtime being used CRI or internal docker-shim is tied to node lifecycle operations.
-
Are there any tests for feature enablement/disablement? No impact for v1 vs v2 or v1 point release extensions for the CRI api. Instead container runtimes will expose grpc service endpoints on a single socket for the CRI services as separate v1/v2 service types. A container runtime would have to provide two endpoints for each service by type if it wants to support two different versions of kubelet/runtime integrations.
This section must be completed when targeting beta graduation to a release.
-
How can a rollout fail? Can it impact already running workloads? Workloads scheduled on nodes with a CRI runtime may fail due to some misconfiguration of that node. Yes, it could impact running workloads since we depend upon draining and moving workloads around to switch to CRI runtimes on a node.
-
What specific metrics should inform a rollback?
- Nodes that have been switched to CRI runtime are not ready.
- Workloads are failing to come up on a CRI configured node.
-
Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? We don't expect to do any automated upgrade or rollback from and to dockershim so this doesn't apply.
-
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
- TODO
This section must be completed when targeting beta graduation to a release.
-
How can an operator determine if the feature is in use by workloads? The Node object returns the configured CRI runtime.
-
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
- Other (treat as last resort)
- Details:
- Node Ready
- Details:
- Metrics
-
What are the reasonable SLOs (Service Level Objectives) for the above SLIs? TBD
-
Are there any missing metrics that would be useful to have to improve observability of this feature? TBD
This section must be completed when targeting beta graduation to a release.
- Does this feature depend on any specific services running in the cluster? No.
For beta, this section is required: reviewers must answer these questions.
- Will enabling / using this feature result in any new API calls?
Exec/attach/port forwarding go through the API server.
-
Will enabling / using this feature result in introducing new API types?
No, new k8s API types besides seccomp changes in the CRI.
-
Will enabling / using this feature result in any new calls to the cloud provider?
No.
-
Will enabling / using this feature result in increasing size or count of the existing API objects?
No.
-
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No.
-
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
We have an open item to dive deeper into perf comparison of CRI stats vs. cadvisor and will update here once we have more data.
- Troubleshooting for CRI integrations requires a good set of documentation be provided by kubernetes for interactions with pods, containers, kubelet, security profiles, crictl (cri-tools client), developer and test (e2e, node, and critest) guides, and each of the container runtimes. As such, while in v1Beta there should be workgroup(s) formed or issues tracked for reaching GA criteria over the beta to GA period.
- How does this feature react if the API server, kubelet and/or etcd is unavailable?
- Open streams for Exec/Attach/Port forward that are forwarded by kubelet to API Server will likely timeout and close if API Server becomes unavailable.
- CRI Runtimes are resiliant to kubelet loosing connection over GRPC CRI calls.
- CRI Runtimes Integrations are not known to checkpoint using etcd and thus are not directly affected by etcd at the node.
-
What are other known failure modes? For each of them, fill in the following information by copying the below template:
- [Failure mode brief description]
- Detection: How can it be detected via metrics? Stated another way: how can an operator troubleshoot without logging into a master or worker node?
- Mitigations: What can be done to stop the bleeding, especially for already running user workloads?
- Diagnostics: What are the useful log messages and their required logging levels that could help debug the issue? Not required until feature graduated to beta.
- Testing: Are there any tests for failure mode? If not, describe why.
- [Failure mode brief description]
-
What steps should be taken if SLOs are not being met to determine the problem?
- First version with v1alpha1 was released in k8s 1.5. See https://github.com/kubernetes/community/blob/ee783a18a34ef16da07f8d16d42782a6f78a9253/contributors/devel/sig-node/container-runtime-interface.md
- v1alpha was released with k8s 1.10. See kubernetes/kubernetes#58973
- v1 proto was introduced in k8s 1.20. See kubernetes/kubernetes#96387