Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal checkpointing support #104907

Merged
merged 8 commits into from
Jul 14, 2022

Conversation

adrianreber
Copy link
Contributor

@adrianreber adrianreber commented Sep 10, 2021

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

This PR introduces the minimal checkpoint support as discussed in kubernetes/enhancements#1990

With the changes in the PR it is only possible to checkpoint a container. One of the main use cases, as discussed in the KEP of this PR is the forensic use case: Checkpoint a container and leave it running, take the checkpoint image and restart the container somewhere else (sandboxed environment) for analysis.

Possible steps could be:

  1. Checkpoint container out of Kubernetes: curl -skv -X POST "https://localhost:10250/checkpoint/default/counters/wildfly"
  2. Take the checkpoint archive from /var/lib/kubelet/checkpoints and transfer it to another system
  3. Use crictl restore --import=<archive> to restore a copy of the container somewhere else.

In the KEP we mentioned that checkpoints will be stored in OCI images and can be pushed to a registry. This is still planned but currently not implemented on the container engine level. It will be implemented in the future.

Which issue(s) this PR fixes:

This does not really completely fix any open issues, but it is a step forward to close issues like:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added a new feature gate `CheckpointRestore` to enable support to checkpoint containers. If enabled it is possible to checkpoint a container using the newly kubelet API (/checkpoint/{podNamespace}/{podName}/{containerName}).

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 10, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @adrianreber. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 10, 2021
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@adrianreber adrianreber changed the title Minimal checkpoint Minimal checkpointing support Sep 10, 2021
@ehashman
Copy link
Member

This needs to be implemented behind a feature flag.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 14, 2021
@adrianreber adrianreber force-pushed the 2021-09-10-checkpoint branch 2 times, most recently from c4eb187 to 71f1bdb Compare September 17, 2021 10:10
@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-unit

@adrianreber adrianreber force-pushed the 2021-09-10-checkpoint branch 4 times, most recently from 42e0bba to 948556e Compare September 20, 2021 15:55
@adrianreber
Copy link
Contributor Author

Added unit tests and a feature flag (disabled by default).

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianreber this is super close.

can you address the following:

  1. ensure checkpoint directory is only created if the feature gate is enabled
  2. ensure i can checkpoint an init container and/or ephemeral container

Thanks!


ctx, cancel := func() (context.Context, context.CancelFunc) {
defaultTimeout := int64(r.timeout / time.Second)
if options.Timeout > defaultTimeout {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what could cause this timeout to need to vary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoint duration directly depends on the number of memory pages the processes in the container require. For very large containers checkpointing can take longer than the default CRI timeout of 2 seconds.

@@ -1251,6 +1254,9 @@ func (kl *Kubelet) setupDataDirs() error {
if err := os.MkdirAll(kl.getPodResourcesDir(), 0750); err != nil {
return fmt.Errorf("error creating podresources directory: %v", err)
}
if err := os.MkdirAll(kl.getCheckpointsDir(), 0750); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should only create the checkpoint directory if the feature gate is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Changed.

// checkpoint handles the checkpoint API request. It checks if the requested
// podNamespace, pod and container actually exist and only then calls out
// to the runtime to actually checkpoint the container.
func (s *Server) checkpoint(request *restful.Request, response *restful.Response) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initially we want to avoid exposing checkpoint operations from the k8s control plane itself.

we need to better understand the pros/cons from a forensic use case.

containerName := request.PathParameter("containerName")

found := false
for _, container := range pod.Spec.Containers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means we are unable to checkpoint init containers or ephemeral containers.

can you add a check for pod.Spec.InitContainers and a check for pod.Spec.EphemeralContainers if the feature gate for ephemeral containers is enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I changed it and was able to checkpoint init and ephemeral containers.

This adds the minimal CRI API changes to support checkpoint/restore and
to enable implementation of these checkpoint/restore interfaces on
container engine level.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds the first infrastructure code parts to the kubelet
to support checkpoint/restore.

Signed-off-by: Adrian Reber <areber@redhat.com>
This is the first step to implement checkpointing and restoring of
container and containers starting from the lowest layer in the kubelet.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds the last pieces to wire through the container checkpoint
support in the kubelet.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
@adrianreber adrianreber force-pushed the 2021-09-10-checkpoint branch from 343c6ac to 92ea6e3 Compare July 14, 2022 10:28
@adrianreber
Copy link
Contributor Author

@adrianreber this is super close.

🎆

can you address the following:

1. ensure checkpoint directory is only created if the feature gate is enabled

Done. Good point.

2. ensure i can checkpoint an init container and/or ephemeral container

Also done. Thanks for your very specific review. That made it really easy to implement the requested changes.

@derekwaynecarr
Copy link
Member

Thanks for the updates as requested.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 14, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adrianreber, derekwaynecarr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 14, 2022
@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind

@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-e2e-kind

@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

1 similar comment
@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@k8s-ci-robot
Copy link
Contributor

@adrianreber: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-e2e-containerd-features dcd59805bf4134f6de7ef08dcf1bd75f1deeb3fd link false /test pull-kubernetes-node-e2e-containerd-features
pull-kubernetes-node-e2e-containerd-alpha-features 0b1cc8ae5d242ecfe9f0b03c843a9bc07d755ef9 link false /test pull-kubernetes-node-e2e-containerd-alpha-features
pull-kubernetes-e2e-gce-ubuntu-containerd 92ea6e3 link true /test pull-kubernetes-e2e-gce-ubuntu-containerd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@adrianreber
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.