[WIP] Add --checkpoint to drain #97194

adrianreber · 2020-12-10T12:41:42Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR implements --checkpoint on drain. This means that Pods can be checkpointed instead of being killed when draining a node, so that the Pods can be restored later with the same state. This is especially, maybe only, interesting for stateful containers which need a long time to start up (Java), need to load a lot of data from storage (database), or for other stateful containers without a storage backend.

The basic steps to use this would look something like this:

kubectl drain 127.0.0.1 --checkpoint
reboot
Restart kubelet (and restore all checkpointed containers)

This PR does not yet implement the restore part. The corresponding CRI-O implementation to support checkpoint/restore already implements checkpointing and restoring of Pods, but it has not been added to this PR, yet.

The motivation behind this PR is to implement Pod migration and in a way it is almost possible with the changes in this PR. The checkpoints created during drain --checkpoint can be copied to another node and upon kubelet start the Pods should be restored there. The motivation to start with the drain use case is that it seems to be one of the simpler starting points to introduce checkpoint/restore/migrate into Kubernetes as it is a manual and user triggered operation only concerning a single node so far.

This is heavily inspired by Jakob Schrettenbrunner's work referenced in #3949

This PR is pretty large, but my initial approach to go in small steps and just add the necessary changes to the CRI API (kubernetes/enhancements#1990) feels kind of stuck because it was not possible to see all the implications of adding checkpoint/restore to the CRI API. So this is kind of the other extreme, a really large PR which shows all necessary changes to implement a subset of the possibilities checkpoint/restore offers.

Which issue(s) this PR fixes:
Maybe #3949

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

This adds the parameter --checkpoint to kubectl drain

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.

k8s-ci-robot · 2020-12-10T12:41:50Z

@adrianreber: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2020-12-10T12:41:51Z

Welcome @adrianreber!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2020-12-10T12:41:51Z

Hi @adrianreber. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sftim · 2020-12-10T13:00:07Z

@adrianreber, it looks like the Checkpoint subresource is in the policy API group. Why that group?

adrianreber · 2020-12-10T13:12:58Z

@adrianreber, it looks like the Checkpoint subresource is in the policy API group. Why that group?

I was following the eviction code path. So wherever I saw eviction I replicated it for checkpoint to go from kubectl to the api server to the kubelet.

fedebongio · 2020-12-10T21:09:32Z

/remove-sig api-machinery

This implements container restore as described in: https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/#restore-checkpointed-container-standalone For detailed step by step instruction also see contrib/checkpoint/checkpoint-restore-cri-test.sh The code changes are based on changes I have done in Podman around 2018 and CRI-O around 2020. The history behind restoring container via CRI/Kubernetes probably requires some explanation. The initial proposal to bring checkpoint/restore to Kubernetes was looking at pod checkpoint and restoring and the corresponding CRI changes. kubernetes-sigs/cri-tools#662 kubernetes/kubernetes#97194 After discussing this topic for about two years another approach was implemented as described in KEP-2008: kubernetes/enhancements#2008 "Forensic Container Checkpointing" allowed us to separate checkpointing from restoring. For the "Forensic Container Checkpointing" it is enough to create a checkpoint of the container. Restoring is not necessary as the analysis of the checkpoint archive can happen without restoring the container. While thinking about a way to restore a container it was by coincidence that we started to look into restoring containers in Kubernetes via Create and Start. The way it was done in CRI-O is to figure out during Create if the container image is a checkpoint image and if that is true we are using another code path. The same was implemented now with this change in containerd. With this change it is possible to restore the container from a checkpoint tar archive that is created during checkpointing via CRI. To restore a container via Kubernetes we convert the tar archive to an OCI image as described in the kubernetes.io blog post from above. Using this OCI image it is possible to restore a container in Kubernetes. At this point I think it should be doable to restore containers in CRI-O and containerd no matter if they have been created by containerd or CRI-O. The biggest difference is the container metadata and that can be adapted during restore. Open items: * It is not clear to me why restoring a container in containerd goes through task/Create(). But as the restore code already exists this change extended the existing code path to restore a container in task/Create() to also restore a container through the CRI via Create and Start. * Automatic image pulling. containerd does not pull images automatically if created via the CRI. There is an option in crictl to pull images before starting, but that uses the CRI image pull interface. It is still a separate pull and create operation. Restoring containers from an OCI image is a bit different. The checkpoint OCI image does not include the base image, but just a reference to the image (NAME@DIGEST). Using crictl with pulling will enable the pulling of the checkpoint image, but not of the base image the checkpoint is based on. So during preparation of the checkpoint containerd will automatically pull the base image, but I was not able how to pull an image blockingly in containerd. So there is a for loop waiting for the container image to appear in the internal store. I think this probably can be implemented better. Anyway, this is a first step towards container restored in Kubernetes when using containerd. Signed-off-by: Adrian Reber <areber@redhat.com>

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 10, 2020

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 10, 2020

k8s-ci-robot requested review from deads2k and jsafrane December 10, 2020 12:43

adrianreber mentioned this pull request Dec 10, 2020

Add Forensic Container Checkpointing KEP kubernetes/enhancements#1990

Merged

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Dec 10, 2020

adrianreber force-pushed the 2020-12-10-drain--checkpoint branch from 58371c4 to 29a7b2f Compare December 17, 2020 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add --checkpoint to drain #97194

[WIP] Add --checkpoint to drain #97194

adrianreber commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

sftim commented Dec 10, 2020

adrianreber commented Dec 10, 2020

fedebongio commented Dec 10, 2020

[WIP] Add --checkpoint to drain #97194

[WIP] Add --checkpoint to drain #97194

Conversation

adrianreber commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

k8s-ci-robot commented Dec 10, 2020

sftim commented Dec 10, 2020

adrianreber commented Dec 10, 2020

fedebongio commented Dec 10, 2020