Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement deferContainers - Pod Termination Semantics #47422

Closed
wants to merge 4 commits into from

Conversation

dhilipkumars
Copy link

What this PR does / why we need it:
Introduces a new Termination Semantic for pod.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Implements this feature. The implementation has diverged reasonably from its orignal proposal.
We will update the proposal to reflect what has been implemented.

Special notes for your reviewer:

NOTE: This is an unpolished implementation, has a lot of rough edges. This PR is raised just to get some consensus on the approach we have taken.

Here are some of the highlights of the approach

  1. It is designed to work across kubelet restarts
  2. The default and only restart policy of DeferContainer is 'OnFailure'.
  3. If deferContainers are configured for a POD it disables (overrides) PreStopHooks
  4. If all the containers during deferContainers are completed well ahead of GracePeriod, the POD would be killed without waiting anyfurther.
  5. deferContainer's images if that is different from the other containers in the POD, it will be pre-pulled while the pod is running.
  6. the property of killPod or killPodwithSyncResults functions to be blocking is maintained.
  7. Implemented as an alpha annotation.

Here is a sample pod spec configured with deferContainers and with a gracePeriod of 60 secs

#LifeCycle Demo
apiVersion: v1
kind: Pod
metadata:
  name: defer-demo
  annotations:
    pod.beta.kubernetes.io/init-containers: '[
       {
         "name": "first",
         "image": "bash",
         "command": ["/usr/local/bin/bash", "-c", "echo started"],
         "volumeMounts": [
            {
               "name": "config",
               "mountPath": "/config"
            }
         ]
       }
    ]'
    pod.alpha.kubernetes.io/defer-containers: '[
       {
         "name": "last-but-one",
         "image": "bash",
         "command": ["/usr/local/bin/bash", "-c", "echo entered deferContainers"],
         "volumeMounts": [
            {
               "name": "config",
               "mountPath": "/config"
            }
         ]
       },
       {
         "name": "last",
         "image": "busybox",
         "command": ["/bin/echo", "finished"],
         "volumeMounts": [
            {
               "name": "config",
               "mountPath": "/config"
            }
         ]
       }
    ]'
spec:
  terminationGracePeriodSeconds: 60
  containers:
  - name: prestop
    image: bash
    command: [ "/usr/local/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 1; echo Prestop1:`date`;  cat /config/test.file; done;" ]
    volumeMounts:
      - name: config
        mountPath: /config
    lifecycle:
      preStop:
        exec:
          command: ["/usr/local/bin/bash","-c", "echo preExitHook_executing > /config/test.file; sleep 2; echo preExithook_finished > /config/test.file"]
  volumes:
      - name: config
        emptyDir: {}

Here is the events for the pod. As you can see the deferContainer named last has a different image which is getting pre-pulled

  FirstSeen     LastSeen        Count   From                    SubObjectPath                           Type            Reason                  Message
  ---------     --------        -----   ----                    -------------                           --------        ------                  -------
  2m            2m              1       kubelet, kube-node-3                                            Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "config"
  2m            2m              1       kubelet, kube-node-3                                            Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-7xgh1"
  2m            2m              1       default-scheduler                                               Normal          Scheduled               Successfully assigned defer-demo to kube-node-3
  2m            2m              1       kubelet, kube-node-3    spec.initContainers{first}              Normal          Started                 Started container with id a8ed41f882604468d6564fb3fd0696369344d188b1ef510f31ea65988217245d
  2m            2m              1       kubelet, kube-node-3    spec.initContainers{first}              Normal          Created                 Created container with id a8ed41f882604468d6564fb3fd0696369344d188b1ef510f31ea65988217245d
  2m            2m              1       kubelet, kube-node-3    spec.containers{prestop}                Normal          Created                 Created container with id 4a5cd7168ccc1d5d51c2c60c77e34f6a2348bac12797513a3dc11a411be87eb2
  2m            2m              1       kubelet, kube-node-3    spec.containers{prestop}                Normal          Started                 Started container with id 4a5cd7168ccc1d5d51c2c60c77e34f6a2348bac12797513a3dc11a411be87eb2
  53s           53s             1       kubelet, kube-node-3    spec.deferContainers{last}              Normal          Pulling                 Defer-Container image is being pre-pulled
  2m            16s             3       kubelet, kube-node-3    spec.initContainers{first}              Normal          Pulling                 pulling image "bash"
  2m            10s             3       kubelet, kube-node-3    spec.initContainers{first}              Normal          Pulled                  Successfully pulled image "bash"
  53s           10s             2       kubelet, kube-node-3    spec.deferContainers{last}              Normal          Pulling                 pulling image "busybox"
  10s           10s             1       kubelet, kube-node-3    spec.deferContainers{last-but-one}      Normal          Created                 Created container with id 9ed5e37058012476bb42769e488dbd39dc07d47cb5b90c0d108898558909cf0a
  10s           10s             1       kubelet, kube-node-3    spec.deferContainers{last-but-one}      Normal          Started                 Started container with id 9ed5e37058012476bb42769e488dbd39dc07d47cb5b90c0d108898558909cf0a
  45s           5s              2       kubelet, kube-node-3    spec.deferContainers{last}              Normal          Pulled                  Successfully pulled image "busybox"
  5s            5s              1       kubelet, kube-node-3    spec.deferContainers{last}              Normal          Created                 Created container with id e334389c25b40e69e0b2f6e8c7b6897fddd237a1586ead3b8bc3b81fa50b942e
  5s            5s              1       kubelet, kube-node-3    spec.deferContainers{last}              Normal          Started                 Started container with id e334389c25b40e69e0b2f6e8c7b6897fddd237a1586ead3b8bc3b81fa50b942e
  3s            3s              1       kubelet, kube-node-3    spec.containers{prestop}                Normal          Killing                 Killing container with id docker://4a5cd7168ccc1d5d51c2c60c77e34f6a2348bac12797513a3dc11a411be87eb2:Need to kill Pod

Also the pod does not wait to run-out of all of the GracePeriod, as soon as all the deferContainers are completed the pod is deleted. Even though the grace period is configured for 60seconds since all the deferContainers completed successfully in 7 seconds, so the pod is deleted without waiting further.

CC: @dchen1107 @smarterclayton @erictune @brendandburns @Kargakis @quinton-hoole @irfanurrehman @kow3ns @sig-node-reviewers

TODO:

  1. Tests
  2. E2E and Integration tests
  3. better integration with kubctl like for status and logs etc.,
  4. Documentation

Release note: none

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 13, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dhilipkumars
We suggest the following additional approvers: Random-Liu, erictune

Assign the PR to them by writing /assign @Random-Liu @erictune in a comment when ready.

Associated issue: 256

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 13, 2017
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 13, 2017
@gmarek
Copy link
Contributor

gmarek commented Jun 14, 2017

I believe we need to agree on the proposal first... @smarterclayton

@gmarek gmarek assigned smarterclayton and unassigned gmarek and mikedanese Jun 14, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 15, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 21, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2017
@dhilipkumars
Copy link
Author

/assign @erictune

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 18, 2017
@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 23, 2017
@k8s-github-robot
Copy link

Adding do-not-merge/release-note-label-needed because the release note process has not been followed.
One of the following labels is required "release-note", "release-note-action-required", "release-note-experimental" or "release-note-none".
Please see: https://github.com/kubernetes/community/blob/master/contributors/devel/pull-requests.md#write-release-notes-if-needed.

@k8s-github-robot k8s-github-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 30, 2017
@dhilipkumars
Copy link
Author

/label keep-open

@erictune
Copy link
Member

erictune commented Sep 1, 2017

Use cases, and comments about each use case:

  1. @bboreham in Pod or Job lifecycle lacks of a mechanism to define cleanup actions once you delete a pod #35183 (comment) wants to uninstall system-level changes made by a DaemonSet.
    1. He can't do this on every shutdown because he needs to know the difference between terminate-to-upgrade vs terminate-forever.
    2. However, termination-reason could be a way to tell the pod whether to clean up or not. We need to finish that by standardizing the set of reasons and having kubectl drain send a standard reason.
    3. Thinking we should finish/explore the existing mechanism we have before adding something new, at least for the "DaemonSet Upgrade" use case.
  2. There are comments on Deployment Hooks about deferContainers. But, deferContainer is a per-Pod thing, and that issue is about the state of the Deployment. A pod typically does not know about the state of the whole deployment.
  3. Another use case is "I need a sequence of containers to happen, and I don't like packaging all my code into a single container".
    1. I wonder if it is possible to have a pod with only init-containers, treating them as more like "runSerialContainers". Does this address that use case well?

@dhilipkumars
Copy link
Author

Some more usecases that was presented during AppSig.

@dhilipkumars
Copy link
Author

/label keep-open

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 22, 2017
@k8s-ci-robot
Copy link
Contributor

@dhilipkumars: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 2e7f91c link /test pull-kubernetes-bazel-test
pull-kubernetes-verify 2e7f91c link /test pull-kubernetes-verify
pull-kubernetes-kubemark-e2e-gce 2e7f91c link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-kops-aws 2e7f91c link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-node-e2e 2e7f91c link /test pull-kubernetes-node-e2e
pull-kubernetes-e2e-gce-device-plugin-gpu 2e7f91c link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-e2e-gce 2e7f91c link /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

@dhilipkumars PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 30, 2018
@dims
Copy link
Member

dims commented May 7, 2018

/close

@dhilipkumars please reopen with a fresh rebase when you need this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants