Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't create a new sandbox for pod with RestartPolicyOnFailure if all containers succeeded #92614

Merged
merged 1 commit into from
Sep 3, 2020

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Jun 29, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:
The kubelet would attempt to create a new sandbox for a pod whose RestartPolicy is OnFailure even after all container succeeded. It caused unnecessary CRI and CNI calls, confusing logs and conflicts between the routine that creates the new sandbox and the routine that kills the Pod.

This patch checks the containers to start and stops creating sandbox if no container is supposed to start.

Which issue(s) this PR fixes:

Fixes #92613

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fixed kubelet creating extra sandbox for pods with RestartPolicyOnFailure after all containers succeeded

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 29, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @tnqn. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 29, 2020
@k8s-ci-robot k8s-ci-robot requested review from dashpole and dims June 29, 2020 18:19
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 29, 2020
@neolit123
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 29, 2020
@neolit123
Copy link
Member

Does this PR introduce a user-facing change?:

^ please add a release note that explains the change in the kubelet

}
changes.ContainersToStart = append(changes.ContainersToStart, idx)
}
changes.ContainersToStart = containersToStart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another nit: wouldn't it be more logical to move the loop back here - after the Init containers check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to change ordering back to make sure that any changes related to containers start/stop are only applied after all Init container complete. Just to avoid potential issues in a future if more logic will be added

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

containersToStart is also used to determine whether there's a need to create a sandbox. Wouldn't it cause repeated computation if moving the loop back here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the logic here - if the if len(pod.Spec.InitContainers) != 0 { than we need to start an Init container. So this condition check should go before any container-related ones. I was imagining situation when len(pod.Spec.Containers) is 0, i.e. pod has no containers, but has Init containers. In this case if order of checks will be as it's written, Init containers wouldn't run. The situation when there are no containers in a pod seems to be impossible today, thus it's just a recommendation to swap these checks. In anticipation that one day more conditions may be added above which semantically needs to run after all Init containers logic complete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I understand the concern now, but swapping them back could cause sandbox still being recreated if any initContainer is present. The current workflow of a Pod with initContainer is and would be kept as below:

  1. create sandbox, create init container, create container (expected actions)
  2. all containers succeeded, delete sandbox (expected actions)
  3. create sandbox, create init container (unexpected actions)

If taking potential no container case into consideration, could we add a condition to ensure this is not the first time to create sandbox like

if !shouldRestartOnFailure(pod) && attempt != 0 && len(podStatus.ContainerStatuses) != 0 {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.r.t. Sergey's comment, is it possible that a pod has:

some init containers (> 0)
no container
some Ephemeral Containers (> 0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure all initContainers can run once successfully regardless of there are containers to start, I added one more condition for not creating sandbox. It should address the potential case "init containers (> 0), no container".

For the case of Ephemeral Containers (> 0), I think it's not affected by this PR. It was never created along with (re-)creating sandbox, and was handled when createPodSandbox is false after initContainer or normal container became running. The behavior is kept as it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a comment above. Suggestion is to minimize the introduced changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe introducing a test with the situaiton @tedyu outlined will be beneficial. So future changes will take it into account.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are already some tests to cover a Pod in TestComputePodActionsWithInitAndEphemeralContainers for example:

"Kill pod and do not restart ephemeral container if the pod sandbox is dead": {

And I have added Create a new pod sandbox if the pod sandbox is dead, init container failed and RestartPolicy == OnFailure to it to ensure sandbox can be recreated as long as initialization is not done regardless of regular containers and ephemeral containers' status.
I didn't really delete regular containers from the pod spec of the tests, thinking it sounds strange to have a case with illegal input, but I think they should cover the situation @tedyu outlined implicitly

@tnqn tnqn force-pushed the onfailure-recreate branch from 743e396 to bf4e43c Compare June 30, 2020 02:12
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 30, 2020
@tnqn
Copy link
Member Author

tnqn commented Jun 30, 2020

Does this PR introduce a user-facing change?:

^ please add a release note that explains the change in the kubelet

Done, thanks.

@tedyu
Copy link
Contributor

tedyu commented Jun 30, 2020

Is it possible to add an e2e node test (under test/e2e_node) that covers the scenario ?

Thanks

@tnqn tnqn force-pushed the onfailure-recreate branch from bf4e43c to 1e66062 Compare June 30, 2020 13:14
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 30, 2020
@tnqn tnqn force-pushed the onfailure-recreate branch from fc3ef9e to b2b082f Compare July 7, 2020 14:51
@tnqn
Copy link
Member Author

tnqn commented Jul 8, 2020

/retest

2 similar comments
@tnqn
Copy link
Member Author

tnqn commented Jul 8, 2020

/retest

@tnqn
Copy link
Member Author

tnqn commented Jul 10, 2020

/retest

@SergeyKanzhelev
Copy link
Member

I don't have permissions to lgtm, for what it worth:

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2020
@SergeyKanzhelev
Copy link
Member

/test pull-kubernetes-e2e-kind-ipv6

@tnqn
Copy link
Member Author

tnqn commented Sep 2, 2020

@SergeyKanzhelev @dashpole @dims could this PR get into 1.20 milestone?

@dims
Copy link
Member

dims commented Sep 3, 2020

/milestone v1.20

@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Sep 3, 2020
@alex-vmw
Copy link

alex-vmw commented Sep 3, 2020

@tnqn @dims Would it be possible to also backport this into 1.18/1.19? Thanks.

@SergeyKanzhelev
Copy link
Member

If you want 1.19.1 cutoff is Friday I think. So need to do it today

@SergeyKanzhelev
Copy link
Member

failure doesn't seem to be related

/retest

@tnqn
Copy link
Member Author

tnqn commented Sep 11, 2020

@alex-vmw @SergeyKanzhelev I just created #94725 to backport to 1.19, hope it can catch 1.19.2.

@alex-vmw
Copy link

@tnqn How about for 1.18.x? Any plans to backport there?

@cpanato
Copy link
Member

cpanato commented Oct 6, 2020

@tnqn @SergeyKanzhelev this only need for 1.18 and 1.19, not for 1.17?

@tnqn
Copy link
Member Author

tnqn commented Oct 13, 2020

@tnqn @SergeyKanzhelev this only need for 1.18 and 1.19, not for 1.17?

@cpanato I just checked the issue can be reproduced on 1.17.12 as well, created #95508 for 1.17.

k8s-ci-robot added a commit that referenced this pull request Oct 23, 2020
…pstream-release-1.19

Automated cherry pick of #92614: Don't create a new sandbox for pod with
k8s-ci-robot added a commit that referenced this pull request Nov 27, 2020
…pstream-release-1.18

Automated cherry pick of #92614: Don't create a new sandbox for pod with
k8s-ci-robot added a commit that referenced this pull request Nov 27, 2020
…pstream-release-1.17

Automated cherry pick of #92614: Don't create a new sandbox for pod with
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubelet always creates sandbox twice for Pods with RestartPolicyOnFailure
9 participants