[AppArmor] Hold bad AppArmor pods in pending rather than rejecting #35342

timstclair · 2016-10-21T23:39:48Z

Overview of the fix:

If the Kubelet needs to reject a Pod for a reason that the control plane doesn't understand (e.g. which AppArmor profiles are installed on the node), then it might contiinuously try to run the pod on the same rejecting node. This change adds a concept of "soft rejection", in which the Pod is admitted, but not allowed to run (and therefore held in a pending state). This prevents the pod from being retried on other nodes, but also prevents the high churn. This is consistent with how other missing local resources (e.g. volumes) is handled.

A side effect of the change is that Pods which are not initially runnable will be retried. This is desired behavior since it avoids a race condition when a new node is brought up but the AppArmor profiles have not yet been loaded on it.

Pods with invalid AppArmor configurations will be held in a Pending state, rather than rejected (failed). Check the pod status message to find out why it is not running.

@kubernetes/sig-node @timothysc @rrati @davidopp

This change is

k8s-ci-robot · 2016-10-21T23:44:03Z

Jenkins GCI GCE e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:44:03Z

Jenkins Kubemark GCE e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:44:04Z

Jenkins GCE etcd3 e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:44:33Z

Jenkins GKE smoke e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:44:34Z

Jenkins GCE e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:44:34Z

Jenkins GCI GKE smoke e2e failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

k8s-ci-robot · 2016-10-21T23:45:04Z

Jenkins unit/integration failed for commit 515bf5488bbb37a9d5af8c5fc0089ecf6bad5588. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

timstclair · 2016-10-21T23:46:40Z

Fixed build error.

k8s-ci-robot · 2016-10-22T00:16:57Z

Jenkins GCE Node e2e failed for commit baf07ad. Full PR test history.

The magic incantation to run this job again is @k8s-bot node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

davidopp · 2016-10-22T20:27:45Z

This approach seems reasonable. I didn't look at the code, so my questions are probably answered there, but I'm wondering:
(1) how is this state reflected to the user (e.g. via an event or status)?
(2) what happens if kubelet is restarted?
(3) what happens if user deletes the pod through the API server

timstclair · 2016-10-24T17:55:44Z

(1) how is this state reflected to the user (e.g. via an event or status)?

Both. Currently the Kubelet creates an event, and sets the PodStatus to Failed with an appropriate reason & message. With my change, the event is still published, and the reason & message on the PodStatus are still set, but the Pod is kept in the Pending state.

(2) what happens if kubelet is restarted?

It shouldn't change anything. The AppArmor check still needs to pass before the Pod is allowed to run.

(3) what happens if user deletes the pod through the API server

Deletion is handled in a separate loop from where we're checking the AppArmor status. The resources in Kubelet will be cleaned up, and there won't be any containers to kill. This should be unaffected by my change.

dchen1107

Have some nits comment, otherwise overall I am ok with this change except:

Once this change is in, the compute resource (cpu, memory) was allocated to such pending pods would be hold forever until the upstream layer take the action. I am ok with this for now to prevent uncontrolled churn from the scheduler / control-plane.

dchen1107 · 2016-10-31T21:41:25Z

pkg/kubelet/dockertools/docker_manager.go

@@ -1383,6 +1383,10 @@ func (dm *DockerManager) KillPod(pod *api.Pod, runningPod kubecontainer.Pod, gra

 // NOTE(random-liu): The pod passed in could be *nil* when kubelet restarted.
 func (dm *DockerManager) killPodWithSyncResult(pod *api.Pod, runningPod kubecontainer.Pod, gracePeriodOverride *int64) (result kubecontainer.PodSyncResult) {
+	// Short circuit if there's nothing to kill.
+	if len(runningPod.Containers) == 0 {


Does runningPod here includes PodInfraContainer too? I think it is, but want to be sure here. Otherwise, returning earlier would have podInfraContainer leakage issue.

Yes, it does. (See line 1424 below). This method is a noop if len(runningPod.Containers) == 0, this is just an optimizitaion.

dchen1107 · 2016-11-01T00:45:43Z

pkg/kubelet/kubelet.go

@@ -1516,6 +1548,29 @@ func (kl *Kubelet) canAdmitPod(pods []*api.Pod, pod *api.Pod) (bool, string, str
 	return true, "", ""
 }

+func (kl *Kubelet) canRunPod(pod *api.Pod) lifecycle.PodAdmitResult {


Can I say that calling this new method as canRunPod, which is the same as what you are using below: pkg/kubelet/util.go::canRunPod(...) is very confusing to me?

Ack. I left a TODO to get rid of that other method. Do you have a suggestion for a better name?

timstclair

Thanks Dawn,

Once this change is in, the compute resource (cpu, memory) was allocated to such pending pods would be hold forever until the upstream layer take the action. I am ok with this for now to prevent uncontrolled churn from the scheduler / control-plane.

Yes, you're right, but I think this situation is strictly better than what we have today. Long term, we should put some more thought into how we want to deal with this case.

timstclair · 2016-11-02T02:19:29Z

pkg/kubelet/dockertools/docker_manager.go

@@ -1383,6 +1383,10 @@ func (dm *DockerManager) KillPod(pod *api.Pod, runningPod kubecontainer.Pod, gra

 // NOTE(random-liu): The pod passed in could be *nil* when kubelet restarted.
 func (dm *DockerManager) killPodWithSyncResult(pod *api.Pod, runningPod kubecontainer.Pod, gracePeriodOverride *int64) (result kubecontainer.PodSyncResult) {
+	// Short circuit if there's nothing to kill.
+	if len(runningPod.Containers) == 0 {


Yes, it does. (See line 1424 below). This method is a noop if len(runningPod.Containers) == 0, this is just an optimizitaion.

timstclair · 2016-11-02T02:23:04Z

pkg/kubelet/kubelet.go

@@ -1516,6 +1548,29 @@ func (kl *Kubelet) canAdmitPod(pods []*api.Pod, pod *api.Pod) (bool, string, str
 	return true, "", ""
 }

+func (kl *Kubelet) canRunPod(pod *api.Pod) lifecycle.PodAdmitResult {


Ack. I left a TODO to get rid of that other method. Do you have a suggestion for a better name?

timstclair · 2016-11-02T02:38:48Z

Squashed & rebased.

k8s-ci-robot · 2016-11-02T03:04:58Z

Jenkins verification failed for commit 1f79ef787e3448b553943fe1efc7695d34d1b85b. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

timstclair · 2016-11-02T18:05:51Z

Rejenerated hack/update-bazel.sh

dchen1107 · 2016-11-03T16:24:27Z

LGTM

k8s-github-robot · 2016-11-06T05:12:40Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-github-robot · 2016-11-06T05:52:26Z

Automatic merge from submit-queue

timstclair added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Oct 21, 2016

timstclair added this to the v1.5 milestone Oct 21, 2016

timstclair assigned dchen1107 Oct 21, 2016

googlebot added the cla: yes label Oct 21, 2016

timstclair force-pushed the rejected branch from 515bf54 to baf07ad Compare October 21, 2016 23:46

k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 21, 2016

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 24, 2016

timstclair mentioned this pull request Oct 25, 2016

Kubelet may reject static pods when the node is out of disk space, and will not retry them #35521

Closed

dchen1107 reviewed Nov 1, 2016

View reviewed changes

timstclair commented Nov 2, 2016

View reviewed changes

timstclair force-pushed the rejected branch from 26e3686 to 1f79ef7 Compare November 2, 2016 02:38

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 2, 2016

Hold bad AppArmor pods in pending rather than rejecting

ec9111d

timstclair force-pushed the rejected branch from 1f79ef7 to ec9111d Compare November 2, 2016 18:05

dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2016

k8s-github-robot merged commit 649c0dd into kubernetes:master Nov 6, 2016

This was referenced Nov 9, 2016

[AppArmor] Don't schedule AppArmor pods to older nodes #31469

Closed

[AppArmor] Profiles are not checked when containers are restarted #35341

Closed

chentao1596 mentioned this pull request Dec 5, 2016

WIP:kubelet: support multi-headers when getting pod from HTTP source #38089

Closed

bgrant0607 mentioned this pull request May 7, 2019

Tight retry loops should not cause cascading failure of the cluster #74405

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AppArmor] Hold bad AppArmor pods in pending rather than rejecting #35342

[AppArmor] Hold bad AppArmor pods in pending rather than rejecting #35342

timstclair commented Oct 21, 2016 •

edited by k8s-oncall

Loading

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

timstclair commented Oct 21, 2016

k8s-ci-robot commented Oct 22, 2016

davidopp commented Oct 22, 2016

timstclair commented Oct 24, 2016

dchen1107 left a comment

dchen1107 Oct 31, 2016

timstclair Nov 2, 2016

dchen1107 Nov 1, 2016

timstclair Nov 2, 2016

timstclair left a comment

timstclair Nov 2, 2016

timstclair Nov 2, 2016

timstclair commented Nov 2, 2016

k8s-ci-robot commented Nov 2, 2016

timstclair commented Nov 2, 2016

dchen1107 commented Nov 3, 2016

k8s-github-robot commented Nov 6, 2016

k8s-github-robot commented Nov 6, 2016

[AppArmor] Hold bad AppArmor pods in pending rather than rejecting #35342

[AppArmor] Hold bad AppArmor pods in pending rather than rejecting #35342

Conversation

timstclair commented Oct 21, 2016 • edited by k8s-oncall Loading

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

k8s-ci-robot commented Oct 21, 2016

timstclair commented Oct 21, 2016

k8s-ci-robot commented Oct 22, 2016

davidopp commented Oct 22, 2016

timstclair commented Oct 24, 2016

dchen1107 left a comment

Choose a reason for hiding this comment

dchen1107 Oct 31, 2016

Choose a reason for hiding this comment

timstclair Nov 2, 2016

Choose a reason for hiding this comment

dchen1107 Nov 1, 2016

Choose a reason for hiding this comment

timstclair Nov 2, 2016

Choose a reason for hiding this comment

timstclair left a comment

Choose a reason for hiding this comment

timstclair Nov 2, 2016

Choose a reason for hiding this comment

timstclair Nov 2, 2016

Choose a reason for hiding this comment

timstclair commented Nov 2, 2016

k8s-ci-robot commented Nov 2, 2016

timstclair commented Nov 2, 2016

dchen1107 commented Nov 3, 2016

k8s-github-robot commented Nov 6, 2016

k8s-github-robot commented Nov 6, 2016

timstclair commented Oct 21, 2016 •

edited by k8s-oncall

Loading