Ignore pods for quota marked for deletion whose node is unreachable #46542

derekwaynecarr · 2017-05-26T21:02:53Z

What this PR does / why we need it:
Traditionally, we charge to quota all pods that are in a non-terminal phase. We have a user report that noted the behavior change in kube 1.5 for the node controller to no longer force delete pods whose nodes have been lost. Instead, the pod is marked for deletion, and the reason is updated to state that the node is unreachable. The user expected the quota to be released. If the user was at their quota limit, their application may not be able to create a new replica given the current behavior. As a result, this PR ignores pods marked for deletion that have exceeded their grace period.

Which issue this PR fixes
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455743
fixes #52436

Release note:

Ignore pods marked for deletion that exceed their grace period in ResourceQuota

derekwaynecarr · 2017-05-26T21:04:13Z

@smarterclayton - I would like your input on this. Another alternative approach is to not charge the user for a pod whose deletion timestamp + grace period has passed? After opening this, I am thinking that may be a better option as it covers all "pod is stuck terminating" scenarios pretty well.

/cc @deads2k @sjenning

sjenning · 2017-06-03T22:08:58Z

FYI, the 1.5 PR that change the force-delete behavior #36017

I like the "don't include pods that are beyond their graceful termination time" as that typically indicates the hold up in pod deletion is a problem on the system side, not the user side. i.e. there is nothing the user can do to create this situation... unless they can create container processes that can resist a SIGKILL (uninterruptable sleep).

derekwaynecarr · 2017-07-03T21:06:02Z

@sjenning @smarterclayton @deads2k -- finally got around to updating this, i think it gives a more user friendly experience. once the pod status is updated as NodeLost, the quota system will see the pod update, and release the pods charge for quota.

deads2k · 2017-07-05T12:58:25Z

pkg/controller/resourcequota/replenishment_controller_test.go

@@ -55,7 +56,7 @@ func TestPodReplenishmentUpdateFunc(t *testing.T) {
 		ObjectMeta: metav1.ObjectMeta{Namespace: "test", Name: "pod"},
 		Status:     v1.PodStatus{Phase: v1.PodFailed},
 	}
-	updateFunc := PodReplenishmentUpdateFunc(&options)
+	updateFunc := PodReplenishmentUpdateFunc(&options, clock.RealClock{})


realclock in a test looks weird.

deads2k · 2017-07-05T12:59:05Z

pkg/quota/evaluator/core/pods.go

+//  - pod has been marked for deletion and grace period has expired.
+func QuotaPod(pod *api.Pod, clock clock.Clock) bool {
+	// if pod is terminal, ignore it for quota
+	if api.PodFailed == pod.Status.Phase || api.PodSucceeded == pod.Status.Phase {


pods have conditions now, right? Since you're rewriting the function anyway, can we update to conditions.

the pod Phase is still the state machine we must follow to know that a pod has reached the end of its life. the PodCondition is not sufficient for that need.

the pod Phase is still the state machine we must follow to know that a pod has reached the end of its life. the PodCondition is not sufficient for that need.

This is really weird. It's been over two years since we decided that conditions were a better choice than phases. When is the kubelet going to update?

Never because of backwards API compatibility.

Never because of backwards API compatibility.

Eh, it only skews two releases. I'd rather have all our code keep as current as compatibility allows so developers aren't trying to remember: well these conditions match these phases. Cruft like this is super hard to deal with in controllers and utilities.

deads2k · 2017-07-05T13:01:33Z

pkg/quota/evaluator/core/pods.go

+	if api.PodFailed == pod.Status.Phase || api.PodSucceeded == pod.Status.Phase {
+		return false
+	}
+	// if pods are stuck terminating (for example, a node is lost), we do not want


This may be obvious to you, but not to me. deletiontimestamp non-nil means its terminating? I thought an update had to come back to recognize it as a terminating.

If you mean, "deleted pods that should be gone should not be charged", then this code makes sense to me.

it means "deleted pods that should be gone should not be charged", i will look to clarify comment.

jsravn · 2017-08-07T10:02:45Z

Any progress on this PR? The existing issue in 1.5+ is really bad in clusters with quotas since replacement replicas can't spin up. We keep getting hit by this in prod. I'd be happy to pick this up and finish it if needed.

derekwaynecarr · 2017-08-18T03:41:20Z

I will rebase this and get it in for 1.8

k8s-github-robot · 2017-09-13T18:14:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, derekwaynecarr

Associated issue: 52436

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~pkg/controller/OWNERS~~ [deads2k,derekwaynecarr]
~~pkg/kubelet/eviction/OWNERS~~ [derekwaynecarr]
~~pkg/quota/OWNERS~~ [derekwaynecarr]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

jdumars · 2017-09-13T23:09:09Z

/sig api-machinery

jdumars · 2017-09-13T23:09:34Z

/test pull-kubernetes-unit

fejta-bot · 2017-09-14T03:48:10Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

fejta-bot · 2017-09-14T15:00:04Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

fejta · 2017-09-14T18:36:09Z

/retest
ref kubernetes/test-infra#4568

k8s-ci-robot · 2017-09-15T02:30:23Z

@derekwaynecarr: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
pull-kubernetes-e2e-gce-bazel	`da01c6d`	link	`/test pull-kubernetes-e2e-gce-bazel`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot · 2017-09-15T07:11:09Z

Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

jsravn · 2017-09-15T09:56:15Z

Any chance for a 1.7 cherrypick?

jdumars · 2017-09-15T15:24:57Z

@jsravn please contact @wojtek-t who is the 1.7.x patch manager.

lavalamp · 2017-09-22T18:43:55Z

Why didn't we fix the node controller?

jsravn · 2017-09-29T13:10:05Z

@jdumars What's the best way to contact him? Is it easier if I open a cherrypick PR?

jdumars · 2017-09-29T17:22:56Z

@jsravn yes, that would be great. The cherrypick tool should get the job done.

jsravn · 2017-10-02T14:57:21Z

Cherry picked @ #53332

…-upstream-release-1.7 Automatic merge from submit-queue. Automated cherry pick of #46542 Cherry pick of #46542 on release-1.7. #53239: Ignore pods for quota marked for deletion whose node is unreachable

k8s-cherrypick-bot · 2017-10-04T13:04:09Z

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

derekwaynecarr · 2017-11-17T01:01:05Z

@lavalamp -- the node controller was not broken. it intentionally changed the behavior for data safety concerns.

ehvs · 2018-04-23T14:32:33Z

Was this backported to Kubernets 1.5 or 1.6?

jsravn · 2018-04-24T09:19:34Z

@songbird159 Nope only 1.7. I think 1.5 and 1.6 are EOL now.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 26, 2017

k8s-github-robot assigned vishh and deads2k May 26, 2017

k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 26, 2017

k8s-ci-robot requested review from sjenning and deads2k May 26, 2017 21:04

derekwaynecarr added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 26, 2017

derekwaynecarr assigned smarterclayton May 26, 2017

derekwaynecarr force-pushed the quota-ignore-pod-whose-node-lost branch from 48f0d45 to 1bee18a Compare May 26, 2017 21:07

k8s-github-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2017

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 22, 2017

derekwaynecarr force-pushed the quota-ignore-pod-whose-node-lost branch from 1bee18a to d75f5b9 Compare July 3, 2017 21:01

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 3, 2017

derekwaynecarr force-pushed the quota-ignore-pod-whose-node-lost branch from d75f5b9 to 75074a7 Compare July 3, 2017 21:02

derekwaynecarr removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jul 3, 2017

deads2k reviewed Jul 5, 2017

View reviewed changes

jsravn mentioned this pull request Jul 10, 2017

Pods in "Unknown" state should not count against namespace quotas #48696

Closed

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 18, 2017

derekwaynecarr force-pushed the quota-ignore-pod-whose-node-lost branch from 75074a7 to 2618611 Compare August 25, 2017 21:09

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 13, 2017

k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Sep 13, 2017

0xmichalis mentioned this pull request Sep 13, 2017

Make the kubeadm presubmit only run on kubeadm PRs kubernetes/test-infra#4489

Merged

k8s-github-robot merged commit 20a4112 into kubernetes:master Sep 15, 2017

sjenning mentioned this pull request Sep 18, 2017

UPSTREAM: 46542: Ignore pods for quota marked for deletion whose node is unreachable openshift/origin#16425

Closed

wojtek-t added the cherrypick-candidate label Oct 2, 2017

wojtek-t modified the milestones: v1.8, v1.7 Oct 2, 2017

jsravn mentioned this pull request Oct 2, 2017

Automated cherry pick of #46542 #53332

Merged

wojtek-t added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 3, 2017

k8s-cherrypick-bot removed the cherrypick-candidate label Oct 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore pods for quota marked for deletion whose node is unreachable #46542

Ignore pods for quota marked for deletion whose node is unreachable #46542

derekwaynecarr commented May 26, 2017 •

edited

Loading

derekwaynecarr commented May 26, 2017

sjenning commented Jun 3, 2017

derekwaynecarr commented Jul 3, 2017

deads2k Jul 5, 2017

deads2k Jul 5, 2017

derekwaynecarr Aug 25, 2017

deads2k Aug 28, 2017

smarterclayton Sep 8, 2017

deads2k Sep 12, 2017

deads2k Jul 5, 2017

derekwaynecarr Aug 25, 2017

jsravn commented Aug 7, 2017

derekwaynecarr commented Aug 18, 2017

k8s-github-robot commented Sep 13, 2017

jdumars commented Sep 13, 2017

jdumars commented Sep 13, 2017

fejta-bot commented Sep 14, 2017

fejta-bot commented Sep 14, 2017

fejta commented Sep 14, 2017

k8s-ci-robot commented Sep 15, 2017

k8s-github-robot commented Sep 15, 2017

jsravn commented Sep 15, 2017

jdumars commented Sep 15, 2017

lavalamp commented Sep 22, 2017

jsravn commented Sep 29, 2017

jdumars commented Sep 29, 2017

jsravn commented Oct 2, 2017 •

edited

Loading

k8s-cherrypick-bot commented Oct 4, 2017

derekwaynecarr commented Nov 17, 2017

ehvs commented Apr 23, 2018

jsravn commented Apr 24, 2018

Ignore pods for quota marked for deletion whose node is unreachable #46542

Ignore pods for quota marked for deletion whose node is unreachable #46542

Conversation

derekwaynecarr commented May 26, 2017 • edited Loading

derekwaynecarr commented May 26, 2017

sjenning commented Jun 3, 2017

derekwaynecarr commented Jul 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsravn commented Aug 7, 2017

derekwaynecarr commented Aug 18, 2017

k8s-github-robot commented Sep 13, 2017

jdumars commented Sep 13, 2017

jdumars commented Sep 13, 2017

fejta-bot commented Sep 14, 2017

fejta-bot commented Sep 14, 2017

fejta commented Sep 14, 2017

k8s-ci-robot commented Sep 15, 2017

k8s-github-robot commented Sep 15, 2017

jsravn commented Sep 15, 2017

jdumars commented Sep 15, 2017

lavalamp commented Sep 22, 2017

jsravn commented Sep 29, 2017

jdumars commented Sep 29, 2017

jsravn commented Oct 2, 2017 • edited Loading

k8s-cherrypick-bot commented Oct 4, 2017

derekwaynecarr commented Nov 17, 2017

ehvs commented Apr 23, 2018

jsravn commented Apr 24, 2018

derekwaynecarr commented May 26, 2017 •

edited

Loading

jsravn commented Oct 2, 2017 •

edited

Loading