Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixing array out of bound by checking initContainers instead of containers #58574

Merged
merged 1 commit into from
Jan 26, 2018

Conversation

yastij
Copy link
Member

@yastij yastij commented Jan 21, 2018

What this PR does / why we need it:

Which issue(s) this PR fixes : Fixes #58541

Special notes for your reviewer:

Release note:

Fixes a bug where kubelet crashes trying to free memory under memory pressure

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 21, 2018
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jan 21, 2018
@yastij yastij force-pushed the fix-kubelet-podRequest branch from 07cd678 to 1a14fa4 Compare January 21, 2018 00:55
@@ -609,9 +609,9 @@ func podRequest(pod *v1.Pod, resourceName v1.ResourceName) resource.Quantity {
for i := range pod.Spec.InitContainers {
switch resourceName {
case v1.ResourceMemory:
containerValue.Add(*pod.Spec.Containers[i].Resources.Requests.Memory())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe you should use initValue.Add and not containerValue.Add

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that second commit didn't ship, fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

case resourceDisk:
containerValue.Add(*pod.Spec.Containers[i].Resources.Requests.StorageEphemeral())
containerValue.Add(*pod.Spec.InitContainers[i].Resources.Requests.StorageEphemeral())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@yastij yastij force-pushed the fix-kubelet-podRequest branch from 1a14fa4 to 8285b6d Compare January 21, 2018 01:09
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jan 21, 2018
@yastij yastij force-pushed the fix-kubelet-podRequest branch from 8285b6d to 5110b6d Compare January 21, 2018 01:13
@dims
Copy link
Member

dims commented Jan 21, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 21, 2018
@dims
Copy link
Member

dims commented Jan 21, 2018

/assign @dashpole

@yastij
Copy link
Member Author

yastij commented Jan 21, 2018

/retest

@@ -609,9 +609,9 @@ func podRequest(pod *v1.Pod, resourceName v1.ResourceName) resource.Quantity {
for i := range pod.Spec.InitContainers {
switch resourceName {
case v1.ResourceMemory:
containerValue.Add(*pod.Spec.Containers[i].Resources.Requests.Memory())
initValue.Add(*pod.Spec.InitContainers[i].Resources.Requests.Memory())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this logic right? since init containers run sequentially, I thought it was supposed to be max(max(initContainer), sum(container)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is the approach taken at

// PodUsageFunc returns the quota usage for a pod.
// A pod is charged for quota if the following are not true.
// - pod has a terminal phase (failed or succeeded)
// - pod has been marked for deletion and grace period has expired
func PodUsageFunc(obj runtime.Object, clock clock.Clock) (api.ResourceList, error) {
pod, err := toInternalPodOrError(obj)
if err != nil {
return api.ResourceList{}, err
}
// always quota the object count (even if the pod is end of life)
// object count quotas track all objects that are in storage.
// where "pods" tracks all pods that have not reached a terminal state,
// count/pods tracks all pods independent of state.
result := api.ResourceList{
podObjectCountName: *(resource.NewQuantity(1, resource.DecimalSI)),
}
// by convention, we do not quota compute resources that have reached end-of life
// note: the "pods" resource is considered a compute resource since it is tied to life-cycle.
if !QuotaPod(pod, clock) {
return result, nil
}
requests := api.ResourceList{}
limits := api.ResourceList{}
// TODO: ideally, we have pod level requests and limits in the future.
for i := range pod.Spec.Containers {
requests = quota.Add(requests, pod.Spec.Containers[i].Resources.Requests)
limits = quota.Add(limits, pod.Spec.Containers[i].Resources.Limits)
}
// InitContainers are run sequentially before other containers start, so the highest
// init container resource is compared against the sum of app containers to determine
// the effective usage for both requests and limits.
for i := range pod.Spec.InitContainers {
requests = quota.Max(requests, pod.Spec.InitContainers[i].Resources.Requests)
limits = quota.Max(limits, pod.Spec.InitContainers[i].Resources.Limits)
}
result = quota.Add(result, podComputeUsageHelper(requests, limits))
return result, nil
}

Copy link
Member Author

@yastij yastij Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt - Indeed, Since they're run sequentially, I think it might be the right logic.

cc @smarterclayton

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yastij yastij force-pushed the fix-kubelet-podRequest branch from 5110b6d to 3c54c42 Compare January 24, 2018 22:23
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 24, 2018
@yastij yastij force-pushed the fix-kubelet-podRequest branch 2 times, most recently from 9be6745 to f1b7a92 Compare January 24, 2018 23:06
@yastij yastij force-pushed the fix-kubelet-podRequest branch from f1b7a92 to ed8e75a Compare January 25, 2018 08:59
@dashpole
Copy link
Contributor

/lgtm
thanks for the fix!
/sig node
/kind bug
/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jan 25, 2018
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jan 25, 2018
@dashpole
Copy link
Contributor

/assign @dchen1107
for approval

@dashpole dashpole added this to the v1.9 milestone Jan 25, 2018
@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request Needs Approval

@dashpole @dchen1107 @yastij @kubernetes/sig-node-misc

Action required: This pull request must have the status/approved-for-milestone label applied by a SIG maintainer.

Pull Request Labels
  • sig/node: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/bug: Fixes a bug discovered during the current release.
Help

@dashpole
Copy link
Contributor

this should be cherrypicked to 1.9

@dchen1107
Copy link
Member

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, dchen1107, yastij

Associated issue: #58541

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 26, 2018
@dchen1107
Copy link
Member

cc/ @mehdy for 1.9 branch manager for the cherrypicking.

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit b6824af into kubernetes:master Jan 26, 2018
@yastij yastij deleted the fix-kubelet-podRequest branch January 26, 2018 19:26
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jan 26, 2018
k8s-github-robot pushed a commit that referenced this pull request Feb 1, 2018
…74-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #58574: fixing array out of bound by checking initContainers instead

Cherry pick of #58574 on release-1.9.

#58574: fixing array out of bound by checking initContainers instead

Fixes #58541
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.9" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. milestone/needs-approval priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubelet crashes trying to free memory under MemoryPressure
8 participants