-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out of resource killing (memory) #21274
out of resource killing (memory) #21274
Conversation
837719b
to
ccb9ce6
Compare
b80c763
to
8bb4fed
Compare
return | ||
} | ||
glog.Infof("stability manager: kill pod %s", pod.Name) | ||
err = podLifecycleManager.KillPod(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recording a note here so when i get time to get back to this PR, but I will look to use kubelet.HandlePodDeletions
8bb4fed
to
7bdf4bd
Compare
} | ||
|
||
// PodUsageInfo returns usage information for each container in a pod. | ||
func (r *resourceAccountantImpl) PodUsageInfo(pod *api.Pod) (api.ResourceList, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to move this to use stats.PodStats
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/stats/summary.go#L185
7bdf4bd
to
f5d3de7
Compare
c03d24e
to
9715e29
Compare
173ad77
to
f482830
Compare
|
||
// disk is best effort, so we don't measure relative to a request. | ||
// TODO: add disk as a guaranteed resource | ||
p1Disk := p1Usage[api.ResourceStorage] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource storage is pretty generic and weirdly it only applies to volumes as per API comments. Should local-disk
be a first class resource? For the purposes of this PR, this can be a kubelet internal type too..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was unsure what resource to really use. Fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added internal type "disk".
I added a few comments. Overall this PR LGTM. @derekwaynecarr Thanks for coming up with super readable code :) |
5d2922b
to
97eeeb5
Compare
97eeeb5
to
edc76f6
Compare
@vishh - updated code per your comments. /cc @kubernetes/rh-cluster-infra @kubernetes/sig-node - if anyone wants to take a pass at memory eviction. |
@derekwaynecarr I hope at-least one of the memory eviction PRs (ideally the PR that adds flags) has a |
@derekwaynecarr Next step is to add a node e2e to test all this logic. The test might need some changes to the framework - setting eviction flags. |
I have a follow on PR to enable the max pod eviction period and will have a release note for that PR that covers the feature. |
GCE e2e build/test passed for commit edc76f6. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit edc76f6. |
Automatic merge from submit-queue |
type signalObservations map[Signal]resource.Quantity | ||
|
||
// thresholdsObservedAt maps a threshold to a time that it was observed | ||
type thresholdsObservedAt map[Threshold]time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is broken, because you're assuming that resource.Quantity contributes to the uniqueness of the the threshold. It does not, because it's a pointer to an inf.Dec, which is only compared on pointer equality, not actual equality.
If you are depending on uniqueness of Quantity to signal value, you'll need to change your map keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, map[Threshold1{Quantity: NewQuantity("1m")}]
is not guaranteed to return the same value if you call it twice.
…to-scheduler Automatic merge from submit-queue Introduce node memory pressure condition to scheduler Following the work done by @derekwaynecarr at #21274, introducing memory pressure predicate for scheduler. Missing: * write down unit-test * test the implementation At the moment this is a heads up for further discussion how the new node's memory pressure condition should be handled in the generic scheduler. **Additional info** * Based on [1], only best effort pods are subject to filtering. * Based on [2], best effort pods are those pods "iff requests & limits are not specified for any resource across all containers". [1] https://github.com/derekwaynecarr/kubernetes/blob/542668cc7998fe0acb315a43731e1f45ecdcc85b/docs/proposals/kubelet-eviction.md#scheduler [2] #14943
…to-scheduler Automatic merge from submit-queue Introduce node memory pressure condition to scheduler Following the work done by @derekwaynecarr at kubernetes/kubernetes#21274, introducing memory pressure predicate for scheduler. Missing: * write down unit-test * test the implementation At the moment this is a heads up for further discussion how the new node's memory pressure condition should be handled in the generic scheduler. **Additional info** * Based on [1], only best effort pods are subject to filtering. * Based on [2], best effort pods are those pods "iff requests & limits are not specified for any resource across all containers". [1] https://github.com/derekwaynecarr/kubernetes/blob/542668cc7998fe0acb315a43731e1f45ecdcc85b/docs/proposals/kubelet-eviction.md#scheduler [2] kubernetes/kubernetes#14943
UPSTREAM: 69890: Run static pods before bootstrap Origin-commit: db65ebbf04b82a420e88b7f870a3ec667dd69998
Adds the core framework for low-resource killing in the kubelet.
Implements support for out of memory killing.
Related:
#18724
This change is