Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Eviction Strategy to take Priority into account #946

Merged
merged 2 commits into from
Aug 26, 2017
Merged

Modify Eviction Strategy to take Priority into account #946

merged 2 commits into from
Aug 26, 2017

Conversation

dashpole
Copy link
Contributor

Previously, I introduced a number of possible methods for introducing priority: #846.
We came to agreement on using the strategy:

Only evict pods where usage > requests. Then sort by Function(priority, usage - requests)

This solution provides users with a clear path to avoiding evictions. It prevents abuse by high-priority pods by not allowing them to disrupt other pods that are below, or near their requests. Using a function provides a more nuanced approach to allowing pods to consume "unused" (not requested) memory on the node. Power users, or cluster administrators can determine how unused memory is allocated to pods by choosing priority levels for pods that are closer for more equal sharing of extra memory, or further apart to give better availability to higher priority pods. For pods that have equal priority, the function is equivalent to usage - requests, so that clusters that do not have priority enabled maintain behavior that is similar (though not exactly the same) as today's behavior.

This PR changes the eviction documentation to include this change in the eviction strategy, and removes outdated information on inode eviction.
@kubernetes/sig-node-proposals
@derekwaynecarr @vishh @dchen1107 @sjenning
@davidopp @bsalamat

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/design Categorizes issue or PR as related to design. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 18, 2017
@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 18, 2017

It will target pods whose usage of the starved resource exceeds its requests.
Of those pods, it will rank by a function of priority, and usage - requests.
If system daemons are exceeding their allocation (see [Strategy Caveat](strategy-caveat) below),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than system daemons, there are critical system pods whose eviction may break the functionality of the node or even the whole cluster. In the future (next couple of months) we will set a priority class for critical system pods. There are two priority classes for system pods: system-cluster-critical and system-node-critical. The former should be set for critical system pods that should be present in every cluster, but they don't need to run on every node. The latter should be set for critical system pods that should be present on every node, e.g., kube-proxy. Pods with system-node-critical priority class should be treated like system daemons and should not get evicted as much as possible.

@derekwaynecarr
Copy link
Member

LGTM

fyi @sjenning

@derekwaynecarr derekwaynecarr merged commit 398dc7e into kubernetes:master Aug 26, 2017
@dashpole dashpole deleted the priority_eviction branch August 26, 2017 01:49
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
Modify Eviction Strategy to take Priority into account
danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants