Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce node memory pressure condition to scheduler #25531

Conversation

ingvagabund
Copy link
Contributor

@ingvagabund ingvagabund commented May 12, 2016

Following the work done by @derekwaynecarr at #21274, introducing memory pressure predicate for scheduler.

Missing:

  • write down unit-test
  • test the implementation

At the moment this is a heads up for further discussion how the new node's memory pressure condition should be handled in the generic scheduler.

Additional info

  • Based on [1], only best effort pods are subject to filtering.
  • Based on [2], best effort pods are those pods "iff requests & limits are not specified for any resource across all containers".

[1] https://github.com/derekwaynecarr/kubernetes/blob/542668cc7998fe0acb315a43731e1f45ecdcc85b/docs/proposals/kubelet-eviction.md#scheduler
[2] #14943

@ingvagabund ingvagabund changed the title WIP: Introduce memory pressure to scheduler WIP: Introduce node memory pressure condition to scheduler May 12, 2016
@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/old-docs size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 12, 2016
@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from 0a0591e to b8f551d Compare May 12, 2016 12:17
func isPodBestEffort(pod *api.Pod) bool {
// A Pod is BestEffort iff requests & limits are not specified
// for any resource across all containers.
for _, container := range pod.Spec.Containers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was pretty sure it has to be already implemented ;-) Thanks for the link. Updated.

@derekwaynecarr
Copy link
Member

/cc @vishh - we will add disk pressure as well once that condition is available.

@ingvagabund - I will notify you as I push updates to the referenced PRs, I am hoping after today it just goes down to 1 commit this PR would carry after I do a final round of cleanups (assuming merge queue merges stuff).

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 12, 2016
@erictune erictune assigned davidopp and unassigned erictune May 12, 2016
@derekwaynecarr
Copy link
Member

@ingvagabund - the out of resource killing pr is down to 1 commit, so you should be able to rebase your pr relative to that so its easier for folks to follow. #21274

@@ -111,6 +111,9 @@ func init() {
Weight: 1,
},
)

// Fit is determined by node memory pressure condition.
factory.RegisterFitPredicate("NodeMemoryPressure", predicates.NodeMemoryPressurePredicate)
Copy link
Member

@davidopp davidopp May 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should either put this under defaultPredicates() or GeneralPredicates(). Putting it where you did, it is disabled by default.

Put in defaultPredicates() if you only want it to run in the scheduler; put it in GeneralPredicates() if you want it to run in the scheduler and kubelet admission controller. (Only put it in one or the other, not both.)

Copy link
Contributor Author

@ingvagabund ingvagabund May 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering what should be the best place to put it in and if it should be run by default. Thinking about it for the second time, it should be as the kubelet will evict all pods by default if its node is under memory pressure. @derekwaynecarr proof me wrong if we need the predicate in the kubelet admission controller. The predicate is mainly for excluding the node from scheduling best effort pods. So scheduler should be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheduler should be enough.

@davidopp
Copy link
Member

Please add a test for the new scheduler predicate.

@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from b8f551d to d24a3ef Compare May 16, 2016 10:17
@k8s-github-robot k8s-github-robot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/old-docs labels May 16, 2016
return qosutil.GetPodQos(pod) == qosutil.BestEffort
}

// MemoryPressurePredicate checks if a pod can be scheduled on a node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/MemoryPressurePredicate/CheckNodeMemoryPressurePredicate

@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from 63c1b1d to 14a81c9 Compare May 18, 2016 11:48
@ingvagabund ingvagabund changed the title WIP: Introduce node memory pressure condition to scheduler Introduce node memory pressure condition to scheduler May 18, 2016
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 18, 2016
@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch 3 times, most recently from 45bbe0d to 3126caa Compare May 18, 2016 19:29
@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from 3126caa to af1debb Compare May 18, 2016 20:36
@vishh vishh added this to the v1.3 milestone May 18, 2016
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 18, 2016
@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from af1debb to 2db76a2 Compare May 19, 2016 06:53
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 19, 2016
@ingvagabund
Copy link
Contributor Author

@davidopp PTAL

@derekwaynecarr derekwaynecarr added ok-to-merge release-note-none Denotes a PR that doesn't merit a release note. and removed needs-ok-to-merge labels May 19, 2016
@derekwaynecarr
Copy link
Member

the code lgtm, will let @davidopp take a pass. this needs to get in before code freeze since eviction is now enabled on node.

@@ -2641,3 +2641,118 @@ func TestPodToleratesTaints(t *testing.T) {
}
}
}

func makeNonBestEffortResources(milliCPU int64, memory int64) api.ResourceList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just call this getResourceList ?

see an example: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/resourcequota/resource_quota_controller_test.go#L33

There is nothing here that actually guaranteed its not best effort since you can pass 0 for both values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicate_test.go defines makeAllocatableResources function which I can use. I don't know why I did not use that before.

@derekwaynecarr
Copy link
Member

@ingvagabund - just had two minor nits if you want to fix them up...

@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from 2db76a2 to 55b92f1 Compare May 21, 2016 15:32
pod: bestEffortPod,
nodeInfo: makeEmptyNodeInfo(memoryPressureNode),
fits: false,
name: "best-effort pod schedulable on node with memory pressure condition on",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/schedulable/not schedulable/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

… pods for nodes that reports memory pressury.

Introduce unit-test for CheckNodeMemoryPressurePredicate

Following work done in kubernetes#14943
@ingvagabund ingvagabund force-pushed the introduce-memory-pressure-to-scheduler branch from 55b92f1 to b95b30b Compare May 21, 2016 22:40
@davidopp
Copy link
Member

LGTM

@davidopp davidopp added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 22, 2016
@k8s-bot
Copy link

k8s-bot commented May 22, 2016

GCE e2e build/test passed for commit b95b30b.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit c17465b into kubernetes:master May 22, 2016
@ingvagabund ingvagabund deleted the introduce-memory-pressure-to-scheduler branch May 22, 2016 07:06
xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016
…-pressure-to-scheduler

Automatic merge from submit-queue

Introduce node memory pressure condition to scheduler

Following the work done by @derekwaynecarr at kubernetes#21274, introducing memory pressure predicate for scheduler.

Missing:

* write down unit-test
* test the implementation

At the moment this is a heads up for further discussion how the new node's memory pressure condition should be handled in the generic scheduler.

**Additional info**

* Based on [1], only best effort pods are subject to filtering.
* Based on [2], best effort pods are those pods "iff requests & limits are not specified for any resource across all containers".

[1] https://github.com/derekwaynecarr/kubernetes/blob/542668cc7998fe0acb315a43731e1f45ecdcc85b/docs/proposals/kubelet-eviction.md#scheduler
[2] kubernetes#14943
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants