Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HugePages feature #50859

Merged
Merged

Conversation

derekwaynecarr
Copy link
Member

@derekwaynecarr derekwaynecarr commented Aug 17, 2017

What this PR does / why we need it:
Implements HugePages support per kubernetes/community#837

Feature track issue: kubernetes/enhancements#275

Special notes for your reviewer:
A follow-on PR is opened to add the EmptyDir support.

Release note:

Alpha support for pre-allocated hugepages

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 17, 2017
@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Aug 17, 2017
@derekwaynecarr derekwaynecarr assigned dchen1107 and davidopp and unassigned jsafrane Aug 17, 2017
@derekwaynecarr derekwaynecarr added this to the v1.8 milestone Aug 17, 2017
for i := range pod.Spec.Containers {
resourceSet := toContainerResourcesSet(&pod.Spec.Containers[i])
for resourceStr := range resourceSet {
if v1helper.IsHugePageResourceName(v1.ResourceName(resourceStr)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to validate the HugePage size? i.e. 2Mi and 1Gi are valid and others not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means, what if users configure hugepages-3Mi ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangpengzhao - that is a good question.

there are a variety of large page sizes depending on the architecture, so i was hoping to avoid an explicit enumeration of them or awareness in the api server. this is an alpha feature, but my long term expectation is that operators would configure the ResourceQuota system to make consumption of hugepages denied by default per something like #36765 , and they would then grant explicit quota for the page sizes supported in their fleet of machines. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I like this idea. Thanks for explanation!

Another concern, I have a use case that users want to have the privilege to adjust their hugepages usage, or even the node hugepages capacity dynamically. Yes this will cause complication and seems like not reasonable enough. Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They want to release the unused hugepages to be "normal" memory so that these memory can be used by other process on the node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangpengzhao - that is out of scope for alpha. its possible you could run a controller that observed pods pending due to lack of hugepages capacity across cluster, and run a pod to allocate them on a particular node in response. there are pros/cons with that especially with gigantic page sizes. i had a pod that did something like that here: https://github.com/derekwaynecarr/hugepages/tree/master/allocator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr Cool! Thanks, Derek!

@k8s-github-robot k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 21, 2017
@derekwaynecarr
Copy link
Member Author

need #50629 to merge first.

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 22, 2017
@derekwaynecarr
Copy link
Member Author

/test pull-kubernetes-e2e-kops-aws

1 similar comment
@derekwaynecarr
Copy link
Member Author

/test pull-kubernetes-e2e-kops-aws

@derekwaynecarr
Copy link
Member Author

prerequisite prs merged, this should be good to go.

@derekwaynecarr
Copy link
Member Author

note to respond to #50773 (comment) in a follow-on pr.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 28, 2017
@@ -43,6 +47,10 @@ const (
libcontainerSystemd libcontainerCgroupManagerType = "systemd"
)

// hugePageSizeList is useful for converting to the hugetlb canonical unit
// which is what is expected when interacting with libcontainer
var hugePageSizeList = []string{"", "kB", "MB", "GB", "TB", "PB"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/kB/KB to make it consistent with other units?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the units were intended to align with what libcontainer uses here (except "B" was irrelevant)

https://github.com/opencontainers/runc/blob/18cd7e06f709750d76a97f690a5b0d5023c52108/libcontainer/cgroups/utils.go#L401


// if huge pages are enabled, we report them as a schedulable resource on the node
if utilfeature.DefaultFeatureGate.Enabled(features.HugePages) {
for _, hugepagesInfo := range info.HugePages {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we agreed to for this release, only support single huge page size, but I couldn't find where we ensure that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good catch. i wrote validation for pod to only consume one size, and forgot to add the validation for node to only report one size larger than 0. will add now.

@@ -262,6 +296,13 @@ func (m *qosContainerManagerImpl) UpdateCgroups() error {
return err
}

// update the qos level cgroup settings for huge pages (ensure they remain unbounded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this lead to the overcommit for hugepage at the node level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We enforce via validation of pod spec that over-commit is not allowed.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2017
@derekwaynecarr
Copy link
Member Author

rebased on kubefeature change.

@derekwaynecarr
Copy link
Member Author

/retest

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 1, 2017

/lgtm

trivial rebase

@dchen1107
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 3, 2017
@derekwaynecarr
Copy link
Member Author

rebased

@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 5, 2017
@derekwaynecarr
Copy link
Member Author

/test pull-kubernetes-e2e-gce-etcd3
/test pull-kubernetes-e2e-kops-aws

@derekwaynecarr
Copy link
Member Author

kernel panic on previous runs not related to this test:

[ 1036.734489] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078

/test pull-kubernetes-e2e-gce-etcd3

@ConnorDoyle
Copy link
Contributor

/assign

@sjenning
Copy link
Contributor

sjenning commented Sep 5, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 5, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, derekwaynecarr, sjenning, smarterclayton

Associated issue: 275

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Sep 5, 2017

@derekwaynecarr: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 38d5dee link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.