Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GlobalMemoryStatusEx to get total physical memory on Windows node #57124

Merged
merged 1 commit into from
Feb 6, 2018

Conversation

JiangtianLi
Copy link
Contributor

@JiangtianLi JiangtianLi commented Dec 13, 2017

What this PR does / why we need it:
This PR fixes issue #57110 due to failure in getting total physical memory on some Windows VM such as in VMWare Fusion or Virtualbox. This change uses GlobalMemoryStatusEx instead of GetPhysicallyInstalledSystemMemory to retrieve total physical memory on Windows node. The amount obtained this way is also closer in parity with reading MemTotal from /proc/meminfo on Linux node.
(thanks to @martinivanov and @marono for the help)

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #57110

Special notes for your reviewer:

Release note:

Use a more reliable way to get total physical memory on windows nodes

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 13, 2017
@dims
Copy link
Member

dims commented Dec 13, 2017

/release-note-none
/ok-to-test

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 13, 2017
@JiangtianLi
Copy link
Contributor Author

@marono @trondhindenes Can you help to do sanity check to run kubelet? I don't have VMware fusion/Mac and VirtualBox/Linux handy. I've verified with HyperV/Windows.

@marono
Copy link

marono commented Dec 13, 2017

@JiangtianLi I can do VMware Fusion tomorrow

@marono
Copy link

marono commented Dec 14, 2017

@JiangtianLi delighted to confirm kubelet.exe starts on VMWare Fusion (MAC) running Windows Server 1709, with your changes ...

Thanks and well done!

@JiangtianLi
Copy link
Contributor Author

@marono Thanks so much for the help!

@alinbalutoiu
Copy link
Contributor

This was previously also failing on Windows Server RTM. Confirmed that it works on Windows Server RTM and 1709 now on VMware Workstation. The change looks good to me, good job!

I am still thinking if kubelet should fail with an error if it couldn't read the memory. On the Linux node from what I see, they ignore the error if it couldn't read the memory. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/cm/container_manager_linux.go#L104-L107
It will always return nil as error (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/cm/container_manager_linux.go#L122)

Should the error be ignored if kubelet couldn't fetch the memory?

@JiangtianLi
Copy link
Contributor Author

@alinbalutoiu Thanks for the confirmation and good suggestion! As to handling reading memory capacity failure, what I read from code is that reading memory capacity on Linux calls cadvisor: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cadvisor/cadvisor_linux.go#L117

cadvisor gets machine info in https://github.com/google/cadvisor/blob/master/manager/manager.go#L220 and https://github.com/google/cadvisor/blob/master/machine/info.go#L107

And if there is an error in https://github.com/google/cadvisor/blob/master/machine/machine.go#L86-L97, the error is surfaced up and not ignored.

What you pointed out seems to be ignoring reading memory capacity error in setting memory limit when creating cgroup manager.

Please feel free to correct me and let me know your thoughts.

@JiangtianLi
Copy link
Contributor Author

Kindly ping @dashpole @tallclair

@dashpole
Copy link
Contributor

dashpole commented Jan 2, 2018

cc @kubernetes/sig-windows-bugs
We could use someone with some context on windows nodes to review and lgtm

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. kind/bug Categorizes issue or PR as related to a bug. labels Jan 2, 2018
@michmike
Copy link
Contributor

michmike commented Jan 5, 2018

@bsteciuk can you please review?

@michmike
Copy link
Contributor

michmike commented Jan 5, 2018

@alinbalutoiu as well

Copy link
Contributor

@alinbalutoiu alinbalutoiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JiangtianLi for the clarifications.

The changes looks good to me.

@dashpole
Copy link
Contributor

dashpole commented Jan 8, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2018
@JiangtianLi
Copy link
Contributor Author

/test pull-kubernetes-cross

@JiangtianLi
Copy link
Contributor Author

I restarted the tests and it is green (I think there was flakiness in cross test before and it is skipped this time)
/cc @michmike

@k8s-ci-robot k8s-ci-robot requested a review from michmike January 23, 2018 17:49
@JiangtianLi
Copy link
Contributor Author

/retest

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2018
@JiangtianLi
Copy link
Contributor Author

/cc @dashpole @kubernetes/sig-windows-bugs @michmike again. Need approve after rebase.

@k8s-ci-robot
Copy link
Contributor

@JiangtianLi: Reiterating the mentions to trigger a notification:
@kubernetes/sig-windows-bugs

In response to this:

/cc @dashpole @kubernetes/sig-windows-bugs @michmike again. Need approve after rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JiangtianLi
Copy link
Contributor Author

JiangtianLi commented Jan 26, 2018

/assign @tallclair

@michmike
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2018
@michmike
Copy link
Contributor

/assign @tallclair

@aserdean
Copy link
Contributor

aserdean commented Feb 6, 2018

Acked-by: Alin Gabriel Serdean aserdean@ovn.org

I tested the changes using VMware Workstation 12.

/lgtm
/ping @tallclair

@k8s-ci-robot
Copy link
Contributor

@aserdean: changing LGTM is restricted to assignees, and only kubernetes org members may be assigned issues.

In response to this:

Acked-by: Alin Gabriel Serdean aserdean@ovn.org

I tested the changes using VMware Workstation 12.

/lgtm
/ping @tallclair

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@michmike
Copy link
Contributor

michmike commented Feb 6, 2018

@tallclair can you please look into this and approve

@tallclair
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aserdean, dashpole, JiangtianLi, michmike, tallclair

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2018
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@feiskyer
Copy link
Member

feiskyer commented Feb 7, 2018

1.9 cherry-pick PR created at #59455

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Feb 7, 2018
k8s-github-robot pushed a commit that referenced this pull request Feb 24, 2018
…24-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #57124: Use GlobalMemoryStatusEx to get total physical memory on

Cherry pick of #57124 on release-1.9.

#57124: Use GlobalMemoryStatusEx to get total physical memory on
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows Kubelet.exe fails to start on VMWare Fusion due to GetPhysicallyInstalledSystemMemory fails