Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix system container detection #26586

Merged

Conversation

derekwaynecarr
Copy link
Member

@derekwaynecarr derekwaynecarr commented May 31, 2016

Fix system container detection in kubelet on systemd.

This fixed environments where CPU and Memory Accounting were not enabled on the unit that launched the kubelet or docker from reporting the root cgroup when monitoring usage stats for those components.

Fixes #25909

/cc @vishh @ncdc @eparis @wojtek-t - this reverts the revert #26478

This should merge when node e2e setups were updated per #26289

@derekwaynecarr derekwaynecarr added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label May 31, 2016
@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 31, 2016
@derekwaynecarr derekwaynecarr added sig/node Categorizes an issue or PR as relevant to SIG Node. do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels May 31, 2016
@derekwaynecarr
Copy link
Member Author

Marking do-not-merge until node e2e infra has been updated.

@vishh
Copy link
Contributor

vishh commented May 31, 2016

I'm updating the CoreOS image to include systemd cpu & memory configs.

@vishh vishh assigned vishh and unassigned yujuhong May 31, 2016
@derekwaynecarr
Copy link
Member Author

@vishh - thanks

@derekwaynecarr
Copy link
Member Author

@redhat-k8s-bot test this

@vishh
Copy link
Contributor

vishh commented May 31, 2016

#26592 updates Core OS images.

@redhat-k8s-bot
Copy link

RH build/test passed for commit 316960f

Build time: 1:55:43

@derekwaynecarr
Copy link
Member Author

@vishh - this is still failing on coreos. i am not sure if there was a necessary follow-on for #26592 , but the other odd thing is that node e2e reports success even if there is a failure on core os. is that expected @eparis ?

@eparis
Copy link
Contributor

eparis commented Jun 6, 2016

I do not think it is expected. @ixdy https://storage.cloud.google.com/kubernetes-jenkins/pr-logs/pull/26586/node-pull-build-e2e-test/9239/build-log.txt shows a failure, but the github status is success...

@ixdy
Copy link
Member

ixdy commented Jun 6, 2016

PR Jenkins for node e2e probably isn't looking at the junit files. @spxtr @pwittrock

@vishh
Copy link
Contributor

vishh commented Jun 6, 2016

@derekwaynecarr The CoreOS image has the required configs. I suspect that the kubelet cgroup detection logic has bugs, since the summary test is flaky.

@derekwaynecarr derekwaynecarr added this to the v1.3 milestone Jun 7, 2016
@derekwaynecarr derekwaynecarr removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jun 8, 2016
@derekwaynecarr
Copy link
Member Author

@vishh - objections to merging this now that coreos is out of the testing pool?

@vishh vishh added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 8, 2016
@vishh
Copy link
Contributor

vishh commented Jun 8, 2016

I wish we had a systemd based distro to test against. AFAIK, we don't have any as of now to validate this PR.

@derekwaynecarr
Copy link
Member Author

@vishh #26586 (comment)

@vishh
Copy link
Contributor

vishh commented Jun 9, 2016

@derekwaynecarr Apologies. I did not realize that the RH test stack was running node e2e and not cluster e2e.

@derekwaynecarr
Copy link
Member Author

Hmm. Looked at the run in more detail and it did not actually run node
e2e. Will follow up with Jan in morning. Either way, I have tested this
manually :-)

On Wednesday, June 8, 2016, Vish Kannan notifications@github.com wrote:

@derekwaynecarr https://github.com/derekwaynecarr Apologies. I did not
realize that the RH test stack was running node e2e and not cluster e2e.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#26586 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AF8dbImdZ7i_gZTNjVkLDzNmNNLwz5z1ks5qJ2fVgaJpZM4Iq516
.

@vishh
Copy link
Contributor

vishh commented Jun 9, 2016

The reason I ask is because I noticed race conditions on CoreOS and there
were no obvious bugs I could find in kubelet. That's why I said that having
a node with systemd in the GCE Node e2e suite will help.

On Wed, Jun 8, 2016 at 6:22 PM, Derek Carr notifications@github.com wrote:

Hmm. Looked at the run in more detail and it did not actually run node
e2e. Will follow up with Jan in morning. Either way, I have tested this
manually :-)

On Wednesday, June 8, 2016, Vish Kannan notifications@github.com wrote:

@derekwaynecarr https://github.com/derekwaynecarr Apologies. I did not
realize that the RH test stack was running node e2e and not cluster e2e.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#26586 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AF8dbImdZ7i_gZTNjVkLDzNmNNLwz5z1ks5qJ2fVgaJpZM4Iq516

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#26586 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AGvIKEoG9EkeKJBoQRcxnDAltP5WPVMAks5qJ2rqgaJpZM4Iq516
.

@derekwaynecarr
Copy link
Member Author

not sure why jenkins is unhappy all of a sudden, will rebase and retag.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2016
@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 11, 2016

GCE e2e build/test passed for commit 08cdc0e.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 55dbcee into kubernetes:master Jun 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubelet container manager cgroup detection should error if cpu and memory differ
9 participants