Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update libcontainer to include PRs with fixes to systemd cgroup driver #61926

Merged
merged 1 commit into from
Apr 25, 2018

Conversation

filbranden
Copy link
Contributor

@filbranden filbranden commented Mar 30, 2018

What this PR does / why we need it:

PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that makes Kubelet startup hang when using systemd cgroup driver (by adding a timeout) and further PR opencontainers/runc#1772 fixes that bug by checking the proper error status before waiting on the channel.

PR opencontainers/runc#1776 checks whether Delegate works in slices, which keeps libcontainer systemd cgroup driver working on systemd v237+.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #61474

Special notes for your reviewer:
/assign @derekwaynecarr
cc @vikaschoudhary16 @sjenning @adelton @mrunalp

Release note:

Fixes a hang on kubelet startup when using systemd cgroup driver of libcontainer.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 30, 2018
@filbranden
Copy link
Contributor Author

/assign @sttts for OWNERS.

@filbranden filbranden changed the title Update libcontainer to include PR opencontainer/runc#1754 Update libcontainer to include PR opencontainers/runc#1754 Mar 30, 2018
@yujuhong
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 30, 2018
@filbranden
Copy link
Contributor Author

/assign @sttts
for OWNERS

@filbranden
Copy link
Contributor Author

I'll probably want to include opencontainers/runc#1772 as well which has a better fix for this...

Maybe we can hold this PR until that one is pushed upstream too.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 4, 2018
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 12, 2018
@filbranden filbranden changed the title Update libcontainer to include PR opencontainers/runc#1754 Update libcontainer to include PRs with fixes to systemd cgroup driver Apr 12, 2018
@filbranden
Copy link
Contributor Author

@derekwaynecarr Please take another look, this includes all the fixes we pushed to libcontainer. (You might want to retest #61474 yet one more time, but I imagine everything should be fine.)

We need approval from dep-approvers, so I'll assign the three of them here...

/assign @cblecker
/assign @thockin
/assign @sttts

Cheers!
Filipe

@filbranden filbranden force-pushed the cgroupdriver10 branch 2 times, most recently from 758cff2 to 32098f7 Compare April 12, 2018 22:19
@cblecker
Copy link
Member

I can approve the dependency bump, but it should be reviewed by sig node first (@derekwaynecarr) and tests should pass.

@filbranden
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@filbranden
Copy link
Contributor Author

Tests are all good... @derekwaynecarr you're next 😃

@dims
Copy link
Member

dims commented Apr 13, 2018

LGTM 👍

@filbranden
Copy link
Contributor Author

@vikaschoudhary16 @sjenning @mrunalp Can you please review opencontainers/runc#1781 ?

If that looks good, then I'd like to include it here and then finally merge this into Kubernetes code base...

Thanks!
Filipe

PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that
makes Kubelet startup hang when using systemd cgroup driver (by adding a
timeout) and further PR opencontainers/runc#1772 fixes that bug by
checking the proper error status before waiting on the channel.

PR opencontainers/runc#1776 checks whether Delegate works in slices,
which keeps libcontainer systemd cgroup driver working on systemd v237+.

PR opencontainers/runc#1781 makes the channel buffered, so if we time
out waiting on the channel, the updater will not block trying to it
since there are no longer any consumers.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2018
@filbranden
Copy link
Contributor Author

Ok I updated this PR with all the fixes.

Considering the current state is "broken" (as far as I can tell, Kubernetes upstream doesn't work at all on a system using systemd cgroup driver), I think we should really go ahead and submit this.

If you want to take a short while to try to reproduce the issue (particularly the one when the systemd response to dbus times out) please go ahead, but let's try to figure this one quickly and merge it before new conflicts arise please 😄

Cheers,
Filipe

@filbranden
Copy link
Contributor Author

/retest

@derekwaynecarr
Copy link
Member

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2018
@filbranden
Copy link
Contributor Author

/assign @cblecker
for approval...

@cblecker
Copy link
Member

Existing dep bump, with no license changes.
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker, derekwaynecarr, filbranden

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 24, 2018
@cblecker
Copy link
Member

/hold
There's an issue with deps on master.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 25, 2018
@adelton
Copy link
Contributor

adelton commented Apr 25, 2018

Just a quick note (mostly to self), based on #61474 (comment), Fedora 28 and rawhide now have patched systemd from https://bugzilla.redhat.com/show_bug.cgi?id=1568594 which allows the cluster to start without this patch to Kubernetes.

@cblecker
Copy link
Member

/retest
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 25, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 667dd71 into kubernetes:master Apr 25, 2018
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 24, 2018
k8s-github-robot pushed a commit that referenced this pull request May 25, 2018
…1926-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #61926: Update libcontainer to include PRs with fixes to systemd

Cherry pick of #61926 on release-1.10.

#61926: Update libcontainer to include PRs with fixes to systemd
@filbranden filbranden deleted the cgroupdriver10 branch May 30, 2018 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

./hack/local-up-cluster.sh fails on Fedora rawhide