-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make /var/lib/kubelet as shared during startup #45724
Make /var/lib/kubelet as shared during startup #45724
Conversation
40bcd14
to
19b6d67
Compare
cmd/kubelet/app/server.go
Outdated
@@ -861,6 +861,12 @@ func RunKubelet(kubeFlags *options.KubeletFlags, kubeCfg *componentconfig.Kubele | |||
} | |||
} | |||
|
|||
// set up /var/lib/kubelet sharing | |||
err = kubeDeps.Mounter.MakeShared(kubeCfg.RootDirectory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope RunKubelet is the right place to do this startup initialization. Feel free to suggest a better one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answering to myself, setupDataDirs
looks like a better place.
defadd4
to
27389e0
Compare
tested with Surprisingly, something already bind-mounts /var/lib/kubelet as shared on GCI even though it's not necessary, / is already shared. |
27389e0
to
cc20410
Compare
cc20410
to
bf2a9d0
Compare
@k8s-bot pull-kubernetes-federation-e2e-gce test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits.
pkg/util/mount/mount.go
Outdated
@@ -55,6 +55,9 @@ type Interface interface { | |||
// GetDeviceNameFromMount finds the device name by checking the mount path | |||
// to get the global mount path which matches its plugin directory | |||
GetDeviceNameFromMount(mountPath, pluginDir string) (string, error) | |||
// MakeShared checks that given path is on a mount with 'shared' mount | |||
// propagation. If not, it bind-mounts the path as shared. | |||
MakeShared(path string) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Call it MakeRShared
to reflect the fact that it shares recursively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
pkg/util/mount/mount_linux.go
Outdated
if err != nil { | ||
return []mountInfo{}, err | ||
} | ||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we could make this a generic utility for reading many kernel APIs that return data more than a page in size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (with some refactoring of listProcMounts
to use the same function).
} | ||
|
||
glog.V(2).Infof("Bind-mounting %q with shared mount propagation", path) | ||
// mount --bind /var/lib/kubelet /var/lib/kubelet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the syscall
package? Why exec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh? syscall.Exec
will replace /usr/bin/kulebet
in current process with /bin/mount
and it won't ever return. We need fork()
+exec()
and that's exactly what os.Exec
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Reworked to use syscall.Mount
.
pkg/util/mount/mount_linux.go
Outdated
// doMakeShared is common implementation of MakeShared on Linux. It checks if | ||
// path is shared and bind-mounts it as shared if needed. mountCmd and mountArgs | ||
// are expected to contain mount-like command, doMakeShared will add '--bind | ||
// <path> <path>' and '--make-shared <path>' to mountArgs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make-rshared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
func TestIsSharedSuccess(t *testing.T) { | ||
successMountInfo := | ||
`62 0 253:0 / / rw,relatime shared:1 - ext4 /dev/mapper/ssd-root rw,seclabel,data=ordered | ||
76 62 8:1 / /boot rw,relatime shared:29 - ext4 /dev/sda1 rw,seclabel,data=ordered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case with /var/lib/kubelet
explicitly marked as shared
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a test with /var/lib/foo
that is a mount point and is shared.
@vishh, thanks for the review, I'll update the PR tomorrow. I played with this PR and with #41683 to make kubelet mount stuff from containers and it looks like /var/lib/kubelet is not enough and https://github.com/kubernetes/kubernetes/pulls/jsafrane is still needed on machines without systemd. See kubernetes/community#648 for details. |
/unassign |
Doesn't seem like this would work for nsenter mounter i.e. containerized kubelet. The container itself will have already bind mounted |
pkg/util/mount/mount_linux.go
Outdated
} | ||
|
||
type mountInfo struct { | ||
root, mountPoint string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not seeing where root
is ever used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
I'm also not seeing the code for "if / is already rshared, do nothing". Really the rule should be:
If you get to the end without a remount, it means the /var/lib/kubelet is under / and is reshared already. You only need to remount if /var/lib/kubelet or one of its parents directories is 1) is a mountpoint and 2) is private. I think 😄 |
/retest |
} | ||
|
||
// mount --make-rshared /var/lib/kubelet | ||
if err := syscall.Mount(path, path, "" /*fstype*/, syscall.MS_SHARED|syscall.MS_REC, "" /*data*/); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a remount flag (MS_REMOUNT) required here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, remount is not required. mount --make-rshared
uses only the fags above as implied by the man page and checked with strace.
I'd like to see a node e2e test that verifies this feature. This may prevent regressions. e2es can land even after code freeze I think. |
I have e2e test for whole propagation in my queue, waiting for #46444 to merge. And if it runs on wheezy we can be quite sure that this PR works as expected. |
/lgtm |
/assign @smarterclayton @thockin |
/retest |
/approve |
/approve no-issue |
/approve no-issue |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane, sjenning, smarterclayton, vishh Associated issue requirement bypassed by: smarterclayton The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest Review the full test history for this PR. |
1 similar comment
/retest Review the full test history for this PR. |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
Automatic merge from submit-queue (batch tested with PRs 45724, 48051, 46444, 51056, 51605) Mount propagation in kubelet Together with #45724 it implements mount propagation as proposed in kubernetes/community#659 There is: - New alpha annotation that allows user to explicitly set propagation mode for each `VolumeMount` in pod containers (to be replaced with real `VolumeMount.Propagation` field during beta) + validation + tests. "Private" is the default one (= no change to existing pods). I know about proposal for real API fields for alpha feature in https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit, but it seems it's not implemented yet. It would save me quite lot of code and ugly annotation. - Updated CRI API to transport chosen propagation to Docker. - New `kubelet --experimental-mount-propagation` option to enable the previous bullet without modifying types.go (worked around with changing `KubeletDeps`... not nice, but it's better than adding a parameter to `NewMainKubelet` and removing it in the next release...) ```release-note kubelet has alpha support for mount propagation. It is disabled by default and it is there for testing only. This feature may be redesigned or even removed in a future release. ``` @derekwaynecarr @dchen1107 @kubernetes/sig-node-pr-reviews
@jsafrane: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
…lement mount.Interface again. This was broken in #45724
This is part of
kubernetes/community#589kubernetes/community#659We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that
/var/lib/kubelet
has shared mount propagation to be able to see these mounts.On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that
/var/lib/kubelet
is a directory on/
which has private mount propagation and kubelet bind-mounts/var/lib/kubelet
as rshared.Both "regular" linux mounter and
NsenterMounter
are updated here.@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews
@vishh
Release note: