-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure daemon root is unmounted on shutdown #36107
Conversation
68da97b
to
2a45e4f
Compare
2a45e4f
to
6e01047
Compare
Looks like this test is flaky (or broken) on Windows;
|
daemon/daemon_linux.go
Outdated
// daemon root is not a mountpoint, nothing to clean up | ||
return nil | ||
} | ||
return mount.Unmount(daemon.root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit confused here; the function description mentions "umounts shm/mqueue mounts for old containers", but here we're unmounting /var/lib/docker
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the doc string.
6e01047
to
3701d43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
af26f64
to
739cd67
Compare
Looks like this can be simplified to something like: - return daemon.cleanupMountsByID("")
+ if err := daemon.cleanupMountsByID(""); err != nil {
+ return err
+ }
+
+ logrus.WithField("mountpoint", daemon.root).Debug("unmounting daemon root")
+ return mount.Unmount(daemon.root) Reason1: keep it simple Or, in case you only want to unmount
Otherwise there are two different checks in two different places, and they might diverge resulting in a breakage. |
@kolyshkin Figuring out if the mount was from docker or not is not so simple as we could have an unclean shutdown at some point. |
@cpuguy83 Right. The current way, i.e. check that daemon.root is a mount point with a Maybe make a function, like |
@kolyshkin This isn't just checking if the mountpoint is shared, it's checking if it is a mountpoint at all and if the root of the mountpoint is itself. I can't really see a way to use the same function for both mount and unmount reliably. The one thing we could do is persist a value somewhere so we know that that we at least mounted it at some point, but then we still need to check everything to make sure it should be unmount. A simple |
ping @kolyshkin what's the status on this one? |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐸
daemon/daemon_linux.go
Outdated
|
||
info := getMountInfo(infos, daemon.root) | ||
// `info.Root` here is the root mountpoint of the passed in path (`daemon.root`). | ||
// The ony cases that need to be claned up is when the daemon has performed a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/claned/cleaned/g
😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
739cd67
to
70b4f4e
Compare
Flaky test? Not sure I've seen this one before; https://jenkins.dockerproject.org/job/Docker-PRs-powerpc/8485/console
|
While testing this |
70b4f4e
to
32f4c20
Compare
@nishanttotla There's a stack trace in the daemon logs for the stuck daemon. |
ping @tonistiigi @kolyshkin this is green now; changes still LGTY? |
still LGTM |
I'm not sure this is a good way to detect if mount was created by 2 issues: Running In dind, |
@tonistiigi The first issue is a simple change, though there are other things that could happen to wind up in that state. When you say add a label, I'm not sure we can just add random labels to the bind mount? |
Ah, yes, I didn't think of that.
What do you mean by tracking a file here? Creating a file on mounting to detect it later? Why shouldn't it be safe to unmount if the detected file was only created by docker on mounting? |
We'd still have to make sure the mount was actually safe to remove since file itself may be out of sync with the system. |
@cpuguy83 How? if someone removes internal files from |
After step 4 if we only relied on the file existing we would unmount the user mount... |
Or not because if it's still our mount from before the crash we'd want to clean it up. |
I wonder if we have to take very scenario in mind here: what would be the "bad" thing of the mount being left behind after a crash? Restarting the daemon will reuse it, restarting the host will unmount it, correct? |
@thaJeztah The case here is that dockerd crashes, user manually unmounts, now for some reason user mounts manually, dockerd is started. In that case on stop dockerd would umount something manually mounted by user that is quite bad. It is a very unlikely scenario though. |
@tonistiigi oh, perhaps I was unclear: the "unmount" scenario can be covered by writing a file (we check that file on shutdown, and unmount if it's there, and remove the file on startup if it exists, and there's a mount). So, the "crash" situation would then only result in us not un-mounting, which would in most situations be no problem (given that we need that mount, and likely start the daemon again after a crash). |
@thaJeztah Ok, yes, if we don't attempt to resync the file after a crash then there is no extra unmount. But it means that we will always leak a mount if there is a crash. If we attempt to resync then we don't leak a mount 99.9% of the cases, even on crash, but there is a tiny possibility of false unmount. |
Yes, so my train of though is: daemon crashes, which is an exceptional condition. Either something is very wrong, and manual intervention is needed, or the daemon is started again, and everything recovers. During start, an existing mount is found:
If the host is rebooted (which could be part of recovering after a crash): situation is reset to "pristine" |
I think a false unmount is a worse scenario than a mount leak. |
So, looks like the “file” approach would give us that? |
ping @tonistiigi