-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add export local volume data route #48839
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: MohammadHasan Akbari <jarqvi.jarqvi@gmail.com>
Thanks! I think this would address;
I recall there were still some discussions to be had around those; probably also in relation to #48798 I'll make sure this gets discussed in the next maintainers call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it works in the optimistic case, it's potentially very unsafe especially for backup purposes.
The issue with this approach is that the archive creation is not atomic - the filesystem content can change during the operation.
Consider the following filesystem at the time the archive.Tar
is called:
/data
/data/a/...
/data/b/...
...
/data/z/...
The filepath.WalkDir
will walk the directory in lexographical order. Obviously each directory needs some time to be processed.
This is fine, as long as we're sure that the filesystem doesn't change, but what if there's a container running that moves the /data/z/important-file
into /data/a/important-file
?
If the walk would already finish processing the /data/a
and the important-file
was already moved, by the time the walk starts processing the /data/z
it will already be in the /data/a
directory, meaning that it will be missing from the final archive.
This is unacceptable for users that would like to use it to backup a volume.
Unfortunately with the local
volume driver, I don't think we can provide a solution for this as we can't effectively snapshot the content of the volume, unless we want to docker pause
all containers using this volume during the export.
Also, if such "unsafe" solution is acceptable for the user, it's already possible with something like: docker run -it -v <volume>:/v alpine tar -c /v | ...
, so I don't see a need to implement it on the engine side.
Well, can't we use rsync in the engine side? Edit:
Is pausing containers problematic? |
Signed-off-by: MohammadHasan Akbari <jarqvi.jarqvi@gmail.com>
For volumes, this would likely mean "all containers that use the volume", in addition to preventing the volume to be used by new containers while the export is in process. We may need to look as well at how the paths are traversed / walked; if this is not happening within the container's mount namespace, we must prevent any path outside of the volume to be accessible (we've had some fun situations with that on |
Signed-off-by: MohammadHasan Akbari <jarqvi.jarqvi@gmail.com>
Signed-off-by: MohammadHasan Akbari <jarqvi.jarqvi@gmail.com>
I made some changes regarding this note.
I think the tar tool, when creating an archive (compression), does not follow symlinks (symbolic links). Instead, it stores the link itself rather than the file or directory it points to. |
Implement a new route in the Docker API that enables users to export the contents of
local
volumes as a.tar
archive. This feature offers a convenient way to generate compressed backups oflocal
volume data, making it easier to store, transfer, and restore volume contents efficiently.