Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relabel BTRFS Content on container Creation #16452

Merged
merged 1 commit into from
Nov 17, 2015
Merged

Conversation

rhatdan
Copy link
Contributor

@rhatdan rhatdan commented Sep 21, 2015

This change will allow us to run SELinux in a container with
BTRFS back end. We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.

It basically relabels the content on container creation.

Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.

Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time. The
more inodes inside of the container image will increase the relabel time.

This patch will give people who care more about security the option of
runnin BTRFS with SELinux. Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.

Without relabel:

time docker run --security-opt label:disable fedora echo test
test

real 0m0.918s
user 0m0.009s
sys 0m0.026s

With Relabel

test

real 0m1.942s
user 0m0.007s
sys 0m0.030s

Signed-off-by: Dan Walsh dwalsh@redhat.com

@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 28, 2015

@LK4D4
Copy link
Contributor

LK4D4 commented Sep 28, 2015

makes sense for me
ping @docker/core-maintainers

@calavera
Copy link
Contributor

Moved to code-review, unfortunately it needs a rebase.

Change LGTM.

@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 29, 2015

@calavera Rebased.

@calavera
Copy link
Contributor

it doesn't look like this compiles any more. Check https://jenkins.dockerproject.org/job/Docker-PRs/16830/console

@rhatdan rhatdan force-pushed the btrfs-selinux branch 6 times, most recently from b8660ea to ad86722 Compare September 29, 2015 20:50
@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 29, 2015

How do I fix the golint problems with Windows?

@calavera
Copy link
Contributor

I've opened #16666 to fix those issues.

@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 30, 2015

Don't see how the Janky failures are related to my patch?

@calavera
Copy link
Contributor

@rhatdan janky runs go-lint in every changed file you commit. For some reason, it looks like graph/windows/windows.go never went through go-lint before and it failed when your changes trigger the lint. It should be fixed if you rebased master into this branch.

@rhatdan rhatdan force-pushed the btrfs-selinux branch 2 times, most recently from b4ac89b to 90a0b76 Compare October 1, 2015 22:04
@rhatdan
Copy link
Contributor Author

rhatdan commented Oct 11, 2015

rebased @calavera Any progress on this?

@rhatdan rhatdan force-pushed the btrfs-selinux branch 2 times, most recently from 92c7a0c to 9a44583 Compare November 11, 2015 19:11
@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 11, 2015

I moved the security check out of setHostConfig so I can call it before createRootFS. Hopefully this fixes the test issues.

This change will allow us to run SELinux in a container with
BTRFS back end.  We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.

It basically relabels the content on container creation.

Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.

Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time.  The
more inodes inside of the container image will increase the relabel time.

This patch will give people who care more about security the option of
runnin BTRFS with SELinux.  Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.

Without relabel:

> time docker run --security-opt label:disable fedora echo test
test

real    0m0.918s
user    0m0.009s
sys    0m0.026s

With Relabel

test

real    0m1.942s
user    0m0.007s
sys    0m0.030s

Signed-off-by: Dan Walsh <dwalsh@redhat.com>

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
@calavera
Copy link
Contributor

LGTM

1 similar comment
@LK4D4
Copy link
Contributor

LK4D4 commented Nov 17, 2015

LGTM

LK4D4 added a commit that referenced this pull request Nov 17, 2015
Relabel BTRFS Content on container Creation
@LK4D4 LK4D4 merged commit 4dda67b into moby:master Nov 17, 2015
@dmcgowan
Copy link
Member

Why were the labels added to the Create method instead of putting it on Get? The interface to Create was updated to accommodate this but the reason is not clear.

@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 17, 2015

You need the label at create time in order to recursively set the labels. Get is too late. We need the same thing to get labels for OverlayFS if the kernel ever gets fixed.

@dmcgowan
Copy link
Member

The label is only added to one Create call, and the next line has a call to Get. https://github.com/docker/docker/pull/16452/files#diff-1a1f3e7ad9b1d7584e2d3e7d0c4c3db9R995

Considering this label is done at the very end of Create, I don't see anything based on the code changes that wouldn't allow it to be done at the beginning of Get.

@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 17, 2015

Then every get call would be relabeling, we only want this happening once. Think docker create versus docker start.

@dmcgowan
Copy link
Member

I understand the difference, I am asking because this effects #17924 and we have to rebase and make interface updates as a result of this change. From my perspective, this is a btrfs specific change (possibly something similar for overlay in the future) which has now updated the interface to add mount labels on create, where previously create was not intended to mount. If relabeling must happen before first mount on btrfs then it is preferable to limit the change if that is possible. Except for on daemon.Mount, graphdriver.Get is normally called without a mount label, meaning this would only apply to start. From your previous comment, you are saying the main reason this must happen on create is that calling relabel for every start is either non-performant or is the incorrect behavior.

Somewhat separate question, why does this only apply to the init layer for the container and not any of the base layers or container writable layer?

dmcgowan added a commit to dmcgowan/docker that referenced this pull request Nov 17, 2015
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
dmcgowan added a commit to aaronlehmann/docker that referenced this pull request Nov 18, 2015
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
@rhatdan
Copy link
Contributor Author

rhatdan commented Nov 18, 2015

Yes relabeling is going to be time consuming. We time it out at about 1-2 seconds on a base image, but it could grow depending on the number of inodes in the image. Since we only need to do this on create, no need to add this penalty to all Starts.

By only applying it to the init layer. we end up being able to continue to share the image. IE Does not cause a COW change for every container. @rhvgoyal correct?

@rhvgoyal
Copy link
Contributor

@rhatdan right. If we apply label on container rootfs (and not init layer), then docker commit will make all the image files part of the COW layer and that's not what we want.

@thaJeztah thaJeztah added this to the 1.10.0 milestone Jan 15, 2016
euank added a commit to euank/coreos-overlay that referenced this pull request May 8, 2017
This script had two main functions:

1. Select the graphdriver

This functionality is now handled in the docker daemon. It defaults to
overlay2 on recent docker versions, and does its own fs detection for
btrfs etc.
We carry a patch for 1.12.6 now to prefer overlay to devicemapper

2. Avoid enabling selinux on btrfs

This no longer matters since as of v1.10, selinux on btrfs is supported.
See moby/moby#16452

It was replaced with a simpler systemd environment variable
euank added a commit to euank/coreos-overlay that referenced this pull request May 9, 2017
This script had two main functions:

1. Select the graphdriver
This functionality is now handled in the docker daemon. It defaults to
overlay2 on recent docker versions, and does its own fs detection for
btrfs etc.
We carry a patch for 1.12.6 now to prefer overlay to devicemapper

2. Avoid enabling selinux on btrfs
This no longer matters since as of v1.10, selinux on btrfs is supported.
See moby/moby#16452

This PR replaces that original functionality with a simpler systemd environment variable, which is also more in-line with what we do for other similar choices.

The environment variable is also more discoverable and easier for users to edit.
Note: for backwards compatibility with
DOCKER_OPTS=--selinux-enabled=false (to make that take precedent), we
intentionally put the environment variable as the first option.

However, for backwards compatibility with older units, we also retain
the script. We are able to remove the graphdriver detection/selection
since that behavior now happens appropriately in docker, but we need to
keep the selinux defaulting so that people who are executing the script
and expecting selinux to work (e.g.  if they copied an old
docker.service) will continue to get selinux as expected.
euank added a commit to euank/coreos-overlay that referenced this pull request May 9, 2017
This script had two main functions:

1. Select the graphdriver
This functionality is now handled in the docker daemon. It defaults to
overlay2 on recent docker versions, and does its own fs detection for
btrfs etc.
We carry a patch for 1.12.6 now to prefer overlay to devicemapper

2. Avoid enabling selinux on btrfs
This no longer matters since as of v1.10, selinux on btrfs is supported.
See moby/moby#16452

This PR replaces that original functionality with a simpler systemd environment variable, which is also more in-line with what we do for other similar choices.

The environment variable is also more discoverable and easier for users to edit.
Note: for backwards compatibility with
DOCKER_OPTS=--selinux-enabled=false (to make that take precedent), we
intentionally put the environment variable as the first option.

However, for backwards compatibility with older units, we also retain
the script. We are able to remove the graphdriver detection/selection
since that behavior now happens appropriately in docker, but we need to
keep the selinux defaulting so that people who are executing the script
and expecting selinux to work (e.g.  if they copied an old
docker.service) will continue to get selinux as expected.
ChrisMcKenzie pushed a commit to ChrisMcKenzie/coreos-overlay that referenced this pull request Dec 9, 2017
This script had two main functions:

1. Select the graphdriver
This functionality is now handled in the docker daemon. It defaults to
overlay2 on recent docker versions, and does its own fs detection for
btrfs etc.
We carry a patch for 1.12.6 now to prefer overlay to devicemapper

2. Avoid enabling selinux on btrfs
This no longer matters since as of v1.10, selinux on btrfs is supported.
See moby/moby#16452

This PR replaces that original functionality with a simpler systemd environment variable, which is also more in-line with what we do for other similar choices.

The environment variable is also more discoverable and easier for users to edit.
Note: for backwards compatibility with
DOCKER_OPTS=--selinux-enabled=false (to make that take precedent), we
intentionally put the environment variable as the first option.

However, for backwards compatibility with older units, we also retain
the script. We are able to remove the graphdriver detection/selection
since that behavior now happens appropriately in docker, but we need to
keep the selinux defaulting so that people who are executing the script
and expecting selinux to work (e.g.  if they copied an old
docker.service) will continue to get selinux as expected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants