-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the ability to configure default capabilities #39297
base: master
Are you sure you want to change the base?
Conversation
@burnMyDread congrats of your first PR to Moby project 🎂 It looks to be that CI does not like your changes to
You should be able to test/debug it locally with command: Also, plz use your real name on sign off message. And last but not least it would be nice if you can describe some real world use case for this one so maintainers can more easily understand if this is correct way to solve that issue? |
@olljanat Thanks for the feedback I will push a commit with an updated unit test on Monday and figure out what is going on with the signature. As for your request for background on why we want this change. We would like to set a local policy for what the default capabilities for newly created docker containers as we would like a more restrictive set of capabilities than the current default. More details on the request are here. |
daemon/daemon_test.go
Outdated
@@ -188,7 +188,7 @@ func TestContainerInitDNS(t *testing.T) { | |||
hostConfig := `{"Binds":[],"ContainerIDFile":"","Memory":0,"MemorySwap":0,"CpuShares":0,"CpusetCpus":"", | |||
"Privileged":false,"PortBindings":{},"Links":null,"PublishAllPorts":false,"Dns":null,"DnsOptions":null,"DnsSearch":null,"ExtraHosts":null,"VolumesFrom":null, | |||
"Devices":[],"NetworkMode":"bridge","IpcMode":"","PidMode":"","CapAdd":null,"CapDrop":null,"RestartPolicy":{"Name":"no","MaximumRetryCount":0}, | |||
"SecurityOpt":null,"ReadonlyRootfs":false,"Ulimits":null,"LogConfig":{"Type":"","Config":null},"CgroupParent":""}` | |||
"SecurityOpt":null,"ReadonlyRootfs":false,"Ulimits":null,"LogConfig":{"Type":"","Config":null},"CgroupParent":"",Capabilities:["CAP_KILL"]}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that you had to change this test suggests that you changed the default config. You can't make that kind of change without it being backwards compatible, and you will need a full set of tests for this change. It is unfortunately difficult making this sort of change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I respond in full, let me make sure I understand your concern. I think what you are saying is: Changing the required default capabilities for everyone is a not something we want to do.
If that is your concern then I do not think that is an issue, if you look at https://github.com/moby/moby/pull/39297/files#diff-72dc1842ed7b5a92b4e98f84456a2c0cR154
You can see that we are only setting the capabilities if they are configured in the daemon config. If they are not set we continue with the current default. (I reused the existing test but I can make a copy that is specific to my case if you would like.)
Cheers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely do not change an existing test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justincormack Looks like the tests are working other than the Windows RS1 tests. I moved my test case into its own test and left the DNS test as it is.
Looking at the windows RS1 failure, it looks like all the runs today failed, and I see a PR up here: #39287 to fix what looks like the same issue. |
@burnMyDread yea RS1 is broken and can be ignored for now. Last status of investigating it you can see on #39290 (comment) |
Ok cool, thanks. |
Derek add label: rebuild/windowsRS1 |
Derek add label: rebuild/windowsRS1 |
@burnMyDread only pre-defined person can send commands to Derek 😉 |
025bb65
to
9f5bff9
Compare
Codecov Report
@@ Coverage Diff @@
## master #39297 +/- ##
==========================================
+ Coverage 36.18% 37.05% +0.87%
==========================================
Files 610 611 +1
Lines 45363 45522 +159
==========================================
+ Hits 16413 16869 +456
+ Misses 26707 26371 -336
- Partials 2243 2282 +39 |
@justincormack Are you ok with the test I added to cover configuring the capabilities? (I left the DNS test alone in this version of the code.) @olljanat Are you happy with the commit message and the signoff? (Also do I need to do something to get the codecov report to work correctly?) |
@burnMyDread assuming that it is valid email address then yes 👍 Codecov have some weird issues which happens time to time. You can ignore it. |
@olljanat Ok sounds good. (That address is an alias, but it is my real email account.) |
@olljanat What can I do to help get this accepted? |
LGTM, @thaJeztah @justincormack ping |
@thaJeztah @justincormack Any feedback for me? Thanks. |
We should have a proper look at this when taking Swarm services into account; see the discussion on #25885 (comment) |
@thaJeztah Thanks for the link. I am a little unclear on what you need from me, but I wanted to clarify one thing. With my change, I only update the capabilities if |
This needs a documentation (at the very least an update to dockerd man page, this lives in https://github.com/docker/cli). Also, can you please clean up the commit a bit to not remove those empty lines |
@burnMyDread Looks like an error one might get from Docker For Mac just because sometimes the FS sucks. |
This is a manjaro install not a mac. |
I'll try an ubuntu VM for my build. I suspect the golang version. |
I have a test working for the new code for the default case. I can't figure out how to pass in the daemon config. Any hints? |
Config: &containertypes.Config{}, | ||
HostConfig: &containertypes.HostConfig{}, | ||
} | ||
d := setupFakeDaemon(t, c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to add a 2nd case and set d.configStore.DefaultCapabilities
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some goofs I got it working. For the other issue I was seeing, if I let the tests run to completion then re-run the tests I get the same failure.
@cpuguy83 You good with this? |
Signed-off-by: Zachary <zachary.Joyner@linux.com>
How do I replicate the tests in docker_cli_run_unix? (I tried running the complied daemon and running the same docker commands and they didn't fail.) |
I ran the following test case and it passed: In the cli tests I see that we are calling this code: moby/integration-cli/docker_utils_test.go Line 34 in d96f61c
Does cli.Docker(cli.Args(args...)) do anything other than a docker command with the same flags? |
Looking at the icmd docs it looks like the answer to my question was yes. |
Signed-off-by: Zachary <zachary.Joyner@linux.com>
02876cb
to
a0f5764
Compare
Signed-off-by: Zachary <zachary.Joyner@linux.com>
retest this please. |
I got some details but it mostly raises more questions: moby/contrib/syscall-test/ns.c Line 54 in 7932d4a
That is the line of source code that is failing. (I have a copy of that docker container via running hack/make.sh) This is what has me very puzzled:
Note: The cap_sys_admin capability. That is running on the same container with ns-test which is failing with At this point I can change the tests to use capsh but that sounds dangerous. Do you guys have any idea why the tooling is giving me contradictory information like this? |
Signed-off-by: Zachary <zachary.Joyner@linux.com>
I may have the issue sorted. |
I gave this a quick try, and there's some things to address. I left some comments above, but there's some more; 1. Changing defaults should not affect existing containersThis needs some thinking what the best approach is. Currently, this pull request will modify capabilities on existing containers, which likely isn't desirable. Although this is already the case for the current "defaults", those defaults are hard-coded in the daemon, and would only change in case docker is updated and adds new capabilities (which would be somewhat more expected, and could be mentioned in changelogs). However, I think that custom defaults should only apply to new containers, be part of the container's configuration, and be immutable after that. Here's what's happens currently: start the daemon: dockerd start a container in the background docker run -dt --init --name test alpine install docker exec test sh -c 'apk add --quiet --no-cache libcap`
docker exec test sh -c 'capsh --print'
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Ambient set =
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video) Now, stop the daemon, and start it again with custom default capabilities dockerd --default-capabilities CAP_KILL --default-capabilities CAP_CHOWN Start the container again docker start test Check capabilities in the container again, and notice that those were changed to the new defaults: docker exec test sh -c 'capsh --print'
Current: = cap_chown,cap_kill+eip
Bounding set =cap_chown,cap_kill
Ambient set =
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video) I don't think changing defaults should affect existing containers, as that could easiliy lead to unexpected behavior for containers that were previously running fine, and all of a sudden no longer function properly. |
2. missing validationWe should probably add validation on the configured defaults. Problem with this is that the daemon starts succesfully, but once a container is started, it will fail; dockerd --default-capabilities=NO_SUCH_CAPABILITY docker run --rm hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:370:
starting container process caused: unknown capability "NO_SUCH_CAPABILITY": unknown.
ERRO[0002] error waiting for container: context canceled There's existing validation functions for the |
3. daemon.json configuration not working correctlyThe third thing I tried is to set default capabilities through Create mkdir -p /etc/docker/
echo '{"default-capabilities":["CAP_KILL","CAP_CHOWN"]}' > /etc/docker/daemon.json Start the daemon; dockerd Check capabilities in containers that are started: docker run --rm alpine sh -c 'apk add --quiet --no-cache libcap && capsh --print'
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Ambient set =
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video) Notice that the custom capabilities are not applied. However, trying to set the capabilities through the dockerd --default-capabilities CAP_KILL --default-capabilities CAP_CHOWN
unable to configure the Docker daemon with file /etc/docker/daemon.json:
the following directives are specified both as a flag and in the configuration file:
default-capabilities: (from flag: [CAP_KILL CAP_CHOWN], from file: [CAP_KILL CAP_CHOWN]) I suspect some logic is missing to load the capabilities; possibly because the If we agree on 1. ("Changing defaults should not affect existing containers"), we should also allow these capabilities to be "live-reloadable", so that they can be updated without having to restart the daemon. |
Co-authored-by: Sebastiaan van Stijn <thaJeztah@users.noreply.github.com>
How did you see this getting implemented? Is it something where after a container is created we store a list of caps for that container? How would we go about checking the current caps for a container? Our current workflow is that we tell our users to drop all caps and add what they need. What we would like to be able do is set a known safe set of default caps and have our users add the caps they need. Our high level requirement is that we need to be able to tell what caps a container has without doing a docker exec or equivalent on the container(s) as we don't know in advance what is or is not installed on the container. Your alpine example is a good indication of why this is tricky. For our use-case I am not as worried about containers that stop working due to capabilities changes because we are already pretty aggressively changing the default caps as it is. I would also like to note that this is an opt-in change the defaults are the same. A penultimate note is that this is really about cap_net_raw at the end of the day. If we didn't have that in the default then this functionality is less of a priority. That being said I get the argument that removing that cap would cause some compatibility issues. In short, I don't have a problem this approach in theory, but I want to make sure that I understand the behavior you are looking for in more depth so that I can insure our goals line up. (P.S. I agree on your other two notes but I want to get this sorted before I start coding.) |
@thaJeztah Just to clarify the behavior you are thinking is that once a container 'gets' capabilities it keeps them via something like this struct: Line 61 in 837ee91
So that if a container is re-launched it has the same capabilities every-time it is launched. Also do you want to first time we create the container to be when we set the capabilities for the container? If so, if we pull a container do the caps come from the repo or the local daemon? (This makes a big difference for you we are trying to do.) |
Note DCO check is throwing an error. One concern I have here is I don't think we really should support expanding the default capability set, only reducing it as this is a major foot-gun with regard to builds... and of course buildkit has its own default caps. |
I am 100% onboard with only removing caps. I wanted to make the most flexibility in the approach. |
On the subject of So to that end, IMO adding extra caps to drop by default seems pretty reasonable, but I wonder if it's a temporary measure? What are the use cases beyond |
Sorry for the delay; catching up with my notifications
and
Yes, it would likely have to be stored in the container's hostconfig that we store on disk. I must admit I haven't looked closely yet what the best approach would be. I saw that this PR progressed since the last time I looked, and wanted to verify what the behavior was for existing containers when changing defaults.
The caps should always be an option set at runtime (either by the daemon for defaults, or by the user). An image that's pulled from a registry cannot specify these (which would be a security risk, as then I could pull an image that gains additional privileges).
I know @justincormack has that one on his list; we definitely want to drop (Wondering if it would be acceptable to diverge from upstream, and to patch the official images to make those changes @tianon)?
and
I also agree to the above; only being able to drop capabilities that are currently enabled by default would have my preference, unless there's clear use-cases where the opposite would be important. And, yes, if dropping |
Let me clarify a bit here. We started this project because we had one of our internal analysis discovered cap_net_raw and told us we needed an audit and mitigation plan. One part of that was an audit approach that is outside this PR. The other was getting the ability to drop by default cap_net_raw upstream which is this PR. I opened this PR before I was aware of any talk of removing cap_net_raw. The goal here was to have docker's policy match our policy with respect to caps. If we don't have cap_net_raw the only other cap of concern is cap_dac_override for which we have mitigations in place.
The only other case we have is dropping cap_dac_override to mitigate the damage from a container breakout. That being said we are also requiring Mandatory Access Controls (MAC) to mitigate container breakout so we can live with the MAC pulling double duty. I don't really see the value in a temporary fix if you guys are looking into more permanent fixes. So while technically, we care about cap_dac_override as well it is a very distant second from net_raw. So, if you guys didn't see the value in a special flag for one or two caps I can understand.
I think we are on the same page. I like the drop only approach and get your point on not wanting temporary features. What is your gut feeling on how long it will take to drop net_raw? Are we talking early next year or a year from now? If it is far out how does this sound as a stop gap: I add a flag that sets the default caps to not include the two caps we talked about. |
Signed-off-by: zach zachary@rocana.com
- What I did
I added the ability to set a default set of linux capabilities to give to newly created containers.
- How I did it
I extended the work 80d7bfd to add the ability to configure the docker daemon to set a more limited default set of capabilities.
- How to verify it
I added a new check in the tests, but if you configure the caps via dockerd cli or json and run:
capsh --print
You will see the caps for your container.
- Description for the changelog
Adding the ability to configure default capabilities.
- A picture of a cute animal (not mandatory but encouraged)