-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use init container load kernel modules in kube-proxy #56732
Conversation
@Lion-Wei: GitHub didn't allow me to request PR reviews from the following users: vfreex. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
Shouldn't this be loaded only if the IPVS proxier is enabled? |
Thanks for replaying, @bowei echo "{{params}}" | grep "\-\-proxy-mode=ipvs" && modprobe ...
cat /var/lib/kube-proxy/config.conf | grep "mode: ipvs" && modprobe ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments
@@ -175,6 +175,20 @@ spec: | |||
- mountPath: /run/xtables.lock | |||
name: xtables-lock | |||
readOnly: false | |||
initContainers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this spacing align with the containers line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No....
Thanks for pointing out.
command: | ||
- /bin/sh | ||
- -c | ||
- cat /var/lib/kube-proxy/config.conf | grep "mode: ipvs" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of parsing the file on disk, can we pull in the component configmap ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using configmap
to config conf will be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing. I tried to use configmap. Kubeadm created kube-proxy configmap with a struct:
apiVersion: v1
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
...
mode: ipvs
...
In which config.conf
is a key with a bug value. I can use this key get all config.conf
value and see it include ipvs: mode
or not. But can't get mode
field. Is there any other way of using configmap?
- -c | ||
- cat /var/lib/kube-proxy/config.conf | grep "mode: ipvs" && | ||
modprobe -a ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4 || | ||
echo "didn't load ipvs related kernel modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use an if statement and use an explicit success/failure return code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this can be simpler. And I worried if the command cat /var/lib/kube-proxy/config.conf | grep "mode: ipvs"
return non zero, the init container will exist with error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the original logic. If can't use ipvs mode, give a warning and slightly fallback to iptables to make sure kube-proxy still can work.
We can talk about it, to chose a more reasonable logic.
@m1093782566
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If failed to load kernel modules, IPVS proxier can't work. In order to make kube-proxy work, we let kube-proxy fall back to iptables or userspace mode... That's NOT a silent action, kube-proxy will log the reason.
Any problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cite a case, if user specify iptables mode but kernel is not compatible or iptables version too low, kube-proxy will fall back to userspace mode.
readOnly: true | ||
initContainers: | ||
- name: init-kube-proxy | ||
image: gcr.io/google_containers/busybox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a versioned image. Probably gcr.io/google_containers/busybox:1.27.2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
@@ -57,6 +57,19 @@ spec: | |||
- mountPath: /run/xtables.lock | |||
name: xtables-lock | |||
readOnly: false | |||
readOnly: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate readOnly field?
pkg/proxy/ipvs/proxier.go
Outdated
@@ -693,15 +693,6 @@ func (em proxyEndpointsMap) unmerge(other proxyEndpointsMap) { | |||
// return an error if it fails to get the kernel modules information without error, in which | |||
// case it will also return false. | |||
func CanUseIPVSProxier(ipsetver IPSetVersioner) (bool, error) { | |||
// Try to load IPVS required kernel modules using modprobe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete these lines? kube-proxy is not always running inside container. If kube-proxy running on host, it can modprobe kernel modules. Besides, if kube-proxy is running inside container and failed to load kernel modules, it just log warning messages - please double check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out if kube-proxy run inside the container, it will certainly fail.
But maybe we can accept an warning message. I will take it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why we just log warning message instead of existing with error.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Lion-Wei Assign the PR to them by writing Associated issue: #52003 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@@ -698,7 +698,7 @@ func CanUseIPVSProxier(ipsetver IPSetVersioner) (bool, error) { | |||
err := utilexec.New().Command("modprobe", "--", kmod).Run() | |||
if err != nil { | |||
glog.Warningf("Failed to load kernel module %v with modprobe. "+ | |||
"You can ignore this message when kube-proxy is running inside container without mounting /lib/modules", kmod) | |||
"You can ignore this message when kube-proxy is running inside container", kmod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the warning message mean here? Does it mean that I do not need to load those kernel modules? But this PR is using initContainer to load those modules, seems a bit weird to me, can you help elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the warning message mean here? Does it mean that I do not need to load those kernel modules?
ip_vs_* kernel modules is needed. But, it's unnecessary that we should load it via modprobe
by kube-proxy. For example, kube-proxy run inside container and ip_vs_* kernel modules are already installed in host.
But this PR is using initContainer to load those modules, seems a bit weird to me, can you help elaborate?
If kube-proxy is always installed by the scripts/manifests which this PR has covered, we can change the warning to error, but actually it's NOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for replaying, we have this warning when modprobe
command return err, if we use init container then we don't need load kernel modules inside kube-proxy container, so can ignore this message. The follow code will check if kernel loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Lion-Wei and @m1093782566 , still have some question: If the modprobe
return error, then it should means that some modules are not loaded, I think that we should return error for such case, as the ipvs may not work well if some modules are not loaded even running in container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the modprobe return error, then it should means that some modules are not loaded
NO.
Suppose you run kube-proxy in a container. When you execute modprobe ip_vs
command, you will get an error since there is no kernel module in container by default. However, you will find some kernel modules in /proc/modules
since there are some kernel modules in host.
That's why we do 2 steps:
1 modprobe - log warning if error occur
2 "cut", "-f1", "-d", " ", "/proc/modules" - return error
We should consider both container and non-container cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @m1093782566
So if the host has kernel modules in /proc/modules
, then we do not need to load those modules in container any more if the container is using security mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, as the initContainer already done the work, how about add more info claim that the initContainer already helped to do the modprobe work
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no? kube-proxy process should not be aware of init container? In other words, process should not care how it's deployed - deployed with init container or anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, why ignore this message when proxy is running in container? How to handle the case if there is no init container and kube proxy do not mount /lib/modules either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, why ignore this message when proxy is running in container?
Because users can deploy kube-proxy in anyway they like, they may don't use init container.
How to handle the case if there is no init container and kube proxy do not mount /lib/modules either?
If kube-proxy is running in container and didn't mount /lib/modules from host, it still has a chance to run IPVS mode.
- /bin/sh | ||
- -c | ||
- 'echo $(PROXY_CONFIG) | grep "mode: ipvs" && modprobe -a | ||
ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4 || echo "did not load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments, do we really need this init container? As the kube-proxy already modprobe the required modules at here https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/proxy/ipvs/proxier.go#L697-L703 for 1.9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed in #52003, Folks believed a better solution is to load kernel modules in init container instead as that has less security concerns, but use this as an alternative solution in first to unblock the IPVS beta work in rebase-1.9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Lion-Wei
- Can you please clarify more for the security risk? and how does initContainer can help this?
- Why
modprobe
is needed in initContainer? I see that the kube-proxy can load those modules dynamically.
/assign @thockin . : ) |
This adds many lines of configuration and a duplicate shell script, all of this code can diverge too easily and very few users would notice. If we want to improve security, I'd say we could ask IPVS users to ensure these modules are present on the nodes (which is easy with |
@Lion-Wei PR needs rebase |
Wouldn't it make more sense if we split the manifest per mode? I would imagine different kube-proxy implementations will diverge. If every mode uses the same manifest, it will become more and more clunky and harder to maintain. Probably structure it like this ??
Another long term goal would be to split the kube-proxy binary and transform the common components into a library or interface. |
@errordeveloper Sorry for the delay. But I don't think tell user to load those kernel modules is a good idea. We can do this work in script like |
@freehan Definitely make sense. IPVS and iptables may have lots different features and different option in the future. |
If we really are looking for best user experience, I'd be looking forward to any ideas where we would introduce an API change/extensions that allows pods to somehow specify modules it depends on. |
@errordeveloper Yeah, I understand your concern, indeed, use busybox and shell script to load kernel modules is very primaeval and unsightly, I'll try to make it more graceful. Thanks for your recommendation. |
Thanks!
…On Wed, 24 Jan 2018, 12:46 pm Lion-Wei, ***@***.***> wrote:
@errordeveloper <https://github.com/errordeveloper> Yeah, I understand
your concern, indeed, use busybox and shell script to load kernel modules
is very primaeval and unsightly, I'll try to make it more graceful. Thanks
for your recommendation. : )
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#56732 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPWS9bS-tENBt7btTZpDBQPznVnkHIUks5tNyYMgaJpZM4QzH55>
.
|
@Lion-Wei This needs to be resolved before ipvs goes GA. I think the general sentiment is that putting the modprobe logic in the init container is not ideal. I think the approach that will require the least work from the user's perspective is to bake the logic of loading the relevant modules into the startup scripts. This looks like it would involve plumbing the KUBEPROXY_MODE environment variable around everywhere. I can work with you on this to make sure that all cases are covered. |
@rramkumar1 Cool!! |
I suspect if it's doable because ipvs will fall back to iptables and iptables will fall back to userspace.
Interesting, but I also worry about the proxy mode fall back path. |
Do you mean we don't need to load kernel modules in init container, instead, we should load kernel modules in deployment tools/scripts. @Lion-Wei has already done it for local-up.sh. Should we continue do this for kubeadm, kubernetes-anywhere, gce scripts...? |
Please check context around #52003 for more information, @gyliu513 Following are some thoughts off the top of my head:
|
@m1093782566 Yes, startup scripts. I think it would be great if you guys could open an issue to track this with a checkbox for each piece you need to cover (kubeadm, gce, kubernetes-anywhere). That way, we can easily track which pieces are complete. |
What's your opinion? |
Summarize what we discussed before, to be concise, I think there are three concerns about this pr:
I understand use busybox and shell script to load kernel modules is very primaeval and unsightly, but I do think use init container load kernel modules is a better idea then bake the logic of loading the relevant modules into the startup scripts. But if we decide that's what we should do, I can accept. |
Do not forget we are NOT always running kube-proxy in container. init container may don't work in such case. |
@Lion-Wei can we close this? |
@rramkumar1 . I'd like leave this open until our work finished, for reference or something else. : ) |
/close |
I think this can be closed. |
What this PR does / why we need it:
kube-proxy ipvs mode need load kernel modules
ip_vs
/ip_vs_*
. Now we do this in kube-proxy container, but have some safety concern. Had some discussion in this pr: #52003 . Mostly think use init container load kernel modules is a good idea.This pr to use init container load kernel modules instead of load those modules inside kube-proxy container, so that we can lower the kube-proxy container permission.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #54497
Special notes for your reviewer:
Release note: