-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for kube-up.sh to deploy Calico network policy to GCI masters #29037
Add support for kube-up.sh to deploy Calico network policy to GCI masters #29037
Conversation
@@ -520,10 +520,21 @@ function start-etcd-servers { | |||
rm -f /etc/init.d/etcd | |||
fi | |||
prepare-log-file /var/log/etcd.log | |||
prepare-etcd-manifest "" "4001" "2380" "200m" "etcd.manifest" | |||
prepare-etcd-manifest "" "4001" "2380" "150m" "etcd.manifest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this change
Thanks for taking a look. Those two changes reduce the CPU resources reserved for the existing etcd clusters to provide room for Calico. There's virtually nothing spare (~40m IIRC) to be able to run more pods on the master, so I had to reduce some existing allocations to make room. There are a few other options:
I'm open to any of these (or something else)? |
Anything I can do to help you review this @zmerlynn? |
Sorry, I was out when originally assigned. I don't think we want to unilterally drop the CPU assignments on the master here - even if there's spare CPU in those containers right now, we need to make sure we leave room for growth in the future. Could possibly do this as a conditional based on the network provider. |
Was there some systematic process for determining the CPU allocations in the first place? I'd be perfectly happy to go with one of the alternative approaches as well. Please let me know what you think. |
Sorry to ping you all again @thockin, @mikedanese, @fabioy - I'd really like to get this closed down and in :). It seems like the main point of contention is the CPU resource allocations. To get things in for now, how about we go with 2 above (remove CPU requests for the Calico components altogether)?
Would this be acceptable? |
function start-calico-policy-controller { | ||
local -r temp_file="/tmp/calico-policy-controller.manifest" | ||
cp "${KUBE_HOME}/kube-manifests/kubernetes/gci-trusty/calico-policy-controller.manifest" "${temp_file}" | ||
PUBLIC_IP=$(ifconfig eth0 | grep 'inet ' | awk '{print $2}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I run this locally I get "addr:172.17.128.66" but when I run it on GCI I get a bare IP. This feels fragile. how about hostname -i
? That is fragile in a different way...
wrt CPU allocations on the master, I don't think it was particularly scientific, but this does raise some fun problems. I can't actually guarantee that all combinations of addons will "fit" on the master. Can this run as a "regular" pod in the cluster? Running with very minimal CPU needs means it might starve. Is that OK? what is the failure mode? @mikedanese too since this touches on turnup |
Thanks for the feedback @thockin :). I'll make the minor adjustments you suggest (other than the Jinja syntax question, which I think I've addressed in reply: shout if you disagree). Now onto the CPU allocations / node choice: Running calico-policy-controller on a regular node:
Overall it's quite difficult; it might be possible, but it's by no means a trivial move. It's a central cluster service, so I don't think it's too inappropriate to run on the master in that respect. As for failure modes, if calico-policy-controller receives no CPU time, the calico-nodes will keep everything as it is until they can get new information from the Calico etcd. That is: existing pods will be allowed to talk to the things they were already allowed to, and new pods won't be able to reach anything. In practice, since the master components all have requests rather than limits, I'm not even sure if omitting requests for the calico-policy-controller would make any real difference? |
Service IP should work. For storage we have persistent volumes and
We're going to have this problem in general with network
When you run as a pod we provide the secrets you need. All you have
I don't think it's INappropriate, but as you're experiencing, the
@davidopp for ideas. What do we do when the master is full? Do we
If you specify a request with no limit, limit == request. That is |
value: "false" | ||
resources: | ||
requests: | ||
cpu: 40m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right number?
The optimal deployment would be a daemonset for the node agent that uses the kubernetes API to store state. It's really the only way to reduce deployment of calico across all kubernetes clusters to a single command, and I think that's what we should strive for. Last time I talked with you guys, you mentioned that this was planned? How far off is that? kube-flannel does this which is merging int flannel proper as an alternativly subnet managers. It removes the dependency on etcd or flannel server.
use the apiserver :)
this primarily happens during initial deployment, which doesn't sound like too big of an issue.
You can use a service account. Credentials to the apiserver are injected to all pods by default. If you are already using the go client (?) then it's a couple lines of code change to switch to InClusterConfig. |
@thockin @mikedanese OK, so I had a go at moving the policy controller to regular nodes. There's one thing I haven't yet got working, where I wanted to get some input on what the right approach is.
With a delayed kubectl create, this commit does work, but it's fairly apparent that there are still unpleasant things here.
I don't really mind what approach is taken here. I do care about getting this fixed sooner rather than later - the GCI master change broke Which of these approaches (first commit / second commit) should I go with? Using the kubernetes API directly isn't possible any time soon. |
Yes! This is something I've been looking into recently. Actually installing the Calico pieces using a DaemonSet is relatively straightforward. I've got a prototype working but it's still a little bit fragile. The problem of course is
We've looked into modifying Calico to use the k8s API as a replacement for
Yep, I agree. I'm not too worried about transient startup conditions.
Yep, the policy controller can already do this. |
To add to my previous comment: We want to keep moving down this "one-click install" path, but I don't think this PR is the place to do it all. The GCI master PR broke the We'll definitely want to do subsequent changes to make this "self-hosted" as we make that option more stable. |
+1 to Casey's comments. We should continue with the self-hosting one-click install work as a high priority (number one priority for us after this PR), but I don't think we should delay getting this fix merged in. If the main immediate concern is the extra load on the master then we can just make sure that the docs are super clear that there is a scale impact of enabling policy - and if they are an at scale deployment (e.g 1000 hosts) then they either should use larger instance for masters or they should follow the manual instructions in the Calico repo. I'm imagining we have the self-hosting one-click install ready to go in time for 1.4 so this is a short term impact anyway. |
(I'm going to make the addon-manager change; I'll post back when that's done.) |
@@ -0,0 +1,6 @@ | |||
# Maintainers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use an OWNERS file like elsewhere in the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I based this on the dashboard and node-problem-detector folders adjacent to this. I'm happy to change it: just pointing out that the closest equivalents currently look like this.
Which would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maintainers is fine
I'm concerned that we are going through feature creep on this PR. It was originally intended as a bug fix. If you pass NETWORK_POLICY_PROVIDER in then kube-up doesn't work in 1.3. We want to get this bug fix into the next 1.3 patch release. But it seems we've started adding increments towards the future vision of self-hosting into the PR which is delaying getting the fix into 1.3. We are already working on self-hosting and it is our highest priority task once this PR gets merged, but I don't think we should delay getting this PR merged on any of that work. |
@@ -878,6 +878,9 @@ function start-kube-addons { | |||
if echo "${ADMISSION_CONTROL:-}" | grep -q "LimitRanger"; then | |||
setup-addon-manifests "admission-controls" "limit-range" | |||
fi | |||
if [[ "${NETWORK_POLICY_PROVIDER}" == "calico" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a :-
to the variable expanision like the other checks.
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
0c01d25
to
074aacd
Compare
name: calico-etcd | ||
namespace: kube-system | ||
spec: | ||
clusterIP: 10.0.0.17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not catching this. We parametrize the same field in kube-dns service because it needs to fall into the service subnet. This may not work if people are not using the default service subnet. I think we can defer the fix for a follow up though.
Thanks, LGTM here. Last thing I'm unsure of is @caseydavenport 's review comment about resource request. Can you address? |
074aacd
to
cdec20e
Compare
Thanks @mikedanese - please hold off for a little longer: I'm tracking down an issue with API connectivity. I spoke with @caseydavenport - we're going to remove the request for the calico-nodes. |
cdec20e
to
f3e3fd4
Compare
OK @mikedanese this is ready to go from my end. We'll continue to enhance the integration (un-static podding the calico-nodes is fairly close), but this should be enough to cover the issue. That's an enhancement, so we'll do a separate PR. Let me know if you have any further comments; thanks :). |
Looking good to me! |
f3e3fd4
to
568fb74
Compare
GCE e2e build/test passed for commit 568fb74. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit 568fb74. |
Automatic merge from submit-queue |
Thanks @mikedanese :). |
Also remove requirement for calicoctl from Debian / salt installed nodes and clean it up a little by deploying calico-node with a manifest rather than calicoctl. This also makes it more reliable by retrying properly.
How to use:
One place where I was uncertain:
This change is