-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Calico add-on #38169
Update Calico add-on #38169
Conversation
b24cc48
to
175fe62
Compare
Quoting #32625 (comment)
@vishh would you mind pointing me to the right place to add this? |
@kubernetes/sig-network |
nice! |
dbc960a
to
e2bf14f
Compare
Looks like the PR that removed the legacy "configure-cbr0" mode also removed some handy outgoing NAT behavior that this PR was depending on (or rather it's now kubenet only). Calico can do its own NAT (and usually doesn't rely on the above), but I haven't added support for that feature yet to the "etcdless" mode used in this PR, so that's the next step to move this forward. |
Is this still WIP? |
@roberthbailey Yes, sorry, I forgot about this guy. Just a few tweaks I need to make - will do it this week. |
e2bf14f
to
1919cda
Compare
It's still a bit icky. I can't mount How do we feel about making GCI and debian both look in |
Probably better to put it in |
1919cda
to
7823db6
Compare
@freehan I've left it as Do you think it's OK to change? |
7823db6
to
d7a68f9
Compare
This PR currently wants this: #41943 |
@k8s-bot gce etcd3 e2e test this |
/lgtm |
kubernetes.io/cluster-service: "true" | ||
k8s-app: calico-node | ||
annotations: | ||
scheduler.alpha.kubernetes.io/critical-pod: '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this runs in a cluster w/o alpha features enabled, do these annotations get ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
- name: CALICO_NETWORKING_BACKEND | ||
value: "none" | ||
- name: CALICO_IPV4POOL_CIDR | ||
value: "10.244.0.0/16" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this CIDR range do? Should it be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CIDR range is used by Calico to perform outgoing NAT for Pods to access the internet, etc and is required as of k8s v1.5 since the kubelet stopped doing NAT for CNI plugins.
It probably should be configurable if the cluster CIDR is configurable (they should have the same value). I think that would mean making this file into a template.
@caseydavenport -- are you still working on this PR? The labels indicate that you need to set a release note before it will merge. |
@roberthbailey sorry for the delay in response, have been out of the office for a few weeks. I've already got a release note in the description, but I'd like to rebase and retest this before merging. |
0fca57a
to
b17b394
Compare
@caseydavenport Any chance we can get this in this week? |
@dnardo just spun another cluster up today with this and noticed that something has changed in the last rebase that is causing the master components to fail. I need to investigate the cause of that - will post back with more information. I'm really hoping to get this merged this week provided I can sort out this new issue. |
Interestingly enough, it looks like the problem I'm seeing is that the kubelet is trying to evict the controller manager upon installation of Calico on the node.
Not sure yet what the right fix is - might just be to stop running Calico on the master somehow, or adjust resource requests, or.. something else? |
@dnardo Yeah, can confirm that this works just swell when I set |
ff012f5
to
c100e5a
Compare
@dnardo @roberthbailey I think this is ready to go in now. |
lgtm |
cluster/gce/config-default.sh
Outdated
@@ -122,7 +122,7 @@ ENABLE_CLUSTER_MONITORING="${KUBE_ENABLE_CLUSTER_MONITORING:-influxdb}" | |||
# of fluentd running on a node, kubelet need to mark node on which | |||
# fluentd is not running as a manifest pod with appropriate label. | |||
# TODO(piosz): remove this in 1.8 | |||
NODE_LABELS="${KUBE_NODE_LABELS:-beta.kubernetes.io/fluentd-ds-ready=true}" | |||
NODE_LABELS="${KUBE_NODE_LABELS:-beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/calico-ds-ready=true}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to make this conditional on NETWORK_POLICY_PROVIDER=calico ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this be beta.kubernetes.io/calico-ds-ready=true
? Shouldn't it be something like: beta.calicoproject.org/ds-ready=true
?
I guess my question comes down to 'who defined this as 'beta' and 'why does it belong in the kubernetes.io org?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c100e5a
to
2b9b7f0
Compare
/release-note |
@dnardo @roberthbailey @eparis I think I've made all the markups - please take a look. Thanks! |
|
||
# Replace the cluster cidr. | ||
local -r calico_file="${dst_dir}/calico-policy-controller/calico-node.yaml" | ||
sed -i -e "s@__CLUSTER_CIDR__@${CLUSTER_IP_RANGE:-'10.244.0.0/16'}@g" "${calico_file}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the default value here. It doesn't match the value in config-default.sh (https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/config-default.sh#L89) and I'm wondering if this script should just fail if the CIDR isn't set rather than falling back to a value that will likely result in a non-functional cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's wrong - it should just use the value in config-default.sh.
# container programs network policy and routes on each | ||
# host. | ||
- name: calico-node | ||
image: calico/node:v1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this pulls from dockerhub. Most (all?) of the other cluster addons pull from gcr.io, so it'd be nice to use that here as well for consistency.
2b9b7f0
to
63744a8
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: caseydavenport, freehan, roberthbailey
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@caseydavenport: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue |
What this PR does / why we need it:
Updates Calico to the latest version using self-hosted install as a DaemonSet, removes Calico's dependency on etcd.
Maybe:
Which issue this PR fixes:
#32625
Try it out
Clone the PR, then:
Release note: