Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parametrize the kube-dns --federations command line argument in the manifest #27986

Merged
merged 3 commits into from
Jun 24, 2016

Conversation

madhusudancs
Copy link
Contributor

This parameter is later substituted with the environment variable during
the build process.

cc @kubernetes/sig-cluster-federation

Analytics

…anifest.

This parameter is later substituted with the environment variable during
the build process.
@madhusudancs madhusudancs added area/cluster-federation release-note-none Denotes a PR that doesn't merit a release note. labels Jun 23, 2016
@madhusudancs madhusudancs added this to the v1.3 milestone Jun 23, 2016
@ghost
Copy link

ghost commented Jun 23, 2016

I don't know enough about SALT to review this competently.
@madhusudancs Do you need to provide a default value for FEDERATIONS_DOMAIN_MAP somewhere?

@k8s-github-robot k8s-github-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 23, 2016
@@ -75,6 +75,7 @@ spec:
# command = "/kube-dns"
- --domain=$DNS_DOMAIN.
- --dns-port=10053
$FEDERATIONS_DOMAIN_MAP

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break the .sed consumers.

Also, should it be `--federations=$FEDERATIONS_DOMAIN_MAP`` ?

We might want to leave this change out till we get some federation testing on systems that use the .sed files.

Copy link
Contributor Author

@madhusudancs madhusudancs Jun 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair argument. I will blank out it out in transforms2sed.

@girishkalele
Copy link

It is expected that during the build step if the FEDERATION environment variable is set to true, then the FEDERATIONS_DOMAIN_MAP variable should be set ?

Maybe the if statement should check for FEDERATIONS_DOMAIN_MAP to be non-empty in addition to FEDERATION==true.

@@ -949,6 +949,9 @@ function kube::release::package_kube_manifests_tarball() {
objects=$(cd "${KUBE_ROOT}/cluster/saltbase/salt/kube-dns" && find . \( -name \*.yaml -or -name \*.yaml.in -or -name \*.json \) | grep -v demo)
mkdir -p "${dst_dir}/dns"
tar c -C "${KUBE_ROOT}/cluster/saltbase/salt/kube-dns" ${objects} | tar x -C "${dst_dir}/dns"
if [[ "${FEDERATION:-}" == "true" ]]; then
sed -i 's/{{ pillar\['"'"'federations_domain_map'"'"'\] }}/- --federations="'"${FEDERATIONS_DOMAIN_MAP}"'"/g' "${dst_dir}/dns/skydns-rc.yaml.in"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass FEDERATION_NAME and DNS_ZONE_NAME to federation-controller-manager. To keep kube-dns and controller manager flags in sync, I guess FEDERATIONS_DOMAIN_MAP should be ${FEDERATION_NAME}=${DNS_ZONE_NAME}?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@madhusudancs
Copy link
Contributor Author

madhusudancs commented Jun 24, 2016

It is expected that during the build step if the FEDERATION environment variable is set to true, then the FEDERATIONS_DOMAIN_MAP variable should be set ?

Maybe the if statement should check for FEDERATIONS_DOMAIN_MAP to be non-empty in addition to FEDERATION==true.

This was intentionally left out. --federation="" is a valid flag value. But based on your comment, @nikhiljindal's comment and having talked to @quinton-hoole, everybody other than me is thinking a little differently about this, I will do this slightly differently.

@girishkalele
Copy link

Discussed with @madhusudancs that baking in the federation parameters into the build artifacts is a short-term measure - we need to move this to the GCE VM startupscripts that will substitute based on instance metadata.

@madhusudancs
Copy link
Contributor Author

@girishkalele @quinton-hoole @nikhiljindal addressed the comments. PTAL.

@girishkalele
Copy link

Salt/yaml changes look good to me - as discussed, need to move baking in of federation names in a future PR.

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 24, 2016
# during cluster bootstrap.
if [[ "${FEDERATION:-}" == "true" ]]; then
FEDERATIONS_DOMAIN_MAP="${FEDERATIONS_DOMAIN_MAP:-}"
if [[ -z "${FEDERATIONS_DOMAIN_MAP}" && -n "${FEDERATION_NAME:-}" && -n "${DNS_ZONE_NAME:-}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not certain about the order of these shell scripts, but we set FEDERATION_NAME and DNS_ZONE_NAME in

export FEDERATION_NAME="${FEDERATION_NAME:-federation}"
which I think runs after this code runs.
Maybe move that code to set these env vars here?
Or if the order is not certain, we will have to set the default values for FEDERATION_NAME and DNS_ZONE_NAME at both places?

Copy link
Contributor Author

@madhusudancs madhusudancs Jun 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhiljindal The operation that defaults DNS_ZONE_NAME (and FEDERATION_NAME) in federation/cluster/common.sh is independent of the operation here. This code here is part of the build and the operation that defaults DNS_ZONE_NAME is part of the deploy. We will have to default the same value twice if we want to default those values.

I don't like defaulting DNS_ZONE_NAME and FEDERATION_NAME here (as opposed to my general inclination towards reasonable defaults) for two reasons:

  1. Doing that will entirely eliminate the users' ability to set those values later during cluster deployment, i.e. during the salt parameter substitution time.
  2. It is not possible to get the e2e tests passing with the current defaults anyway and if a user wants to get their tests to pass they will have to supply non-default values, i.e. the domain name that they own. So there is no reasonable default here.

For the automated tests however, we will provide the FEDERATIONS_DOMAIN_MAP value directly to the Jenkins job. See kubernetes/test-infra#211. So that should take care of the automated tests case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe its because I dont know enough about salt: but why would someone want to override it using salt rather than doing it using env var? The advantage of doing it using env var is that it sets the right values in both kube-dns and the controller manager. (salt will only set it in kube-dns).
The problem I see with current setup is that it can happen that I dont set the values anywhere (neither in env var nor in salt), my cluster bring up will work fine. Just that my DNS resolution for federation queries will not work.
If we decide not to set default values for FEDERATION_NAME and DNS_ZONE_NAME, then cluster bring up script should throw an error if FEDERATION=true and FEDERATION_NAME or DNS_ZONE_NAME are not specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe its because I dont know enough about salt: but why would someone want to override it using salt rather than doing it using env var?

The code isn't there right now and it could go in a follow up PR, but conceivably this parameter could be replaced by an environment variable on the master node when the master node is configured. That's why I wanted to leave it as is. I am not doing that right now because it caused other problems. But that should be followed up.

If we decide not to set default values for FEDERATION_NAME and DNS_ZONE_NAME, then cluster bring up script should throw an error if FEDERATION=true and FEDERATION_NAME or DNS_ZONE_NAME are not specified.

Sure, the bring up script should throw an error if the values aren't there. We should probably do that. I am not entirely sure if our defaulting is the right thing to do. It makes the tests fail semi-silently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes as discussed, we should do one: default to a reasonable value or throw an error.
Looks like you dont like defaulting so we can throw an error :)

Dont want to block this PR. Lets get this PR merged to get the tests passing.
You can send a follow up PR.

@mml
Copy link
Contributor

mml commented Jun 24, 2016

Some dns-related tests are failing. I'm guessing the kube-dns yaml file is somehow broken. Let me look at the artifacts.

@mml
Copy link
Contributor

mml commented Jun 24, 2016

@k8s-bot e2e test this issue: #IGNORE

@mml
Copy link
Contributor

mml commented Jun 24, 2016

I brought up a test cluster using this branch, and I got no DNS created at all...

% kc -q --context=mml-e2e_e2e get rc,rs,deployments,pod --namespace=kube-system | grep -Pv 'minion-group|node-problem-detector'
NAME                                          DESIRED   CURRENT   AGE
elasticsearch-logging-v1                      2         2         1m
kibana-logging-v1                             1         1         1m
kubernetes-dashboard-v1.1.0                   1         1         1m
l7-default-backend-v1.0                       1         1         1m
monitoring-influxdb-grafana-v3                1         1         1m
NAME                                          DESIRED   CURRENT   AGE
heapster-v1.1.0-3210777732                    0         0         1m
heapster-v1.1.0-4105671804                    1         1         1m
NAME                                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
heapster-v1.1.0                               1         1         1            1           1m
NAME                                          READY     STATUS    RESTARTS     AGE
elasticsearch-logging-v1-3eqmj                1/1       Running   0            1m
elasticsearch-logging-v1-z2ell                1/1       Running   0            1m
etcd-server-e2e-master                        1/1       Running   0            5m
etcd-server-events-e2e-master                 1/1       Running   0            5m
heapster-v1.1.0-4105671804-1x8dh              4/4       Running   0            1m
kibana-logging-v1-bughm                       1/1       Running   2            1m
kube-addon-manager-e2e-master                 1/1       Running   0            5m
kube-apiserver-e2e-master                     1/1       Running   2            5m
kube-controller-manager-e2e-master            1/1       Running   2            6m
kube-scheduler-e2e-master                     1/1       Running   0            5m
kubernetes-dashboard-v1.1.0-aosto             1/1       Running   0            1m
l7-default-backend-v1.0-creow                 1/1       Running   0            1m
l7-lb-controller-v0.7.0-e2e-master            1/1       Running   0            6m
monitoring-influxdb-grafana-v3-n0cac          2/2       Running   0            1m

@girishkalele
Copy link

There is no startupscript.log (or its GCI equivalent) in the kube-master node artifacts and no addon manager logs to indicate whether it started the kube-dns addon correctly.
Makes it hard to post-mortem.

@mml
Copy link
Contributor

mml commented Jun 24, 2016

On the master, this is in skydns-rc.yaml.

        args:
        # command = "/kube-dns"
        - --domain=cluster.local.
        - --dns-port=10053
        {{ pillar['federations_domain_map'] }}

FEDERATIONS_DOMAIN_MAP="${FEDERATION_NAME}=${DNS_ZONE_NAME}"
fi
if [[ -n "${FEDERATIONS_DOMAIN_MAP}" ]]; then
sed -i 's/{{ pillar\['"'"'federations_domain_map'"'"'\] }}/- --federations="'"${FEDERATIONS_DOMAIN_MAP}"'"/g' "${dst_dir}/dns/skydns-rc.yaml.in"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't redirecting stdout anywhere. So this change goes nowhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, no. sed -i

@mml
Copy link
Contributor

mml commented Jun 24, 2016

That did it, @madhusudancs

% kcs --context=mml-e2e_e2e describe rc -l k8s-app=kube-dns 
krun: Using _output/dockerized/bin
Name:           kube-dns-v16
Namespace:      kube-system
Image(s):       gcr.io/google_containers/kubedns-amd64:1.4,gcr.io/google_containers/kube-dnsmasq-amd64:1.3,gcr.io/google_containers/exechealthz-amd64:1.0
Selector:       k8s-app=kube-dns,version=v16
Labels:         k8s-app=kube-dns
                kubernetes.io/cluster-service=true
                version=v16
Replicas:       1 current / 1 desired
Pods Status:    1 Running / 0 Waiting / 0 Succeeded / 0 Failed
No volumes.
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                            -------------   --------        ------                  -------
  6m            6m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: kube-dns-v16-gn6gf

@madhusudancs
Copy link
Contributor Author

@mml oh cool! Thanks for verifying!

@k8s-bot
Copy link

k8s-bot commented Jun 24, 2016

GCE e2e build/test passed for commit 3ee03a5.

@mml mml added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 24, 2016
@madhusudancs madhusudancs added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 24, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 24, 2016

GCE e2e build/test passed for commit 3ee03a5.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@erictune
Copy link
Member

Setting parameters in salt doesn't seem like something that belongs in build.sh.

@madhusudancs
Copy link
Contributor Author

@erictune that's a fair argument. Planning to move this to node bootstrap script either in the next patch release or 1.4.

@girishkalele
Copy link

@erictune

Yes, discussed this with @madhusudancs earlier -
#27986 (comment)

@luxas
Copy link
Member

luxas commented Jun 26, 2016

Please please please take breaking changes like this seriously and fix it up in all places, @MikeSpreitzer already fixed it up for ubuntu, but the new, unset variable is still there for the most of the other deployments that are using this file :/, including hyperkube.

I'm gonna send a fast patch for hyperkube now, but take the whole repo into account the next time a change is made to make sure it's backwards-compatible or update the files that are referencing the skydns manifest.

@MikeSpreitzer
Copy link
Member

Yeah, I opened #28073 because I noticed there is other breakage.

k8s-github-robot pushed a commit that referenced this pull request Jun 27, 2016
Automatic merge from submit-queue

Hotfix: Fixup the hyperkube dns manifest from a breaking federation PR

This one has to be cherrypicked into v1.3.0, ref: #27986

@girishkalele @madhusudancs @quinton-hoole @eparis @nikhiljindal @mml @roberthbailey @david-mcmahon
@erictune
Copy link
Member

I'm going to hold off on taking this until you've moved the code somewhere appropriate and have dealt with the other repos.

@madhusudancs
Copy link
Contributor Author

@erictune fair enough.

Btw, this PR is already in 1.3 branch.

$ git branch --contains 3ee03a
  master
* release-1.3

You probably have to revert it.

@erictune
Copy link
Member

Okay

eparis pushed a commit to eparis/kubernetes that referenced this pull request Jun 29, 2016
Automatic merge from submit-queue

Hotfix: Fixup the hyperkube dns manifest from a breaking federation PR

This one has to be cherrypicked into v1.3.0, ref: kubernetes#27986

@girishkalele @madhusudancs @quinton-hoole @eparis @nikhiljindal @mml @roberthbailey @david-mcmahon
(cherry picked from commit d9cecdb)
@madhusudancs madhusudancs deleted the fed-kubedns-flags branch July 7, 2016 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.