Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster dns stops functioning after 1.6 -> 1.5 downgrade #43668

Closed
pwittrock opened this issue Mar 25, 2017 · 27 comments
Closed

Cluster dns stops functioning after 1.6 -> 1.5 downgrade #43668

pwittrock opened this issue Mar 25, 2017 · 27 comments
Assignees
Milestone

Comments

@pwittrock
Copy link
Member

pwittrock commented Mar 25, 2017

I have been able to reproduce this twice.

Steps to reproduce:

  • Download kubernetes 1.5.5 tar, extract and cd into the directory
  • run cluster/get-kube-binaries.sh
  • run cluster/kube-up.sh
  • Run the steps defined here here
    • You don't need to wait for the HPA to react, just verify that the script says "OK!" a bunch of times to ensure it can find the servier
  • Download the v1.6.0-rc.1 release tar, extract, and cd into the directory
  • run cluster/get-kube-binaries.sh
  • Warning: Set the following env or you will not be able to downgrade later
export TARGET_STORAGE=etcd3
export ETCD_IMAGE=3.0.17
export TARGET_VERSION=3.0.17
export STORAGE_MEDIA_TYPE=application/json
  • run ./cluster/gce/upgrade.sh -M v1.6.0-rc.1
  • Run the steps defined here here to verify the Pod can see the service
  • run ./cluster/gce/upgrade.sh -N -o v1.6.0-rc.1 to upgrade the nodes
  • Run the steps defined here here to verify the Pod can see the service
  • run ./cluster/gce/upgrade.sh -N -o v1.5.5 to downgrade the nodes
  • Run the steps defined here here to verify the Pod can see the service
  • Warning: Set the following env to downgrade
export TARGET_STORAGE=etcd2
export ETCD_IMAGE=3.0.17
export TARGET_VERSION=2.2.1
export STORAGE_MEDIA_TYPE=application/json
  • run ./cluster/gce/upgrade.sh -M v1.5.5 to downgrade the master
  • Run the steps defined here here to verify the Pod can see the service
    • This is where things fail for me. I am not longer able to connect to the service through the DNS address.
wget php-apache.default.svc.cluster.local -q -O -
wget: bad address 'php-apache.default.svc.cluster.local'
  • I am able to connect to the service using its IP address
kubectl get services
NAME         CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
kubernetes   10.0.0.1      <none>        443/TCP   1h
php-apache   10.0.52.190   <none>        80/TCP    1h
wget -q -O - 10.0.52.190
OK!
@pwittrock
Copy link
Member Author

Following up, it looks like cluster dns is generally down, not just for services. I can't wget from google.com either or run apt-get update.

@pwittrock
Copy link
Member Author

@ethernetdan
Copy link
Contributor

@pwittrock working to recreate now, can you post the DNS logs if they exist?

@pwittrock
Copy link
Member Author

Output from Pod trying to use cluster dns

/# wget -q -O- http://php-apache.default.svc.cluster.local
wget: bad address 'php-apache.default.svc.cluster.local'
/ # cat /etc/resolv.conf 
search default.svc.cluster.local svc.cluster.local cluster.local c.pwittroc-1180.internal. google.internal.
nameserver 10.0.0.10
options ndots:5
kubectl get services --namespace=kube-system
NAME                   CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
default-http-backend   10.0.84.48     <nodes>       80:30114/TCP        5h
heapster               10.0.152.163   <none>        80/TCP              5h
kube-dns               10.0.0.10      <none>        53/UDP,53/TCP       5h
kubectl describe services kube-dns --namespace=kube-system
Name:			kube-dns
Namespace:		kube-system
Labels:			k8s-app=kube-dns
			kubernetes.io/cluster-service=true
			kubernetes.io/name=KubeDNS
Selector:		k8s-app=kube-dns
Type:			ClusterIP
IP:			10.0.0.10
Port:			dns	53/UDP
Endpoints:		10.244.7.4:53
Port:			dns-tcp	53/TCP
Endpoints:		10.244.7.4:53
Session Affinity:	None
No events.

Then change /etc/resolve.conf entry from 10.0.0.10 to 10.244.7.4:

wget -q -O- http://php-apache.default.svc.cluster.local
OK!/

So the issue appears to be with the service routing not actually having been updated.

@justinsb
Copy link
Member

Hypothesis: when you downgrade the master, the new etcd2 resource versions are lower, and kube-proxy will ignore changes until the resource version "catches up". I wonder if there's anything in the kube-proxy logs, and I wonder if a restart of kube-proxy on the node will fix it.

@pwittrock
Copy link
Member Author

I tried kicking all of the kube-proxies on each of the nodes - no luck.
I then tried creating a new service called "kube-dns-2" that had the same settings, but different ip - no luck.
As a sanity check, tried updating the resolv.conf to match the Pod ip directly again - things work again.

@justinsb
Copy link
Member

That is interesting, as it possibly points to something to do with kube-proxy rather than with the API. Does the gce node upgrade procedure build new instances, or is it a "hot push"?

Another hypothesis: maybe kube-proxy no longer has permissions on the API?

The kube-proxy logs (/var/log/kube-proxy.log I think) and iptables -t nat --list-rules may be interesting.

(-t nat because the NAT table is where most of the magic happens, and --list-rules because that is the format I am more used to :-) )

@ethernetdan
Copy link
Contributor

Have been trying to reproduce this - ran into similar DNS issues with the 1.5.5 cluster even before upgrade on a few different attempts. Going to start from scratch and try again.

@ethernetdan
Copy link
Contributor

ethernetdan commented Mar 26, 2017

Recreated the issue on 2 clusters. Seemed to work for a few moments on both before stopping. I also was having serious DNS problems on the 1.5 cluster but they seemed to be fixed by increasing the replicas on the DNS autoscaler deployment.

Cluster 1

kubectl --namespace=kube-system get endpoints
NAME                      ENDPOINTS                         AGE
default-http-backend      10.128.0.2:8080,10.244.8.4:8080   1h
heapster                  10.244.9.7:8082                   1h
kube-controller-manager   <none>                            43m
kube-dns                  10.244.8.17:53,10.244.8.17:53     1h
kube-scheduler            <none>                            43m
kubernetes-dashboard      10.244.9.4:9090                   1h
monitoring-grafana        10.244.7.9:3000                   1h
monitoring-influxdb       10.244.7.9:8086,10.244.7.9:8083   1h
~$ sudo iptables -t nat --list-rules | grep dns
-A KUBE-SEP-AAIBWZFGLDNIOXPJ -s 10.244.7.8/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-AAIBWZFGLDNIOXPJ -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.7.8:53
-A KUBE-SEP-LMCFD7F2JBFHOYE7 -s 10.244.7.8/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-LMCFD7F2JBFHOYE7 -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.7.8:53
-A KUBE-SERVICES ! -s 10.244.0.0/14 -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/14 -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-AAIBWZFGLDNIOXPJ
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-LMCFD7F2JBFHOYE7
kubectl --namespace=kube-system get svc
NAME                   CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
default-http-backend   10.0.15.86     <nodes>       80:31804/TCP        1h
heapster               10.0.155.168   <none>        80/TCP              1h
kube-dns               10.0.0.10      <none>        53/UDP,53/TCP       1h
kubernetes-dashboard   10.0.196.87    <none>        80/TCP              1h
monitoring-grafana     10.0.97.156    <none>        80/TCP              1h
monitoring-influxdb    10.0.242.86    <none>        8083/TCP,8086/TCP   1h

Cluster 2

kubectl --namespace=kube-system get endpoints
NAME                      ENDPOINTS                         AGE
default-http-backend      10.138.0.2:8080,10.244.6.7:8080   1h
heapster                  10.244.6.8:8082                   1h
kube-controller-manager   <none>                            4m
kube-dns                  10.244.5.3:53,10.244.5.3:53       1h
kube-scheduler            <none>                            4m
kubernetes-dashboard      10.244.6.6:9090                   1h
monitoring-grafana        10.244.7.4:3000                   1h
monitoring-influxdb       10.244.7.4:8086,10.244.7.4:8083
~$ sudo iptables -t nat --list-rules | grep dns
-A KUBE-SEP-5YVXBBORU6YR62W3 -s 10.244.7.6/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-5YVXBBORU6YR62W3 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.7.6:53
-A KUBE-SEP-ECPWESCXHU7NIICT -s 10.244.7.6/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-ECPWESCXHU7NIICT -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.7.6:53
-A KUBE-SERVICES ! -s 10.244.0.0/14 -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/14 -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-5YVXBBORU6YR62W3
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-ECPWESCXHU7NIICT
kubectl --namespace=kube-system get svc
NAME                   CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
default-http-backend   10.0.68.98     <nodes>       80:31471/TCP        1h
heapster               10.0.43.47     <none>        80/TCP              1h
kube-dns               10.0.0.10      <none>        53/UDP,53/TCP       1h
kubernetes-dashboard   10.0.232.190   <none>        80/TCP              1h
monitoring-grafana     10.0.6.91      <none>        80/TCP              1h
monitoring-influxdb    10.0.85.201    <none>        8083/TCP,8086/TCP   1h

@ethernetdan
Copy link
Contributor

Looks like the permissions theory is the winner:
/var/log/kube-proxy.log

E0326 11:44:16.414907       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)
E0326 11:44:17.362300       5 reflector.go:188] pkg/proxy/config/api.go:30: Failed to list *api.Service: the server does not allow access to the requested resource (get services)
E0326 11:44:17.416435       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)
E0326 11:44:18.363607       5 reflector.go:188] pkg/proxy/config/api.go:30: Failed to list *api.Service: the server does not allow access to the requested resource (get services)
E0326 11:44:18.417973       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)
E0326 11:44:19.365039       5 reflector.go:188] pkg/proxy/config/api.go:30: Failed to list *api.Service: the server does not allow access to the requested resource (get services)
E0326 11:44:19.419445       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)
E0326 11:44:20.366330       5 reflector.go:188] pkg/proxy/config/api.go:30: Failed to list *api.Service: the server does not allow access to the requested resource (get services)
E0326 11:44:20.420800       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)
E0326 11:44:21.367868       5 reflector.go:188] pkg/proxy/config/api.go:30: Failed to list *api.Service: the server does not allow access to the requested resource (get services)
E0326 11:44:21.422240       5 reflector.go:188] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: the server does not allow access to the requested resource (get endpoints)

@ethernetdan
Copy link
Contributor

/var/log/kube-apiserver.log

I0326 11:54:28.471708       5 panics.go:76] GET /api/v1/endpoints?resourceVersion=0: (61.51<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.3:43226]
I0326 11:54:28.472030       5 panics.go:76] GET /api/v1/services?resourceVersion=0: (46.759<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.4:37322]
I0326 11:54:28.520539       5 panics.go:76] GET /api/v1/services?resourceVersion=0: (64.52<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.3:43226]
I0326 11:54:28.520865       5 panics.go:76] GET /api/v1/endpoints?resourceVersion=0: (21.853<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.4:37322]
E0326 11:54:28.550293       5 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list *rbac.ClusterRoleBinding: no kind "ClusterRoleBinding" is registered for version "rbac
.authorization.k8s.io/v1beta1"
I0326 11:54:28.584839       5 panics.go:76] GET /api/v1/namespaces/kube-system/pods/kube-dns-2078123902-xw3p4: (1.365774ms) 200 [[rescheduler/v0.0.0 (linux/amd64) kubernetes/$Format] 127.0.0.1:33022]
E0326 11:54:28.644863       5 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list *rbac.Role: no kind "Role" is registered for version "rbac.authorization.k8s.io/v1beta
1"
I0326 11:54:28.700518       5 panics.go:76] GET /api/v1/services?resourceVersion=0: (53.612<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.5:39122]
I0326 11:54:28.718371       5 panics.go:76] GET /api/v1/endpoints?resourceVersion=0: (44.152<C2><B5>s) 403 [[kube-proxy/v1.5.5 (linux/amd64) kubernetes/894ff23] 10.138.0.5:39122]

Looks like the api server is still trying to use beta rbac for some reason

@pwittrock
Copy link
Member Author

@ethernetdan Good catch.

@pwittrock
Copy link
Member Author

@ethernetdan @calebamiles I am surprised that this was not already caught during a downgrade test of services. How did we perform downgrade testing this release (did we at all?)?

@pwittrock
Copy link
Member Author

@ethernetdan Was the version or group of rbac changed in the 1.6? I wonder if kube-proxy uses the discovery service to figure out which version of rbac to use, and is caching the discovery version from the 1.6 server.

After downgrading, try deleting the nodes and let the node pool recreate them. Check if they are healthy afterward.

Also check the FIRST version of rbac listed by discovery service in 1.5 vs 1.6. Is it alpha in 1.5 and beta in 1.6?

@ethernetdan
Copy link
Contributor

ethernetdan commented Mar 26, 2017

@ethernetdan @calebamiles I am surprised that this was not already caught during a downgrade test of services. How did we perform downgrade testing this release (did we at all?)?

I'm confused by this as well, this should have been able to be caught in our large cluster manual testing as well as in automated tests. @wojtek-t @krousey any ideas why we are just seeing this now?

@ethernetdan
Copy link
Contributor

Didn't mean to close.

After downgrading, try deleting the nodes and let the node pool recreate them. Check if they are healthy afterward.

Interesting thought but it didn't seem to help. Maybe it could be cached somehow on the API server?

Also check the FIRST version of rbac listed by discovery service in 1.5 vs 1.6. Is it alpha in 1.5 and beta in 1.6?

RBAC did move to beta in 1.6, I'm wondering if this is a side effect of making ABAC default again. @liggitt @bgrant0607 @ericchiang any thoughts?

@liggitt
Copy link
Member

liggitt commented Mar 26, 2017

I wonder if kube-proxy uses the discovery service to figure out which version of rbac to use

No, API clients don't inspect RBAC versions to determine how they should authorize… that's purely used server side.

Also check the FIRST version of rbac listed by discovery service in 1.5 vs 1.6. Is it alpha in 1.5 and beta in 1.6?

A 1.5 apiserver would only serve alpha RBAC (it doesn't have the beta code), and like all alpha versions, does not enable the API by default. I'd expect no RBAC versions listed in discovery for a 1.5 apiserver.

1.5 kube-up installations don't use RBAC authz, their only authorizer has always been ABAC.

I'm wondering if this is a side effect of making ABAC default again

The recent change to leave ABAC enabled by default in 1.6 has no effect on a downgrade to 1.5, since it only changed 1.6 install scripts.

@ethernetdan
Copy link
Contributor

Here are the API server logs.

@ethernetdan
Copy link
Contributor

kubectl api-versions:

apps/v1beta1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1beta1
autoscaling/v1
batch/v1
certificates.k8s.io/v1alpha1
extensions/v1beta1
policy/v1beta1
rbac.authorization.k8s.io/v1alpha1
storage.k8s.io/v1beta1
v1

@ethernetdan
Copy link
Contributor

ethernetdan commented Mar 26, 2017

@liggitt identified that the 1.5 upgrade script is not modifying /etc/srv/kubernetes/known_tokens.csv to ensure that the user kube_proxy is associated with a token specific to kube-proxy which is supposed to be hardcoded in 1.5.

/etc/srv/kubernetes/known_tokens.csv:

o3DXSarJd06DspecUHYyDRDxSOpIIZvb,admin,admin,system:masters
BoAnj2dd7Miefnee1ZNGE3s71qCbxA0m,system:kube-controller-manager,uid:system:kube-controller-manager
3ZP9SvR23S0iqaFRY6sTmzMC15nqPWhW,system:kube-scheduler,uid:system:kube-scheduler
rCqQG3oyRG2gUJt6x8OrEll8MDBMGBz5,kubelet,uid:kubelet,system:nodes
NtdGElxrNovFYiQubYNucw0cXiRkeHTu,system:kube-proxy,uid:kube_proxy

@ethernetdan
Copy link
Contributor

Also seeing this in downgrade jobs logs: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-1.6-1.5-downgrade-cluster/183/artifacts/bootstrap-e2e-master/kube-apiserver.log

It seems the service test doesn't get run after the master downgrade for some reason so it wasn't caught

@pwittrock
Copy link
Member Author

Is it correct that this is an issue in 1.5 then and would be fixed with a patch release?

We should figure out the scope of what downgrade tests are not being run and get them running.

@ethernetdan
Copy link
Contributor

Is it correct that this is an issue in 1.5 then and would be fixed with a patch release?

That's definitely an option, others would be having a migration script in 1.6 or keeping backwards compatibility with the hardcoded username.

We should figure out the scope of what downgrade tests are not being run and get them running.

Agreed, I want to get to the bottom of why we didn't catch this earlier

@liggitt
Copy link
Member

liggitt commented Mar 26, 2017

Is it correct that this is an issue in 1.5 then and would be fixed with a patch release?

It is an issue in 1.5, and should be fixed with a patch release.

1.5 hard-codes authorization rules, but if the known_tokens.csv file already exists, it does not ensure it works with those authorization rules.

This issue was fixed in 1.6 so that whether you are upgrading to 1.6 or downgrading to 1.6, the configured tokens identify users that have the correct permissions.

I want to get to the bottom of why we didn't catch this earlier

In 1.5, the scheduler and controller manager used the insecure port (which means they bypass authentication and authorization entirely, and weren't affected by tokens identifying permissionless users when downgrading from 1.6):

That means the only affected component was kube-proxy.

If a downgrade to 1.5 ran in an environment that was missing the known_tokens.csv file, it would be recreated with the 1.5-expected users, masking the issue. Are our automated downgrade tests doing something that would result in that file being missing and getting recreated? The manual procedure is preserving it (correctly, as expected), revealing the issue.

k8s-github-robot pushed a commit that referenced this issue Mar 27, 2017
Automatic merge from submit-queue

kube-up: ensure tokens file is correct on upgrades/downgrades

Fixes #43668

1.5 [hard-codes authorization rules](https://github.com/kubernetes/kubernetes/blob/release-1.5/cluster/gce/gci/configure-helper.sh#L915-L920), but if the `known_tokens.csv` file already exists, it [does not ensure it works with those authorization rules](https://github.com/kubernetes/kubernetes/blob/release-1.5/cluster/gce/gci/configure-helper.sh#L264).

```release-note
kube-up (with gce/gci and gce/coreos providers) now ensures the authentication token file contains correct tokens for the control plane components, even if the file already exists (ensures upgrades and downgrades work successfully)
```

This issue was fixed in 1.6 for the gce and coreos providers. This PR picks those fixes for the control plane elements from these commits:
* 968b0b3
* d94bb26
@liggitt
Copy link
Member

liggitt commented Mar 28, 2017

fixed in #43676

successfully upgraded v1.5.5 -> ci/latest-1.6 -> ci/latest-1.5 and ensured kube-proxy and kube-dns were functioning correctly after upgrade and downgrade, and tokens file contained correct control plane users after each step

@ethernetdan
Copy link
Contributor

This has merged into release-1.6 and will be part of v1.6.0

@MrHohn
Copy link
Member

MrHohn commented Apr 24, 2017

cc @bowei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants