Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeadm cluster breaks persistently on out-of-order upgrade #65562

Closed
danderson opened this issue Jun 28, 2018 · 8 comments
Closed

Kubeadm cluster breaks persistently on out-of-order upgrade #65562

danderson opened this issue Jun 28, 2018 · 8 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@danderson
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

(arguably WAI, but I have thoughts about that in the "what you expected" section)

What happened:

On my 1.10.5 cluster (kubeadm, deb package for kubelet), in preparation for upgrading to 1.11.0, I accidentally upgraded the kubelets before the control plane (apt update && apt upgrade with kubelet version not pinned, results in kubelet+kubeadm 1.11.0 vs. 1.10.5 control plane).

After the kubelet upgrade was completed, all kubelets in my cluster were crashlooping due to a missing /var/lib/kubelet/config.yaml. This means the control plane was also hard down.

I couldn't execute kubeadm upgrade apply v1.11.0 to fix the version skew, because it needs the control plane to be up.

I tried downgrading kubelet to v1.10.5, but the changes made to the systemd drop-in unit were permanent, so kubelet was still crashlooping after downgrade. This is what I believe to be a bug, the fact that downgrading back to a supported set of component versions did not recover the cluster. In other words, "rollback did not roll back".

The release notes just say that "there needs to be a config.yaml for kubelet now", without specifying what goes in that config, how to construct one (either manually or automatically).

I got lucky, and knew just enough kubeadm to bail myself out. In case someone else with this failure mode finds this bug, the way to fix this that worked for me was the following (I can't promise that it works, but it seems to have worked for me) :

  • On control plane node, apt install kubelet=1.10.5, kubeadm alpha phase kubelet config write-to-disk --config=/var/lib/kubelet/config.yaml, then systemctl restart kubelet. This should get kubelet started again, and thus the control plane should come back.
  • Still on control node, kubeadm upgrade plan v1.11.0 and kubeadm upgrade apply v1.11.0 to upgrade the control plane.
  • On damaged worker nodes, run kubeadm alpha phase kubelet config download to generate the config.yaml from the in-cluster ConfigMap (which is now reachable since you repaired the control plane). You might need a systemctl restart kubelet to defeat the crashloop backoff timer.
  • Cleanup on all nodes, get all kubelets up to 1.11.0.

What you expected to happen:

I expected Kubernetes to be robust against accidental version skew and out-of-order upgrading. Especially given the current state of managing k8s clusters, the situation where you end up upgrading the kubelet before the control plane is unfortunately common.

By "robust", I don't necessarily mean "should work", since this configuration is clearly outside the supported version skew. However, I think it should not persistently break the cluster. In other words, the only necessary corrective action should have been apt install kubelet=1.10.5 to downgrade back to a supported {control plane}x{kubelet} version set.

At minimum, I expected the Debian package to set things up such that a downgrade goes back to a working configuration. Even better would be if kubelet 1.11 had crashlooped with a useful message, e.g. "oh dear, it looks like you did upgrades out of order, because I don't have a config. Downgrade back to 1.10 and go upgrade the control plane first!" That way, people who don't have a rollback reflex as sharp as mine are still guided in the right direction.

How to reproduce it (as minimally and precisely as possible):

On Debian testing (or ubuntu xenial, failure mode should be identical):

  • Add the Docker and Kubernetes apt sources, per official documentation.
  • apt install kubelet=1.10.5-00 kubeadm
  • kubeadm init on control plane, kubeadm join on worker nodes as appropriate.
  • Verify kubectl get nodes works, lists all nodes at 1.10.5.
  • apt install kubelet to upgrade kubelet to 1.11.0-00.
  • Observe kubectl get nodes can no longer connect to control plane. journalctl -u kubelet shows kubelet crashlooping due to lack of config.yaml.
  • apt install kubelet=1.10.5-00 to downgrade kubelet.
  • Observe kubectl get nodes still broken, journalctl -u kubelet shows same crashloop.
  • Apply manual fixing with individual kubeadm steps, per above
  • Observe (hopefully!) that kubectl get nodes works again, and kubelets have started back up.

Environment:

  • Kubernetes version (use kubectl version): v1.10.5 -> v1.11.0
  • Cloud provider or hardware configuration: kubeadm on debian testing (~= ubuntu xenial for k8s binaries) on bare metal.
  • OS (e.g. from /etc/os-release): Debian testing (buster)
  • Kernel (e.g. uname -a): Linux prod-01 4.16.0-2-amd64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Debian 4.16.12-1 (2018-05-27) x86_64 GNU/Linux
  • Install tools: kubeadm
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Jun 28, 2018
@danderson
Copy link
Author

@kubernetes/sig-cluster-lifecycle-bugs

@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 28, 2018
@k8s-ci-robot
Copy link
Contributor

@danderson: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-bugs

In response to this:

@kubernetes/sig-cluster-lifecycle-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vsl86
Copy link

vsl86 commented Jun 28, 2018

definitly a bug or some devs f-up.
i have 1 node for testing purposes running over ubuntu 18.04. today upgrade 1.10.5-00 to 1.11-00 was made. it just destroyed /var/lib/kubelet/config.yaml and no place to restore it from, since it is 1 node setup.
systemd unit for kubelet have Restart=always, so be careful with upgrading. for me now it is deathloop. nice thing is this is not on production yet.

@timothysc
Copy link
Member

It's been well documented since nearly the beginning of the project that the order of upgrade it control-plan then nodes. We could definitely add more warnings, but this order has existed since the beginning of the project.

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-11/ .

I'm closing this issue in the main repo, please reopen in - https://github.com/kubernetes/kubeadm

@timothysc
Copy link
Member

We're chatting on slack right now to try and discuss options around preventing this problem.

@timothysc
Copy link
Member

xref - kubernetes/kubeadm#954

@davidkarlsen
Copy link
Member

Big thanks for the explanation to get things working again, just want to add two notes:

  1. the config.yaml needs to exist for the write-to-file step, just do a touch /var/lib/kubelet/config.yaml first
  2. there will be a reference to kubeadm-flags.env, this file will not exist on the other nodes, so copy it from where control plane was fixed, or else CNI will go south:
cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

@stealthybox
Copy link
Member

@davidkarlsen /var/lib/kubelet/kubeadm-flags.env is a node-specific overrides file:

Run kubeadm alpha phase kubelet write-env-file on each node to run the applicable logic that writes this file.

@danderson could you add this step (and maybe also point 1 about touching the config) to your write-up of the workaround?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests

6 participants