Kubelet fails to authenticate to apiserver due to expired certificate #65991
Description
/kind bug
/sig auth
What happened:
My team is having an issue with TLS bootstrap, running Kubernetes 1.10.5. We set --experimental-cluster-signing-duration to 24h on the kube-controller-manager. Some nodes are being deallocated over night, and when they come up, Kubelet goes into a failed state. It appears that it recognizes that the certificate expires and attempts to bootstrap using the token from bootstrap.kubeconfig (so far so good), but then reuses the expired certificate, and cannot authenticate to the apiserver. Here are relevant logs from kubelet:
bootstrap.go:204] Part of the existing bootstrap client certificate is expired: 2018-07-06 12:32:00 +0000 UTC
bootstrap.go:58] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
certificate_store.go:117] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
server.go:549] Starting client certificate rotation.
certificate_manager.go:216] Certificate rotation is enabled.
certificate_manager.go:287] Rotating certificates
manager.go:154] cAdvisor running in container: "/sys/fs/cgroup/cpu,cpuacct/system.slice/kubelet.service"
certificate_manager.go:299] Failed while requesting a signed certificate from the master: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:anonymous" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope
After removing the generated cert at /var/lib/kubelet/pki/kubelet-client-current.pem
, kubelet was able to bootstrap properly, obtain a new cert and join the cluster.
rm /var/lib/kubelet/pki/kubelet-client-current.pem
systemctl restart kubelet
Just removing the kubeconfig, or any of the files in /var/lib/kubelet/pki/
other than kubelet-client-current.pem and restarting kubelet did not work. Removing the entire /var/lib/kubelet/pki/
directory and restarting kubelet works as well.
What you expected to happen:
I expect that after kubelet recognizes that its certificate has expired, it should remove its certificate and successfully bootstrap with the token in bootstrap.kubeconfig. It should obtain a new, valid, signed certificate from the control plane and successfully authenticate to the apiserver.
How to reproduce it (as minimally and precisely as possible):
- Set the RotateKubeletClientCertificate flag on kubelet and feature gate on kube-controller manager
- Set the
--experimental-cluster-signing-duration
flag on the kube-controller-manager to a small duration. - Start kubelet with bootstrap.kubeconfig file containing a token (that is present in the token authentication file passed to the apiserver) -- kubelet bootstraps successfully and is Ready.
- Stop kubelet before it has attempted to renew its certificate.
- Wait until kubelet's certificate has expired
- Restart kubelet
Anything else we need to know?:
Let me know if there are more logs and information that would be useful. Thanks a lot!
Environment:
- Kubernetes version (use
kubectl version
): 1.10.5 - Cloud provider or hardware configuration: Azure
- OS (e.g. from /etc/os-release): CentOS 7.4
- Kernel (e.g.
uname -a
): 3.10.0-693.11.6.el7.x86_64 - Install tools: custom
- Others: