Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet ignores updated etcd.yaml and monitors only etcd.yaml.backup #129364

Open
rad1k4l opened this issue Dec 22, 2024 · 9 comments
Open

kubelet ignores updated etcd.yaml and monitors only etcd.yaml.backup #129364

rad1k4l opened this issue Dec 22, 2024 · 9 comments
Labels
kind/documentation Categorizes issue or PR as related to documentation. kind/support Categorizes issue or PR as a support question. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@rad1k4l
Copy link

rad1k4l commented Dec 22, 2024

What happened?

During the kubeadm upgrade process, kubeadm creates a backup of the existing etcd manifest (etcd.yaml.backup) and updates the etcd.yaml manifest to a newer version (e.g., from etcd 3.4.13-0 to 3.5.16-0). However, post-upgrade, the kubelet appears to ignore the updated etcd.yaml and continues to monitor and apply changes only to etcd.yaml.backup. This behavior prevents the etcd cluster from upgrading, causing the kubeadm upgrade process to fail.

What did you expect to happen?

After initiating the kubeadm upgrade:

  • kubeadm should update the etcd.yaml manifest to the new etcd version (3.5.16-0).
  • kubelet should detect changes in etcd.yaml, apply the updated configuration, and successfully upgrade the etcd cluster to version 3.5.16-0.
  • The etcd.yaml.backup should remain as a backup and kubelet should continue to monitor only the primary etcd.yaml manifest.

How can we reproduce it (as minimally and precisely as possible)?

There are two methods to reproduce the issue: Automated Upgrade via kubeadm and Manual Updating of etcd.

Automated Upgrade via kubeadm:

  • Attempt to upgrade the cluster using kubeadm to upgrade etcd to version 3.5.16-0:

kubeadm upgrade apply v1.32.0

  • kubeadm creates etcd.yaml.backup and updates etcd.yaml to version 3.5.16-0.
  • kubelet ignores etcd.yaml(verison 3.5.16-0 ) and only monitors etcd.yaml.backup ( version 3.4.13-0), preventing the etcd upgrade.

Manual :

  • Create a backup of the existing etcd.yaml file:
    cp /etc/kubernetes/manifests/etcd.yaml /etc/kubernetes/manifests/etcd.yaml.backup

  • Open etcd.yaml in a text editor and update the etcd image version from 3.4.13-0 to 3.5.16-0.

  • Restart the kubelet service:
    systemctl restart kubelet

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: v1.32.0
Kustomize Version: v5.5.0
Server Version: v1.31.4

Cloud provider

On-premises bare-metal servers

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

containerd

Related plugins (CNI, CSI, ...) and versions (if applicable)

@rad1k4l rad1k4l added the kind/bug Categorizes issue or PR as related to a bug. label Dec 22, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 22, 2024
@rad1k4l
Copy link
Author

rad1k4l commented Dec 22, 2024

/sig node
/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 22, 2024
@rad1k4l
Copy link
Author

rad1k4l commented Dec 22, 2024

/remove-kind node

@k8s-ci-robot
Copy link
Contributor

@rad1k4l: Those labels are not set on the issue: kind/node

In response to this:

/remove-kind node

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@pacoxu
Copy link
Member

pacoxu commented Dec 23, 2024

/remove-sig cluster-lifecycle
As kubernetes/kubeadm#3136 (comment) pointed out, the file is not created by kubeadm.

You should find out who created the file. (That is the root cause of your problem)

/kind feature
(or support)
/remove-kind bug
Dup with #105684.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. kind/bug Categorizes issue or PR as related to a bug. labels Dec 23, 2024
@sftim
Copy link
Contributor

sftim commented Dec 23, 2024

I agree, it's either a support issue, or something we can document as current behavior (there is an existing feature request, albeit declined).

/kind support

Want to get this documented @rad1k4l? You can request that via k/website issues.

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Dec 23, 2024
@sftim
Copy link
Contributor

sftim commented Dec 23, 2024

/remove-kind feature

@k8s-ci-robot k8s-ci-robot removed the kind/feature Categorizes issue or PR as related to a new feature. label Dec 23, 2024
@sftim
Copy link
Contributor

sftim commented Dec 23, 2024

/retitle kubelet ignores updated etcd.yaml and monitors only etcd.yaml.backup

@k8s-ci-robot k8s-ci-robot changed the title [BUG] Kubelet ignores updated etcd.yaml and monitors only etcd.yaml.backup kubelet ignores updated etcd.yaml and monitors only etcd.yaml.backup Dec 23, 2024
@kannon92
Copy link
Contributor

/triage accepted
/kind documentation

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. kind/documentation Categorizes issue or PR as related to documentation. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 23, 2024
@kannon92
Copy link
Contributor

Is it ignored because the name of the pod is the same?

@kannon92 kannon92 moved this from Triage to Triaged in SIG Node Bugs Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. kind/support Categorizes issue or PR as a support question. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Triaged
Development

No branches or pull requests

5 participants