Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apiserver healthz should check etcd override endpoints #129417

Open
mengqiy opened this issue Dec 28, 2024 · 3 comments · May be fixed by #129438
Open

apiserver healthz should check etcd override endpoints #129417

mengqiy opened this issue Dec 28, 2024 · 3 comments · May be fixed by #129438
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@mengqiy
Copy link
Member

mengqiy commented Dec 28, 2024

What happened?

It seems the apiserver will fail bootstrap if the etcd override endpoint is not healthy.
But after the bootstrap completes, if the etcd override endpoint become unhealthy, the apiserver health check will still report OK while kubectl get cs will report etcd override endpoint is not healthy.

What did you expect to happen?

APIserver health check should report unhealthy when an etcd override endpoint is unhealthy.

How can we reproduce it (as minimally and precisely as possible)?

Run 2 etcd clusters: one for events and one for the other resources. Then configure apiserver to use the event etcd using the --etcd-servers-overrides flag. After the apiserver complete the bootstrap, kill the event etcd.
You will be able to see APIServer health check still reporting OK.

Anything else we need to know?

No response

Kubernetes version

1.32. I think I have seen this issue in older version as well.

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@mengqiy mengqiy added the kind/bug Categorizes issue or PR as related to a bug. label Dec 28, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 28, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mengqiy
Copy link
Member Author

mengqiy commented Dec 28, 2024

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 28, 2024
@mengqiy mengqiy changed the title apiserver healthz should chek apiserver healthz should check etcd override endpoints Dec 30, 2024
@pacoxu
Copy link
Member

pacoxu commented Dec 31, 2024

(base) ➜  ~ kubectl get --raw='/healthz?verbose'

[+]ping ok
[+]log ok
[+]etcd ok
...
healthz check passed

This only checked the default etcd. We may add one in /healthz?verbose

(base) ➜  ~ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE   ERROR
controller-manager   Healthy     ok
scheduler            Healthy     ok
etcd-0               Healthy     ok
etcd-1               Unhealthy             context deadline exceeded
(base) ➜  ~ kubectl get --raw='/healthz?verbose'

[+]ping ok
[+]log ok
[+]etcd ok
...
healthz check passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants