Division-by-zero in Horizontal Workload Autoscaler #128847

jm-franc · 2024-11-18T18:40:41Z

Which component are you using?:

Horizontal workload autoscaler.

What version of the component are you using?:

Not relevant.

What k8s version are you using (kubectl version)?:

kubectl version Output

$ kubectl version
Client Version: v1.30.6-dispatcher
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.5-gke.1443001

What environment is this in?:

Not relevant.

What did you expect to happen?:

When the horizontal autoscaler computes the expected number of replicas, it uses the following formula:

usageRatio := float64(usage) / (float64(targetUsagePerPod) * float64(statusReplicas))

(This formula (or a variation of it) appears a couple of times in this file.)

It's not impossible that statusReplicas (i.e. the status.replicas field of the /scale subresource) equals zero (e.g. if a user kills pods), leading to a division by zero.

The Golang spec allows divisions by zero to trigger traps, although most implementations would return +Inf (which leads to the correct behaviour) or NaN (if usage == 0., in which case the func would return a negative number of replicas).

What happened instead?:

This problem could lead to a panic, or an incorrect behaviour. This could only happen as a result of a race condition, and in the unlikely situation where status.replicas == 0. I'm not sure of its practical significance.

How to reproduce it (as minimally and precisely as possible):

This issue only appears as a result of a race condition. I couldn't produce it.

The text was updated successfully, but these errors were encountered:

gjtempleton · 2024-11-18T23:22:13Z

Hey @jm-franc, thanks for raising the issue, going to move it over to k/k as that hosts the PodAutoscaler code

/transfer-issue kubernetes

k8s-ci-robot · 2024-11-18T23:22:25Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

gjtempleton · 2024-11-18T23:23:57Z

/sig autoscaling

jm-franc added the kind/bug Categorizes issue or PR as related to a bug. label Nov 18, 2024

k8s-ci-robot transferred this issue from kubernetes/autoscaler Nov 18, 2024

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 18, 2024

k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division-by-zero in Horizontal Workload Autoscaler #128847

Division-by-zero in Horizontal Workload Autoscaler #128847

jm-franc commented Nov 18, 2024

gjtempleton commented Nov 18, 2024

k8s-ci-robot commented Nov 18, 2024

gjtempleton commented Nov 18, 2024

Division-by-zero in Horizontal Workload Autoscaler #128847

Division-by-zero in Horizontal Workload Autoscaler #128847

Comments

jm-franc commented Nov 18, 2024

gjtempleton commented Nov 18, 2024

k8s-ci-robot commented Nov 18, 2024

gjtempleton commented Nov 18, 2024