-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPA: fixed wrong count for target replicas calculations. #34955
Conversation
CC @DirectXMan12 |
Jenkins unit/integration failed for commit 47451cb. Full PR test history. The magic incantation to run this job again is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small request on a comment (it should reference a bug number so that future readers can look up the problem), but otherwise LGTM 👍
@@ -329,6 +334,12 @@ func (a *HorizontalController) reconcileAutoscaler(hpa *autoscaling.HorizontalPo | |||
if desiredReplicas > hpa.Spec.MaxReplicas { | |||
desiredReplicas = hpa.Spec.MaxReplicas | |||
} | |||
|
|||
// Do not upscale too much to prevent incorrect rapid increase of the number of master replicas caused by | |||
// bogus CPU usage report from heapster/kubelet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a bug number here so that future readers of this recognize what we're talking about here (I'm assuming this is to avoid the persistent issues with overflow/astronomically high CPU)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
return currentReplicas, &utilization, timestamp, nil | ||
desiredReplicas := math.Ceil(usageRatio * float64(numRunningPods)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could effectively cause a scale down, even if it looks like a scale up (if desiredReplicas * numRunningPods < (numRunningPods + numPendingPods)
)? Should we deal with that, or at least log or note it somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine: not running pods are not consuming CPU. We are logging the number of running pods in DesiredReplicasComputed
event bellow, so, it is clear what we are doing.
desiredReplicas := math.Ceil(usageRatio * float64(numRunningPods)) | ||
|
||
a.eventRecorder.Eventf(hpa, api.EventTypeNormal, "DesiredReplicasComputed", | ||
"Computed the desired num of replicas: %d, on a base of %d reports (avgCPUutil: %d)", int32(desiredReplicas), numRunningPods, utilization) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add the current number of created (possibly not started) replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
// Do not upscale too much to prevent incorrect rapid increase of the number of master replicas caused by | ||
// bogus CPU usage report from heapster/kubelet. | ||
if desiredReplicas > int32(math.Max(2.0*float64(currentReplicas), 4.0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make 2.0 ad 4.0 a flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done (made consts like other HPA parameters)
|
||
// Do not upscale too much to prevent incorrect rapid increase of the number of master replicas caused by | ||
// bogus CPU usage report from heapster/kubelet. | ||
if desiredReplicas > int32(math.Max(2.0*float64(currentReplicas), 4.0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also put the formula to variable - no need to calculate it 2 times (thats a room for mistakes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
47451cb
to
678c291
Compare
Comments applied, PTAL |
@@ -48,8 +48,15 @@ const ( | |||
|
|||
HpaCustomMetricsTargetAnnotationName = "alpha/target.custom-metrics.podautoscaler.kubernetes.io" | |||
HpaCustomMetricsStatusAnnotationName = "alpha/status.custom-metrics.podautoscaler.kubernetes.io" | |||
|
|||
scaleUpLimitFactor = 2 | |||
scaleUpLimitMinimum = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous version we had 4 here. 1 seems to little. With this setup if 1 replica is present we could only scale it up to 2. We should be able to get to 3-4 immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, a typo :)
HPA: fixed wrong count for target replicas calculations (kubernetes#34821).
678c291
to
f495e73
Compare
LGTM |
Automatic merge from submit-queue |
Commit found in the "release-1.4" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
…ck-of-#34955-origin-release-1.4 Automatic merge from submit-queue Automated cherry pick of kubernetes#34955 Cherry pick of kubernetes#34955 on release-1.4. kubernetes#34955: HPA: fixed wrong count for target replicas calculations.
HPA: fixed wrong count for target replicas calculations (#34821).
This change is