TestAutoscaleSustaining scales to 8 instead of 10 #13679

mgencur · 2023-02-07T13:47:48Z

There's an indication that the client node that sends the requests is not able to generate enough load to actually scale the ksvc to 10. The test runs "vegeta" tool and configures number of workers to send requests. However, when the client machine doesn't have enough resources (possibly just 2 vCPUs like KinD) or there are also other tests running in parallel the worker threads are not able to generate enough traffic to scale the ksvc as desired.
We ran into this issue downstream as well, there were no errors on Knative (cluster) side. Increasing the CPU resources for the client machine resolved the issue and it doesn't happen anymore.

Fixes #13049

Proposed Changes

Decrease the target scale for TestAutoscaleSustaining. Choose a reasonable default that will work in KinD, Prow cluster and also when running the test in a container.

I was considering leaving the default to 10 for Prow and when the GOMAXPROCS is lower than 10 use 8. But that is just complicated and unnecessary. Also, GOMAXPROCS in itself doesn't work well in a container where it actually returns the number of CPUs of the whole node.

The KinD tests ran 3 times in this PR without issues.

Release Note

There's an indication that the source node that sends the results is not able to generate enough load to actually scale the ksvc to 10. Trying to to find a lower bar for KinD tests.

codecov · 2023-02-07T13:54:30Z

Codecov Report

Base: 86.24% // Head: 86.21% // Decreases project coverage by -0.03% ⚠️

Coverage data is based on head (b13a8fe) compared to base (0639c5f).
Patch has no changes to coverable lines.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #13679      +/-   ##
==========================================
- Coverage   86.24%   86.21%   -0.03%     
==========================================
  Files         197      197              
  Lines       14783    14774       -9     
==========================================
- Hits        12749    12737      -12     
- Misses       1733     1735       +2     
- Partials      301      302       +1

Impacted Files	Coverage Δ
pkg/reconciler/configuration/configuration.go	`82.93% <0.00%> (-1.43%)`	⬇️
pkg/reconciler/route/resources/ingress.go	`94.80% <0.00%> (-0.20%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

mgencur · 2023-02-07T14:33:47Z

Tests passed, run 1

skonto · 2023-02-08T16:02:04Z

test/e2e/autoscale_test.go

@@ -143,7 +138,7 @@ func TestAutoscaleSustaining(t *testing.T) {
 				}))
 			test.EnsureTearDown(t, ctx.Clients(), ctx.Names())

-			AssertAutoscaleUpToNumPods(ctx, 1, 10, time.After(2*time.Minute), false /* quick */)
+			AssertAutoscaleUpToNumPods(ctx, 1, 8, time.After(2*time.Minute), false /* quick */)


If flakiness is only for the exponential case (not linear) we could only adjust target for that to distinguish between the two algorithms for the "same" generated traffic. Assuming flakiness is not a result of some bug in the statistics aggregation or due to some other failure. Thinking out loud about the latter, afaik exponential favors latest values in the window statistics so I would expect it would catch up faster (compared to linear) if enough traffic is there? 🤔 cc @dprotaso @psschwei

Hmm, I'd probably prefer setting the same target for both because we're adjusting this for environment that is not able to achieve what is requested by the test (the test requests certain number of workers that run in parallel).
For environments which can satisfy the test requirements, the different target is not necessary. So, it complicates the test settings (and code) for the more usual case as well. But I'm not sure.

afaik exponential favors latest values in the window statistics

In that case, would frequent changes in traffic also cause more frequent scaling up/down with the exponential algorithm than with the linear one? Perhaps insufficient resources on client side would cause the traffic to be more unstable, and the exponential algorithm would react more quickly to scale ksvc up/down.

I wonder if it would be better to set different targets depending on environment, using the short flag to distinguish the constrained one (i.e. something like targetPods := 10; if testing.Short() { targetPods = 8 })? That said, I'm not sure if there was a specific reason why we chose 10 to begin with or if it was just a nice round number.

(The weird thing is that these tests used to work on Kind too. If I remember correctly, it was right around the time that they switched over to systemd cgroups driver that things stopped working. But that's neither here nor there...)

I wonder if it would be better to set different targets depending on environment, using the short flag to distinguish the constrained one (i.e. something like targetPods := 10; if testing.Short() { targetPods = 8 })?

That's an option too. But having 10 or 8 doesn't make a reasonable difference to me. As you said, it's not clear where the value 10 came from. So, I chose one common value (8) to simplify the test and have it identical for all environments.

I think it's probably fine to just go with 8 (I don't know what scaling up two more pods would really tell us, other than we can handle double digits), but will give @dprotaso a chance to weigh in in case he has more context here.

Can you also drop the -short flag here:

serving/.github/workflows/kind-e2e.yaml

Line 331 in 9b9a951

-short \

? We added that just for this test, so may as well get rid of it if we're fixing the issue that caused us to add it

@dprotaso gentle ping.

psschwei

Sorry, should've gotten back to this one sooner...

/lgtm
/approve

knative-prow · 2023-02-23T13:09:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mgencur, psschwei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [psschwei]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

TestAutoscaleSustaining scales to 8 instead of 10

eec6f74

There's an indication that the source node that sends the results is not able to generate enough load to actually scale the ksvc to 10. Trying to to find a lower bar for KinD tests.

knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Feb 7, 2023

knative-prow bot requested review from evankanderson and izabelacg February 7, 2023 13:47

knative-prow bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 7, 2023

mgencur added 2 commits February 7, 2023 15:34

Trigger KinD

42380c0

Trigger KinD

16485d1

mgencur changed the title ~~[WIP] TestAutoscaleSustaining scales to 8 instead of 10~~ TestAutoscaleSustaining scales to 8 instead of 10 Feb 8, 2023

knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 8, 2023

skonto reviewed Feb 8, 2023

View reviewed changes

Remove -short flag from kind-e2e.yaml

b13a8fe

psschwei reviewed Feb 23, 2023

View reviewed changes

knative-prow bot assigned psschwei Feb 23, 2023

knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Feb 23, 2023

knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 23, 2023

knative-prow bot merged commit 708374e into knative:main Feb 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestAutoscaleSustaining scales to 8 instead of 10 #13679

TestAutoscaleSustaining scales to 8 instead of 10 #13679

mgencur commented Feb 7, 2023 •

edited

Loading

codecov bot commented Feb 7, 2023 •

edited

Loading

mgencur commented Feb 7, 2023

skonto Feb 8, 2023 •

edited

Loading

mgencur Feb 8, 2023 •

edited

Loading

mgencur Feb 8, 2023 •

edited

Loading

psschwei Feb 10, 2023

mgencur Feb 13, 2023

psschwei Feb 13, 2023

skonto Feb 20, 2023

psschwei left a comment

knative-prow bot commented Feb 23, 2023

TestAutoscaleSustaining scales to 8 instead of 10 #13679

TestAutoscaleSustaining scales to 8 instead of 10 #13679

Conversation

mgencur commented Feb 7, 2023 • edited Loading

Proposed Changes

codecov bot commented Feb 7, 2023 • edited Loading

Codecov Report

mgencur commented Feb 7, 2023

skonto Feb 8, 2023 • edited Loading

Choose a reason for hiding this comment

mgencur Feb 8, 2023 • edited Loading

Choose a reason for hiding this comment

mgencur Feb 8, 2023 • edited Loading

Choose a reason for hiding this comment

psschwei Feb 10, 2023

Choose a reason for hiding this comment

mgencur Feb 13, 2023

Choose a reason for hiding this comment

psschwei Feb 13, 2023

Choose a reason for hiding this comment

skonto Feb 20, 2023

Choose a reason for hiding this comment

psschwei left a comment

Choose a reason for hiding this comment

knative-prow bot commented Feb 23, 2023

mgencur commented Feb 7, 2023 •

edited

Loading

codecov bot commented Feb 7, 2023 •

edited

Loading

skonto Feb 8, 2023 •

edited

Loading

mgencur Feb 8, 2023 •

edited

Loading

mgencur Feb 8, 2023 •

edited

Loading