Perf test: WebSockets (http4s) #3520

kciesielski · 2024-02-15T12:18:16Z

This PR adds an endpoint for performance tests of WebSockets: ws://127.0.0.1:8080/ws/ts, returning current timestamp.
A special simulation called WebSocketsSimulation can then be executed. It would build a histogram reporting latency percentiles.
This simulation cannot be configured and run using standard way (PerfTestSuiteRunner), so the process is described in perfTests/README.md.

Based on https://github.com/kamilkloch/websocket-benchmark
WS endpoint will be added to other servers in separate PRs.

adamw · 2024-02-15T13:08:42Z

perf-tests/src/main/scala/sttp/tapir/perf/http4s/Http4s.scala

+        .toWebSocketRoutes(
+          wsEndpoint.serverLogicSuccess(_ =>
+            IO.pure { (in: Stream[IO, Long]) =>
+              Http4sCommon.wsResponseStream.evalMap(_ => IO.realTime.map(_.toMillis)).concurrently(in.as(()))


Is this the same as the http4s version? why the "concurrently"?

second question: isn't IO.realTime "slow" by itself, esp when called under stress? For better isolation of the purely WS performance, would it make a difference if we just returned IO.pure(1823)?

This has been entirely copied from https://github.com/kamilkloch/websocket-benchmark/

Is this the same as the http4s version? why the "concurrently"?

The behavior of the stream should be the same. Vanilla implementation requires a send: Stream[F, WebSocketFrame] and a receive: Pipe[F, WebSocketFrame, Unit].
The Tapir logic has to return a Stream[F, Long] based on an input Stream so maybe it needs this additional concurrently operator to correctly make the result stream "run after the input stream is read".
I wanted to keep the implementation 1:1 with the base benchmark for a start.

second question

The returned timestamp is used to calculate latency on the client. I found that suspicious too, but if Kamil's tests result in latencies of a few milliseconds, then it might not be that expensive after all.

ok, so we ignore the input, and as the output produce timestamps. But how do we calculate the latency from that? The scenario reads a single message, compares timestamps and waits for a second, right? But isn't that skewed by any delay that there might be when establishing the web socket, that is, this assumes that the WS sends its timestamp, and that the scenario ends its sleep at exactly the same time (to properly measure latency)

My understanding is that the server returns its timestamp after the connection is established, so the client receives it immediately and compares to its timestamp. The await call is unclear to me, maybe it only means to await at most, but I need to double check.

I found this example in Gatling docs:

// expecting 2 messages // 1st message will be validated against wsCheck1 // 2nd message will be validated against wsCheck2 // whole sequence must complete withing 30 seconds exec(ws("Send").sendText("hello") .await(30)(wsCheck1, wsCheck2))

It confirms what I suggested, that the await(1.second) operator means only that the response check has to complete in 1 second.

ah ok ... so we want to measure the total latency of: establish a ws connection, send one message, receive one message? Why not simply doing it on the client side, capturing the timestamp, doing the operations, and then getting the local timestamp again?

The connection part is not measured. After connecting, the client sends requests and does the checks in a loop. It looks the intention was to measure only the latency of one direction: not full req->response cycle, but just the server->client part.

As far as I understand the response stream, it emits a timestamp every 100 ms. It looks like we're simply ignoring requests, just emitting timestamp every 100ms and measuring the time between sending it and receiving on the client side.

I guess this entire scheme may be caused by Gatling's communication model, where you can't simply connect and receive messages, you have to send a request and receive a response, but the test is interested only in the response generated -> response received part, and the responses are generated in an independent stream.

…test-websockets

adamw · 2024-02-19T17:09:13Z

build.sbt

@@ -516,7 +516,7 @@ lazy val perfTests: ProjectMatrix = (projectMatrix in file("perf-tests"))
      "nl.grons" %% "metrics4-scala" % Versions.metrics4Scala % Test,
      "com.lihaoyi" %% "scalatags" % Versions.scalaTags % Test,
      // Needs to match version used by Gatling
-      "com.github.scopt" %% "scopt" % "4.1.0",
+      "com.github.scopt" %% "scopt" % "3.7.1",


pin in steward's config?

It's pinned, but I pinned it after steward had updated it.

adamw · 2024-02-19T17:10:07Z

perf-tests/src/main/scala/sttp/tapir/perf/http4s/Http4s.scala

+object Http4sCommon {
+  // Websocket response is returned with a lag, so that we can have more concurrent users talking to the server.
+  // This lag is not relevant for measurements, because the server returns a timestamp after having a response ready to send back,
+  // so the client can measure only the latency of the server stack handling the response.


ah now it's clear, thanks :)

kciesielski added 6 commits February 15, 2024 10:07

Add a WebSocket endpoint to http4s

1996e1c

Add a simulation for WebSockets

6470d7e

Exclude WebSocketsSimulation from standard simulations

3530184

Document how to run WebSocketsSimulation

12b3299

Merge branch 'master' into perf-test-websockets

7712144

Restore scopt 3.7.1

85155ee

adamw reviewed Feb 15, 2024

View reviewed changes

kciesielski added 5 commits February 19, 2024 14:48

A little refactoring and some comments about the simulation flow

6c8c0b0

More comments

ad01f46

Merge branch 'master' into perf-test-websockets

bcc91c9

Save histogram to a file

a7f7de2

Merge remote-tracking branch 'origin/perf-test-websockets' into perf-…

2210537

…test-websockets

kciesielski requested a review from adamw February 19, 2024 15:47

kciesielski marked this pull request as ready for review February 19, 2024 15:47

adamw reviewed Feb 19, 2024

View reviewed changes

adamw approved these changes Feb 19, 2024

View reviewed changes

kciesielski merged commit 8ef0b68 into master Feb 20, 2024
28 checks passed

kciesielski deleted the perf-test-websockets branch February 20, 2024 08:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf test: WebSockets (http4s) #3520

Perf test: WebSockets (http4s) #3520

kciesielski commented Feb 15, 2024

adamw Feb 15, 2024

kciesielski Feb 15, 2024

adamw Feb 15, 2024

kciesielski Feb 15, 2024

kciesielski Feb 15, 2024

adamw Feb 16, 2024

kciesielski Feb 19, 2024

kciesielski Feb 19, 2024

kciesielski Feb 19, 2024

adamw Feb 19, 2024

kciesielski Feb 20, 2024

adamw Feb 19, 2024

Perf test: WebSockets (http4s) #3520

Perf test: WebSockets (http4s) #3520

Conversation

kciesielski commented Feb 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment