Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf test: WebSockets (http4s) #3520

Merged
merged 11 commits into from
Feb 20, 2024
Merged

Perf test: WebSockets (http4s) #3520

merged 11 commits into from
Feb 20, 2024

Conversation

kciesielski
Copy link
Member

This PR adds an endpoint for performance tests of WebSockets: ws://127.0.0.1:8080/ws/ts, returning current timestamp.
A special simulation called WebSocketsSimulation can then be executed. It would build a histogram reporting latency percentiles.
This simulation cannot be configured and run using standard way (PerfTestSuiteRunner), so the process is described in perfTests/README.md.

Based on https://github.com/kamilkloch/websocket-benchmark
WS endpoint will be added to other servers in separate PRs.

.toWebSocketRoutes(
wsEndpoint.serverLogicSuccess(_ =>
IO.pure { (in: Stream[IO, Long]) =>
Http4sCommon.wsResponseStream.evalMap(_ => IO.realTime.map(_.toMillis)).concurrently(in.as(()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as the http4s version? why the "concurrently"?

second question: isn't IO.realTime "slow" by itself, esp when called under stress? For better isolation of the purely WS performance, would it make a difference if we just returned IO.pure(1823)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been entirely copied from https://github.com/kamilkloch/websocket-benchmark/

Is this the same as the http4s version? why the "concurrently"?

The behavior of the stream should be the same. Vanilla implementation requires a send: Stream[F, WebSocketFrame] and a receive: Pipe[F, WebSocketFrame, Unit].
The Tapir logic has to return a Stream[F, Long] based on an input Stream so maybe it needs this additional concurrently operator to correctly make the result stream "run after the input stream is read".
I wanted to keep the implementation 1:1 with the base benchmark for a start.

second question

The returned timestamp is used to calculate latency on the client. I found that suspicious too, but if Kamil's tests result in latencies of a few milliseconds, then it might not be that expensive after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so we ignore the input, and as the output produce timestamps. But how do we calculate the latency from that? The scenario reads a single message, compares timestamps and waits for a second, right? But isn't that skewed by any delay that there might be when establishing the web socket, that is, this assumes that the WS sends its timestamp, and that the scenario ends its sleep at exactly the same time (to properly measure latency)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the server returns its timestamp after the connection is established, so the client receives it immediately and compares to its timestamp. The await call is unclear to me, maybe it only means to await at most, but I need to double check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this example in Gatling docs:

// expecting 2 messages
// 1st message will be validated against wsCheck1
// 2nd message will be validated against wsCheck2
// whole sequence must complete withing 30 seconds
exec(ws("Send").sendText("hello")
  .await(30)(wsCheck1, wsCheck2))

It confirms what I suggested, that the await(1.second) operator means only that the response check has to complete in 1 second.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok ... so we want to measure the total latency of: establish a ws connection, send one message, receive one message? Why not simply doing it on the client side, capturing the timestamp, doing the operations, and then getting the local timestamp again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection part is not measured. After connecting, the client sends requests and does the checks in a loop. It looks the intention was to measure only the latency of one direction: not full req->response cycle, but just the server->client part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand the response stream, it emits a timestamp every 100 ms. It looks like we're simply ignoring requests, just emitting timestamp every 100ms and measuring the time between sending it and receiving on the client side.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this entire scheme may be caused by Gatling's communication model, where you can't simply connect and receive messages, you have to send a request and receive a response, but the test is interested only in the response generated -> response received part, and the responses are generated in an independent stream.

@kciesielski kciesielski requested a review from adamw February 19, 2024 15:47
@kciesielski kciesielski marked this pull request as ready for review February 19, 2024 15:47
@@ -516,7 +516,7 @@ lazy val perfTests: ProjectMatrix = (projectMatrix in file("perf-tests"))
"nl.grons" %% "metrics4-scala" % Versions.metrics4Scala % Test,
"com.lihaoyi" %% "scalatags" % Versions.scalaTags % Test,
// Needs to match version used by Gatling
"com.github.scopt" %% "scopt" % "4.1.0",
"com.github.scopt" %% "scopt" % "3.7.1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pin in steward's config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pinned, but I pinned it after steward had updated it.

object Http4sCommon {
// Websocket response is returned with a lag, so that we can have more concurrent users talking to the server.
// This lag is not relevant for measurements, because the server returns a timestamp after having a response ready to send back,
// so the client can measure only the latency of the server stack handling the response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah now it's clear, thanks :)

@kciesielski kciesielski merged commit 8ef0b68 into master Feb 20, 2024
28 checks passed
@kciesielski kciesielski deleted the perf-test-websockets branch February 20, 2024 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants