Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: make docs clearer for groupBy operator #24671 #32273

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

dhirensr
Copy link
Contributor

@dhirensr dhirensr commented Jan 1, 2024

related to #24671

@dhirensr dhirensr force-pushed the issue_24671 branch 2 times, most recently from 9b8cd7b to c62fd60 Compare January 8, 2024 03:06
@dhirensr
Copy link
Contributor Author

dhirensr commented Jan 8, 2024

@johanandren : changed the text, can you please review again?

// #groupByWithAsync
Source.range(1, 10)
.groupBy(2, i -> i % 2 == 0) // create two sub-streams with odd and even numbers
.async()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trickily enough not the right thing, .async() creates an async boundary between the graph so far and the next steps, to create a separate async island for each substream you would have to do .via(Flow.reduce(...).async())

Copy link
Contributor Author

@dhirensr dhirensr Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha! I wasn't aware about this, and maybe I've been using it wrongly. I'll update the example.

@johanandren : just to confirm adding async between groupBy and flow is kind of redundant then and has no impact? wondering if it should even compile if it's redundant because it's misleading

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just groupBy().async would put the graph up to that point, so Source.range and groupBy grouping on one stream interpreter actor, and then all the downstreams and the substream merge and sink would run in another single actor. So it has impact but it does not do what we want to show here (one separate stream interpreter per substream).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanandren : does this diagram depict the right understanding ? before making the change I wanted to understand correctly.

groupby_async

I think the current example is state A and we want to show state B in the example by adding async to the flow before merging sub streams. correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just doing .async after groupBy means that you end up with two interpreters/actors running the stream, one up to the .async and one after.
Screenshot 2024-01-16 at 13 22 34

State B is correct, Flow.a.b.b.async adds a boundary around each materialisation of that flow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanandren : the image you attached is the current state right ? and State B which is correct will be done by adding async in the example as proposed by you , right?

also the materialisation of the stream happens even between the actors or only at the end of the stream?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, my drawing is what happens with the current (instead of your suggested State A), State B in the diagram is the fixed Flow().async one.

The materialization, turning a stream graph/bluepring into a running stream, of the sub stream happens each time a new group-by key is seen, however, normally the sub stream is materialized into the existing running stream. The async boundary around it means materialized stream becomes a separately running stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer @johanandren. I've now changed the example as suggested, can you please review it again?

Copy link
Member

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@johanandren johanandren merged commit ac66bc9 into akka:main Jan 18, 2024
5 checks passed
@johanandren johanandren added this to the 2.9.2 milestone Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants