-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Await comply with reactive streams Subscriber rule 2.7 #3360
Conversation
There is a possibility of race of Subscription.request and Subscription.cancel methods since cancellation handler could be executed in a separate thread. Rule [2.7](https://github.com/reactive-streams/reactive-streams-jvm/blob/v1.0.3/README.md#2-subscriber-code) requires Subscription methods to be executed serially.
Thank you! This is indeed a problem, and I don't think your solution is enough to fix it. Looks like it is possible that |
Thank you for the quick feedback! I'm not sure if parallel cancel executions might be the problem, as specified in subscription rules 3.5 cancel method itself should be thread-safe (i.e. I can try myself to introduce lock here but I feel it might be excessive as well might have additional performance implications which is hard for me to estimate. |
Good point about 3.5, but still, I think we should comply with 2.7 to the letter. For example, see its proposed interpretation:
I don't think there's room for ambiguity here: the calls must have a happen-before link between them.
I would estimate the implications as insignificant. @qwwdfsad has more in-depth knowledge of this and can correct me if I'm wrong, but as I understand it, the performance impact of
1 is insignificant, because using coroutines already causes the state to synchronize to the main memory on dispatches, and 2 is insignificant because, when threads don't fight for the right to enter a |
Makes sense to me. I've added synchronization to the subscription methods. Unfortunately I couldn't figure out how to avoid |
Great, thanks! If, at any point, you feel like you don't want to continue working on this pull request, please say so, and we will polish it ourselves, because we want it to land in the next release, which should happen soon. You will be credited in any case. Regarding the test: yes, I think we need to fix it, but more radically. Currently, the test is structured in such a way that it will pass most of the time even if the bug is not fixed: I removed the What do we want to test here? Judging by the name of the pull request, we want to test that rule 2.7 is upheld even in the presence of cancellation and multithreading, that is, that the operations on a subscriber are not entered in parallel. This feels like a good job for a stress test. We have plenty of those (see So, in this case, a good test, I think, would be something like this: create a subscriber that sets some variable to If you do write this test, please also check that, when the bug is present, the test finds it consistently. |
Thanks for the suggestions! I've added suggested test, it fails for me locally only when I have sleep in |
Now, the test procedure itself could be improved significantly. Let's look at what happens currently.
The fact that publishers are running in parallel also obscures the results: there aren't that many cores in the typical CPU, so, despite thousands of coroutines launching, less than 20 of them will typically execute in parallel. What can be done here to improve the situation: instead of launching everything and only then cancelling everything, you can perform many self-contained iterations that don't influence one another in any way. |
I've rewritten the test addressing your comments, please let me know what you think. When I've added Do you think this is good enough to consider Thanks for the patience and explanations btw. |
This test doesn't actually check the synchronization in So, I think this is good to go. Good work, thank you! |
Really nice fix, we are struggling from it for a long time! Thank you @EgorKulbachka |
@sadensmol we take such issues seriously, so if you experience something else of the sort, please feel free to report it. |
There is a possibility of a race between Subscription.request and Subscription.cancel methods since cancellation handler could be executed in a separate thread. Rule [2.7](https://github.com/reactive-streams/reactive-streams-jvm/blob/v1.0.3/README.md#2-subscriber-code) requires Subscription methods to be executed serially.
…to baxter/upstream-flow-timeout * origin/baxter/upstream-flow-timeout: (328 commits) Commit API dump Cleanup API, update knit Fix typo in runTest method docs (Kotlin#3417) Update coroutines-and-channels.md (Kotlin#3410) chore: update the website's release step (Kotlin#3397) ktl-695 chore: support Dokka HTML customization (Kotlin#3388) update: KT-50122 adding kotlinx.dependencies Improve bump-version.sh (Kotlin#3365) Fix documentation for `DEBUG_PROPERTY_VALUE_OFF` (Kotlin#3389) feat: moving coroutines hands-on to docs (Kotlin#3369) Version 1.6.4 Improve CoroutineDispatcher documentation (Kotlin#3359) Update binary compatibility validator to 0.11.0 (Kotlin#3362) Add a scope for launching background work in tests (Kotlin#3348) Fix debug module publication with shadow plugin (Kotlin#3357) Comply with Subscriber rule 2.7 in the `await*` impl (Kotlin#3360) Update readme (Kotlin#3343) Reduce reachable references of disposed invokeOnTimeout handle (Kotlin#3353) breakleg; knit validation fix Additional comment in CoroutineScheduler ... # Conflicts: # README.md
Rule 2.7 requires Subscription methods to be executed serially.
There is a possibility of race between Subscription.request and Subscription.cancel methods since cancellation handler could be executed in a separate thread.
In particular this bug leads to database connection leaks if used with jooq subscription as cancel can happen while request being initialized and hence it misses to close connection.