-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-threaded sampling with reversediff backend and rdcache gives bad samples #1412
Comments
Maybe related to the other ReverseDiff/Memoization issues (see, e.g., #1393). According to its README, Memoization is also not thread-safe (see https://github.com/marius311/Memoization.jl), so I'm not surprised that multithreaded sampling is problematic. Does ReverseDiff work without Memoization (you have to restart the Julia process to make sure there's no memoized stuff floating around anymore)? |
@devmotion yes it works fine. I will update the plot and mwe to include it |
Then it seems the issue here is that Memoization is not threadsafe |
Thank you @devmotion. I was wondering if we could have a warning/error message with multi-threaded sampling when rdcache is enabled. It could be something like this |
I think maybe it's possible to fix the problem on our side by using a |
Thanks for the suggestion. Could you point me out to any example which does this? |
I think this is fixable too. I am actually surprised this is not working at all. Because even a race condition shouldn't affect the result because all the threads are writing the same compiled tape. |
It's something in Turing that has to be changed (if it fixes the issue). |
Hmm ok I think I know what's going on. It's not the memoization, it's ReverseDiff. The compiled tape has cache fields re-used every time the tape is differentiated. Different compiled tapes for different threads should solve the problem here. |
Yep, that's what I suggested above 🙂 |
Yes I was agreeing with you :) |
Closing this as #1414 fixes it. |
Turing 0.14.4 is available now which should contain the fix for this problem. |
@devmotion The repository still shows |
Tags for the git repo are created automatically at midnight every day. For the Julia package manager it is only relevant that JuliaRegistries/General#21909 was merged - as soon as the registry is updated users are able to update Turing with Pkg. |
Thanks for the clarification! |
The following are the posterior plots for the coin flip turing example
As we can see the posterior from multi-threaded sampling with rdcache is not at all close to others
Code to reproduce this
My configuration
Also posted in the slack channel
The text was updated successfully, but these errors were encountered: