-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use SHA512 when seeding MersenneTwister #37766
base: master
Are you sure you want to change the base?
Conversation
This hashes the provided seeds before feeding dSFMT initialization routine. The passed seeds are still stored as-is in the `MersenneTwister.seed` field. Goals: * possibly lead to more independance between two streams corresponding to two user-provided seeds * by feeding `VERSION.minor` and `VERSION.major` to the hashing process, this guarantees that random streams change between minor releases, helping users to not believe in stream stability
@@ -14,6 +14,8 @@ using .DSFMT | |||
using Base.GMP.MPZ | |||
using Base.GMP: Limb | |||
|
|||
import SHA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As SHA
is accessed, I guess yes? I prefer using qualifed names like SHA.update!
here. Or did I misunderstand the question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see: it's used in RNG.jl
but not in this file. Might be clearer to move the import SHA
into the file that uses it, but this is valid as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok I understand. it's just that Random.jl
already happens to have all the import/using directives necessary for the other files it includes.
Aside from my question (and CI), I approve very much. |
Yes, I'm not sure what to do here. We don't want to update the seeds in tests for each minor release (finding a working seed is very tedious). I think I will go with a the hack |
Also, many |
If it's painful for us, then perhaps it's too painful to inflict on everyone else too. Finding seeds that work shouldn't really be that hard: if the tests are valid, then most seeds should work. The trouble, I guess, is that while most seeds should work for each test, the chance that a given global seed will allow all tests to pass is slim. It seems like each stochastic test should get its own seed then and you just try new random seeds for any test that fails and most choices should work. |
For doctests, I don't have a strong opinion what should be done. But for tests, I think it's easier than in the past to find fitting seeds in
So for tests, that's kinda the point of the PR, to motivate people to not rely on |
If this is to be done, consider putting the hashing logic within |
This addresses a part of #37165: > It's common that sequential seeds for RNGs are not as independent as one might like. This clears out this problem for `MersenneTwister`, and makes it easy to add the same feature to other RNGs via a new `hash_seed` function, which replaces `make_seed`. This is an alternative to #37766.
This addresses a part of #37165: > It's common that sequential seeds for RNGs are not as independent as one might like. This clears out this problem for `MersenneTwister`, and makes it easy to add the same feature to other RNGs via a new `hash_seed` function, which replaces `make_seed`. This is an alternative to #37766.
This addresses a part of #37165: > It's common that sequential seeds for RNGs are not as independent as one might like. This clears out this problem for `MersenneTwister`, and makes it easy to add the same feature to other RNGs via a new `hash_seed` function, which replaces `make_seed`. This is an alternative to #37766.
This hashes the provided seeds before feeding dSFMT initialization routine.
The passed seeds are still stored as-is in the
MersenneTwister.seed
field.Goals:
user-provided seeds
VERSION.minor
andVERSION.major
to the hashing process, thisguarantees that random streams change between minor releases, helping users
to not believe in stream stability
Closes #37165.
Performance
Random.seed!(rng, 0)
takes, on my machine,As seeding is usually not a time-critical operation, the slowdown here seems acceptable (it will probably be a different story when Xoshiro lands as the default RNG...)
Handling our tests
Many of our tests use
Random.seed!()
, and these would have to be changed for all minor releases. Alternatives include:seed!(seed, hash=false)
, orRandom.seednohash(seed)