Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rand dispatch for tuple types #50251

Merged
merged 16 commits into from
Sep 29, 2023
Merged

Conversation

adienes
Copy link
Contributor

@adienes adienes commented Jun 21, 2023

as suggested by #50236 @rfourquet

@adienes
Copy link
Contributor Author

adienes commented Jun 21, 2023

I doubt those build failures are related to this change

@rfourquet
Copy link
Member

I'm not a fan of using broadcast for the implementation. Performance-wise, we should try to not be too far-off from RandomExtensions.jl:

# this PR
julia> @btime rand(Tuple{Bool,Int})
  1.310 μs (20 allocations: 744 bytes)

# RandomExtensions
julia> @btime rand(Tuple{Bool,Int})
  7.722 ns (0 allocations: 0 bytes)

@rfourquet rfourquet added the randomness Random number generation and the Random stdlib label Jun 22, 2023
@adienes
Copy link
Contributor Author

adienes commented Jun 22, 2023

ok, should match performance now (at least locally for me)

@adienes
Copy link
Contributor Author

adienes commented Jun 26, 2023

g2g? is this functionality desired

@adienes adienes force-pushed the rand_tuple_type branch from b6b984f to ef0ac5e Compare July 3, 2023 20:54
@adienes adienes changed the title add rand dispatch for tuple types add rand dispatch for tuple and rational types Jul 3, 2023
@adienes
Copy link
Contributor Author

adienes commented Jul 3, 2023

given that this won't be in 1.10 either way and can sit on master for a while, I added in support also for sampling from Rational type and updated the docstring of rand; would appreciate a review 🙂

@rfourquet
Copy link
Member

rfourquet commented Jul 4, 2023

I suggest removing the Rational part from this PR, which would almost surely prevent merging. See #25993 and #46060 for other attempts, if you want to pursue this idea, a separate PR is warranted.

@adienes adienes changed the title add rand dispatch for tuple and rational types add rand dispatch for tuple types Jul 4, 2023
@adienes
Copy link
Contributor Author

adienes commented Jul 4, 2023

I suggest removing the Rational part from this PR, which would almost surely prevent merging. See #25993 and #46060 for other attempts, if you want to pursue this idea, a separate PR is warranted.

fair enough. I should have guessed that would be contentious; I've removed the Rational piece

@rfourquet
Copy link
Member

rfourquet commented Jul 21, 2023

marking for triage as suggested here, both PR could probably be triaged at the same time

@rfourquet rfourquet added the triage This should be discussed on a triage call label Jul 21, 2023
@rfourquet
Copy link
Member

rfourquet commented Jul 21, 2023

I'm in favor of merging this PR as is, although there is room for optimization for some niche cases. Essentially, this implementation doesn't use the 2-staged random generation, i.e. the fact that in a call to rand, first a sampler is created, and then rand is called on the sampler; this means that for cases (rare, in particular when the distribution is a type like here, a tuple type) where the sampler creation can factor out some non-negligible computation when a bunch a random values are about to be generated (like in array generation). For reference, consider the following made-up example:

struct ZZ{N}
    x::Char
end

function Random.Sampler(::Type{RNG}, ::Type{ZZ{N}}, nn::Random.Repetition) where {N, RNG<:AbstractRNG}
    Random.SamplerSimple(ZZ{N}('a'), Random.Sampler(RNG, String(N), nn))
    # note: passing `ZZ{N}('a')` is somewhat artificial here, using SamplerTag would be better but i believe it's undocumented
end

Random.rand(rng::AbstractRNG, sp::Random.SamplerSimple{ZZ{N}}) where {N} = ZZ{N}(rand(rng, sp.data))

Then creating an array of such values wrapped in a tuple has significant overhead compared to raw values, unlike the no-overhead approach of RandomExtensions:

julia> @btime rand(ZZ{:asd}, 1000);
  2.712 μs (3 allocations: 4.16 KiB)

julia> @btime rand(Tuple{ZZ{:asd}}, 1000); # This PR
  17.623 μs (1001 allocations: 27.50 KiB)

julia> @btime rand(Tuple{ZZ{:asd}}, 1000); # Using instead `RandomExtensions`
  2.678 μs (3 allocations: 4.16 KiB)

But I think this optimization can be added later if anyone finds the motivation.

@JeffBezanson
Copy link
Member

Triage agrees with adding this 👍
But triage thinks we should not merge something that conflicts with a package that already has a better implementation. We should move the good implementation here, or else hold off and just tell people to use RandomExtensions.

The docs should also clarify what is supported, i.e. basically only concrete tuple types? Or maybe say fixed-length tuples of sample-able types?

@LilithHafner
Copy link
Member

The possibility of rand(Tuple{1:4, Int}, 10) did not make triage smile, nor did traige have anything authoritative to say about that possibility.

@adienes
Copy link
Contributor Author

adienes commented Aug 4, 2023

thank you for the comments.

I am happy to give it a shot to get the performance of long arrays of custom struct types performant compared to RandomExtensions.jl, but I may end up needing help on that

also, I did not know Tuple{1:4, Int} was even allowed 😅, don't all the parameters have to be types?

@LilithHafner
Copy link
Member

don't all the parameters have to be types?

Nope! Types and isbitstype values are also allowed. This is useful because it allows Array{Float, 2} to be a two-dimensional array with dimensionality known at compile time. A unit range would be a reasonable type parameter for a static offset array.

You can also do silly things like eltype(Vector{:blue}) === :blue

@adienes
Copy link
Contributor Author

adienes commented Sep 3, 2023

please let me know if the latest revision addresses all comments. the benchmarks I see are now

# This PR
@btime rand(Tuple{Bool,Int}); # ~ 3.958 ns (0 allocations: 0 bytes)
@btime rand(Tuple{Bool, Char, Vararg{Tuple{Int, Float64, Tuple{Char, Bool}}, 80}}); # ~ 2.509 μs (81 allocations: 4.50 KiB)
@btime rand(ZZ{:asd}, 1000); # ~ 3.250 μs (3 allocations: 4.16 KiB)
@btime rand(Tuple{ZZ{:asd}}, 1000); # ~ 3.250 μs (3 allocations: 4.16 KiB)


# RandomExtensions.jl
@btime rand(Tuple{Bool,Int}); # ~ 3.958 ns (0 allocations: 0 bytes)
@btime rand(Tuple{Bool, Char, Vararg{Tuple{Int, Float64, Tuple{Char, Bool}}, 80}}); # ~ 1.067 μs (0 allocations: 0 bytes)
@btime rand(ZZ{:asd}, 1000); # ~ 3.260 μs (3 allocations: 4.16 KiB)
@btime rand(Tuple{ZZ{:asd}}, 1000); # ~ 3.276 μs (3 allocations: 4.16 KiB)

so still slightly worse on the very wide tuple case, but I'm not sure how to avoid that without more involved changes. I believe this would be a pretty rare use case either way---the extra allocations occur past length 10 according to the implementation of ntuple, so the performance here is more reflective of wide tuples generally I think rather than anything specific to rand

On reflection, I believe Tuple{1:4, Int} shouldn't be allowed, since rand(T) should return a value of type T and Tuple{1:4, Int} is not instantiable

@adienes adienes force-pushed the rand_tuple_type branch 2 times, most recently from 58c160e to 9e58e6e Compare September 7, 2023 21:52
@adienes
Copy link
Contributor Author

adienes commented Sep 14, 2023

@rfourquet could you possibly review the latest updates?
in particular, let me know if the extra allocations in the case of the super-wide heterogenous tuple are acceptable. I suspect to remove those the implementation would have to grow significantly in complexity

stdlib/Random/src/Random.jl Outdated Show resolved Hide resolved
stdlib/Random/src/generation.jl Outdated Show resolved Hide resolved
@rfourquet
Copy link
Member

let me know if the extra allocations in the case of the super-wide heterogenous tuple are acceptable

I think it would be fair to merge this so that things move forward, but it's up to triage...

If this version is merged, I will have to update RandomExtensions before the 1.11 release, and in the process I will probably be able to extract from it a reasonable (not too complex) implementation which could supersede this PR; then I hope there won't be much opposition to such an update.

An alternative would be for you or me to directly pull RandomExtensions's implementation.

@adienes
Copy link
Contributor Author

adienes commented Sep 14, 2023

do note that since this was last evaluated on triage, the update has now matched performance with RandomExtensions on all cases except for the extremely wide tuple case. while I am not an expert on either implementation, my understanding of the reasoning is that Base.ntuple basically rewrites/unrolls only up to length 10, while RandomExtensions does a lot more codegen in the @make macros

given that this is kind of a more fundamental limitation to the performance of Tuple in Base rather than anything to do with rand, I'm inclined to think that the difference in performance is acceptable. however, I would completely understand if its desired to leave no performance on the table in these wide cases --- that being said, I'm not sure I have the requisite familiarity to pull the code over. I'm happy to attempt with some guidance, and I don't mean to put more work on your plate, but going that route it will probably be best if I let you handle it :)

@LilithHafner
Copy link
Member

Does this support rand(Tuple{Int, 1:4})? Does this support rand(Tuple{Int, Vararg{Int}}) and if so, what are the semantics? Whether or not these are supported, there should be tests for them.

@adienes
Copy link
Contributor Author

adienes commented Sep 14, 2023

intentionally not supported to both. I'll go ahead and add tests and make the docs more clear on that

@adienes adienes force-pushed the rand_tuple_type branch 3 times, most recently from ef15e41 to 9702348 Compare September 14, 2023 21:11
@adienes
Copy link
Contributor Author

adienes commented Sep 15, 2023

green!

@rfourquet rfourquet closed this Sep 23, 2023
@rfourquet rfourquet reopened this Sep 23, 2023
@rfourquet rfourquet merged commit 8436f68 into JuliaLang:master Sep 29, 2023
1 check passed
rfourquet pushed a commit that referenced this pull request Oct 7, 2023
EDIT: this was also implemented in #50251, which is now merged, so
merge this now for the added tests, and the added feature of
`rand(Tuple{})`.

This allows e.g `rand(NTuple{5,Int})` to sample a tuple of 5 `Int`s.

The implementation simply assembles a tuple by calling `rand` on the
corresponding type parameters of the tuple type.
A generated function is used to ensure type stability.
rfourquet added a commit that referenced this pull request Oct 7, 2023
This is a rebase of #35856, where we keep only the tests, as the
functionality was added in #50251. This also adds the possibility to
call `rand` on an empty tuple type: `rand(Tuple{})`.

Co-authored-by: Stephan Hilb <stephan@ecshi.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants