-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rand(::String) #22224
add rand(::String) #22224
Changes from 1 commit
459a126
1deb0c5
5e827e9
533dfb3
3b97b61
fa51ee6
a5289a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -284,11 +284,6 @@ julia> rand(MersenneTwister(0), Dict(1=>2, 3=>4)) | |
`Set` and `IntSet`. For more than a few calls, use `rand(rng, | ||
collect(s))` instead, or either `rand(rng, Dict(s))` or `rand(rng, | ||
Set(s))` as appropriate. | ||
|
||
!!! note | ||
the complexity of `rand(rng, s::AbstractString)` is linear | ||
in the length of `s` if `s` is not of type `String`. If called more | ||
than a few times, you should use `rand(rng, collect(s))` instead. | ||
""" | ||
@inline rand() = rand(GLOBAL_RNG, CloseOpen) | ||
@inline rand(T::Type) = rand(GLOBAL_RNG, T) | ||
|
@@ -717,32 +712,27 @@ rand(rng::AbstractRNG, r::AbstractArray, dims::Integer...) = rand(rng, r, conver | |
|
||
# rand from a string | ||
|
||
rand(rng::AbstractRNG, s::AbstractString) = rand_iter(rng, s, Base.iteratorsize(s)) | ||
rand(s::AbstractString) = rand(GLOBAL_RNG, s) | ||
|
||
## optimization for String | ||
|
||
isvalid_unsafe(s::String, i) = !Base.is_valid_continuation(unsafe_load(pointer(s), i)) | ||
|
||
function rand(rng::AbstractRNG, s::String) | ||
rg = RangeGenerator(1:s.len) | ||
function rand(rng::AbstractRNG, s::String)::Char | ||
g = RangeGenerator(1:s.len) | ||
while true | ||
pos = rand(rng, rg) | ||
pos = rand(rng, g) | ||
isvalid_unsafe(s, pos) && return s[pos] | ||
end | ||
end | ||
|
||
## rand from an iterator | ||
|
||
rand_iter(rng, s, ::Base.IteratorSize) = throw(ArgumentError("iterator must have a known length")) | ||
|
||
function rand_iter(rng::AbstractRNG, s, ::Union{Base.HasShape,Base.HasLength})::eltype(s) | ||
pos = rand(rng, 1:length(s)) | ||
for (i, c) in enumerate(s) | ||
i == pos && return c | ||
function rand(rng::AbstractRNG, s::AbstractString)::Char | ||
g = RangeGenerator(1:endof(s)) | ||
while true | ||
try # the generic `isvalid` includes an equivalent try/catch statement | ||
return s[rand(rng, g)] | ||
end | ||
end | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @StefanKarpinski would you mind having a look at this updated generic implementation, to confirm it complies with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be better to do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok thanks (I believe the idea being that |
||
|
||
rand(s::AbstractString) = rand(GLOBAL_RNG, s) | ||
|
||
## rand from a string for arrays | ||
# we use collect(str), which is most of the time more efficient than specialized methods | ||
# (except maybe for very small arrays) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little skeptical of the
rand_iter
function andAbstractString
support. People usingrand
with other string types will probably not expect this function to be O(n); it seems preferable to throw a MethodError rather than to use this implementation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree, I had the same hesitation and had planned to update the doc today to warn about the complexity. Then I think that with a proper warning those convenience methods (similarly for
AbstractSet
andAssociative
at #22228) are not harmful and can be handy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also do rejection sampling: randomly choose and index from
start(s):endof(s)
and retry if it's not a valid start of a character – i.e. the same thing you do forString
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great! for some reason, I skimmed too fast on the generic code of
isvalid
and thought it was O(n).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I was confused not about
isvalid
, but about the fact that I thought I needed to compute thelength
somehow, which is O(n) in the generic method. Thanks again @StefanKarpinski, I updated the code.