-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add BLAS.get_num_threads #36360
add BLAS.get_num_threads #36360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be nice to have. I think the obvious use is
old = get_num_threads()
@threads for i in 1:100
# use *
end
set_num_threads(old)
If it's possible to have Julia without one of the libraries handled by set_num_threads
, then I think this should do nothing, and not give an error.
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
stdlib/LinearAlgebra/src/blas.jl
Outdated
ret = nothing | ||
@warn "Could not get number of BLAS threads. Returning `$ret` instead." maxlog=1 | ||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's nice that the current code ties the implementation and the warning message. But I fear other core devs might not like "unnecessary" dynamism (as @OkonSamuel pointed out). I think it might be safe to just do:
ret = nothing | |
@warn "Could not get number of BLAS threads. Returning `$ret` instead." maxlog=1 | |
return ret | |
@warn "Could not get number of BLAS threads. Returning `nothing` instead." maxlog=1 | |
return nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with returning nothing
, should there be any warnings at all?
I thought these two were complementary approaches, either return something clearly marking the failure to read how many threads, or else return an integer & warn that it may not be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a warning won't do any hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is used as
old = get_num_threads()
# ...
set_num_threads(old)
then the end-user (which may not be who writes the function) cannot notice something fishy is going on unless there is a warning.
Also, I'm guessing this code pass is rarely used. So, I guess it's OK to make things verbose until somebody complains. It's also possible for a user to suppress the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is a performance optimisation, those tend to give up silently. (Although I wish more of them had @debug
statements.) That said I agree it shouldn't be triggered often.
If there is to be a warning, just one is enough, not two. Edit -- actually the above save/reset might emit 3 warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the question is if it is OK to consider changing number of threads as pure optimization. It changes the result slightly:
julia> A = randn(300, 300); B = randn(300, 300);
julia> LinearAlgebra.BLAS.set_num_threads(1)
julia> C1 = @btime $A * $B;
1.112 ms (2 allocations: 703.20 KiB)
julia> LinearAlgebra.BLAS.set_num_threads(2)
julia> C2 = @btime $A * $B;
650.866 μs (2 allocations: 703.20 KiB)
julia> extrema(C1 .- C2)
(-3.552713678800501e-14, 4.263256414560601e-14)
I think it is worth notifying the user that something unexpected is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps one warning when set
fails, and nothing on get
, is the right way to go?
(The behaviour of get_num_threads
is unchanged; surely adding a one-time warning isn't considered a breaking change.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess you mean to merge the two warnings for macOS? But, since this code path would be rarely exercised, I think it's better to go with simplicity than a nicely formatted warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should not overengineer this rare case. About silent fails of performance optimization, I think it makes a difference if an API level function fails at its one job or if some "optimization detail" burried deep in a long algorithm fails.
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can refactor this a bit more.
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
""" | ||
set_num_threads(n) | ||
set_num_threads(n::Integer) | ||
set_num_threads(::Nothing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an unusual API, I would prefer set_num_threads()
instead of set_num_threads(nothing)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to allow the pattern
default = BLAS.get_num_threads() # returns nothing on exotic platforms
BLAS.set_num_threads(1)
# do stuff
BLAS.set_num_threads(default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will still work if you make the signature set_num_threads(n=nothing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me allowing nothing
is a hack to allow the above pattern on strange platforms. It is not a thing I would encourage or that I think needs more convenient syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine if you have code like set_num_threads()
it is more likely you forgot to pass the number of threads, than that you really want to invoke the nothing
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should define set_num_threads() = set_num_threads(nothing)
. It would send a wrong message that set_num_threads(nothing)
is somehow a reasonable default. But it's not. It is the last resort that exists only for supporting the rollback use case.
But this is not clear from the current docstring. I think it's better to clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps clearest to show the pattern which motivates this:
Set the number of threads the BLAS library should use.
Also accepts `nothing`, in which case it tries to set set the default number of threads.
On exotic variants of BLAS, `nothing` may be returned by ` get_num_threads()`.
Thus the following pattern may fail to set the number of threads, but will not give an error:
old = get_num_threads()
set_num_threads(1)
@threads for i in 1:10
# single-threaded BLAS calls
end
set_num_threads(old)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fredrikekre Does the argument here makes sense? It'd be nice if you can have a look at the docstring.
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
Co-authored-by: Okon Samuel <39421418+OkonSamuel@users.noreply.github.com>
We need a news item for |
Has anyone tried this with Apple's BLAS? I tried & failed to install this today, |
I stopped using Apple BLAS around Julia v0.7: I think MKL is faster |
Can somebody rerun the failing CI? |
I tried restarting but it's not doing my bidding even though I'm logged in. Hitting the rebuild button just gives me a bunch of info about the PR and doesn't do anything. @staticfloat, any idea what's up with that? |
If you click That means that the build is queued, but hasn't started yet. You can see at the top instead of |
Can this be merged? |
This is late but notice (1) tests will fail if
|
Arguably thats a desirable property. It means that the BLAS library is not properly supported.
The exact number of threads set this way is an implementation detail and should not be tested. But it makes sense, that we just call the method and test that julia does not e.g. crash. |
It's documented to fail gracefully on weird BLAS libraries, so I would say they ought to pass tests. |
My view is that |
Co-authored-by: Okon Samuel <39421418+OkonSamuel@users.noreply.github.com> Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
!!! compat "Julia 1.6" | ||
`get_num_threads` requires at least Julia 1.6. | ||
""" | ||
get_num_threads(;_blas=guess_vendor())::Union{Int, Nothing} = _get_num_threads() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why _blas
isn't passed to _get_num_threads
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reporting, I will fix it!
From slack, also came up on discourse