-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OneHotMatrix causes a 'scalar getindex disallowed' error on GPU #1006
Comments
Could you try with #764 ? |
@dhairyagandhi96, #764 makes no difference, it makes some changes to |
julia> ohb = float.(onehotbatch(rand(1:10, 100), 1:10)) |> gpu julia> dl(ohb) |
Right, I misread that, apologies for that. Minimizing shouldn't be hard, it's probably in the matmul, but would need to cross check. |
As far as I understand, the point of having It also does not currently work properly with
I am not 100% sure of the correct way Adapt.jl functions should be used to achieve the desired behaviour (docs are a bit scarce). One workaround I can think of is that instead of holding a |
In any case, we don't have this problem when computing crossentropies (which is the main reason why OneHotMatrix is there). using Flux, CuArrays
CuArrays.allowscalar(false)
using Flux: onehotbatch
ohb = onehotbatch(rand(1:10, 100), 1:10) |> gpu;
ŷ = CuArrays.rand(size(ohb)...)
Flux.crossentropy(ŷ, ohb) I'm not sure wether OneHotMatrix was ever meant to be a fully-fledged AbstractMatrix to use as an input to a model |
One-hot encoding is a standard technique used in language modelling and other applications of deep learning. When training these models on GPU with Flux, one currently has to revert to dense arrays with zeros and ones for that purpose. This is not ideal; either |
Fixed by JuliaGPU/CUDA.jl#90. Make sure Julia 1.5 is used, as well as |
I think this issue is causing many reported bugs that complain about "slowness" of
OneHotMatrix
I suspect the underlying issue is that
OneHotVector
is not properly adapted for GPU storage, andOnHotMatrix
stores an vector ofOneHotVectors
. A workaround would be to changeOnHotMatrix
to store aVector{Int}
and a size, as proposed by #578 . It would then be easy to adapt the vector for GPU storage.MWE:
The text was updated successfully, but these errors were encountered: