You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just ran into a case where bad codegen, not incomplete inference, resulted in calls to the runtime that prevents us from using Cassette for GPU codegen. Pasting my notes here:
using Cassette
Cassette.@context Noop
functionmain()
a = [0]
functionkernel(T, ptr)
unsafe_store!(ptr, 1)
returnendkernel(Int, pointer(a))
code_llvm(kernel, Tuple{Type{Int}, Ptr{Int}}; debuginfo=:none)
# good code; T and ptr end up in different slots, T is marked constant# https://github.com/JuliaLang/julia/blob/592878623d376c71e5452dc2775fa2f7a4e097ca/src/codegen.cpp#L6116# define void @julia_kernel_12936(%jl_value_t*, i64) #0 {# top:# %2 = inttoptr i64 %1 to i64*# store i64 1, i64* %2, align 1# ret void# }
Cassette.overdub(Noop(), kernel, Int, pointer(a))
code_llvm(Cassette.overdub, Tuple{typeof(Noop()), typeof(kernel), Type{Int}, Ptr{Int}}; debuginfo=:none)
# bad code; T and ptr are in the vaSlot, which is not constant or unused.# the varargs slot isn't concretely typed, so we get a call to jl_f_tuple# and calls to jl_f_getfield to access values# https://github.com/JuliaLang/julia/blob/592878623d376c71e5452dc2775fa2f7a4e097ca/src/codegen.cpp#L6198-L6201# define void @"julia_#4_12937"(%jl_value_t* nonnull, i64) #0 {# top:# ...# %19 = call nonnull %jl_value_t* @jl_f_tuple(%jl_value_t*l_value_t* null to %jl_value_t*), %jl_value_** %2, i32 2)# ...# %23 = call nonnull %jl_value_t* @jl_f_getfield(%jl_value_t*l_value_t* null to %jl_value_t*), %jl_value_t** %2, i32 2)# %24 = bitcast %jl_value_t* %23 to i64**# %25 = load i64*, i64** %24, align 8# store i64 1, i64* %25, align 1# ...# ret void# }functionkernel(ptr)
unsafe_store!(ptr, 1)
returnend
Cassette.overdub(Noop(), kernel, pointer(a))
code_llvm(Cassette.overdub, Tuple{typeof(Noop()), typeof(kernel), Ptr{Int}}; debuginfo=:none)
# good code; we still have a vaSlot but it's concretely typed.# https://github.com/JuliaLang/julia/blob/592878623d376c71e5452dc2775fa2f7a4e097ca/src/codegen.cpp#L6193-L6196endisinteractive() ||main()
The splat comes from how arguments are passed to/by overdub. It would be possible to try and change that, or by having the ability to force specializing varargs (e.g. JuliaLang/julia#34365 (comment)).
The text was updated successfully, but these errors were encountered:
I just ran into a case where bad codegen, not incomplete inference, resulted in calls to the runtime that prevents us from using Cassette for GPU codegen. Pasting my notes here:
The splat comes from how arguments are passed to/by
overdub
. It would be possible to try and change that, or by having the ability to force specializing varargs (e.g. JuliaLang/julia#34365 (comment)).The text was updated successfully, but these errors were encountered: