-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing changes and more precompiles #254
Conversation
…y nested reductions to stop confusing LLVM with too many phi nodes, fix some reduction naming code.
…, mangle parameters in `_avx_!` to make sure they don't clash with user's symbols from loops
fixes #253 julia> using LoopVectorization
[ Info: Precompiling LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890]
julia> function foofoo(xs::Vector{Int64})
s = 0f0
@inbounds @avx for i in 1:length(xs)
s += Float32(xs[i])
end
return s
end
foofoo (generic function with 1 method)
julia> foofoo(collect(1:10))
55.0f0 |
fixes #252 julia> using LoopVectorization
julia> function lv_test(A, B ,C)
D = zero(promote_type(eltype(A),eltype(B),eltype(C)))
@avx for i in axes(C,1), j in axes(C,2), k in axes(C,3)
D += A[i,j] * B[i,k] * C[i,j,k]
end
D
end
lv_test (generic function with 1 method)
julia> A = rand(2, 3); B = rand(2, 3); C = rand(2, 3, 3);
julia> lv_test(A, B, C)
3.162221675229055 Similarly, julia> function fwd(r22::AbstractArray{T}, x, r312, ax_b=axes(x,2), ax_β=axes(r312,2), ax_a=axes(r312,2), ax_c=axes(r312,1)) where T
acc = zero(T)
# for c in ax_c # works fine without @avx
@avx for c in ax_c
for a in ax_a
for β in ax_β
for b in ax_b
acc = acc + r22[b, β] * x[a, b, c] * r312[c, a, β]
end
end
end
end
acc
end
fwd (generic function with 5 methods)
julia> r22, x, r312 = rand(2,2), rand(1,2,3), rand(3,1,2);
julia> fwd(r22, x, r312) # UndefVarError: ####op#568_0 not defined
0.38040785607910943
julia> fwd(rand(3,3), rand(3,3,3), rand(3,3,3))
10.002254914427532 |
Was already fixed on master, but the new release will fix #250 julia> function act!(R, A, B, ax_i=1:4, ax_j=Base.OneTo(4))
for j in ax_j
for i in ax_i
R[i, j] = A[2i + (j - 1) ÷ 2] + 0 * B[j]
end
end
R
end
act! (generic function with 3 methods)
julia> function act_avx!(R, A, B, ax_i=1:4, ax_j=Base.OneTo(4))
@avx for j in ax_j
for i in ax_i
R[i, j] = A[2i + (j - 1) ÷ 2] + 0 * B[j]
end
end
R
end
act_avx! (generic function with 3 methods)
julia> A, B = rand(10), rand(4)
([0.07679965771268082, 0.824045571092674, 0.936416792764375, 0.3251156875738792, 0.10978032932429849, 0.05916232379702535, 0.7161730794494576, 0.9125257528195947, 0.6873029017520635, 0.69167345196837], [0.992224771266325, 0.676384458821444, 0.8259595522815406, 0.18011831085812058])
julia> act!(zeros(4,4), A, B)
4×4 Matrix{Float64}:
0.824046 0.824046 0.936417 0.936417
0.325116 0.325116 0.10978 0.10978
0.0591623 0.0591623 0.716173 0.716173
0.912526 0.912526 0.687303 0.687303
julia> act_avx!(zeros(4,4), A, B)
4×4 Matrix{Float64}:
0.824046 0.824046 0.936417 0.936417
0.325116 0.325116 0.10978 0.10978
0.0591623 0.0591623 0.716173 0.716173
0.912526 0.912526 0.687303 0.687303 |
fixes #249 julia> SLEEFPirates
ERROR: UndefVarError: SLEEFPirates not defined
julia> using LoopVectorization
julia> let x=rand(10), y=zeros(10)
@avx for i in 1:10
y[i]= log10(x[i]) # same for log2, but log is OK
end
y
end
10-element Vector{Float64}:
-0.26098113144791735
-0.002190298330183066
-0.7032198454377777
-0.1787018821076598
-0.27671130090165763
-0.029409238479290098
-0.3669815600361895
-0.22742360902320607
-0.3405875544083138
-0.06392385494065451 |
Codecov Report
@@ Coverage Diff @@
## master #254 +/- ##
==========================================
+ Coverage 88.62% 89.79% +1.17%
==========================================
Files 37 36 -1
Lines 6928 7410 +482
==========================================
+ Hits 6140 6654 +514
+ Misses 788 756 -32
Continue to review full report at Codecov.
|
…t reproduce locally.
No description provided.