Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing changes and more precompiles #254

Merged
merged 15 commits into from
Apr 30, 2021
Merged

Indexing changes and more precompiles #254

merged 15 commits into from
Apr 30, 2021

Conversation

chriselrod
Copy link
Member

No description provided.

@chriselrod
Copy link
Member Author

fixes #253

julia> using LoopVectorization
[ Info: Precompiling LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890]

julia> function foofoo(xs::Vector{Int64})
           s = 0f0
           @inbounds @avx for i in 1:length(xs)
               s += Float32(xs[i])
           end
           return s
       end
foofoo (generic function with 1 method)

julia> foofoo(collect(1:10))
55.0f0

@chriselrod
Copy link
Member Author

fixes #252

julia> using LoopVectorization

julia> function lv_test(A, B ,C)
           D = zero(promote_type(eltype(A),eltype(B),eltype(C)))
           @avx for i in axes(C,1), j in axes(C,2), k in axes(C,3)
               D += A[i,j] * B[i,k] * C[i,j,k]
           end
           D
       end
lv_test (generic function with 1 method)

julia> A = rand(2, 3); B = rand(2, 3); C = rand(2, 3, 3);

julia> lv_test(A, B, C)
3.162221675229055

Similarly,
fixes #251

julia> function fwd(r22::AbstractArray{T}, x, r312, ax_b=axes(x,2), ax_β=axes(r312,2), ax_a=axes(r312,2), ax_c=axes(r312,1)) where T
               acc = zero(T)
               # for c in ax_c  # works fine without @avx
               @avx for c in ax_c
                   for a in ax_a
                       for β in ax_β
                           for b in ax_b
                               acc = acc + r22[b, β] * x[a, b, c] * r312[c, a, β]
                           end
                       end
                   end
               end
               acc
       end
fwd (generic function with 5 methods)

julia> r22, x, r312 = rand(2,2), rand(1,2,3), rand(3,1,2);

julia> fwd(r22, x, r312) # UndefVarError: ####op#568_0 not defined
0.38040785607910943

julia> fwd(rand(3,3), rand(3,3,3), rand(3,3,3))
10.002254914427532

@chriselrod
Copy link
Member Author

Was already fixed on master, but the new release will fix #250

julia> function act!(R, A, B, ax_i=1:4, ax_j=Base.OneTo(4))
           for j in ax_j
               for i in ax_i
                   R[i, j] = A[2i + (j - 1) ÷ 2] + 0 * B[j]
               end
           end
           R
       end
act! (generic function with 3 methods)

julia> function act_avx!(R, A, B, ax_i=1:4, ax_j=Base.OneTo(4))
           @avx for j in ax_j
               for i in ax_i
                   R[i, j] = A[2i + (j - 1) ÷ 2] + 0 * B[j]
               end
           end
           R
       end
act_avx! (generic function with 3 methods)

julia> A, B = rand(10), rand(4)
([0.07679965771268082, 0.824045571092674, 0.936416792764375, 0.3251156875738792, 0.10978032932429849, 0.05916232379702535, 0.7161730794494576, 0.9125257528195947, 0.6873029017520635, 0.69167345196837], [0.992224771266325, 0.676384458821444, 0.8259595522815406, 0.18011831085812058])

julia> act!(zeros(4,4), A, B)
4×4 Matrix{Float64}:
 0.824046   0.824046   0.936417  0.936417
 0.325116   0.325116   0.10978   0.10978
 0.0591623  0.0591623  0.716173  0.716173
 0.912526   0.912526   0.687303  0.687303

julia> act_avx!(zeros(4,4), A, B)
4×4 Matrix{Float64}:
 0.824046   0.824046   0.936417  0.936417
 0.325116   0.325116   0.10978   0.10978
 0.0591623  0.0591623  0.716173  0.716173
 0.912526   0.912526   0.687303  0.687303

@chriselrod
Copy link
Member Author

fixes #249

julia> SLEEFPirates
ERROR: UndefVarError: SLEEFPirates not defined

julia> using LoopVectorization

julia> let x=rand(10), y=zeros(10)
       @avx for i in 1:10
       y[i]= log10(x[i])  # same for log2, but log is OK
       end
       y
       end
10-element Vector{Float64}:
 -0.26098113144791735
 -0.002190298330183066
 -0.7032198454377777
 -0.1787018821076598
 -0.27671130090165763
 -0.029409238479290098
 -0.3669815600361895
 -0.22742360902320607
 -0.3405875544083138
 -0.06392385494065451

@codecov
Copy link

codecov bot commented Apr 29, 2021

Codecov Report

Merging #254 (307fc23) into master (96aef96) will increase coverage by 1.17%.
The diff coverage is 92.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #254      +/-   ##
==========================================
+ Coverage   88.62%   89.79%   +1.17%     
==========================================
  Files          37       36       -1     
  Lines        6928     7410     +482     
==========================================
+ Hits         6140     6654     +514     
+ Misses        788      756      -32     
Impacted Files Coverage Δ
src/modeling/costs.jl 69.11% <ø> (+0.22%) ⬆️
src/codegen/lower_memory_common.jl 86.74% <50.00%> (ø)
src/codegen/lower_threads.jl 60.41% <50.00%> (ø)
src/codegen/loopstartstopmanager.jl 85.30% <82.50%> (-2.84%) ⬇️
src/modeling/determinestrategy.jl 97.20% <86.48%> (-1.56%) ⬇️
src/codegen/lowering.jl 90.24% <89.53%> (+0.37%) ⬆️
src/modeling/graphs.jl 88.52% <90.00%> (+0.53%) ⬆️
src/codegen/lower_store.jl 90.27% <91.66%> (+11.81%) ⬆️
src/condense_loopset.jl 95.09% <95.15%> (-1.16%) ⬇️
src/reconstruct_loopset.jl 93.39% <97.77%> (-2.97%) ⬇️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96aef96...307fc23. Read the comment docs.

@chriselrod chriselrod merged commit 2295262 into master Apr 30, 2021
@chriselrod chriselrod deleted the indexingoverhaul branch April 30, 2021 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant