You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
now we've had simd optimization in Q8_0 and Q8_1, we can add more optimizations over the basic quantization formats (not include the K quants).
we can take a new approach on making SIMD optimizations:
add a cross-platform SIMD optimization via stdsimd, and name it vec_dot_xxx_xxx_fallback, which offers a not too bad performance over different platforms.
add a neon optimized vec_dot implementation, which optimized on apple silicon platform
add a x86_64 platform with avx2
we do not need add platform specialized code in the quantize and dequantize part, using stdsimd or even plain rust loops is ok, it does not actually affect performance while comparing with a platform specialized impl.
now we've had simd optimization in Q8_0 and Q8_1, we can add more optimizations over the basic quantization formats (not include the K quants).
we can take a new approach on making SIMD optimizations:
vec_dot_xxx_xxx_fallback
, which offers a not too bad performance over different platforms.we do not need add platform specialized code in the
quantize
anddequantize
part, using stdsimd or even plain rust loops is ok, it does not actually affect performance while comparing with a platform specialized impl.The text was updated successfully, but these errors were encountered: