BLAS support for M1 ARM64 via Apple's Accelerate

The default BLAS Julia uses is OpenBLAS. Apple's M1 has proprietary dedicated matrix hardware that is only accessible via Apple's Accelerate BLAS implementation. That proprietary interface can provide 2x to 4x speedups for some linear algebra use cases (see  https://discourse.julialang.org/t/does-mac-m1-in-multithreads-is-slower-that-in-single-thread/61114/12?u=kristoffer.carlsson for some benchmarks and discussion.)

Since Julia 1.7 there's a BLAS multiplexer:
* https://github.com/staticfloat/libblastrampoline
(this currently -- as far as I understood it -- requires still proprietary code for each BLAS, and for M1 there's so far only a minimal shim via [AppleAccelerateLinAlgWrapper.jl](https://github.com/chriselrod/AppleAccelerateLinAlgWrapper.jl) )

So in theory, it should be possible to extend this so that depending on a given platform either OpenBLAS or other BLAS solutions are used transparently by default.

So this issue discusses what needs to be done to have Apple's Accelerate access to M1 hardware acceleration available by default in Julia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLAS support for M1 ARM64 via Apple's Accelerate #869

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development