This project is part of the @thi.ng/umbrella monorepo.
WebAssembly SIMD vector operations for array/batch processing, written in AssemblyScript. These functions use the CPU's vector instructions to process 128bit words at once, which is the equivalent width of a 4D vector with 4x 32bit components. Several of the provided functions can also be used to process 2D vectors.
See /assembly for sources:
abs4_f32
add4_f32
addn4_f32
clamp4_f32
clampn4_f32
div4_f32
(!)divn4_f32
(!)dot2_f32_aos
(2)dot4_f32_aos
dot4_f32_soa
invsqrt4_f32
(!)madd4_f32
maddn4_f32
mag2_f32_aos
mag4_f32_aos
magsq2_f32_aos
magsq4_f32_aos
max4_f32
min4_f32
mix4_f32
mixn4_f32
msub4_f32
msubn4_f32
mul4_f32
muln4_f32
mul_m22v2_aos
(2)mul_m23v2_aos
(2)mul_m44v4_aos
neg4_f32
normalize2_f32_aos
(2)normalize4_f32_aos
sqrt4_f32
(!)sub4_f32
subn4_f32
sum4_f32
swizzle4_32
(!)
(!) Missing native implementation, waiting on...
(2) 2x vec2 per iteration
Also see src/api.ts for documentation about the exposed TS/JS API...
ALPHA - bleeding edge / work-in-progress
The WebAssembly SIMD spec is still WIP and (at the time of writing) only partially implemented and hidden behind feature flags.
- NodeJS (v12.10+):
node --experimental-wasm-simd
- Chrome: Enable SIMD support via chrome://flags
yarn add @thi.ng/simd
Package sizes (gzipped, pre-treeshake): ESM: 2.28 KB / CJS: 2.34 KB / UMD: 2.45 KB
import { init } from "@thi.ng/simd";
// the WASM module doesn't specify any own memory and it must be provided by user
// the returned object contains all available vector functions & memory views
// (an error will be thrown if WASM isn't available or SIMD unsupported)
const simd = init(new WebAssembly.Memory({ initial: 1 }));
// input data: 3x vec4 buffers
const a = simd.f32.subarray(0, 4);
const b = simd.f32.subarray(4, 16);
const out = simd.f32.subarray(16, 18);
a.set([1, 2, 3, 4])
b.set([10, 20, 30, 40, 40, 30, 20, 10]);
// compute dot products: dot(A[i], B[i])
// by using 0 as stride for A, all dot products are using the same vec
simd.dot4_f32_aos(
out.byteOffset, // output addr / pointer
a.byteOffset, // vector A addr
b.byteOffset, // vector B addr
2, // number of vectors to process
1, // output stride (floats)
0, // A stride (floats)
4 // B stride (floats)
);
// results for [dot(a0, b0), dot(a0, b1)]
out
// [300, 200]
// mat4 * vec4 matrix-vector multiplies
const mat = simd.f32.subarray(0, 16);
const points = simd.f32.subarray(16, 24);
// mat4 (col major)
mat.set([
10, 0, 0, 0,
0, 20, 0, 0,
0, 0, 30, 0,
100, 200, 300, 1
]);
// vec4 array
points.set([
1, 2, 3, 1,
4, 5, 6, 1,
]);
simd.mul_m44v4_aos(
points.byteOffset, // output addr / pointer
mat.byteOffset, // mat4 addr
points.byteOffset, // vec4 addr
2, // number of vectors to process
4, // output stride (float)
4 // vec stride (float)
);
// transformed points
points
// [110, 240, 390, 1, 140, 300, 480, 1]
Karsten Schmidt
© 2019 - 2020 Karsten Schmidt // Apache Software License 2.0