Skip to content

Latest commit

 

History

History

simd

simd

npm version npm downloads Twitter Follow

This project is part of the @thi.ng/umbrella monorepo.

About

WebAssembly SIMD vector operations for array/batch processing, written in AssemblyScript. These functions use the CPU's vector instructions to process 128bit words at once, which is the equivalent width of a 4D vector with 4x 32bit components. Several of the provided functions can also be used to process 2D vectors.

Available functions

See /assembly for sources:

  • abs4_f32
  • add4_f32
  • addn4_f32
  • clamp4_f32
  • clampn4_f32
  • div4_f32 (!)
  • divn4_f32 (!)
  • dot2_f32_aos (2)
  • dot4_f32_aos
  • dot4_f32_soa
  • invsqrt4_f32 (!)
  • madd4_f32
  • maddn4_f32
  • mag2_f32_aos
  • mag4_f32_aos
  • magsq2_f32_aos
  • magsq4_f32_aos
  • max4_f32
  • min4_f32
  • mix4_f32
  • mixn4_f32
  • msub4_f32
  • msubn4_f32
  • mul4_f32
  • muln4_f32
  • mul_m22v2_aos (2)
  • mul_m23v2_aos (2)
  • mul_m44v4_aos
  • neg4_f32
  • normalize2_f32_aos (2)
  • normalize4_f32_aos
  • sqrt4_f32 (!)
  • sub4_f32
  • subn4_f32
  • sum4_f32
  • swizzle4_32 (!)

(!) Missing native implementation, waiting on...

(2) 2x vec2 per iteration

Also see src/api.ts for documentation about the exposed TS/JS API...

Status

ALPHA - bleeding edge / work-in-progress

The WebAssembly SIMD spec is still WIP and (at the time of writing) only partially implemented and hidden behind feature flags.

  • NodeJS (v12.10+): node --experimental-wasm-simd
  • Chrome: Enable SIMD support via chrome://flags

Installation

yarn add @thi.ng/simd

Package sizes (gzipped, pre-treeshake): ESM: 2.28 KB / CJS: 2.34 KB / UMD: 2.45 KB

Dependencies

API

Generated API docs

import { init } from "@thi.ng/simd";

// the WASM module doesn't specify any own memory and it must be provided by user
// the returned object contains all available vector functions & memory views
// (an error will be thrown if WASM isn't available or SIMD unsupported)
const simd = init(new WebAssembly.Memory({ initial: 1 }));

// input data: 3x vec4 buffers
const a = simd.f32.subarray(0, 4);
const b = simd.f32.subarray(4, 16);
const out = simd.f32.subarray(16, 18);

a.set([1, 2, 3, 4])
b.set([10, 20, 30, 40,  40, 30, 20, 10]);

// compute dot products: dot(A[i], B[i])
// by using 0 as stride for A, all dot products are using the same vec
simd.dot4_f32_aos(
    out.byteOffset, // output addr / pointer
    a.byteOffset,   // vector A addr
    b.byteOffset,   // vector B addr
    2,              // number of vectors to process
    1,              // output stride (floats)
    0,              // A stride (floats)
    4               // B stride (floats)
);

// results for [dot(a0, b0), dot(a0, b1)]
out
// [300, 200]

// mat4 * vec4 matrix-vector multiplies
const mat = simd.f32.subarray(0, 16);
const points = simd.f32.subarray(16, 24);

// mat4 (col major)
mat.set([
    10, 0, 0, 0,
    0, 20, 0, 0,
    0, 0, 30, 0,
    100, 200, 300, 1
]);

// vec4 array
points.set([
    1, 2, 3, 1,
    4, 5, 6, 1,
]);

simd.mul_m44v4_aos(
    points.byteOffset, // output addr / pointer
    mat.byteOffset,    // mat4 addr
    points.byteOffset, // vec4 addr
    2,                 // number of vectors to process
    4,                 // output stride (float)
    4                  // vec stride (float)
);

// transformed points
points
// [110, 240, 390, 1, 140, 300, 480, 1]

Authors

Karsten Schmidt

License

© 2019 - 2020 Karsten Schmidt // Apache Software License 2.0