Skip to content

Files

Failed to load latest commit information.

Latest commit

 Cannot retrieve latest commit at this time.

History

History

simd

@thi.ng/simd

npm version npm downloads Mastodon Follow

This project is part of the @thi.ng/umbrella monorepo.

About

WebAssembly SIMD vector operations for array/batch processing, written in AssemblyScript. These functions use the CPU's vector instructions to process 128bit words at once, which is the equivalent width of a 4D vector with 4x 32bit components. Several of the provided functions can also be used to process 2D vectors.

Available functions

See /assembly for sources:

  • abs4_f32
  • add4_f32
  • addn4_f32
  • clamp4_f32
  • clampn4_f32
  • div4_f32
  • divn4_f32
  • dot2_f32_aos (2)
  • dot4_f32_aos
  • dot4_f32_soa
  • invsqrt4_f32
  • madd4_f32
  • maddn4_f32
  • mag2_f32_aos
  • mag4_f32_aos
  • magsq2_f32_aos
  • magsq4_f32_aos
  • max4_f32
  • min4_f32
  • mix4_f32
  • mixn4_f32
  • msub4_f32
  • msubn4_f32
  • mul4_f32
  • muln4_f32
  • mul_m22v2_aos (2)
  • mul_m23v2_aos (2)
  • mul_m44v4_aos
  • neg4_f32
  • normalize2_f32_aos (2)
  • normalize4_f32_aos
  • sqrt4_f32
  • sub4_f32
  • subn4_f32
  • sum4_f32
  • swizzle4_32 (f32 and u32)

(2) 2x vec2 per iteration

Also see src/api.ts for documentation about the exposed TS/JS API...

Status

ALPHA - bleeding edge / work-in-progress

Search or submit any issues for this package

The WebAssembly SIMD spec is still WIP and (at the time of writing) only partially implemented and hidden behind feature flags. Currently only fully tested (& testable for me) on Node 14.6+.

BREAKING CHANGES

0.4.0

Due to the opcode renumbering of SIMD operations proposed in April 2020, the WASM module will only work on engines released after 2020-05-21 when that change was committed to the WASM spec. For NodeJS this means only v14.6.0 or newer will be supported. This was an external change and outside our control...

Related packages

  • @thi.ng/malloc - ArrayBuffer based malloc() impl for hybrid JS/WASM use cases, based on thi.ng/tinyalloc
  • @thi.ng/soa - SOA & AOS memory mapped structured views with optional & extensible serialization
  • @thi.ng/vectors - Optimized 2d/3d/4d and arbitrary length vector operations
  • @thi.ng/vector-pools - Data structures for managing & working with strided, memory mapped vectors

Installation

yarn add @thi.ng/simd

ES module import:

<script type="module" src="https://cdn.skypack.dev/@thi.ng/simd"></script>

Skypack documentation

For Node.js REPL:

const simd = await import("@thi.ng/simd");

Package sizes (brotli'd, pre-treeshake): ESM: 2.16 KB

Dependencies

API

Generated API docs

import { init } from "@thi.ng/simd";

// the WASM module doesn't specify any own memory and it must be provided by user
// the returned object contains all available vector functions & memory views
// (an error will be thrown if WASM isn't available or SIMD unsupported)
const simd = init(new WebAssembly.Memory({ initial: 1 }));

// input data: 3x vec4 buffers
const a = simd.f32.subarray(0, 4);
const b = simd.f32.subarray(4, 16);
const out = simd.f32.subarray(16, 18);

a.set([1, 2, 3, 4])
b.set([10, 20, 30, 40,  40, 30, 20, 10]);

// compute dot products: dot(A[i], B[i])
// by using 0 as stride for A, all dot products are using the same vec
simd.dot4_f32_aos(
    out.byteOffset, // output addr / pointer
    a.byteOffset,   // vector A addr
    b.byteOffset,   // vector B addr
    2,              // number of vectors to process
    1,              // output stride (floats)
    0,              // A stride (floats)
    4               // B stride (floats)
);

// results for [dot(a0, b0), dot(a0, b1)]
out
// [300, 200]

// mat4 * vec4 matrix-vector multiplies
const mat = simd.f32.subarray(0, 16);
const points = simd.f32.subarray(16, 24);

// mat4 (col major)
mat.set([
    10, 0, 0, 0,
    0, 20, 0, 0,
    0, 0, 30, 0,
    100, 200, 300, 1
]);

// vec4 array
points.set([
    1, 2, 3, 1,
    4, 5, 6, 1,
]);

simd.mul_m44v4_aos(
    points.byteOffset, // output addr / pointer
    mat.byteOffset,    // mat4 addr
    points.byteOffset, // vec4 addr
    2,                 // number of vectors to process
    4,                 // output stride (float)
    4                  // vec stride (float)
);

// transformed points
points
// [110, 240, 390, 1, 140, 300, 480, 1]

Authors

If this project contributes to an academic publication, please cite it as:

@misc{thing-simd,
  title = "@thi.ng/simd",
  author = "Karsten Schmidt",
  note = "https://thi.ng/simd",
  year = 2019
}

License

© 2019 - 2023 Karsten Schmidt // Apache License 2.0