Skip to content

Commit

Permalink
feat(leb128): add bigint support
Browse files Browse the repository at this point in the history
BREAKING CHANGE: update decode return type

- update all encode to accept bigint or number, cast to correct u64/i64 range
- update decode result to [bigint, number] tuple
- rebuild binary (~400 bytes smaller)
- move zig source to /zig
- update tests
- update readme
  • Loading branch information
postspectacular committed Dec 2, 2022
1 parent 56925a8 commit 0440c34
Show file tree
Hide file tree
Showing 8 changed files with 103 additions and 65 deletions.
25 changes: 17 additions & 8 deletions packages/leb128/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This project is part of the
[@thi.ng/umbrella](https://github.com/thi-ng/umbrella/) monorepo.

- [About](#about)
- [Breaking changes](#breaking-changes)
- [Status](#status)
- [Installation](#installation)
- [Dependencies](#dependencies)
Expand All @@ -22,14 +23,13 @@ This project is part of the

WASM based [Little Endian Base
128](https://en.wikipedia.org/wiki/LEB128) varint encoding / decoding,
supporting (u)int64 range (however for JS purposes only up to
`MAX_SAFE_INTEGER`).
supporting the full (u)int64 range.

The WASM binary (~860 bytes) is embedded as base64 string in the
TypeScript source to make it easier to use in both browser & node
environments. The source code of the actual implementation (written in
[Zig](https://ziglang.org)) is included in
[/src/leb128.zig](https://github.com/thi-ng/umbrella/tree/develop/packages/leb128/src/leb128.zig)
[/zig/leb128.zig](https://github.com/thi-ng/umbrella/tree/develop/packages/leb128/zig/leb128.zig)

All public functions throw an error if the WASM module could not be
initialized.
Expand All @@ -39,6 +39,15 @@ References:
- https://en.wikipedia.org/wiki/LEB128
- http://webassembly.github.io/spec/core/binary/values.html#integers

## Breaking changes

v3.0.0 introduces JS `bigint` support and both decode functions return a tuple
of `[bigint, number]` with the `bigint` being the decoded value and the 2nd item
the number of bytes consumed. Simarly, the encode functions now accept a JS
number or bigint arg.

Furthermore, all values to be encoded/decoded are cast to i64/u64 range now.

## Status

**STABLE** - used in production
Expand Down Expand Up @@ -68,7 +77,7 @@ node --experimental-repl-await
> const leb128 = await import("@thi.ng/leb128");
```

Package sizes (brotli'd, pre-treeshake): ESM: 1.04 KB
Package sizes (brotli'd, pre-treeshake): ESM: 876 bytes

## Dependencies

Expand All @@ -87,16 +96,16 @@ import * as leb from "@thi.ng/leb128";
enc = leb.encodeULEB128(Number.MAX_SAFE_INTEGER);
// Uint8Array [ 255, 255, 255, 255, 255, 255, 255, 15 ]

// decoding returns tuple of [value, bytes consumed]
// decoding returns tuple of [value (bigint), bytes consumed]
leb.decodeULEB128(enc);
// [ 9007199254740991, 8 ]
// [ 9007199254740991n, 8 ]

// encode signed int
enc = leb.encodeSLEB128(Number.MIN_SAFE_INTEGER)
// Uint8Array [ 129, 128, 128, 128, 128, 128, 128, 112 ]

leb.decodeSLEB128(enc)
// [ -9007199254740991, 8 ]
// [ -9007199254740991n, 8 ]
```

## Building the binary
Expand All @@ -111,7 +120,7 @@ Requirements:
brew install zig binaryen

# first run native tests
zig test packages/leb128/src/leb128.zig
zig test zig/leb128.zig
# Test 1/2 min safe integer...OK
# Test 2/2 max safe integer...OK
# All tests passed.
Expand Down
1 change: 1 addition & 0 deletions packages/leb128/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
"varint",
"wasm",
"webassembly",
"zig",
"ziglang"
],
"publishConfig": {
Expand Down
5 changes: 2 additions & 3 deletions packages/leb128/src/binary.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
// thing:no-export
/**
* Generated @ 2021-09-27T18:47:14Z
* Generated @ 2022-12-02T12:00:53Z
*
* @internal
*/
export const BINARY =
"AGFzbQEAAAABDQNgAXwBf2AAAXxgAAADBgUCAAEAAQUDAQACBioHfwBBgAgLfwBBgAgLfwBBiggLfwBBgAgLfwBBkIgEC38AQQALfwBBAQsH0QENBm1lbW9yeQIAEV9fd2FzbV9jYWxsX2N0b3JzAAASbGViMTI4X2VuY29kZV91X2pzAAEDYnVmAwASbGViMTI4X2RlY29kZV91X2pzAAISbGViMTI4X2VuY29kZV9zX2pzAAMSbGViMTI4X2RlY29kZV9zX2pzAAQMX19kc29faGFuZGxlAwEKX19kYXRhX2VuZAMCDV9fZ2xvYmFsX2Jhc2UDAwtfX2hlYXBfYmFzZQMEDV9fbWVtb3J5X2Jhc2UDBQxfX3RhYmxlX2Jhc2UDBgqWBAUDAAELegICfwF+AkACfiAARAAAAAAAAPBDYyAARAAAAAAAAAAAZnEEQCAAsQwBC0IACyIDQoABWgRAA0AgAUGACGogA6dB/wBxIANCB4giA0IAUiICQQd0cjoAACABQQFqIQEgAg0ACwwBC0GACCADPAAAQQEhAQsgAUH/AXELWwIDfwJ+QXYhAANAAkAgAEUEQEEKIQEMAQsgAUEBaiEBIABBighqLAAAIgJB/wBxrSADhiAEhCEEIABBAWohACADQgd8IQMgAkEASA0BCwtBgAggAToAACAEugu7AQIBfgR/AkACfiAAmUQAAAAAAADgQ2MEQCAAsAwBC0KAgICAgICAgIB/CyIBQkB9QoABWgRAQQEhAwNAIANFDQIgAaciA0HAAHEhBAJ/QgEgAUIHhyIBIAQbUEUEQCADQYB/ciEFQQEgBEUgAUJ/UnINARoLIANB/wBxIQVBAAshAyACQYAIaiAFOgAAIAJBAWohAgwACwALQYAIIAFCOYinQcAAcSABp0E/cXI6AABBASECCyACQf8BcQt8AgN/A35BfyEAA0ACQCADQgd8IQUgAEGBCGotAAAiAkEYdEEYdSEBIAJB/wBxrSADhiAEhCEEIABBAWoiAEEISw0AIAUhAyABQQBIDQELC0GACCAAQQFqOgAAIARCfyAFhkIAIAFBwABxQQZ2G0IAIABB/wFxQQlJG4S5CwAaCXByb2R1Y2VycwEIbGFuZ3VhZ2UBA0M5OQA=";
export const BINARY = "AGFzbQEAAAABCgJgAX4Bf2AAAX4DBQQAAQABBQMBABEGCQF/AEGAgMAACwdYBgZtZW1vcnkCAA9sZWIxMjhFbmNvZGVVNjQAAANidWYDAA9sZWIxMjhEZWNvZGVVNjQAAQ9sZWIxMjhFbmNvZGVJNjQAAg9sZWIxMjhEZWNvZGVJNjQAAwrBAwRbAQJ/AkAgAEKAAVoEQANAIAFBgIBAayAAp0H/AHEgAEL/AFZBB3RyOgAAIAFBAWohASAAQoABVCAAQgeIIQBFDQALDAELQYCAwAAgADwAAEEBIQELIAFB/wFxC1ECA38CfgNAAkAgAEEBaiECIABBgIBAaywAACIBQf8Aca0gA4YgBIQhBCABQQBODQAgA0IHfCEDIABBCUkgAiEADQELC0GAgMAAIAI6AAAgBAuRAQEDfwJAIABCQH1CgAFaBEBBASECA0AgAkEBcUUNAiABQYCAQGtBAEGAfyAApyICQcAAcSIDRSAAQoABVHEgA0EGdiAAQgeHIgBCf1FxciIDGyACQf8AcXI6AAAgA0UhAiABQQFqIQEMAAsAC0GAgMAAIABCOYinQcAAcSAAp0E/cXI6AABBASEBCyABQf8BcQt+AgN/A35BfyEAA0ACQCAAQQFqIQEgA0IHfCEFIABBgYDAAGotAAAiAEH/AHGtIAOGIASEIQQgAMAiAkEATg0AIAEhACAFIQMgAUEJSQ0BCwtBgIDAACABQQFqOgAAIARCfyAFhkIAIAJBwABxQQZ2G0IAIAFB/wFxQQlJG4QL";
34 changes: 21 additions & 13 deletions packages/leb128/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ import { BINARY } from "./binary.js";
interface LEB128 {
memory: WebAssembly.Memory;
buf: number;
leb128_encode_u_js(x: number): number;
leb128_decode_u_js(buf: number, num: number): number;
leb128_encode_s_js(x: number): number;
leb128_decode_s_js(buf: number, num: number): number;
leb128EncodeU64(x: bigint | number): number;
leb128DecodeU64(buf: bigint | number, num: number): bigint;
leb128EncodeI64(x: bigint | number): number;
leb128DecodeI64(buf: bigint | number, num: number): bigint;
}

let wasm: LEB128;
Expand All @@ -27,17 +27,25 @@ if (hasWASM()) {
const ensureWASM = () => !wasm && unsupported("WASM module unavailable");

const encode =
(op: "leb128_encode_s_js" | "leb128_encode_u_js") => (x: number) => {
(op: "leb128EncodeI64" | "leb128EncodeU64", signed: boolean) =>
(x: bigint | number) => {
ensureWASM();
return U8.slice(0, wasm[op](x));
const value = signed
? BigInt.asIntN(64, BigInt(x))
: BigInt.asUintN(64, BigInt(x));
return U8.slice(0, wasm[op](value));
};

const decode =
(op: "leb128_decode_s_js" | "leb128_decode_u_js") =>
(src: Uint8Array, idx = 0) => {
(op: "leb128DecodeI64" | "leb128DecodeU64", signed: boolean) =>
(src: Uint8Array, idx = 0): [bigint, number] => {
ensureWASM();
U8.set(src.subarray(idx, Math.min(idx + 10, src.length)), 0);
return [wasm[op](0, 0), U8[0]];
const value = wasm[op](0, 0);
return [
signed ? BigInt.asIntN(64, value) : BigInt.asUintN(64, value),
U8[0],
];
};

/**
Expand All @@ -46,7 +54,7 @@ const decode =
*
* @param x -
*/
export const encodeSLEB128 = encode("leb128_encode_s_js");
export const encodeSLEB128 = encode("leb128EncodeI64", true);

/**
* Takes Uint8Array with LEB128 encoded signed varint and an optional
Expand All @@ -56,15 +64,15 @@ export const encodeSLEB128 = encode("leb128_encode_s_js");
* @param src -
* @param idx -
*/
export const decodeSLEB128 = decode("leb128_decode_s_js");
export const decodeSLEB128 = decode("leb128DecodeI64", true);

/**
* Encodes unsigned integer `x` into LEB128 varint format and returns
* encoded bytes. Values < 0 will be encoded as zero.
*
* @param x -
*/
export const encodeULEB128 = encode("leb128_encode_u_js");
export const encodeULEB128 = encode("leb128EncodeU64", false);

/**
* Takes Uint8Array with LEB128 encoded unsigned varint and an optional
Expand All @@ -74,4 +82,4 @@ export const encodeULEB128 = encode("leb128_encode_u_js");
* @param src -
* @param idx -
*/
export const decodeULEB128 = decode("leb128_decode_u_js");
export const decodeULEB128 = decode("leb128DecodeU64", false);
30 changes: 21 additions & 9 deletions packages/leb128/test/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,37 +17,49 @@ if (hasWASM()) {
[255, 255, 255, 255, 255, 255, 255, 15]
);
assert.deepStrictEqual(decodeSLEB128(a), [
Number.MAX_SAFE_INTEGER,
BigInt(Number.MAX_SAFE_INTEGER),
8,
]);
assert.deepStrictEqual(
[...(a = encodeSLEB128(Number.MIN_SAFE_INTEGER))],
[129, 128, 128, 128, 128, 128, 128, 112]
);
assert.deepStrictEqual(decodeSLEB128(a), [
Number.MIN_SAFE_INTEGER,
BigInt(Number.MIN_SAFE_INTEGER),
8,
]);
assert.deepStrictEqual(decodeSLEB128(encodeSLEB128(64)), [64, 2]);
assert.deepStrictEqual(decodeSLEB128(encodeSLEB128(-64)), [-64, 1]);
assert.deepStrictEqual(decodeSLEB128(encodeSLEB128(64)), [
BigInt(64),
2,
]);
assert.deepStrictEqual(decodeSLEB128(encodeSLEB128(-64)), [
BigInt(-64),
1,
]);
},

unsigned: () => {
let a;
let a: Uint8Array;
assert.deepStrictEqual(
[...(a = encodeULEB128(Number.MAX_SAFE_INTEGER))],
[255, 255, 255, 255, 255, 255, 255, 15]
);
assert.deepStrictEqual(decodeULEB128(a), [
Number.MAX_SAFE_INTEGER,
BigInt(Number.MAX_SAFE_INTEGER),
8,
]);
assert.deepStrictEqual(
[...(a = encodeULEB128(Number.MIN_SAFE_INTEGER))],
[0]
[129, 128, 128, 128, 128, 128, 128, 240, 255, 1]
);
assert.deepStrictEqual(decodeULEB128(a), [0, 1]);
assert.deepStrictEqual(decodeULEB128(encodeULEB128(127)), [127, 1]);
assert.deepStrictEqual(decodeULEB128(a), [
BigInt.asUintN(64, BigInt(Number.MIN_SAFE_INTEGER)),
10,
]);
assert.deepStrictEqual(decodeULEB128(encodeULEB128(127)), [
BigInt(127),
1,
]);
},
});
} else {
Expand Down
2 changes: 1 addition & 1 deletion packages/leb128/tools/build-binary.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
zig build-lib -target wasm32-freestanding -dynamic -O ReleaseSmall --strip src/leb128.zig
zig build-lib -target wasm32-freestanding -dynamic -O ReleaseSmall zig/leb128.zig

# apply binaryen optimizer
wasm-opt leb128.wasm -o opt.wasm -Oz
Expand Down
22 changes: 15 additions & 7 deletions packages/leb128/tpl.readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,13 @@ This project is part of the

WASM based [Little Endian Base
128](https://en.wikipedia.org/wiki/LEB128) varint encoding / decoding,
supporting (u)int64 range (however for JS purposes only up to
`MAX_SAFE_INTEGER`).
supporting the full (u)int64 range.

The WASM binary (~860 bytes) is embedded as base64 string in the
TypeScript source to make it easier to use in both browser & node
environments. The source code of the actual implementation (written in
[Zig](https://ziglang.org)) is included in
[/src/leb128.zig](https://github.com/thi-ng/umbrella/tree/develop/packages/leb128/src/leb128.zig)
[/zig/leb128.zig](https://github.com/thi-ng/umbrella/tree/develop/packages/leb128/zig/leb128.zig)

All public functions throw an error if the WASM module could not be
initialized.
Expand All @@ -30,6 +29,15 @@ References:
- https://en.wikipedia.org/wiki/LEB128
- http://webassembly.github.io/spec/core/binary/values.html#integers

## Breaking changes

v3.0.0 introduces JS `bigint` support and both decode functions return a tuple
of `[bigint, number]` with the `bigint` being the decoded value and the 2nd item
the number of bytes consumed. Simarly, the encode functions now accept a JS
number or bigint arg.

Furthermore, all values to be encoded/decoded are cast to i64/u64 range now.

${status}

${supportPackages}
Expand Down Expand Up @@ -61,16 +69,16 @@ import * as leb from "@thi.ng/leb128";
enc = leb.encodeULEB128(Number.MAX_SAFE_INTEGER);
// Uint8Array [ 255, 255, 255, 255, 255, 255, 255, 15 ]

// decoding returns tuple of [value, bytes consumed]
// decoding returns tuple of [value (bigint), bytes consumed]
leb.decodeULEB128(enc);
// [ 9007199254740991, 8 ]
// [ 9007199254740991n, 8 ]

// encode signed int
enc = leb.encodeSLEB128(Number.MIN_SAFE_INTEGER)
// Uint8Array [ 129, 128, 128, 128, 128, 128, 128, 112 ]

leb.decodeSLEB128(enc)
// [ -9007199254740991, 8 ]
// [ -9007199254740991n, 8 ]
```

## Building the binary
Expand All @@ -85,7 +93,7 @@ Requirements:
brew install zig binaryen

# first run native tests
zig test packages/leb128/src/leb128.zig
zig test zig/leb128.zig
# Test 1/2 min safe integer...OK
# Test 2/2 max safe integer...OK
# All tests passed.
Expand Down
Loading

0 comments on commit 0440c34

Please sign in to comment.