Description
The linear memory associated with a WebAssembly instance is a contiguous, byte addressable range of memory. In the MVP each module or instance can only have one memory associated with it, this memory at index zero is the default memory.
The need for finer grained control of memory has been in the cards since the early days of WebAssembly, and some functionality is also described in the future features document.
Motivation
- The primary motivation for this proposal is to reduce the impact of copying into Wasm memory. In comparison with native applications, Web Codecs, and ML based applications in the browser context incur a non-negligible performance overhead due to the number of copies that need to be made across the pipeline (source: WebCodecs, WebGPU, Machine learning on the Web 1, 2).
- Widely used WebAssembly engines currently reserve large chunks of memory upfront to ensure that memory.grow can be efficient, and we can obtain a large linear contiguous chunk of memory that can be grown in place. While this works in the more general cases, it can be a very difficult requirement for systems that are memory constrained. In some cases it is possible that applications don’t use the memory that an engine implicitly reserves, so having some way to explicitly tell the OS that some reserved memory can be released would mean that applications have better control over their memory requirements. This was also previously discussed as an addition to the MVP, and more recently as an option for better memory management.
- Ensuring that some data can also be read-only is useful for applications that want to provide a read only API to inspect the memory, or for media use cases that want to disallow clients from manipulating buffers that the decoder might still be using. From a security perspective, read only memory will provide increased security guarantees for constant data, or better security guarantees for sensitive data which is currently absent due to all of the memory being write accessible.
Proposed changes
At a high level, this proposal aims to introduce the functionality of the instructions below:
memory.map
: Provide the functionality ofmmap(addr, length, PROT_READ|PROT_WRITE, MAP_FIXED, fd)
on POSIX, andMapViewOfFile
on Windows with accessFILE_MAP_READ/FILE_MAP_WRITE
.memory.unmap
: Provide the functionality of POSIXmunmap(addr, length)
, andUnmapViewOfFile(lpBaseAddress)
on Windows.memory.protect
: Provide the functionality ofmprotect
withPROT_READ/PROT_WRITE
permissions, andVirtualProtect
on Windows with memory protection constantsPAGE_READONLY
andPAGE_READWRITE
.memory.discard
: Provide the functionality ofmadvise(MADV_DONTNEED)
andVirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT)
on windows.
Some options for next steps are outlined below, the instruction semantics will depend on the option. The intent is to pick the option that introduces the least overhead of mapping external memory into the Wasm memory space. Both the options below below assume that additional memories apart from the default memory will be available. The current proposal will only introduce memory.discard
to work on the default memory, the other three instructions will only operate on memory not at index zero.
Option 1: Statically declared memories, with bind/unbind APIs (preferred)
- Extend the multi-memory proposal with a JS API that enables access to memories other than the default memory.
- The instructions outlined above will take a static argument for memory index.
- Introduce new JS API views to bind/unbind JSArrays that call
memory.map
/memory.unmap
underneath. (Note: it may be possible for some browser engines to operate on the same backing store without an explicitmap
/unmap
instruction. If the only usecase for these instructions is from JS, it is possible to make these API only as needed.) - Extend
memtype
to store memory protections in addition to limits for size ranges.
Reasons for preferring this approach:
- Having a statically known number of memories ahead of time may be useful for optimizing engine implementations
- From looking at applications, it looks like applications do not require a large number of additional memories, and having a single digit number of extra memories may be sufficient for most cases. (This is a limited survey of applications, if there are more that would benefit from fully first class memories please respond here, or send them my way.)
- Incremental addition over existing proposals.
Option 2: First class WebAssembly memories
This is the more elegant approach to dynamically add memories, but adding support for first class memories is non-trivial.
- Introduce the notion of a generic memory ref
ref.mem
. - Introduce a new class of instructions to add, remove and manipulate memory references.
- Extend existing instructions that take a
memarg
to use memory references. - The instructions outlined above will need an argument for a memory reference.
- JS API extensions for the instructions mentioned above.
Other alternatives
Why not just map/unmap to the single linear memory, or memory(0)?
- I'm not sure that this can be done in any way that can still be compatible with the performance guarantees for the current memory. At minimum, I expect that more memory accesses would need to be bounds checked, and write protections would also add extra overhead.
- Generalizing what this would need to look like, we need to store granular page level details for the memory which complicates the engine implementations, especially because engines currently assume that Wasm owns the default memory, and have tricks in place to make this work in a performant and secure way (the use of guard pages for example).
- To maintain backwards compatibility to the extent that the default Wasm memory space is unaffected.
Web API extensions
To support WebAssembly owning the memory, and also achieving zero copy data transfer, is to extend Web APIs to take typed array views as input parameters into which outputs are written. The advantage here is that the set of APIs that need this can be scaled incrementally with time, and it minimizes the changes to the WebAssembly spec.
The disadvantages are that this would require changes to multiple Web APIs across different standards organizations, it’s not clear that the churn here would result in providing a better data transfer story as some APIs will still need to copy out.
This is summarizing a discussion from the previous issue in which this approach was discussed in more detail.
Using GC Arrays
Though the proposal is still in phase 1, it is very probable that ArrayBuffers will be passed back and forth between JS/Wasm. Currently this proposal is not making assumptions about functionality that is not already available, and when available will evaluate what overhead it introduces with benchmarks. If at that time the mapping functionality is provided by the GC proposal without much overhead, and it makes sense to introduce a dependency on the GC proposal, this proposal will be scoped to the remaining functionality outlined above.
JS API
Interaction of this proposal with JS is somewhat tricky because
- WebAssembly memory can be exported as an ArrayBuffer, or a SharedArrayBuffer if the memory is shared, but ArrayBuffers do not have the notion of read protections for the ArrayBuffer. There are proposals in flight that explore this, and when this is standardized in JS, WebAssembly memory that is read-only either by using a map-for-read mapping or, protected to read-only can be exposed to JS. There are currently proposals in flight that explore these restricted ArrayBuffers. (1, 2)
- Multiple ArrayBuffers cannot alias the same backing store unless a SharedArrayBuffer is being used. One option would be for the BufferObject to return the reference to the existing JS ArrayBuffer. Alternatively a restriction that could be imposed to only use SharedArrayBuffers when mapping memory, but this also would has trickle down effects into Web APIs.
- Detailed investigation needed into whether growing memory is feasible for memory mapped buffers. What restrictions should be in place when interacting with resizeable/non-resizeable buffers?
Open questions
Consistent implementation across platforms
The functions provided above only include Windows 8+ details. Chrome still supports Windows 7 for critical security issues, but only until Jan 2023, this proposal for now will only focus on windows system calls available on Windows 8+ for now. Any considerations of older Windows users will depend on usage stats of the interested engines.
How would this work in the tools?
While dynamically adding/removing memories is a key use case, for C/C++/Rust programs operate in a single address space, and library code assumes that it has full access to the single address space, and can access any memory. With multiple memories, we are introducing separate address spaces so it’s not clear what overhead we would be introducing.
Similarly, read-only memory is not easy to differentiate in the current model when all the data is in a single read-write memory.
How does this work in the presence of multiple threads?
In applications that use multiple threads, what calls are guaranteed to be atomic? On the JS side, what guarantees can we provide for Typed array views?
Feedback requested
All feedback is welcome, but specific feedback that I would find useful for this issue:
- The use cases detailed in the motivation section are the ones that I will be currently focusing on. If there are other use cases that would benefit from a proposal along these lines I’d be interested to evaluate them as well.
- At the moment I’ve added high level details for Option 1, Option 2 and will continue to evaluate them in more depth, but I would be interested in feedback on both the options as well as thoughts on the alternate options mentioned above.
Repository link here if filing issues is more convenient.