Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make SegmentedList thread-safe and lock-free by allocating all segments statically #20491

Open
andrewrk opened this issue Jul 3, 2024 · 0 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jul 3, 2024

Motivation

SegmentedList has turned out to be a useful abstraction when trying to mix Data-Oriented Design with multi-threading. It provides contiguous arrays which is useful for serialization, cache coherency, and ability to use indexes as alternative to pointers, while preserving stable element pointers so that multiple threads can independently do work on different elements within this list, even when more elements are added concurrently.

However, it has a big flaw currently: reallocation of the dynamic segments is not thread-safe. In particular, accessing any element by index races with insertions to the data structure, because a resize my cause the dynamic segments to be invalid while being read.

Proposed Changes

Always allocate all the segments statically:

pub fn SegmentedList(comptime Elem: type, comptime Len: type) type {
    return struct {
        const cache_line_size = 64;
        const first_segment_len = @max(1, cache_line_size / @sizeOf(Elem));
        const segments_buffer_len = @typeInfo(Len).Int.bits - @log2(first_segment_len) - 1;

        segments_buffer: [segments_buffer_len][*]T,
        len: Len,

        // ...
    };
}

This idea allows specifying the upper bound number of elements via providing the length type. For example, when Elem is a pointer and Len is u32, it will resolve to this:

segments_buffer: [28][*]T,
len: u32,

Smoke test:

>>> sum([(8 << x) for x in range(28,-1,-1)])
4294967288
>>> 1 << 32
4294967296

On 64-bit systems, the size of this struct is 232 bytes. Compared to ArrayListUnmanaged (24 bytes) and ArrayHashMapUnmanaged (32 bytes), this proposed memory layout requires about 7x more overhead.

In exchange, we get a thread-safe data structure that can be freely accessed and appended to in a lock-free manner. For instance, allocating a new dynamic segment (which is rare) can work like this:

  • cmpxchg to update the next dynamic segment (failure means another thread won the race; continue)
  • cmpxchg to increment the len (failure means another thread won the race; continue)

These changes are safe to perform while accessing any previous elements of the data structure by index because the segments are never resized.

Real-World Use Case

There is a direct need for this in the compiler, which uses a thread pool for AstGen work. Worker threads need to be able to create File instances and operate on File pointers.

Currently File instances are allocated individually with GeneralPurposeAllocator. This means that serializing File instances for incremental compilation purposes requires a lot of pointer chasing. On the other hand, if the data were stored in a SegmentedList, then an array of one iovec per segment could be passed to writev in order to serialize this data.

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. standard library This issue involves writing Zig code for the standard library. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Jul 3, 2024
@andrewrk andrewrk added this to the 0.14.0 milestone Jul 3, 2024
@andrewrk andrewrk changed the title Rework SegmentedList API to be thread-safe and avoid make SegmentedList thread-safe and lock-free by allocating all segments statically Jul 3, 2024
@andrewrk andrewrk modified the milestones: 0.14.0, 0.16.0 Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

1 participant