Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing logical collection pointers #393

Merged
merged 30 commits into from
Mar 9, 2024

Conversation

Horusiath
Copy link
Collaborator

@Horusiath Horusiath commented Mar 8, 2024

This PR does few things:

1. Introduce Hook and BranchID

While shared collections are all represented by BranchPtr, this is basically an unsafe wrapper around branch reference. In case when the underlying block is deleted and garbage collected (like in case when a nested type or entire document is dropped), this pointer is no longer valid and attempt to using it would result in segfault.

Important

While eventually we want to get rid of this behaviour properly, at the current stage this change would be even bigger and more invasive in the library API. For this reason we have an intermediary solution described bellow.

We exposed a new logical accessors, that don't issue segfaults at the cost of longer access. They are called hooks:

let doc = Doc::new();

// one way to create a root level logical reference
let root = ArrayRef::root("name");
let mut txn = doc.transact_mut();
let array: ArrayRef = root.get_or_create(&mut txn);

let map: MapRef = array.insert(&mut txn, 0, MapPrelim::from([("key", "value")]));

// hooks can also be takes straight from any shared type (root or nested)
let map_hook: Hook<MapRef> = map.hook();
let array_hook: Hook<ArrayRef> = array.hook();

// hooks can be used to obtain references
assert_eq!(map_hook.get(&txn), Some(map));
assert_eq!(array_hook.get(&txn), Some(array));

let doc2 = Doc::new();
let mut txn2 = doc2.transact_mut();
// sync changes
txn2.apply_update(
    Update::decode_v1(&txn.encode_state_as_update_v1(&StateVector::default())).unwrap(),
);

// hooks are logical pointer, so they work across different documents
let array2 = array_hook.get(&txn2).unwrap();
let map2 = map_hook.get(&txn2).unwrap();
assert_eq!(map2.get(&txn2, "key"), Some(Value::from("value")));

// if the shared collection was garbage collected it can no longer be accessed using hooks
array2.remove(&mut txn2, 0);
drop(txn2);

let txn2 = doc2.transact();
let map2 = map_hook.get(&txn2);
assert_eq!(map2, None);

Hooks also expose so-called BranchID which is basically an identifier od a reference, but unlike Hook it doesn't carry shared collection type info.

2. Rewrite of ywasm

Ywasm (yrs-based JS variant using WebAssembly) has been rewritten to accommodate new Hook API, this way we don't have to worry about segfaults inside of WASM virtual machine. There's also a number of other improvements:

  1. XML types like YXmlElement/YXmlFragment/YXmlText now have preliminary type representation. Methods like xmlElem.insertXmlText have been removed - now you can just use xmlElem.insert(new YXmlText('')).
  2. All shared collection now expose type and id properties, which inform about the type ID and BranchID of a current type.
  3. Doc.getXmlElement and Doc.getXmlText methods have been removed - they never worked right as root level types in the first place due to limitations of lib0 encoding. If you need, use Doc.getXmlFragment and nest your XML nodes inside of it instead.

3. Updates to yffi

  1. yxmlelem and yxmltext methods have been removed - they never worked right as root level types in the first place due to limitations of lib0 encoding. If you need, use yxmlfragment and nest your XML nodes inside of it instead.
  2. There's only one function to dispose all types of observers: yunobserve(Subscription*).
  3. ytransaction_alive is replaced by ybranch_alive(Branch*).
  4. Since Hooks and BranchIDs are especially important from the PoV of unmanaged C code, they have been exposed:
  • YBranchID is a C-compatible logical identifier of a Branch*. Just like described above it works across different docs. It's standard struct so it doesn't need manual memory freeing, however since root-level type identification is string-based, the YBranchID name pointer is valid as long as the document from which it has been generated from is alive. You could work around that by constructing YBranchID manually.
  • YBranchID can be obtained from any alive branch via ybranch_id(Branch*)
  • Any alive Branch* can be resolved via ybranch_get(YBranchId*, YTransaction*). If branch ID points to a collection which has been already garbage collected, a NULL will be returned.

@Horusiath Horusiath marked this pull request as ready for review March 9, 2024 08:06
@Horusiath Horusiath merged commit 1810664 into y-crdt:main Mar 9, 2024
7 checks passed
@Horusiath Horusiath mentioned this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant