-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak? #445
Comments
This is not a memory leak, but a side effect of how updates work. In essence: when you insert a new item, old item gets tombstoned - it's never fully removed, but instead it leaves a small struct in memory keeping the metadata required for possible concurrent update conflict resolution. Now: Yjs/Yrs have a specific optimisation strategy that enables squashing together blocks which met specific criteria - one of that criteria is the same type (i.e. tombstone) and sequentiality - blocks concerning the same entity (here: a key-value pair) can be squashed together into one if updates happened after another. This is mostly used in text editing, when person types characters in a sequence without changing cursor location. In your first case you're updating the same key-value pair, meaning that every new update overrides previous one in a sequential manner. Thanks to that, when there are tombstones, they can be squashed together into single struct describing range of deletes instead. In second case, updates are no longer sequential, as odd updates are targeting entry "leaf_0" while even ones: "leaf_1". For that reason tombstones can no longer be squashed, so every deleted update is counted individually as a separate structure in memory. |
@Horusiath Thanks for the explanation. This makes much more sense to me now. One question more, then. Is there a way to force the removal of tombstones? Something like garbage collection. I.e., if I want to save the data to disk or optimize my memory footprint. If yes, what would be its implications for conflict resolution? |
I'm interested in this issue too, as in a context where a server is always up, this would inevitably lead to memory overflow at some point. |
There is a process of garbage collection, however it only happens when you remove an entire collection (Text, YArray, YMap etc.), since we can guarantee that removed element is no longer reachable. For other occasions, we cannot really remove tombstones, as the current peer never knows if there was someone else editing concurrently but happened to be offline i.e. for the last 3 months. What I was proposing in the past was to provide an "islands of concurrency". Depending on your use case, as example: if you're collaboratively editing release notes, after release has been made, we no longer care to accept concurrent text edits coming from others as release doc has already been approved. If we can identify such events that divide our history into non-concurrent fragments, then when handling such an event, we could i.e. copy-paste content of the data structure to new collection (i.e. YText deltas/diffs) - that WON'T carry over potential tombstones - and then delete old version of collection, causing it to be GCed. This of course means, that any missed updates that happened prior that event, will no longer be applicable to new structure. |
Just to be clear, does it apply to any collection within the tree or only to the root collections?
As for the islands, I understand it applies to the case in which the doc is effectively locked and is no longer available for edits in the future. Is this correct? |
When you delete collection, all of its elements will be no longer accessible from the root, so they will be GCed as well.
This can be applied on Doc level (freezing entire doc) for better control, but with clarification above it can also be used at individual collection level: copy-pasting collection contents in place will GC old version of collection together will all of the concurrent updates that may be potentially in flight. |
@Horusiath Thanks for the explanations! I appreciate them a lot. I'll continue testing the library and will come back once I have more questions. Meanwhile, could you please also take a look at the PR I submitted a while ago #444? I know you approved it initially, but after it ran the CI, some issues surfaced, and I posted some suggestions there regarding how they can be addressed. Please, let me know your preference. It's quite important for our use scenario. |
@Horusiath, do you have any examples of things cleaning up once a collection is deleted? I was trying to test it but couldn't get it working. The memory footprint doesn't return to normal even after a collection is deleted. Here is a snippet of one of my tests: fn memory_cleanup(count: usize) -> anyhow::Result<Duration> {
let doc1 = Doc::new();
let element1 = doc1.get_or_insert_map("element");
{
let mut txn = doc1.transact_mut();
let inner: MapRef = element1.insert(&mut txn, "inner", MapPrelim::<Any>::new());
inner.insert(&mut txn, "position", MapPrelim::from(
[
("x", 1),
("y", 2)
]
));
inner.insert(&mut txn, "size", MapPrelim::from(
[
("x", 3),
("y", 4)
]
));
inner.insert(&mut txn, "num", Any::BigInt(5 as i64));
}
measure(move || {
for i in 0..(count as i64) {
{
let mut txn = doc1.transact_mut();
let inner: MapRef = element1.get(&txn, "inner").unwrap().try_into().unwrap();
inner.insert(&mut txn, "position", MapPrelim::from(
[
("x", Any::BigInt(i)),
("y", Any::BigInt(i+1))
]
));
}
}
{ //once this transaction is executed, I expect the memory consumption to return to normal
let mut txn = doc1.transact_mut();
element1.remove(&mut txn, "inner");
}
println!("Trying to cleanup....");
thread::sleep(Duration::from_secs(1000));
Ok(())
})
} |
Hi guys,
I was testing
Yrs
and encountered the following behavior, which looks like a memory leak to me.Consider the following function:
It works just fine when I call it with a single insert per transaction. But when I call it with at least two inserts per transaction, it starts consuming all the memory in the world:
Tested on a Mac with M3.
The text was updated successfully, but these errors were encountered: