Try not eating into the slot (performance/block times) #5522

eskimor · 2022-05-13T21:23:42Z

Results on a loaded Versi (300 parachain validators, ~40 parachains):

Yellow area (5522-257c9d4d) : strict slots - when paras_inherent asks, we deliver.
Blue area (master-d5f16df7): Back to master with eating into slots
Purple area (5522-e13b16c4): Strict slots + unbounded send provisioner -> backing
Red area (5522-cf43dc6a): Strict slots + unbounded send proisioner -> backing + extended bitfield gossip duration

Full experiment

If you consider how much block times fluctuate within those areas, it becomes clear that for current Versi there is no significant difference between those setups.

It has to be noted that Versi was very loaded in this setup, mostly because of a overwhelmed dispute coordinator - it might make sense to repeat this experiment with a fixed dispute coordinator.

For why it makes no difference at all (strict slots vs. eating into a slot): I would expect relay chain blocks to be rather small and to propagate fast on Versi, meaning the eating into the slot likely never happens on Versi. We should probably add a metric triggering whenever we are eating into a slot, to confirm/disprove this. Another explanation would be, that it does not matter on Versi, because we only have toy parachains on Versi, no real load, no runtime upgrades, no message passing, networking is fast, ...

Why I am experimenting with strict slots in the first place is argued here ... I would expect this to make a difference on real networks, like Kusama, replicated here for completeness:

We might be able to get better properties, by being more strict and not cutting into the next slot.
This way, if a parachain does something that is slow (e.g. just big PoVs or issuing a runtime uprade), the following will happen:

If the previous relay chain block was rather big, it might not make it.
Other smaller stuff is likely ready and can be put into the block.
Because of the reduced amount of time we had, the block is likely ending up rather small. (so we have a tendency of producing a small block after a big one)
A small block is likely built, distributed and imported fast.
We have more time available at the next slot and the big stuff of our parachain has good chances of getting in.

This means after a slot with little time, we will have more time in the next slot. This even scales a bit, the less time we have now, the more we will have the next (because the block will be smaller).
So in my mind, by being stricter we might be able to help avoid long stalls and also in general end up with a better distributed load, which should also be beneficial to the system performance.

With the current scheme the following happens:

A parachains does something slow:

If the previous block was rather big, we eat into the slot, still the maximum time budget is 2 seconds in that case, parachain might not make it.
Despite the fact that the parachain did not make it (and the block ended up tiny), we just ate into the block, worst case up to 2 seconds - so the time we have for preparing the next block will again be greatly reduced, despite the fact that we did not get much work done with the previous block!

We will need to have a long sequence of small relay chain blocks for this to recover eventually, but the moment the parachain makes it, we will have a big block again and the game starts anew.

Those are some late night thoughts, but to me that sounds like a promising experiment: Don't eat into the slot.

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

…metrics

Backing can be loaded, we don't want to delay block production because of this. If we miss candidates because we don't wait, we will get the current block out more quickly and will have more time on the next block.

…s' into rk-strict-slots

eskimor · 2022-08-16T08:13:40Z

Potential next steps:

Try again on Versi with more realistic load scenarios (verify whether the slot eating is used or not)
Given that there are no negative effects on Versi, we could shortcut and just try on Kusama to get real world data.

sandreim and others added 4 commits May 9, 2022 16:36

Add histogram for inherent data bitfields

f94f539

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

-500ms bitfield sign job delay, +500ms bitfield gossip

7563051

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

Merge remote-tracking branch 'origin' into sandreim/more_provisioner_…

fd2035f

…metrics

Try not eating into the slot.

13e82d8

github-actions bot added the A3-in_progress Pull request is in progress. No review needed at this stage. label May 13, 2022

eskimor added 3 commits May 13, 2022 23:35

Fixes.

257c9d4

Request candidates from backing via unbounded.

e13b16c

Backing can be loaded, we don't want to delay block production because of this. If we miss candidates because we don't wait, we will get the current block out more quickly and will have more time on the next block.

Merge remote-tracking branch 'origin/sandreim/more_provisioner_metric…

cf43dc6

…s' into rk-strict-slots

eskimor changed the title ~~Try not eating into the slot.~~ Try not eating into the slot (performance/block times) Aug 16, 2022

ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try not eating into the slot (performance/block times) #5522

Try not eating into the slot (performance/block times) #5522

eskimor commented May 13, 2022 •

edited

Loading

eskimor commented Aug 16, 2022

Try not eating into the slot (performance/block times) #5522

Are you sure you want to change the base?

Try not eating into the slot (performance/block times) #5522

Conversation

eskimor commented May 13, 2022 • edited Loading

eskimor commented Aug 16, 2022

eskimor commented May 13, 2022 •

edited

Loading