Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Try not eating into the slot (performance/block times) #5522

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

eskimor
Copy link
Member

@eskimor eskimor commented May 13, 2022

Results on a loaded Versi (300 parachain validators, ~40 parachains):
para-block-times

Yellow area (5522-257c9d4d) : strict slots - when paras_inherent asks, we deliver.
Blue area (master-d5f16df7): Back to master with eating into slots
Purple area (5522-e13b16c4): Strict slots + unbounded send provisioner -> backing
Red area (5522-cf43dc6a): Strict slots + unbounded send proisioner -> backing + extended bitfield gossip duration

Full experiment

If you consider how much block times fluctuate within those areas, it becomes clear that for current Versi there is no significant difference between those setups.

It has to be noted that Versi was very loaded in this setup, mostly because of a overwhelmed dispute coordinator - it might make sense to repeat this experiment with a fixed dispute coordinator.

For why it makes no difference at all (strict slots vs. eating into a slot): I would expect relay chain blocks to be rather small and to propagate fast on Versi, meaning the eating into the slot likely never happens on Versi. We should probably add a metric triggering whenever we are eating into a slot, to confirm/disprove this. Another explanation would be, that it does not matter on Versi, because we only have toy parachains on Versi, no real load, no runtime upgrades, no message passing, networking is fast, ...

Why I am experimenting with strict slots in the first place is argued here ... I would expect this to make a difference on real networks, like Kusama, replicated here for completeness:

We might be able to get better properties, by being more strict and not cutting into the next slot.
This way, if a parachain does something that is slow (e.g. just big PoVs or issuing a runtime uprade), the following will happen:

  1. If the previous relay chain block was rather big, it might not make it.
  2. Other smaller stuff is likely ready and can be put into the block.
  3. Because of the reduced amount of time we had, the block is likely ending up rather small. (so we have a tendency of producing a small block after a big one)
  4. A small block is likely built, distributed and imported fast.
  5. We have more time available at the next slot and the big stuff of our parachain has good chances of getting in.

This means after a slot with little time, we will have more time in the next slot. This even scales a bit, the less time we have now, the more we will have the next (because the block will be smaller).
So in my mind, by being stricter we might be able to help avoid long stalls and also in general end up with a better distributed load, which should also be beneficial to the system performance.

With the current scheme the following happens:

A parachains does something slow:

  1. If the previous block was rather big, we eat into the slot, still the maximum time budget is 2 seconds in that case, parachain might not make it.
  2. Despite the fact that the parachain did not make it (and the block ended up tiny), we just ate into the block, worst case up to 2 seconds - so the time we have for preparing the next block will again be greatly reduced, despite the fact that we did not get much work done with the previous block!

We will need to have a long sequence of small relay chain blocks for this to recover eventually, but the moment the parachain makes it, we will have a big block again and the game starts anew.

Those are some late night thoughts, but to me that sounds like a promising experiment: Don't eat into the slot.

@github-actions github-actions bot added the A3-in_progress Pull request is in progress. No review needed at this stage. label May 13, 2022
eskimor added 3 commits May 13, 2022 23:35
Backing can be loaded, we don't want to delay block production because
of this. If we miss candidates because we don't wait, we will get the
current block out more quickly and will have more time on the next
block.
@eskimor eskimor changed the title Try not eating into the slot. Try not eating into the slot (performance/block times) Aug 16, 2022
@eskimor
Copy link
Member Author

eskimor commented Aug 16, 2022

Potential next steps:

  1. Try again on Versi with more realistic load scenarios (verify whether the slot eating is used or not)
  2. Given that there are no negative effects on Versi, we could shortcut and just try on Kusama to get real world data.

@ordian ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-in_progress Pull request is in progress. No review needed at this stage. T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.
Projects
No open projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

3 participants