Open
Description
The endpoint regeneration mechanism is a bit of a mess.
Regeneration is, naturally, serialized; we should not regenerate a given endpoint in parallel. Okay, fine; there is a queue with a single consumer.
However, this is still awkward, as this prevents multiple pending regeneration requests from being coalesced. Incorrectly, if we coalesce multiple regeneration requests, only the first caller is blocked until completion; all other callers return immediately. This is particularly bad, since those callers may erroneously expect that regeneration has succeeded.
Even more awkward, if regeneration fails, we trigger a separate controller to re-enqeue yet another request.
The Fix
- We should not use the endpoint's EventQueue for handling regeneration. Rather, we should have a single controller that handles all regeneration requests.
- All regeneration requests should be coalesced. All regeneration requests should be blocking.
- The desiredRegenerationLevel mechanism should be persisted.
RegenMetadata.ParentContext
should be removed. It is extremely error prone.- If regeneration fails, it should be retried with some backoff.
Additional cleanup:
- Document exactly what the different regeneration levels do
- Decide when we should and should not force policy recalculation. I suspect we should be more rigorous here
- Consider making forced policy calculation another level between userspace and datapath
- Audit metrics, see if they're lacking