Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the process of syncing rules in Rulers #6462

Open
rapphil opened this issue Dec 27, 2024 · 0 comments
Open

Optimize the process of syncing rules in Rulers #6462

rapphil opened this issue Dec 27, 2024 · 0 comments
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc.

Comments

@rapphil
Copy link
Contributor

rapphil commented Dec 27, 2024

Is your feature request related to a problem? Please describe.
Not really.

Describe the solution you'd like
SyncRules in Rulers is the process of configuring Rulers with the rules that are configured per tenant in cortex so that each Ruler loads the rule groups that they are responsible for. This process can happen periodically (with a configurable interval) or due to ring change events (when Rulers leave or join the ring). During a deployment, several ring change events are expected. We can consider to be a waste of resources to perform a periodic SyncRules right after a ring change has happened.

Therefore this issue proposes that ring change events restart the timer used for periodic SyncRules so that we reduce the number of SyncRules.

Describe alternatives you've considered
Can we aggregate the ring change events in buckets so that we minimize the number of SyncRules that are performed, in case we have a deployment with several instances restarting at the same time? I.e.: we set a upper bound to the number of SyncRules that can happen per period of time.

Additional context
Here is the for loop that we want to optimize:

cortex/pkg/ruler/ruler.go

Lines 670 to 687 in 8a46d20

for {
select {
case <-ctx.Done():
return nil
case <-tick.C:
r.syncRules(ctx, rulerSyncReasonPeriodic)
case <-ringTickerChan:
// We ignore the error because in case of error it will return an empty
// replication set which we use to compare with the previous state.
currRingState, _ := r.ring.GetAllHealthy(RingOp)
if ring.HasReplicationSetChanged(ringLastState, currRingState) {
ringLastState = currRingState
r.syncRules(ctx, rulerSyncReasonRingChange)
}
case err := <-r.subservicesWatcher.Chan():
return errors.Wrap(err, "ruler subservice failed")
}

@dosubot dosubot bot added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc.
Projects
None yet
Development

No branches or pull requests

1 participant