Skip to content
This repository has been archived by the owner on Mar 17, 2024. It is now read-only.

Metrics are not reset after consumer restart #25

Closed
aikoven opened this issue May 23, 2019 · 4 comments · Fixed by #27
Closed

Metrics are not reset after consumer restart #25

aikoven opened this issue May 23, 2019 · 4 comments · Fixed by #27
Labels
bug Something isn't working
Milestone

Comments

@aikoven
Copy link

aikoven commented May 23, 2019

In my app each consumer gets unique generated client_id. When I restart all consumers, Kafka assigns partitions to their new client_ids, but kafka_consumergroup_group_lag metrics for old client_ids remain in the output until I restart kafka-lag-exporter.

The graph below shows stacked values of kafka_consumergroup_group_lag for each client_id on the first start, after restart of consumers, and then after restart of kafka-lag-exporter:

image

The prometheus query is:

sum(kafka_consumergroup_group_lag) by (client_id)
@seglo
Copy link
Owner

seglo commented May 23, 2019

Thanks for trying out the project @aikoven. The client_id is generated by Kafka clients (by default) and is used to differentiate different Kafka clients. You probably want to aggregate on the group label, which is the same group.id you specify in your Kafka consumer properties configuration.

@aikoven
Copy link
Author

aikoven commented May 24, 2019

In this case, I would get a single series for the whole group. But I'd like to monitor the lag for each consumer in the group separately.

The problem is — even if a consumer is no longer assigned a partition, its metrics still show up.

@seglo
Copy link
Owner

seglo commented May 24, 2019

I see. Yes, this is also a problem for old consumer groups that no longer exist. When a metric is set for a particular set of labels it will remain on the prometheus endpoint until it's explicitly unset. The fix would be for the exporter to either reset all metrics each reporting interval, or determine how to selectively unset metrics that are no longer valid according to information retrieved from the consumer group coordinator.

@seglo seglo added the bug Something isn't working label May 24, 2019
@seglo seglo added this to the 0.4.1 milestone May 24, 2019
@seglo seglo closed this as completed in #27 Jun 5, 2019
@seglo
Copy link
Owner

seglo commented Jun 6, 2019

@aikoven LMK if 0.4.1 addresses this problem for you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants