Skip to content

Allow customization of pre-defined Prometheus alert rules #2448

Closed
@alexandre-allard

Description

Component: salt, monitoring

Why this is needed:
To allow the user to tune some alert rules threshold to better meet his needs.

What should be done:
Customize pre-defined Prometheus alert rules from the user input.

Implementation proposal (strongly recommended):
In order to allow the user to customize the alert rules, we will pick up
some of them (for now, we'll only expose node-exporter alert rules) and expose only few parts of their configurations (e.g. threshold) to be customized.

Since in Prometheus rules, there are duplicated group name + alert rule name, we also need to take the severity into account to understand which
specific alert we're editing.

These customization will be stored in the metalk8s-prometheus-config
ConfigMap with something like::

apiVersion: v1
kind: ConfigMap
metadata:
  name: metalk8s-prometheus-config
  namespace: metalk8s-monitoring
data:
  config.yaml: |-
    apiVersion: addons.metalk8s.scality.com
    kind: PrometheusConfig
    spec:
      deployment:
        replicas: 1
      rules:
        <alertGroupName>:
          <alertName>:
            warning:
              threshold: 30
            critical:
              threshold: 10
        <anotherAlertGroupName>:
          <anotherAlertName>:
            critical:
              threshold: 20
              anotherThreshold: 10

The PrometheusRules object manifests
salt/metalk8s/addons/prometheus-operator/deployed/chart.sls need
to be templatized to consume these customizations through Jinja.

Default values for customizable alert rules to fallback on, if not defined
in the ConfigMap, will be set in salt/metalk8s/defaults.yaml.

Test plan:
Using pytest-bdd, we need to add a new feature alerting that will edit a pre-defined Prometheus rule and then check if the modification has been successfully applied.

Metadata

Labels

kind:enhancementNew feature or requesttopic:monitoringEverything related to monitoring of services in a running cluster

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions