Details of Prometheus Alertmanager configurations are available at https://prometheus.io/docs/alerting/latest/configuration/. In this document, we will touch upon a very small set of considerations to keep in mind.
The Alertmanager is configured using a YAML-based configuration file. Essential configuration components and parameters include:
Global Configurations
resolve_timeout
: This global setting defines the default duration after which an alert will be considered resolved if no more firing alerts are received for it.
Example snippet:
global:
resolve_timeout: 5m
smtp_smarthost: {{ .Values.email_config.smtp_host }}:{{ .Values.email_config.smtp_port }}
smtp_from: {{ .Values.email_config.smtp_from }}
smtp_auth_username: {{ .Values.email_config.smtp_auth_username }}
smtp_auth_password: {{ .Values.email_config.smtp_auth_password }}
Route Configurations
receiver
: Specifies the default receiver for alertsgroup_by
: Groups alerts by specific labels. In this example, alerts are grouped by alertname and severity.group_wait
: Specifies how long to wait before grouping alerts. New alerts within this window will be grouped together.group_interval
: Defines the interval at which groups of alerts are evaluated for sending.repeat_interval
: Specifies how often to repeat notifications for the same alert group.routes
: Defines routing rules. In this example, alerts with a severity label set to "critical" are sent to the 'urgent-email' receiver, while others are sent to the 'normal-email' receiver.
Example snippet:
route:
receiver: alert-emailer
group_by: ['alertname', 'priority']
group_wait: 10s
group_interval: 5m
repeat_interval: 30m
routes:
- receiver: alert-emailer
matchers:
- severity="critical"
Receiver Configuration
receivers
: specify different receivers for alerts. Each receiver can have various configurations based on the notification channel, such as email, Slack, or other integrations.
Example snippet:
receivers:
- name: alert-emailer
email_configs:
- to: "team@example.com"
It is recommended to point Prometheus to a list of all Alertmanagers instead of load-balancing.