Alerting
Thalassa Prometheus Service includes built-in Alertmanager for configuring alerts and routing notifications. This page covers the working of the Alertmanager, Ruler and configuration for Thalassa Prometheus Service.
Overview
The alerting system consists of:
- Alert rules, which define the conditions that trigger alerts.
- Recording rules, that are used to pre-compute expensive queries, improving performance.
- Alertmanager configuration, for routing alerts to various notification channels.
- And Notification channels, such as email, Slack, Teams, webhooks, and other supported integrations.
Managing Rules and Alertmanager Configuration
Thalassa Prometheus Service supports multiple methods for managing alert rules, recording rules, and Alertmanager configuration, including Mimirtool (a command-line tool for managing rules and Alertmanager config), the Console (a web-based interface for managing rules and notification channels), Prometheus APIs (compatible with standard Prometheus Ruler and Alertmanager APIs), and other tools that support Prometheus APIs such as amtool and promtool.
Creating Alert Rules
Basic Alert Rule
Create an alert rule that triggers when a condition is met:
groups:
- name: infrastructure
interval: 30s
rules:
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
team: platform
annotations:
summary: "High CPU usage detected"
description: "CPU usage is {{ $value }}% on {{ $labels.instance }}"Alert Rule Components
- expr: PromQL expression that evaluates to true when alert should fire
- for: Duration the condition must be true before alerting
- labels: Labels attached to the alert
- annotations: Human-readable information about the alert
Recording Rules
Recording rules pre-compute expensive queries to improve query performance and reduce costs:
groups:
- name: recording_rules
interval: 30s
rules:
- record: instance:node_cpu:rate5m
expr: rate(node_cpu_seconds_total[5m])
- record: instance:node_memory:usage_percent
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100Managing Rules with Mimirtool
Mimirtool is a command-line tool for managing rules and Alertmanager configuration in Cortex/Mimir-based Prometheus services.
Installation
Download Mimirtool from the Grafana Mimir releases or install via package manager.
Authentication
Configure authentication using OIDC:
# Set environment variables
export MIMIR_ADDRESS=https://prometheus.nl-01.thalassa.cloud
export MIMIR_TENANT_ID=<your-tenant-id>
# Authenticate using OIDC
export THALASSA_BEARER_TOKEN=$(tcloud oidc get-bearer-token)
mimirtool auth login \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENManaging Alert Rules
Load rules from file:
mimirtool rules load rules.yaml \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENList existing rules:
mimirtool rules list \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENDelete rules:
mimirtool rules delete <namespace> \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENVerify rules:
mimirtool rules verify rules.yaml \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENManaging Alertmanager Configuration
You can configure Alertmanager for your Prometheus tenant using mimirtool. This allows you to upload, view, or remove your Alertmanager settings directly from the command line.
Load Alertmanager config:
mimirtool alertmanager load alertmanager.yaml \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENGet current Alertmanager config:
mimirtool alertmanager get \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENDelete Alertmanager config:
mimirtool alertmanager delete \
--address $MIMIR_ADDRESS \
--tenant-id $MIMIR_TENANT_ID \
--token $THALASSA_BEARER_TOKENManaging Rules via Prometheus APIs
Thalassa Prometheus Service is compatible with standard Prometheus Ruler and Alertmanager APIs.
Ruler API
List rule groups:
curl -H "Authorization: Bearer $TOKEN" \
https://prometheus.nl-01.thalassa.cloud/api/v1/rulesGet rules for namespace:
curl -H "Authorization: Bearer $TOKEN" \
https://prometheus.nl-01.thalassa.cloud/api/v1/rules/<namespace>Load rules (POST):
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/yaml" \
--data-binary @rules.yaml \
https://prometheus.nl-01.thalassa.cloud/api/v1/rules/<namespace>Delete rule group:
curl -X DELETE \
-H "Authorization: Bearer $TOKEN" \
https://prometheus.nl-01.thalassa.cloud/api/v1/rules/<namespace>/<group>Alertmanager API
Get Alertmanager config:
curl -H "Authorization: Bearer $TOKEN" \
https://prometheus.nl-01.thalassa.cloud/api/v1/alertsSet Alertmanager config:
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/yaml" \
--data-binary @alertmanager.yaml \
https://prometheus.nl-01.thalassa.cloud/api/v1/alertsAlert Routing
Route alerts to different channels based on severity or labels:
route:
group_by: ['alertname', 'severity']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'pagerduty'
- match:
severity: warning
receiver: 'slack-alerts'
- match:
team: platform
receiver: 'email-team'References
- Alerting Documentation — Official Alertmanager documentation
- Alert Rules — Alert rule configuration
- Recording Rules — Recording rule configuration
- Mimirtool Documentation — Mimirtool usage guide
- Prometheus Ruler API — Ruler API reference
- Alertmanager API — Alertmanager API reference
- Grafana Integration — Set up Grafana for visualization