Files
OrbitWard/docs/alerting-design.md
2026-05-26 21:24:54 -06:00

30 lines
842 B
Markdown

# Alerting Design
Alerting is built around alert rules, incidents, notification policies, and notification history.
## Alert Rules
An alert rule turns monitor status or metric data into an incident. Initial rule behavior should support:
- Failure thresholds
- Recovery notifications
- Cooldown
- Severity
- Acknowledge
- Silence
## Incidents
Incidents represent active or historical alert events. They include opened time, resolved time, current status, severity, related asset, related monitor, related alert rule, notification history, acknowledgement, and silence state.
## Notifications
Initial channels:
- Email / SMTP
- Mattermost incoming webhook
- Zoom Team Chat incoming webhook
- Generic webhook
Alert messages should be human-readable and include asset, check, status, duration, timestamps, and a link back to OrbitWard.