Alerts
Alerts evaluate a widget's query every 60 seconds and fire when the resulting value crosses a threshold, swings by a delta, or stops reporting data entirely. They deliver through email, webhook, or both — and every attempt is recorded in the webhook_deliveries observability table.
Query of an existing widget (dashboard widget or standalone widget from the widget library). The widget's metric, numeric prop, and event name define what is being measured — the alert only adds the condition and delivery.Condition types
| Type | When it fires | Required fields |
|---|---|---|
threshold | Current value crosses threshold_value using the chosen operator (gt, lt, gte, lte). | operator, threshold_value |
delta | Absolute percent change vs. the previous period crosses threshold_value. Useful for spike/dip detection. | operator, threshold_value |
no_data | The widget query returns zero rows over the lookback window. Good for catching ingestion outages. | — |
Evaluation model
widget_id. If dashboard_id is set, it reads the inline widget from that dashboard's published config. Otherwise it reads a standalone widget from the widget library — again preferring the published snapshot so you never get paged on a draft edit.
Query runs against ClickHouse with a period derived from lookback_minutes, with no group_by and no granularity — the alert only cares about the scalar value.
threshold_value. Delta compares the previous-period percentage change. No-data short-circuits on empty result sets. Anything else transitions the alert to ok.
alert_events row (state=firing, value, threshold), flips last_state to firing, and triggers delivery. Resolution flips back to ok. Delivery only runs on the firing transition, not every tick.
cooldown_minutes has elapsed since last_fired_at. This suppresses paging storms without silencing the alert entirely.
Creating an alert
Create, update, and delete endpoints require an admin API key or admin-scoped session. List, get, and events endpoints work with any authenticated session.
POST /v1/alerts
Authorization: Bearer YOUR_ADMIN_API_KEY
Content-Type: application/json
{
"dashboard_id": "dash_abc123",
"widget_id": "w_api_calls",
"name": "API error rate too high",
"condition_type": "threshold",
"operator": "gt",
"threshold_value": 50,
"lookback_minutes": 15,
"email_to": ["oncall@example.com"],
"webhook_url": "https://hooks.example.com/emban"
}
For a standalone widget (not attached to a dashboard), omit dashboard_id:
{
"widget_id": "wgt_standalone_latency",
"name": "p95 latency spike",
"condition_type": "delta",
"operator": "gt",
"threshold_value": 25,
"lookback_minutes": 60,
"webhook_url": "https://hooks.example.com/emban"
}
Webhook payload
Webhooks are POSTed with Content-Type: application/json. The payload identifies the alert, the dashboard/widget, and the exact condition that fired:
{
"alert_id": 42,
"name": "API error rate too high",
"dashboard_id": "dash_abc123",
"widget_id": "w_api_calls",
"condition": {
"type": "threshold",
"operator": "gt",
"value": 50
},
"state": "firing",
"fired_at": "2026-04-24T14:30:00Z"
}
Emban retries webhooks up to 3 times on failure with backoff (0s, 30s, 120s). A response code in the 2xx range stops retries; anything else (4xx, 5xx, connection error, 10-second timeout) triggers the next attempt. All attempts — successes and failures — are logged to webhook_deliveries and visible under Admin → Webhooks in the app.
127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, link-local) are rejected before the request is dispatched. If you need internal delivery, route it through an external relay or a public proxy.Email delivery
Emails send through the org's configured SMTP relay. Subject is [Emban Alert] <alert name>; the body includes the condition, widget reference, and a link back to the dashboard:
Subject: [Emban Alert] API error rate too high
Alert: API error rate too high
Condition: threshold gt 50.00
Dashboard: dash_abc123
Widget: w_api_calls
View dashboard: https://emban.example.com/app/dashboards/dash_abc123
---
Emban Alert System
Managing alerts
# List alerts
GET /v1/alerts
→ [{"id":42,"name":"...","last_state":"ok",...}, ...]
# Get one
GET /v1/alerts/42
# Pause/resume (does not delete alert_events history)
PATCH /v1/alerts/42
{"enabled": false}
# Adjust threshold without recreating
PATCH /v1/alerts/42
{"threshold_value": 75, "lookback_minutes": 30}
# Full history of firings and resolutions
GET /v1/alerts/42/events
→ [{"fired_at":"...","state":"firing","value":54.2,"threshold":50}, ...]
# Delete alert (cascades alert_events)
DELETE /v1/alerts/42
Design patterns
lookback_minutes of 1 means the alert evaluates a single minute of data — small sample, noisy signal. Prefer 5, 15, or 60 so transient spikes don't page you.
cooldown_minutes to at least that. Shorter cooldowns page the same incident twice; longer ones mute real re-escalations.
no_data alert on a widget that counts events over 10 minutes tells you the pipeline stopped. It is the cheapest possible ingestion-outage page and costs you nothing except the widget.
Observability
Every webhook attempt is recorded in the webhook_deliveries Postgres table with org scope, source (alert vs. report), attempt number, HTTP status, response excerpt (first 1 KB), duration, and error. The Admin → Webhooks page aggregates this into a per-alert summary:
- Success 24h — 2xx responses in the last day
- Fail 24h — non-2xx, connection errors, or timeouts
- Last response — status code of the most recent attempt, with a click-through to the alert config
webhook_deliveries observability. The API reference lists every alert endpoint and parameter.