Query Builder
The Query Builder is the admin-side adhoc SQL tool at /app/querybuilder. It talks directly to the ClickHouse cluster that stores your org's events, with org + environment scoping enforced at the engine level so you can't accidentally (or deliberately) read another tenant's data.
What it's for
- Exploring the events schema before you build a widget
- Answering one-off questions that don't justify a dashboard (“how many tenants used feature X last week?”)
- Validating a widget's expected numbers against raw SQL before publishing
- Debugging why a dashboard widget returns 0 rows (usually a tenant/time filter mismatch)
Three editing modes
sql-formatter, and a side panel showing the full dataset catalog.
The schema browser
The left rail lists every table and materialized view your org can read, with its columns and types. The list comes from GET /v1/datasets. Clicking a dataset expands its columns; clicking a column inserts it into the visual builder's dimension or measure field depending on the active tab.
Long type annotations are abbreviated in the list — DateTime64(3, 'UTC') shows as DateTime64, LowCardinality(String) shows as Lc(String), Map(String, Float64) shows as Map. Hover over the column row to see the full type in the tooltip. Use Datasets → View DDL (or GET /v1/datasets/{name}/ddl) when you need the exact CREATE TABLE statement.
Running a query
Hit the Run button (or the keyboard shortcut shown in the toolbar) to execute. The response tab below shows:
- Columns — name and ClickHouse type per column
- Rows — the resolved values, with type-aware formatting for dates and maps
- Stats —
wall_ms(end-to-end time including network),rows_returned, andtruncated(set if the result was capped)
The result pane also offers Copy CSV and Copy JSON for pasting into a spreadsheet or notebook. History (per-browser, last 30 runs) lives under the History tab on the left rail and is kept in localStorage.
Limits
| Limit | Value | Why |
|---|---|---|
| SQL length | 8192 characters | Enforced before the query reaches ClickHouse. Rejects with 400 sql too long. |
| Execution time | 5 seconds | ClickHouse max_execution_time=5. Expensive queries get killed at the engine; you see a readable error back. |
| Result rows | 10,000 | ClickHouse max_result_rows=10000 with result_overflow_mode=break. Anything beyond is silently dropped and the truncated stat flips to true. |
| Statement type | SELECT or WITH only | Writes are blocked at two layers: a keyword pre-check and readonly=2 on the ClickHouse session. |
Security model
Every adhoc query runs with a ClickHouse additional_table_filters setting that pins emban.events to the caller's org_id and environment. The filter is injected at SETTINGS level, not concatenated into the SQL, so it:
- Applies to every reference of
emban.events, including inside subqueries,UNIONs, and CTEs - Cannot be bypassed by omitting a
WHEREor using a JOIN trick — the engine applies the filter before the query plan - Is invisible in your SQL — you don't have to type it, and you can't remove it
Writes are blocked by readonly=2 (all mutations forbidden, SETTINGS overrides still allowed for our filter injection). A query that tries to INSERT, ALTER, DROP, TRUNCATE, or OPTIMIZE is rejected with 400 only SELECT queries are allowed before it ever reaches ClickHouse.
WHERE org_id = '...' to your SQL — the engine adds it for you. Cross-checking a widget's numbers against raw SQL gives you apples-to-apples results without manual scoping.Saved queries
Saved queries live in Postgres (saved_queries table), scoped to the org. Anyone in the org can list and load them; only admins can create, update, delete, or star. Each saved query records name, sql, dataset (for UX context), starred, created_by, and updated-at.
# List (auth required)
GET /v1/saved-queries
→ [{"id":12,"name":"top tenants","sql":"SELECT...","starred":true,...}]
# Create (admin only)
POST /v1/saved-queries
{
"name": "Top tenants by volume (7d)",
"sql": "SELECT tenant_id, count() FROM emban.events WHERE timestamp > now() - INTERVAL 7 DAY GROUP BY tenant_id ORDER BY 2 DESC LIMIT 10",
"dataset": "emban.events"
}
# Toggle star or rename
PATCH /v1/saved-queries/12
{"starred": true}
# Remove
DELETE /v1/saved-queries/12
Starred queries float to the top of the sidebar list for everyone in the org. Use this for the 3–5 queries your team actually reaches for — health checks, pipeline sanity, common customer questions.
Audit trail
Every successful adhoc run writes a query.adhoc.run row into audit_log:
{
"action": "query.adhoc.run",
"actor": "admin@example.com",
"org_id": "org_abc",
"metadata": {
"sql_excerpt": "SELECT tenant_id, count() FROM emban.events ...",
"rows": 847,
"wall_ms": 218,
"truncated": false
},
"created_at": "2026-04-24T14:30:00Z"
}
The sql_excerpt is the first 512 characters of the submitted SQL — enough to reconstruct “who ran what” without bloating the audit table. Admins can browse this under Admin → Audit and filter by actor or time range.
Design patterns
tenant_id, environment, or a string_props key.
WHERE timestamp > now() - INTERVAL N DAY. The event-time column is also the main sort key, so this is the cheapest possible filter.
AND org_id = '...' or AND environment = '...' — the engine adds both for you. Adding them yourself is harmless (the filters compound), but your SQL reads cleaner without them.
API reference
# Execute (admin only)
POST /v1/query/adhoc
Authorization: Bearer YOUR_ADMIN_API_KEY
Content-Type: application/json
{"sql": "SELECT tenant_id, count() FROM emban.events WHERE timestamp > now() - INTERVAL 1 DAY GROUP BY tenant_id"}
# Response
{
"columns": [
{"name": "tenant_id", "type": "String"},
{"name": "count()", "type": "UInt64"}
],
"rows": [
["tenant_acme", 12453],
["tenant_foo", 8412]
],
"stats": {"wall_ms": 143, "rows_returned": 2, "truncated": false}
}
# Browse schema (any authenticated user)
GET /v1/datasets
GET /v1/datasets/{name}/ddl
# Saved queries
GET /v1/saved-queries (auth)
POST /v1/saved-queries (admin)
PATCH /v1/saved-queries/{id} (admin)
DELETE /v1/saved-queries/{id} (admin)
emban.events columns, prop shapes), Dashboards for how widgets are built on top of this same data, and the API reference for every endpoint surface.