Don't fall back to demo_workspace silently: a small tale of a feature flag
We had two pieces of demo behavior baked into the backend that nobody had to opt into: a synthetic event generator that wrote rows every 20 seconds, and seven handlers that fell back to a tenant named "demo_workspace" if the caller didn't specify one. Both were load-bearing for our public demo and silently wrong for self-hosted installs.
What was happening
The public site at emban.sidelabs.dev is a working demo: visitors click "Try the live demo" and see a dashboard ticking with fresh data. To make that work, we shipped two things that ran unconditionally on every server start:
// cmd/server/main.go
// Demo live event generator — keeps /live-demo dashboards "live" with fresh events every 20s.
demoLive := handler.NewDemoLiveGenerator(chWriter,
"1a49032b-3209-49c6-bf00-b690fff220da",
"prod",
)
go demoLive.Start(demoLiveCtx, 20*time.Second)
// /demo/embed-url, /v1/demo/embed-url
r.Get("/demo/embed-url", demoHandler.GetEmbedURL)
r.Get("/v1/demo/embed-url", demoHandler.GetEmbedURL)Plus the silent fallback in handlers like widget_preview.go:
tenantID := req.TenantID
if tenantID == "" {
tenantID = "demo_workspace"
}Seven handlers had this pattern: onboarding, demo, dashboard_preview, widget_preview, dashboard_export, discover, reports_impl. All of them silently defaulted to "demo_workspace" if the caller didn't pass a tenant_id.
Why it was OK on the demo
On emban.sidelabs.dev all three behaviors made sense:
- The
/demo/*routes power the public live-demo page. Disabling them would break the demo. - The synthetic event generator is what makes the dashboard tick. Without it, the demo looks dead between visitors.
- The "demo_workspace" fallback gives a sensible default for the demo page's queries.
Why it was broken everywhere else
Now picture a self-hosted install. The operator pulls the binary, runs migrations, points it at their ClickHouse. Without changing anything:
- A goroutine starts writing 20 synthetic events every 20 seconds into their production ClickHouse, in an org they don't own.
/demo/embed-urlis mounted on their public domain, even though they have nothing to demo.- A widget preview that forgets to pass
tenant_idhits the fallback. If the operator never created a tenant called "demo_workspace", the query returns no rows. If they did, by coincidence, the preview shows a different customer's data than the operator expected.
Plus a less obvious problem: the synthetic generator costs nothing on the demo box but on a self-hosted install it might be writing to an org that's billed for events. The operator never asked for it.
The fix
One env var, default false. Read once at startup. Gate the generator and routes behind it.
// internal/config/config.go
DemoModeEnabled bool
// ...
DemoModeEnabled: envBool("EMBAN_DEMO_ENABLED", false),// cmd/server/main.go
if cfg.DemoModeEnabled {
demoHandler := handler.NewDemoHandler(embedHandler, ...)
r.Get("/demo/embed-url", demoHandler.GetEmbedURL)
r.Get("/v1/demo/embed-url", demoHandler.GetEmbedURL)
demoInsightsHandler := handler.NewDemoInsightsHandler(...)
r.Get("/demo/insights", demoInsightsHandler.Handle)
r.Get("/v1/demo/insights", demoInsightsHandler.Handle)
}
// ...
if cfg.DemoModeEnabled {
demoLive := handler.NewDemoLiveGenerator(chWriter, ...)
go demoLive.Start(demoLiveCtx, 20*time.Second)
}On the public demo we set EMBAN_DEMO_ENABLED=true and document it loudly in .env.example:
# Demo mode — set to true ONLY on the public demo instance (emban.sidelabs.dev).
# When true: mounts /demo/* routes, runs the live-demo event generator.
# Self-hosted production installs should leave this as false (default).
EMBAN_DEMO_ENABLED=falseWhat we deliberately didn't gate
The "demo_workspace" fallback in preview/export handlers stayed. Reasoning: those handlers run inside an authenticated admin session, so the query is already scoped by auth.OrgID. The fallback can only return data the caller would already have access to — usually their own seeded sample data from onboarding. No cross-org leak is possible. We logged a TODO to make the default the user's first real tenant instead, but it's not a security issue.
What this taught us
- Demo behavior should be loud and opt-in, not silent and on by default. A 20-second-tick goroutine that writes synthetic data to ClickHouse is not something a self-hosted operator should discover by accident.
- Silent fallback to a magic string is rarely the right call. Better to require the caller to pass it, and 400 if they don't, than to quietly substitute and let them think their query worked.
- Default values for env flags should match the safer environment. Self-hosted is the strict default; the demo box opts in.
- When the demo is the first deployment, demo behavior creeps into shared paths. Add the gate before the second deployment target ships, not after.
The flag landed in one commit. It changed nothing for the live demo and removed a small but real footgun for everyone else.