Skip to content
Blog
·4 min readgoconfig

Don't fall back to demo_workspace silently: a small tale of a feature flag

We had two pieces of demo behavior baked into the backend that nobody had to opt into: a synthetic event generator that wrote rows every 20 seconds, and seven handlers that fell back to a tenant named "demo_workspace" if the caller didn't specify one. Both were load-bearing for our public demo and silently wrong for self-hosted installs.

What was happening

The public site at emban.sidelabs.dev is a working demo: visitors click "Try the live demo" and see a dashboard ticking with fresh data. To make that work, we shipped two things that ran unconditionally on every server start:

go
// cmd/server/main.go

// Demo live event generator — keeps /live-demo dashboards "live" with fresh events every 20s.
demoLive := handler.NewDemoLiveGenerator(chWriter,
    "1a49032b-3209-49c6-bf00-b690fff220da",
    "prod",
)
go demoLive.Start(demoLiveCtx, 20*time.Second)

// /demo/embed-url, /v1/demo/embed-url
r.Get("/demo/embed-url", demoHandler.GetEmbedURL)
r.Get("/v1/demo/embed-url", demoHandler.GetEmbedURL)

Plus the silent fallback in handlers like widget_preview.go:

go
tenantID := req.TenantID
if tenantID == "" {
    tenantID = "demo_workspace"
}

Seven handlers had this pattern: onboarding, demo, dashboard_preview, widget_preview, dashboard_export, discover, reports_impl. All of them silently defaulted to "demo_workspace" if the caller didn't pass a tenant_id.

Why it was OK on the demo

On emban.sidelabs.dev all three behaviors made sense:

  • The /demo/* routes power the public live-demo page. Disabling them would break the demo.
  • The synthetic event generator is what makes the dashboard tick. Without it, the demo looks dead between visitors.
  • The "demo_workspace" fallback gives a sensible default for the demo page's queries.

Why it was broken everywhere else

Now picture a self-hosted install. The operator pulls the binary, runs migrations, points it at their ClickHouse. Without changing anything:

  • A goroutine starts writing 20 synthetic events every 20 seconds into their production ClickHouse, in an org they don't own.
  • /demo/embed-url is mounted on their public domain, even though they have nothing to demo.
  • A widget preview that forgets to pass tenant_id hits the fallback. If the operator never created a tenant called "demo_workspace", the query returns no rows. If they did, by coincidence, the preview shows a different customer's data than the operator expected.

Plus a less obvious problem: the synthetic generator costs nothing on the demo box but on a self-hosted install it might be writing to an org that's billed for events. The operator never asked for it.

The fix

One env var, default false. Read once at startup. Gate the generator and routes behind it.

go
// internal/config/config.go

DemoModeEnabled bool

// ...

DemoModeEnabled: envBool("EMBAN_DEMO_ENABLED", false),
go
// cmd/server/main.go

if cfg.DemoModeEnabled {
    demoHandler := handler.NewDemoHandler(embedHandler, ...)
    r.Get("/demo/embed-url", demoHandler.GetEmbedURL)
    r.Get("/v1/demo/embed-url", demoHandler.GetEmbedURL)

    demoInsightsHandler := handler.NewDemoInsightsHandler(...)
    r.Get("/demo/insights", demoInsightsHandler.Handle)
    r.Get("/v1/demo/insights", demoInsightsHandler.Handle)
}

// ...

if cfg.DemoModeEnabled {
    demoLive := handler.NewDemoLiveGenerator(chWriter, ...)
    go demoLive.Start(demoLiveCtx, 20*time.Second)
}

On the public demo we set EMBAN_DEMO_ENABLED=true and document it loudly in .env.example:

text
# Demo mode — set to true ONLY on the public demo instance (emban.sidelabs.dev).
# When true: mounts /demo/* routes, runs the live-demo event generator.
# Self-hosted production installs should leave this as false (default).
EMBAN_DEMO_ENABLED=false

What we deliberately didn't gate

The "demo_workspace" fallback in preview/export handlers stayed. Reasoning: those handlers run inside an authenticated admin session, so the query is already scoped by auth.OrgID. The fallback can only return data the caller would already have access to — usually their own seeded sample data from onboarding. No cross-org leak is possible. We logged a TODO to make the default the user's first real tenant instead, but it's not a security issue.

The deciding question isn't "is this code shippable?" It's "would I be happy if this code shipped to someone else's production?" Two different answers explain the gate.

What this taught us

  • Demo behavior should be loud and opt-in, not silent and on by default. A 20-second-tick goroutine that writes synthetic data to ClickHouse is not something a self-hosted operator should discover by accident.
  • Silent fallback to a magic string is rarely the right call. Better to require the caller to pass it, and 400 if they don't, than to quietly substitute and let them think their query worked.
  • Default values for env flags should match the safer environment. Self-hosted is the strict default; the demo box opts in.
  • When the demo is the first deployment, demo behavior creeps into shared paths. Add the gate before the second deployment target ships, not after.

The flag landed in one commit. It changed nothing for the live demo and removed a small but real footgun for everyone else.

Why this matters if you embed analytics in your SaaS
Self-hosted is a distinct deployment target with its own defaults. If you ever want to ship the same binary that powers your hosted demo to a customer's infrastructure, you have to assume the operator never reads the README. Anything that runs on startup without a flag becomes part of your support surface — silently writing synthetic events into a production ClickHouse is exactly the kind of incident you only learn about from an angry email. Demo behavior should be loud, opt-in, and documented next to the flag that turns it on.