EngineeringApril 27, 2026·10 min read

We dogfooded AuthzX — here's what broke

Before launching AuthzX, we decided to use it to authorize our own admin dashboard. Every feature in the dashboard — applications, policies, roles, subjects, settings — is gated by AuthzX policies, evaluated by the AuthzX Agent running locally alongside the dashboard.

This is the story of what worked, what broke, and what we learned by eating our own cooking.

The setup

Our admin dashboard has three roles: owner, admin, and developer. Each role maps to a set of dashboard features — applications, policies, resources, subjects, settings, user management — with different actions (view, create, edit, delete, manage).

We defined these as AuthzX resources, created policies in Terraform, and wrapped every dashboard route with a PermissionGate component that calls the AuthzX Agent before rendering. If the agent says deny, you don't see the button, the page, or the menu item.

The goal was simple: prove that the product works by running it on itself.

What worked

The agent is fast. Sub-millisecond policy evaluation. The dashboard wraps dozens of permission checks per page load, and none of them are perceptible. We measured 10ms end-to-end including Docker overhead and HTTP transport — actual Rego evaluation is in the hundreds of microseconds.
Terraform-managed policies were the right call. Every policy change went through a terraform apply. We could review, diff, and roll back. When we refactored from UUID-based resource matching to name-based, the diff was clean and auditable.
Observability paid off immediately. The decision log showed every allow and deny. When a developer role couldn't access a page, we could see exactly which policy evaluated and why it denied. Debugging authz without an audit trail is guesswork; with one, it's a lookup.
The dogfood narrative is real. Being able to say “we run our own product on AuthzX” isn't just marketing. It forces every edge case to surface before customers hit it.

What broke

Quite a bit, actually.

1. The Rego fan-out bug

Our Rego rules allowed a policy to match by application scope OR by resource list. This meant a policy scoped to “Application A” with resources defined in it would permit access to any resource in Application A, not just the ones explicitly listed. It was an over-permitting bug — the kind that silently passes all tests because everything gets allowed.

We caught it during manual testing when a developer could access a settings page they shouldn't have seen. The fix was a logic tightening in the Rego rules, but the fact that it wasn't caught by automated tests was the bigger lesson. We've since added Rego unit tests for every evaluation path.

2. Duplicated provisioning code

When a user signs up or gets invited, AuthzX needs to provision their tenant, create their subject entity, and assign their dashboard role. This provisioning code existed in two places: the identity service (for signups) and the user service (for invites).

The two copies drifted. The identity service used raw SQL. The user service used ORM methods. We almost shipped a fix to only one copy when an application rename surfaced a column mismatch in the raw SQL version.

Lesson: any code that exists in two places will drift, and authorization provisioning is the worst place for silent drift. We've since consolidated into a shared package, but we should have done it from day one.

3. UUID spaghetti

Our first pass used UUIDs for resource matching. The Terraform config was a wall of opaque identifiers. The dashboard code had 13 hardcoded resource UUIDs. Changing anything meant hunting through UUIDs and hoping you got the right one.

We ripped it out and switched to name-based matching. Resources are identified by human-readable names ( dashboard_applications, dashboard_settings) instead of UUIDs. The Terraform config became readable. The dashboard code became maintainable. But the migration cost us a full day that we wouldn't have needed if we'd started with names.

4. Resource count churn

We started with 14 dashboard feature resources, realized some were redundant, merged down to 10, then realized we'd missed two and went back to 12. Each change meant updating the Terraform config, the Rego test fixtures, and the dashboard permission gates.

We should have surveyed every PermissionGate usage in the dashboard before defining resources in Terraform. Instead, we defined resources first and then discovered mismatches reactively.

5. Transaction boundary asymmetry

Signup wraps everything in a single database transaction: user creation, tenant provisioning, entity creation, role assignment. If anything fails, the whole thing rolls back cleanly.

Invite does not. It commits the tenant membership first, then attempts entity creation and role assignment separately. If the second step fails, you end up with a user who belongs to the tenant but can't access the dashboard — a partial state that requires manual backfill.

This hasn't caused a customer-facing issue yet, but it will.

What we'd do differently

Survey UI permission usage first, define resources second. Map every gate in the frontend before touching Terraform. It would have saved the 14-to-10-to-12 resource churn.
Go name-based from the start. UUIDs are correct for system identifiers but wrong for policy authoring. Human-readable names should be the default.
Extract shared provisioning on day one. Two copies of provisioning code is one copy too many, especially when the logic is authorization-critical.
Write one integration test per critical flow. Signup, invite, and permission check each need at least one end-to-end test. Manual testing catches bugs; automated testing prevents regressions.
Decide meta-instance vs per-tenant early. We flip-flopped three times on whether dashboard RBAC should be per-tenant (each customer defines their own dashboard roles) or meta-instance (one fixed set of roles for everyone). Each flip meant rethinking the provisioning, the policy structure, and the migration path. We settled on meta-instance for v1 — the right call for launch — but we wasted time rediscovering the trade-off.

The bottom line

Dogfooding AuthzX was the best decision we made during the v1 build. Every bug we found — the Rego fan-out, the provisioning drift, the UUID mess — would have been a customer-facing incident if we hadn't hit it first.

The product is architecturally sound. We're not doing anything exotic: it's OPA/Rego, Postgres, a local agent, and a dashboard. The same patterns that Cerbos, Permit, and Styra use. We differentiate on developer experience — Terraform-first policy management, AI agent authorization as a first-class concept, and the observability that makes authorization debuggable instead of terrifying.

We're not going to pretend we built a better engine. We built a better experience around a proven engine. And we proved it works by running it on ourselves.

Try AuthzX free during Early Access

No credit card required. Authorize anything in under 10 minutes.

Get Started