13. Authentication and User Management

WorkOS-backed multi-tenant authentication in production. The five principal kinds (§13.1) are implemented and in production. For operational details (org IDs, DNS records, Google OAuth client), see the WorkOS Production Configuration runbook.

1. Mental model

sv0 is a B2B multi-tenant security platform. Authentication is structured around four concepts:

Users — real humans who log in. Each user has one identity and can belong to multiple organizations.
Tenants — customer organizations plus a single SecurityV0-internal organization. Every piece of customer data in Mongo is tenant-scoped.
Memberships — the (user, tenant, role) relationship. A user is a member of zero or more tenants; each membership carries a role.
Super-admins — SecurityV0 staff who can see and act on every tenant. Derived from membership in the special securityv0-internal tenant, not a separate login path.

The system is designed around one rule: identity lives in the auth provider; authorization lives in sv0. The auth provider knows who you are and what organizations you belong to. sv0 knows what each role is allowed to do with findings, entities, and evidence packs. The boundary between these two is deliberate and load-bearing.

Provider abstraction. The auth layer is accessed through an AuthProvider interface (§2.1), not through direct WorkOS SDK calls. WorkOS is the default implementation for our SaaS deployment, but the interface is designed so alternative implementations can be swapped in at deployment time — for example, a direct OIDC implementation for partner-deployed single-tenant installations where WorkOS is not available. See §2.1 for the interface definition and deployment model mapping.

2. Layered auth architecture

Per-origin perimeter. app.securityv0.com is open at the network layer — the WorkOS hosted login is the only gate. dev.securityv0.com and PR previews (pr-N-dev.securityv0.com) sit behind Cloudflare Access. The Layer 1 description below shows the general composition; for the per-origin state see Cloudflare Zero Trust Access. For the full end-to-end flow see Authentication, end-to-end.

There are two authentication layers in front of sv0-platform. They compose; neither replaces the other.

┌─────────────────────────────────────────────────────┐
│  Layer 1: Cloudflare Access (Zero Trust perimeter)  │
│  - Gates dev.securityv0.com and PR previews         │
│  - app.securityv0.com is NOT behind CF Access       │
│  - Verifies identity at the network edge            │
│  - Issues Cf-Access-Jwt-Assertion header to origin  │
│  - Bypassed in CI/CD via service tokens             │
│  - Not aware of tenants or sv0 business logic       │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│  Layer 2: sv0 app-layer auth (WorkOS AuthKit)       │
│  - Runs inside Cloudflare perimeter                 │
│  - Handles login, sessions, orgs, SAML, magic link  │
│  - Drives the tenant model + super-admin logic      │
│  - Source of truth for who can see which tenant     │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
         sv0-platform Express middleware pipeline

Layer 1 (Cloudflare Access) is a network gate. It prevents anyone on the open internet from reaching the origin without passing through CF's identity check first. This is defense-in-depth: if we had a bug in Layer 2 that allowed unauthenticated access, Layer 1 would still keep the attack surface limited to "people CF Access lets through" (currently SecurityV0 Google Workspace members and Cloudflare service tokens for CI/CD).

Layer 2 (auth provider) is the app-layer identity system. This is where tenant logic, user records, memberships, and the super-admin model live. Everything in this document is about Layer 2 unless explicitly noted. In our SaaS deployment, Layer 2 is WorkOS AuthKit. In partner-deployed installations, it may be a direct OIDC provider or the client's own identity system.

Local development does not go through Layer 1 at all (localhost doesn't sit behind Cloudflare Access) and uses a dev-bootstrap shortcut for Layer 2 (see §10). dev.securityv0.com and PR previews have both layers active; app.securityv0.com has Layer 2 only (the WorkOS hosted login is the only gate). See §14 for the per-origin CF Access composition.

2.1 Auth provider abstraction (`AuthProvider` interface)

The auth provider is accessed through an interface, not through direct SDK calls. This is a deployment-time choice, not a runtime toggle — each installation has exactly one provider.

// src/api/auth/auth-provider.ts

interface AuthProvider {
  /** Initialize the provider (validate config, warm caches). */
  init(): Promise<void>;

  /** Build the redirect URL for interactive login. */
  getLoginUrl(params: { returnTo: string; orgHint?: string }): Promise<string>;

  /** Exchange an auth callback code for a resolved identity. */
  handleCallback(code: string): Promise<AuthCallbackResult>;

  /** Revoke / end a session on the provider side. */
  logout(providerSessionId: string): Promise<void>;

  /** Verify and decode a session token (cookie payload or JWT).
   *  Returns null if the token is invalid or expired. */
  verifySession(token: string): Promise<VerifiedSession | null>;

  /** Verify a bearer API key. Returns null if invalid.
   *  Not all providers support this — returns null if unsupported. */
  verifyApiKey?(bearerToken: string): Promise<VerifiedApiKey | null>;

  /** Verify an M2M JWT. Returns null if invalid.
   *  Not all providers support this — returns null if unsupported. */
  verifyM2MToken?(jwt: string): Promise<VerifiedM2MToken | null>;

  /** Generate an Admin Portal link for a tenant's IT admin to self-configure SSO.
   *  Not all providers support this — may throw UnsupportedError. */
  generateAdminPortalLink?(providerOrgId: string, intent: "sso" | "dsync"): Promise<string>;

  /** List active SSO connections for a tenant (for SSO enforcement).
   *  Returns empty array if SSO is not managed by this provider. */
  listActiveConnections(providerOrgId: string): Promise<string[]>;

  /** Webhook signature verification + event parsing.
   *  Not all providers emit webhooks — may be a no-op. */
  verifyWebhook?(rawBody: Buffer, signature: string): Promise<AuthWebhookEvent | null>;
}

interface AuthCallbackResult {
  providerUserId: string;
  email: string;
  displayName: string;
  providerOrgId: string | null;       // The org the user authenticated into
  authMethod: "sso" | "magic_link" | "oauth_google" | "oauth_microsoft"
            | "oauth_github" | "password" | "passkey";
  authConnectionId: string | null;    // SSO connection ID, if applicable
  providerSessionId: string;          // For logout/revocation
}

The interface boundary is clean: everything above the interface is provider-specific (WorkOS SDK calls, OIDC discovery, webhook parsing). Everything below it is sv0 business logic (tenant resolution, membership checks, permission maps, audit logging). The middleware pipeline (§5.2) calls AuthProvider methods and never imports provider-specific packages.

Deployment model mapping:

Deployment model	`AUTH_PROVIDER` env	AuthProvider implementation	Notes
SaaS multi-tenant	`workos`	`WorkOSAuthProvider`	Default. Uses WorkOS AuthKit, Organizations, M2M, API Keys.
SaaS single-tenant managed	`workos`	`WorkOSAuthProvider`	Same code, dedicated infra per client.
Partner-deployed (Deloitte etc.)	`oidc`	`OIDCAuthProvider`	Direct OIDC against client's IdP (Okta, Entra ID, PingFederate). No WorkOS dependency.
Air-gapped / on-prem	`oidc`	`OIDCAuthProvider`	Same as above, pointed at an on-prem IdP or Keycloak.
Local dev	`dev`	`DevAuthProvider`	Auto-mints sessions for the seeded dev user. No external dependency.

What the non-WorkOS implementations lose:

No Admin Portal (SSO config is manual or client-managed)
No WorkOS-hosted login UI (we render a minimal login form ourselves)
No API Keys widget (API key CRUD would need a custom UI or be unsupported)
No M2M Applications (service-to-service auth uses static tokens or mTLS)
No webhook-driven mirror sync (reconciliation is polling-based or manual)

These are acceptable trade-offs for partner-deployed installations where the client manages their own identity infrastructure. The OIDCAuthProvider is intentionally minimal — it handles login, logout, session verification, and SSO enforcement. Everything else (Admin Portal, SCIM, API Keys widget) is a WorkOS-specific feature that only exists in the WorkOSAuthProvider.

3. Data model

Four new Mongo collections support the auth system. Their relationship to the auth provider depends on the deployment model:

Deployment	Mongo collections are…	Populated via…
SaaS (WorkOS)	A mirror of WorkOS state. WorkOS Organizations, Users, and Memberships are the source of truth; Mongo caches a read-optimized subset for fast middleware lookups and cross-tenant queries.	Webhooks from WorkOS (§11), reconciliation job (§11.3)
Partner-deployed (OIDC)	The source of truth. There is no external org directory.	Admin API endpoints, first-login auto-provisioning, manual seeding
Local dev	The source of truth, seeded by `DevAuthProvider.init()`.	Dev bootstrap seed routine (§10)

This distinction is invisible to the middleware pipeline. The middleware reads tenants, users, memberships from Mongo regardless of how they got there. Whether a membership row was upserted by a WorkOS webhook or created by an admin API call, the permission resolution (§7) works identically. This is the key property that makes the provider abstraction work: the middleware depends on the local data model, not on the provider.

Design implication for the SaaS deployment: even though WorkOS is the source of truth, we deliberately keep the local mirror fully functional as a standalone data model. If we ever need to migrate away from WorkOS, the mirror becomes the source of truth with no schema change — we just stop the webhook sync, add the admin API endpoints from the OIDC path, and switch AUTH_PROVIDER.

`tenants`

One row per tenant. In the WorkOS deployment, corresponds 1:1 to a WorkOS Organization.

interface TenantDoc {
  _id: ObjectId;
  slug: string;                   // URL-safe identifier — used in /t/:slug/... routes
  display_name: string;           // Human-readable, shown in switcher
  provider_org_id: string;        // WorkOS Organization ID (or IdP tenant ID in OIDC mode)
        | "active"                // Paying customer
        | "churned"               // Contract ended, tenant archived but readable
        | "internal";             // SecurityV0-internal tenant (exactly one row)
  sso_enforced: boolean;          // If true, magic link is blocked; only SAML is allowed
  created_at: Date;
  archived_at: Date | null;
}

Indexes:

{ slug: 1 } unique — URL lookup
{ provider_org_id: 1 } unique — webhook reconciliation
{ status: 1 } — super-admin listings

`users`

One row per authenticated user. In WorkOS mode, mirrors a subset of the WorkOS User object. In OIDC mode, populated on first login from the OIDC token claims.

interface UserDoc {
  _id: ObjectId;
  provider_user_id: string;       // WorkOS User ID, or OIDC `sub` claim
  email: string;                  // Normalized lowercase
  display_name: string;
  is_super_admin: boolean;        // Derived: member of securityv0-internal tenant
  created_at: Date;
  updated_at: Date;
  last_seen_at: Date | null;      // Updated on session creation, not per-request
}

Indexes:

{ provider_user_id: 1 } unique
{ email: 1 } unique
{ is_super_admin: 1 } — fast super-admin listing

`memberships`

One row per (user, tenant) relationship. In WorkOS mode, mirrors WorkOS Organization Memberships. In OIDC mode, managed directly (admin API or first-login auto-join).

interface MembershipDoc {
  _id: ObjectId;
  user_id: ObjectId;              // Foreign key → users._id
  tenant_id: ObjectId;            // Foreign key → tenants._id
  role: "owner" | "admin" | "member";  // Mirrored from WorkOS org role
  created_at: Date;
  updated_at: Date;
}

Indexes:

{ user_id: 1, tenant_id: 1 } unique — membership uniqueness
{ tenant_id: 1, role: 1 } — "who are the admins of this tenant"
{ user_id: 1 } — "what tenants does this user belong to"

Super-admins do not have explicit membership rows for every tenant. They have one membership row for the securityv0-internal tenant. The auth middleware treats them as implicit members of every other tenant. This keeps the collection size bounded and avoids having to backfill rows when a new tenant is created.

`tenant_configs`

One row per tenant. Holds sv0-specific per-tenant configuration that is not mirrored from WorkOS.

interface TenantConfigDoc {
  _id: ObjectId;
  tenant_id: ObjectId;            // Foreign key → tenants._id
  jira_base_url?: string;         // https://acme.atlassian.net
  jira_project_key?: string;      // "SEC"
  branding?: {
    logo_url?: string;
    primary_color?: string;
    display_name_override?: string;
  };
  feature_flags?: Record<string, boolean>;
  connector_credential_refs?: {   // References to secrets in a vault, NOT the secrets themselves
    entra?: string;
    servicenow?: string;
    aws?: string;
  };
  updated_at: Date;
  updated_by_user_id: ObjectId;
}

Indexes:

{ tenant_id: 1 } unique

Important: credentials themselves are never stored here. connector_credential_refs holds vault keys (e.g., AWS Secrets Manager ARNs or references to environment-injected secrets). The actual secrets live elsewhere. See Connectors for the credential storage pattern.

3.5 Tenant ↔ provider org binding: constraint, escape hatch, and refactor triggers

tenants.provider_org_id is a single string with a unique index. The bearer-token middleware resolves an M2M request by calling findTenantByProviderOrgId(jwt.org_id) — return one tenant or reject 401. This is deliberate for the IDOR-class protection it gives (see #346/#347). The constraint has three known refactor triggers, below.

What the constraint forces today

One tenant per WorkOS org, full stop. At most one row in tenants can claim a given provider_org_id. Mongo enforces this via the provider_org_id_unique index in src/storage/mongo/schema.ts.
Staff super-admin sees one tenant by default. The staff WorkOS org (securityv0-internal in prod, equivalent in staging — see WorkOS production configuration) binds to the default tenant as its auth-landing. Super-admin access to other tenants (demo-w1, nimbus-cloud, customer tenants) uses the tenant_slug override on the cookie-mint endpoint — see below.
One provider per environment. There is no per-tenant provider field. The provider is implicit in which AuthProvider adapter is wired up at boot.

Escape hatch: `tenant_slug` override on cookie mint

POST /api/v1/automation/browser-sessions accepts an optional tenant_slug body parameter. Super-admins may specify any tenant slug there; non-super-admins must have an explicit memberships row for the slug. The returned sv0_session cookie carries the chosen tenant, and tenant middleware honors the cookie's binding rather than the bearer's org-id-resolved default. This is how visual-screenshot.ts and other headless scripts reach demo-w1 / nimbus-cloud after authenticating against default.

Implications:

A staff bearer that never calls /automation/browser-sessions (e.g., direct Authorization: Bearer against an API route) sees only the auth-landing tenant. There is no header-level "act-as-tenant-X" mechanism for raw bearer requests — that was the IDOR fix.
Automation cookie-mint sessions are 30-minute TTL (super-admin browser cookies are 8h — see AUTOMATION_SESSION_TTL_MS in src/api/auth/session.ts). Long-running scripts re-mint per tenant they need to touch.
The cookie-mint route equalizes work across all authorization branches (super-admin / member / non-member / non-existent-tenant) to prevent slug-enumeration timing oracles — see SecurityV0/sv0-platform#771. Do not "optimize" the route by short-circuiting branches.

Not a customer access pattern. The tenant_slug override exists for staff super-admins who legitimately span tenants. Customer access to a tenant goes through SSO + an explicit memberships row, exactly as documented in §4 and §6. Do not introduce customer-facing flows that use the override.

When this constraint actually breaks (refactor triggers)

Scenario	Likelihood	Why the unique constraint fails
Blue/green migration across auth providers. Running WorkOS and (e.g.) Auth0 simultaneously during a cutover, with the same logical tenant addressable from both.	Roadmap item, not committed.	The same `tenants` row needs two `provider_org_id` values — one per provider — for the duration of the dual-provider window. The schema can't represent this; you'd be forced into either splitting the tenant (data duplication) or a no-rollback flip-day cutover.
First paying customer brings their own WorkOS org or SAML IdP.	Triggered by sales.	Their `provider_org_id` claims the customer-tenant row; staff super-admin loses the org-id → tenant resolution for that customer. The `tenant_slug` override on cookie mint partially absorbs this (super-admin can mint a cookie for the customer tenant by slug), but only for cookie sessions — raw bearer access to customer-tenant routes is lost.
Second super-admin scope (e.g., a partner/consultant WorkOS org with multi-tenant read access).	Plausible if Deloitte/Accenture engagement materializes.	Same single-tenant-per-org constraint applies. Partner org would have to be bound to one tenant; reaching the rest requires the override pattern, and there's no good way to scope "this partner org has access to these N tenants but not all".

Target shape for the refactor (when needed, not now)

A tenant_provider_bindings collection, decoupled from tenants:

interface TenantProviderBindingDoc {
  _id: ObjectId;
  tenant_id: ObjectId;            // FK → tenants._id (M:N — one tenant can have many bindings)
  provider: "workos" | "auth0" | "okta" | "oidc";  // open enum, no schema change to add
  provider_org_id: string;         // the foreign-system org/tenant ID
  is_primary: boolean;             // exactly one per (tenant_id, provider) — the "default" for that provider
  scope: "primary" | "delegated_admin" | "read_only";  // partner/MSP scoping — supports the second-super-admin trigger
  valid_from: Date;
  valid_until: Date | null;        // null = active; non-null = policy-retired, kept for audit
  retired_at: Date | null;         // distinct from valid_until: set when the upstream provider deleted the org
                                   // (webhook-driven). valid_until is planned end; retired_at is observed event.
}

Indexes:

{ provider: 1, provider_org_id: 1 } unique with partialFilterExpression: { valid_until: null, retired_at: null } — preserves the IDOR-class invariant (at most one ACTIVE binding per (provider, org_id)). This is non-negotiable — without the partial-filter unique index, two tenants can simultaneously claim the same active org_id and re-open #346/#347.
{ provider: 1, provider_org_id: 1, valid_until: 1, retired_at: 1 } — bearer lookup (findTenantsByBinding({provider, org_id, valid_at: now}))
{ tenant_id: 1, provider: 1, is_primary: 1 } — tenant → primary binding per provider

Migration plan sketch. Touchpoints + their file paths:

Add tenant_provider_bindings collection + backfill. Schema in src/storage/mongo/schema.ts (next to the tenants index block at lines ~777-778 today). Backfill: one row per existing tenant with provider: "workos", is_primary: true, scope: "primary", copying provider_org_id verbatim.
Re-route the three current call sites of findTenantByProviderOrgId to findTenantsByBinding, feature-flagged:
- src/api/middleware/bearer-token-middleware.ts (the M2M tenant resolution at the m2m.orgId branch — currently around lines 202 and 280)
- src/api/middleware/auth-middleware.ts (the cookie-session tenant resolution — around line 247)
- src/storage/storage-adapter.ts (interface — around line 442)
Update the cookie session payload. SessionData.tenant_provider_org_id (src/api/auth/session.ts:30) is set by createAutomationSession and read by auth-middleware.ts. Either re-key cookies to carry binding_id / tenant_id (forces all live automation sessions to re-mint at cutover) or keep the field and route it through the bindings collection via findActiveBindingByProviderOrgId. Pick at migration time — the choice depends on whether the cutover can tolerate a re-mint window.
Drop tenants.provider_org_id + the provider_org_id_unique index (schema.ts:777-778) once the new path is verified.

Memberships and users are NOT affected. memberships.tenant_id references tenants._id (ObjectId, not provider_org_id) — see §3 schema. Webhook receiver (§11.1) is also unaffected: organization.created is marked informational in this repo, and the tenant-provisioning path is explicit, not webhook-driven.

Safety gate before dropping the unique index (step 4). Integration test asserting: (a) a bearer with org_id = X resolves to exactly the tenant whose active primary binding is (workos, X); (b) two tenants sharing a non-primary (or expired) binding to X both 401 (no silent cross-tenant resolution); (c) the same M2M token gets 401 if its org-id matches a retired_at != null binding. Without these, the IDOR-regression risk lives in auth-middleware.ts and bearer-token-middleware.ts.

Do not do this pre-emptively. Keep documenting the constraint here. The trigger is the first concrete event from the table above, not "we might want to someday".

Operational hygiene right now (no schema change)

WorkOS org IDs are environment-scoped. Staging WorkOS and prod WorkOS have independent ID namespaces — the same logical "SecurityV0 staff" org has different org_id values in each. Look them up:
- Prod: see the table in WorkOS production configuration.
- Staging: intentionally not committed (they rotate on env rebuilds). Read from the WorkOS dashboard's Staging environment, or from the dev GitHub Environment secret: gh secret list --env dev --repo SecurityV0/sv0-platform | grep WORKOS_SUPER_ADMIN_ORG_ID (returns the name + last-updated date; the value is set via the WorkOS dashboard and copied into the secret).
Avoid hardcoding org IDs in seeds or fixtures — read from SEED_TENANT_PROVIDER_ORG_ID, which the deploy workflow forwards from the env-scoped secret.
Don't bind two tenants to the same WorkOS org. The provider_org_id_unique index rejects the second upsert. The PR-preview seed (pr-preview-admin.sh) deliberately binds only default; demo-w1 and nimbus-cloud stay on their seed-<slug> / discovered-<slug> placeholders. Staff super-admins reach those via the tenant_slug cookie-mint override instead. Context: SecurityV0/sv0-platform#959.
Seed-script silent-fallback gap (not a runtime middleware bypass). When WORKOS_SUPER_ADMIN_ORG_ID is missing from a deploy env, the seed step's ${SEED_TENANT_PROVIDER_ORG_ID:-discovered-default} falls back to the placeholder. The bearer middleware then correctly 401s every staff bearer (no IDOR risk — findTenantByProviderOrgId returns null on the placeholder), but the failure is misleading at the human-debug level (looks like an auth bug when it's a missing-secret bug). The deploy step should fail closed instead — a missing org-id secret on a staff-binding path is a deploy-time misconfig, not a graceful-degradation case. Not yet wired; the fix is local to the deploy workflow's seed step, not the middleware.

4. Identity lifecycle

4.1 User creation

In WorkOS mode (SaaS): Users are never created manually in sv0. They arrive through one of three paths, all driven by WorkOS:

Invitation — a super-admin (or a tenant admin) invites an email to a specific Organization via POST /user_management/invitations on the WorkOS API. WorkOS sends a branded invitation email. When the invitee clicks through, WorkOS creates the User, adds them to the Organization, fires user.created + organization_membership.created webhooks. sv0 upserts the local mirror rows.
Just-in-time SAML provisioning — a user authenticates via their customer's SAML IdP (Okta, Entra ID, etc.). If the user does not yet exist in WorkOS, WorkOS creates them on the fly based on the SAML assertion and adds them to the Organization the SSO connection is attached to. Same webhook sequence follows.
SCIM provisioning — if Directory Sync is enabled for a customer, the customer's IdP pushes user/group changes to WorkOS. Same webhook sequence follows.

In this mode, sv0 has no user-creation API. This is intentional: it removes an entire class of "stale user in sv0 after they left the company" bugs. The customer's source of truth (their IdP) stays authoritative.

In OIDC mode (partner-deployed): Users are created on first login via JIT provisioning. When someone authenticates through the client's IdP for the first time, the auth callback upserts a users row from the OIDC token claims (sub, email, name). In single-tenant deployments, the user is auto-joined to the sole tenant with member role. In multi-tenant partner deployments, an admin assigns membership via the admin API. Deprovisioning is the client's responsibility (disable the user in their IdP; the next login attempt fails).

Browser                          sv0 backend                    WorkOS
  │                                   │                             │
  │ GET /t/acme/clusters              │                             │
  ├──────────────────────────────────▶│                             │
  │                                   │ No session cookie           │
  │ 302 → /login?return_to=/t/acme/…  │                             │
  │◀──────────────────────────────────┤                             │
  │                                   │                             │
  │ GET /login?return_to=…            │                             │
  ├──────────────────────────────────▶│                             │
  │ 302 → WorkOS AuthKit hosted page  │                             │
  │◀──────────────────────────────────┤                             │
  │                                   │                             │
  │ GET https://auth.workos.com/… ────────────────────────────────▶│
  │                                   │                             │
  │ (user authenticates: magic link,  │                             │
  │  social, SAML, passkey, etc.)     │                             │
  │                                   │                             │
  │ 302 → sv0/auth/callback?code=…◀───────────────────────────────┤
  │                                   │                             │
  │ GET /auth/callback?code=…         │                             │
  ├──────────────────────────────────▶│                             │
  │                                   │ exchange(code) ────────────▶│
  │                                   │◀──────────── { user, org } │
  │                                   │ upsert users, memberships   │
  │                                   │ create iron-session cookie  │
  │                                   │                             │
  │ 302 → /t/acme/clusters            │                             │
  │ Set-Cookie: sv0_session=…         │                             │
  │◀──────────────────────────────────┤                             │
  │                                   │                             │
  │ GET /t/acme/clusters              │                             │
  │ Cookie: sv0_session=…             │                             │
  ├──────────────────────────────────▶│                             │
  │ 200 OK                            │                             │
  │◀──────────────────────────────────┤                             │

Key properties:

Sessions are HttpOnly, Secure, SameSite=Lax cookies encrypted with iron-session or equivalent. The cookie contains only the WorkOS user ID and a session expiry; all other data is looked up from the local mirror.
The callback is the only write-intensive moment. Regular requests do not touch WorkOS at all; they only look up the local mirror.
return_to is validated against an allowlist of same-origin paths to prevent open-redirect.

4.3 Session lifetime

Default session duration: 7 days (rolling refresh on each request).
Hard maximum: 30 days. After 30 days the user must re-authenticate regardless of activity.
Logout: POST /auth/logout clears the cookie and notifies WorkOS to invalidate the session remotely.

Implementation note. The live values in src/api/auth/session.ts are tighter than the design defaults — 24 hours for regular users and 8 hours for super-admins (SESSION_TTL_MS / SUPER_ADMIN_SESSION_TTL_MS). The tighter TTLs limit blast radius pending audit logging and session-revocation tooling; the design defaults are 7 days / 30-day hard max.

4.4 Per-tenant SSO enforcement (tenant- and connection-specific)

SSO enforcement is tenant-scoped and connection-specific, not a global "session came from SSO" check. A session that satisfied one tenant's SSO policy cannot be replayed against a different SSO-enforced tenant.

The enforcement rule — applied per request, for tenant-scoped routes only:

if authContext.tenant.sso_enforced AND NOT authContext.user.is_super_admin:
    allowed_connection_ids := workos.listActiveConnections(authContext.tenant.provider_org_id)
    if authContext.session.auth_method != "sso"
       OR authContext.session.auth_connection_id NOT IN allowed_connection_ids:
        redirect to /login?return_to=<url>&reason=sso_required&org=<tenant.slug>

Two properties follow from this:

Sessions carry the WorkOS connection ID they came from. When a user authenticates through SAML or OIDC via AuthKit, the token response includes the connection_id of the SSO connection that was used. We persist that in the session payload (session.auth_connection_id) at login time. Magic-link, social, password, and passkey sessions have auth_connection_id = null.
Super-admins are exempt from the SSO enforcement check. SecurityV0 staff are members of securityv0-internal, not members of customer tenants. Their authorization to view a customer tenant comes from being a super-admin, which is independent of how they authenticated. A SecurityV0 engineer logged in via Google social to securityv0-internal is therefore allowed to browse any sso_enforced=true customer tenant — they are not impersonating a customer user, they are acting as a super-admin with read-wide (and role-gated write) access. This is deliberate: forcing super-admins to re-authenticate via every customer's IdP is both impossible (we don't have accounts in their IdP) and unnecessary (the audit trail already attributes their actions to a SecurityV0 user). The risk model here is that super-admin access is privileged internally and must be tightly scoped at the securityv0-internal org level — see §12.

A concrete consequence for a user in multiple SSO-enforced tenants:

Suppose a consultant is a customer-side member of both Acme (SSO connection conn_acme_okta) and Beta (SSO connection conn_beta_entra). Both tenants have sso_enforced = true. The user's session history is:

Logs in to Acme via conn_acme_okta → session has auth_connection_id = conn_acme_okta, can access /t/acme/....
Navigates to /t/beta/... → enforcement check fails (conn_acme_okta is not in Beta's allowed connections) → redirected to re-authenticate via conn_beta_entra.
After re-authentication, session has auth_connection_id = conn_beta_entra, can access /t/beta/... but not /t/acme/... until they re-auth against Acme.

This is the correct behavior for a strict enterprise tenancy model: one session can only prove it came from one customer's SSO at a time. If this friction turns out to be impractical for consultants in the future, we can add per-tenant session slots (one cookie per tenant slug), but the simple single-session-per-browser model is sufficient for Phase 1 because the population of cross-tenant customer users is effectively zero.

5. Tenant routing and middleware

5.1 URL structure

Every tenant-scoped page in the UI lives under /t/:tenantSlug/.... Examples:

Old route	New route
`/`	`/t/:slug/` (dashboard)
`/clusters`	`/t/:slug/clusters`
`/findings/:id`	`/t/:slug/findings/:id`
`/reports`	`/t/:slug/reports`
`/graph`	`/t/:slug/graph`

Non-tenant-scoped routes (unchanged):

Route	Purpose
`/login`	Entry point, redirects to WorkOS AuthKit
`/auth/callback`	OAuth callback from WorkOS
`/auth/logout`	Clears session
`/` (root, no slug)	Picks a default tenant from the user's memberships and redirects to `/t/:default/`
`/admin/tenants`	Super-admin only: list of all tenants
`/admin/users`	Super-admin only: user list (thin view; deep user management happens in WorkOS)

5.2 Middleware pipeline

The Express middleware pipeline becomes:

request
  │
  ▼
[1] session-middleware        — decrypt cookie, load user from local mirror
  │
  ▼
[2] tenant-middleware         — extract :tenantSlug from URL, load tenant row
  │
  ▼
[3] membership-middleware     — verify user has membership (or is super-admin)
  │
  ▼
[4] permission-middleware     — for write routes, check role → permission map
  │
  ▼
route handler

Each layer has a single responsibility and fails fast on any inconsistency:

session-middleware: calls authProvider.verifySession() (or verifyApiKey() / verifyM2MToken() depending on the auth source). If no valid session, return 401 (for API calls) or 302 → /login (for HTML).
tenant-middleware: if the URL has no :tenantSlug, skip (for non-tenant routes). If the slug does not resolve to a tenant row, return 404 (not 403 — we do not leak tenant existence).
membership-middleware: if the user is not a super-admin and has no membership row for this tenant, return 404. Super-admins bypass this check.
permission-middleware: route-level. Uses a per-route declaration of required permissions and looks them up in the role→permission table (see §7).

The middleware writes req.authContext with a complete, authoritative snapshot:

interface AuthContext {
  user: {
    id: ObjectId;                 // Local users._id
    provider_user_id: string;     // WorkOS user ID
    email: string;
    display_name: string;
    is_super_admin: boolean;
    internal_role: "owner" | "admin" | "member" | null;
                                  // Role in securityv0-internal org, if super-admin. null otherwise.
  };
  tenant: {
    id: ObjectId;                 // Local tenants._id
    slug: string;
    display_name: string;
    provider_org_id: string;
    status: "evaluation" | "active" | "churned" | "internal";
    sso_enforced: boolean;
  };
  membership: {
    // Effective role for THIS request in THIS tenant.
    // - Regular member: their membership row's role
    // - Super-admin in customer tenant: resolved from user.internal_role (see §6.3)
    role: "owner" | "admin" | "member";
    permissions: ReadonlySet<Permission>;
    source: "direct" | "super_admin_derived";
                                  // Tells audit logs whether this was a direct membership
                                  // or a super-admin acting in a customer tenant
  };
  session: {
    expires_at: Date;
    method: "magic_link" | "sso" | "social" | "passkey" | "password" | "api_key" | "m2m";
    auth_connection_id: string | null;
                                  // WorkOS connection_id if method === "sso", otherwise null.
                                  // Used for per-tenant SSO enforcement (§4.4).
  };
}

Route handlers read req.authContext exclusively. They must not read headers or cookies directly — this guarantees that any auth bypass happens in one place.

6. Super-admin model

SecurityV0 staff are modeled as members of the securityv0-internal tenant. One tenant row with status: "internal" and slug: "securityv0-internal", one WorkOS Organization (whose ID is configured as WORKOS_SUPER_ADMIN_ORG_ID), and one membership per staff member.

The users.is_super_admin flag is derived from active WorkOS organization membership, not manually set:

user.is_super_admin := WorkOS user is an active member of the
                       org identified by WORKOS_SUPER_ADMIN_ORG_ID

The lookup is performed via WorkOSAuthProvider.getOrganizationMemberships (which wraps listOrganizationMemberships) and runs in two places:

Cookie callback (src/api/routes/auth.ts) — every cookie login re-resolves the flag and writes the result to the local users row.
Bearer JIT-upsert (src/api/middleware/bearer-token-middleware.ts) — runs only on the first bearer request for a provider_user_id against a given DB.

Results are cached in-memory for 5 minutes. The flag is then read from the local users row by both cookie- and bearer-authenticated requests, so the two paths always agree on super-admin status for the same user.

The membership lookup is the canonical signal — AuthKit's providerOrgId claim cannot be substituted because it returns null for personal-account logins (Google personal Gmail, etc.). There is no parallel allowlist and no email-domain fallback; granting or revoking staff super-admin is a single WorkOS dashboard action. Manual edits to users.is_super_admin in Mongo are reverted on the user's next cookie login by design — WorkOS is the source of truth.

6.1 What super-admin grants

Super-admin is not a role in the permission sense — it is a visibility flag that grants cross-tenant access. The actual write permissions a super-admin has in any given tenant are derived from their role inside the securityv0-internal organization, not from a blanket super-admin permission set.

Capability	Non-super-admin	Super-admin
List own tenant memberships	✓	✓
List all tenants	✗	✓
Switch to any tenant in the UI	Own memberships only	All tenants
Read data for any tenant	Own memberships only	All tenants
Write data in a customer tenant	Only if own membership role allows	Only if internal role resolves to a write-capable tenant role (see §6.3)
Generate WorkOS Admin Portal link	✗	Only with internal role `owner` or `admin`
Edit `tenant_configs`	Only with own tenant `admin` role	Only with internal role `owner` or `admin`
Provision new tenants (eval or paid)	✗	Only with internal role `owner` or `admin`
Manage SecurityV0 staff	✗	Only with internal role `owner`

Super-admin is read-wide but write-gated. A SecurityV0 staff member with internal role member can see every customer tenant but cannot modify anything in any of them — they are effectively global read-only. This is the default safe posture for new staff.

6.2 Internal roles

Within the securityv0-internal organization, staff have WorkOS Organization roles:

Internal role	What it grants globally (across all tenants)
`owner`	Full control: manage SecurityV0 staff, provision tenants, edit any tenant's config, generate Admin Portal links, write to any tenant's data.
`admin`	Operate on customer tenants: provision eval tenants, edit tenant configs, generate Admin Portal links, write to tenant data. Cannot manage SecurityV0 staff.
`member`	Read-only across all tenants. Cannot write to tenant data, cannot edit configs, cannot provision, cannot generate Admin Portal links. This is the default for new SecurityV0 engineers who don't yet need write access to customer data.

These roles are managed in the WorkOS dashboard for the securityv0-internal Organization. A staff member's role is mirrored into their memberships row for the internal tenant via webhook and surfaced on the user as user.internal_role.

6.3 Permission resolution for super-admin requests

When a super-admin issues a request against a customer tenant, the effective role is derived from their internal role, not looked up from a nonexistent membership row:

resolveSuperAdminRole(internal_role) -> effective_role_in_customer_tenant
  "owner"  -> "owner"    // full write
  "admin"  -> "admin"    // tenant config + data writes, cannot remove tenant owners
  "member" -> "member"   // read-only

The resolved role then feeds the normal role→permission map (§7.3). The authContext.membership.source field is set to "super_admin_derived" so audit logs can distinguish super-admin-originated actions from direct-membership actions.

Important: the super-admin path cannot escalate above the internal role. A SecurityV0 member cannot write to a customer tenant even if the customer tenant's owner role would allow it, because their derived role is member not owner. Internal role is the upper bound on cross-tenant authority.

For M2M access from the internal org (see §13.7 and ADR-017), the same resolution applies: the M2M Application has a declared internal role (typically member for read-only automation, occasionally admin for deployment automation), and tokens issued by that application get that role when acting in any customer tenant.

7. Roles and permissions

The authorization split from ADR-016:

Roles are identity data → mirrored from WorkOS
Permissions are sv0 business logic → live in code

7.1 Tenant roles (mirrored from WorkOS)

These are the only real roles. super_admin is not a role — it is a visibility flag that triggers a derived-role computation (§6.3).

Role	Description
`owner`	Tenant owner. Can invite/remove other members (including other admins). Used for customer-side tenant owners.
`admin`	Tenant admin. Can edit tenant config, invite/remove non-owner members, mark findings as accepted risk.
`member`	Standard user. Can view and triage findings within the tenant. Read-only on config and membership.

7.2 Permissions (defined in sv0 code)

// src/api/auth/permissions.ts
enum Permission {
  // Tenant-scoped — these are gated by the tenant role (direct or derived)
  TENANT_READ,
  TENANT_WRITE_CONFIG,
  TENANT_GENERATE_ADMIN_PORTAL_LINK,
  TENANT_INVITE_MEMBER,
  TENANT_REMOVE_MEMBER,
  FINDING_READ,
  FINDING_WRITE_STATUS,            // Mark accepted risk, closed, etc.
  FINDING_DELETE,
  EVIDENCE_PACK_READ,
  EVIDENCE_PACK_GENERATE,
  CONNECTOR_READ_STATUS,
  CONNECTOR_TRIGGER_SYNC,

  // Super-admin-only (require user.is_super_admin === true AND an appropriate internal role)
  INTERNAL_LIST_ALL_TENANTS,       // Any internal role
  INTERNAL_PROVISION_TENANT,       // Internal owner or admin
  INTERNAL_MANAGE_STAFF,           // Internal owner only
}

7.3 Role → permission map

The map contains only real roles, not super-admin. Super-admin callers go through the derived-role path in §6.3 and then land on this same map.

const ROLE_PERMISSIONS: Record<TenantRole, ReadonlySet<Permission>> = {
  owner: new Set([
    TENANT_READ, TENANT_WRITE_CONFIG, TENANT_GENERATE_ADMIN_PORTAL_LINK,
    TENANT_INVITE_MEMBER, TENANT_REMOVE_MEMBER,
    FINDING_READ, FINDING_WRITE_STATUS, FINDING_DELETE,
    EVIDENCE_PACK_READ, EVIDENCE_PACK_GENERATE,
    CONNECTOR_READ_STATUS, CONNECTOR_TRIGGER_SYNC,
  ]),
  admin: new Set([
    TENANT_READ, TENANT_WRITE_CONFIG, TENANT_GENERATE_ADMIN_PORTAL_LINK,
    TENANT_INVITE_MEMBER,                  // NOT TENANT_REMOVE_MEMBER for owners
    FINDING_READ, FINDING_WRITE_STATUS,    // NOT FINDING_DELETE
    EVIDENCE_PACK_READ, EVIDENCE_PACK_GENERATE,
    CONNECTOR_READ_STATUS, CONNECTOR_TRIGGER_SYNC,
  ]),
  member: new Set([
    TENANT_READ, FINDING_READ, EVIDENCE_PACK_READ, CONNECTOR_READ_STATUS,
  ]),
};

// Super-admin-only permissions are gated independently of tenant role.
// They require user.is_super_admin === true AND the listed internal role.
const INTERNAL_PERMISSIONS: Record<InternalRole, ReadonlySet<Permission>> = {
  owner:  new Set([INTERNAL_LIST_ALL_TENANTS, INTERNAL_PROVISION_TENANT, INTERNAL_MANAGE_STAFF]),
  admin:  new Set([INTERNAL_LIST_ALL_TENANTS, INTERNAL_PROVISION_TENANT]),
  member: new Set([INTERNAL_LIST_ALL_TENANTS]),
};

// The middleware computes the effective permission set as:
//   tenant_role_perms := ROLE_PERMISSIONS[membership.role]         // direct or derived
//   internal_perms    := user.is_super_admin
//                          ? INTERNAL_PERMISSIONS[user.internal_role]
//                          : {}
//   authContext.membership.permissions := tenant_role_perms ∪ internal_perms

Two properties follow:

A SecurityV0 member (internal role) is read-only on every tenant. Their derived tenant role is member, which only carries read permissions. They can browse everything, modify nothing.
INTERNAL_LIST_ALL_TENANTS is gated on super-admin status. A customer owner cannot list other tenants just because owner sounds powerful — the permission set is additive only when is_super_admin === true.

This map lives in one file. Adding a permission means editing a file, updating the map, and writing a test. No dashboard clicks. No runtime config.

7.4 Route declarations

Every write route declares its required permission:

router.patch(
  "/api/v1/tenants/:slug/config",
  requirePermission(Permission.TENANT_WRITE_CONFIG),
  tenantConfigHandler
);

requirePermission is a thin middleware that reads req.authContext.membership.permissions and returns 403 if the required permission is missing.

8. Per-tenant configuration

8.1 What lives in `tenant_configs`

Field	Purpose	Who writes
`jira_base_url`, `jira_project_key`	Used by the remediation service to construct "create Jira ticket" links	Tenant admin or super-admin
`branding.logo_url`, `branding.primary_color`	Rendered in the UI header for tenant-scoped pages	Super-admin only (phase 1); tenant owner (later)
`feature_flags`	Per-tenant feature gates (e.g., graph explorer, AI summaries)	Super-admin only
`connector_credential_refs`	Vault references (ARNs, secret names) — never the secrets themselves	Super-admin only

8.2 Editing flow (phase 1)

A super-admin-only page at /admin/tenants/:slug/config shows a form for the above fields. The form posts to PATCH /api/v1/tenants/:slug/config, which is permission-gated at TENANT_WRITE_CONFIG. Tenant admins do not yet have a UI but can call the API directly if they have the permission (deferred self-service).

8.3 Editing flow (later — self-service)

In a future phase, tenant admins get a Settings UI for the fields appropriate to their role. connector_credential_refs and feature_flags stay super-admin-only. Branding and Jira config become tenant-admin-editable.

9. Operating policies

These are the concrete rules for day-to-day auth operations. They are derived from ADR-016 and ADR-017 and should be treated as authoritative.

9.1 Evaluation vs paid

Situation	Auth method	Who configures it	WorkOS cost
Prospect evaluation / POC / demo	Magic link via AuthKit	sv0 super-admin invites them	$0
Small paying customer without IdP requirement	Magic link (optionally MFA)	sv0 super-admin invites them	$0
Enterprise paying customer with IdP requirement	SAML SSO connection to their IdP	Customer IT admin via Admin Portal	$125/connection/month
Enterprise customer requesting auto-provision/deprovision	+ Directory Sync (SCIM)	Customer IT admin via Admin Portal	+$125/connection/month

The rule: no SSO connection is enabled until a paid contract is signed. If a prospect asks for SAML during a POC because their security team mandates it, this becomes a paid POC conversation. See ADR-017 for the full rationale.

9.2 Provisioning a new evaluation tenant

One script: scripts/provision-eval-tenant.ts. It performs:

Create WorkOS Organization (POST /organizations) with the prospect's company name.
Create tenants row with status: "evaluation", provider_org_id, and a slug.
Create tenant_configs row with empty defaults.
Optionally seed demo data.
Create invitation for the specified email via WorkOS (POST /user_management/invitations), which sends the branded invitation email.
Print the tenant slug and the /t/:slug/ URL for Slack sharing.

9.3 Converting evaluation to paid

One super-admin page: /admin/tenants/:slug. The "Convert to paid" action:

Updates tenants.status to active.
Generates a WorkOS Admin Portal link (POST /portal/generate_link) with the sso intent.
Displays the link and a copy-paste email template for the super-admin to send to the customer's IT admin.

When the customer's IT admin completes SAML setup in the Admin Portal, WorkOS fires a connection.activated webhook. The webhook receiver is designed to update tenants.sso_enforced = true for that tenant, after which magic link is disabled and only SAML logins are accepted. The receiver itself is not yet wired (see §15); today, sso_enforced must be flipped manually after confirming SAML setup in the WorkOS dashboard.

9.4 Churning a customer

One super-admin action on /admin/tenants/:slug:

Disable the WorkOS SSO connection (stops the $125/month charge).
Update tenants.status to churned.
Remove all memberships from the tenant (optional — current approach keeps memberships read-only).
Customer data is retained for the contractual retention period, then hard-deleted via a separate process.

9.5 SecurityV0 staff onboarding

A SecurityV0 owner invites the new staff member to the securityv0-internal Organization in the WorkOS dashboard.
The invitee receives the branded invitation email, clicks through, authenticates with @securityv0.com Google Workspace (which AuthKit handles via the social login), and lands on sv0-platform as a super-admin.
Their role (owner, admin, or member) is set in the WorkOS dashboard. The webhook mirrors it to sv0.

9.6 SecurityV0 staff offboarding

An owner removes the staff member from the securityv0-internal Organization in WorkOS.
The webhook fires, sv0 removes their memberships row for the internal tenant, users.is_super_admin flips to false.
Their active session is invalidated on next request (because the session middleware re-reads the user row).

10. Local development

Local development must not require a round-trip to any external auth provider to start the server. The convention:

AUTH_PROVIDER=dev (replaces the old REQUIRE_AUTH=false) in .env.local. This selects the DevAuthProvider implementation (§2.1).
On server startup, DevAuthProvider.init() runs a seed routine once:
- Ensures a tenants row with slug: "securityv0-internal" and status: "internal" exists.
- Ensures a users row exists with email: "dev@securityv0.com" and is_super_admin: true.
- Ensures a memberships row connects them.
- Optionally creates one demo customer tenant with seeded data.
DevAuthProvider.verifySession() always returns a valid session for the seeded dev user, minting a synthetic cookie if none is present.

Crucially, the middleware pipeline is unchanged. In local dev, session-middleware still calls authProvider.verifySession(), the tenant middleware still extracts the URL slug, the membership middleware still resolves role and permissions. The only difference is DevAuthProvider returns a synthetic session instead of calling WorkOS. Every other code path runs identically to production.

This design catches the class of bugs where "it worked locally with REQUIRE_AUTH=false but broke in prod because half the code paths were skipped under the bypass."

10.1 Testing auth paths locally

To test the real WorkOS path locally, developers can:

Set AUTH_PROVIDER=workos and populate WorkOS environment variables pointing to the staging WorkOS environment.
Use ngrok or equivalent to expose localhost so WorkOS webhooks can reach it.
Log in via staging AuthKit, which uses real provider sessions.

This is opt-in. Most local development runs with DEV_BOOTSTRAP=true.

11. Webhooks and mirror sync

11.1 Webhook events we handle

WorkOS event	sv0 action
`user.created`	Upsert `users` row
`user.updated`	Update `users` row (email, display name)
`user.deleted`	Soft-delete `users` row; keep for audit
`organization_membership.created`	Upsert `memberships` row; recompute `users.is_super_admin` if tenant is internal
`organization_membership.updated`	Update role on `memberships` row
`organization_membership.deleted`	Delete `memberships` row; recompute `users.is_super_admin`
`organization.created`	(Informational — sv0 creates tenants explicitly via provisioning script, not via webhook)
`organization.updated`	Update `tenants.display_name`
`organization.deleted`	Mark `tenants.status = "churned"`
`connection.activated`	Set `tenants.sso_enforced = true` for the linked tenant
`connection.deactivated`	Set `tenants.sso_enforced = false`
`dsync.user.created` (SCIM)	Upsert `users` row
`dsync.user.updated` (SCIM)	Update `users` row
`dsync.user.deleted` (SCIM)	Delete `users` row and associated memberships

11.2 Webhook security and idempotency

Signature verification: every webhook is verified against the WorkOS HMAC signature before any side effect.
Idempotency: every event carries a unique ID; the receiver keeps a small webhook_events collection of recently processed IDs and no-ops on duplicates.
Order: WorkOS makes no ordering guarantees. Handlers are written to be order-insensitive (upserts rather than inserts, absence rather than sequence).

11.3 Reconciliation

A daily reconciliation job:

Fetches all Organizations from WorkOS, diffs against the local tenants collection, reports mismatches.
For each Organization, fetches its members, diffs against local memberships, reports mismatches.
Produces a report (logged and surfaced on an internal status page).
Auto-repairs obvious drift (a member in WorkOS but missing locally → upsert); reports suspicious drift (a member in WorkOS but with a different role locally → flag for human review).

Reconciliation catches missed webhooks, ordering issues, and any bugs in the event handlers. It is the safety net for the event-driven mirror.

12. Security posture and threat considerations

12.1 What this design protects against

Broken tenant isolation: tenant is derived from the URL and validated against the membership mirror on every request. There is no way to spoof a tenant via header manipulation because headers are no longer a source of tenant context.
Session fixation / replay: sessions are iron-session encrypted cookies with rotating tokens. WorkOS handles the OAuth2 PKCE flow on our behalf.
Token leakage via URL: the magic link token is consumed exactly once in a server-side callback; it does not persist in browser history or referrer headers.
Super-admin privilege creep: super-admin status is derived from one specific membership, not a sticky flag. Removing a staff member from securityv0-internal in WorkOS immediately revokes their super-admin status via webhook.
Credential phishing for SAML: we rely on the customer's IdP for credential custody. We never see passwords. Our exposure is the SAML assertion signature, which WorkOS validates.
Deprovisioned user still active: SCIM events remove memberships in near-real-time. Reconciliation catches any missed events within 24 hours.

12.2 What this design does NOT protect against

WorkOS account compromise: if an attacker gains access to our WorkOS organization itself, they can add super-admin users, change roles, or disable connections. Mitigation: WorkOS dashboard access must be MFA-enforced, limited to a small number of staff, and audited.
Compromised super-admin laptop: a super-admin session cookie on a compromised laptop grants access to all tenants. Mitigation: short session durations for super-admins (consider 24h instead of 7d), required MFA for super-admin login.
Customer IdP compromise: if a customer's Okta is compromised, the attacker can authenticate as any user in that customer's org. This is the customer's problem, not ours, but we should provide an "emergency disable" for SSO connections in case a customer reports a breach.
Tenant data crossover via bugs: any query that forgets to filter by tenant_id could leak data. This is an existing risk documented in the code patterns; the middleware-layer tenant validation reduces (but does not eliminate) the blast radius.

12.3 Audit logging

Every request logs:

user_id, tenant_id, http_method, path, status_code, session_method.

Webhook events log:

event type, WorkOS event ID, resulting mirror mutation.

These logs are structured JSON and can be streamed to a SIEM. A dedicated audit UI is deferred (see ADR-016 "Deferred to later").

13. Programmatic and machine access

Human users log in through AuthKit and get session cookies. Machines — CI/CD, bots, internal services, customer scripts, MCP clients, AI agents — need different authentication mechanisms. WorkOS supports all of them natively through its "Connect" product family, and sv0 uses each for a specific purpose. We do not build our own PAT system; WorkOS Connect is strictly better than anything we would write.

13.1 The five principal kinds

The following table maps each authentication principal to its JWT/cookie shape and tenant scope. This taxonomy is locked in code at src/api/auth/principal-kind.ts and is the authoritative reference for audit log attribution and middleware dispatch.

#	Principal kind	When used	JWT/cookie shape	Tenant scope
1	`human_session`	Browser users via AuthKit (magic link, social, SAML)	iron-session cookie (`sv0_session`)	URL slug or `x-tenant-id` header
2	`delegated_agent`	Staff CLI (`device_code` grant)	JWT: `sub=user_*`	JWT `org_id` claim (immutable)
3	`service`	CI runners, connector workers — M2M	JWT: `sub=client_*`	JWT `org_id` claim (immutable)
4	`test_session`	Local dev / Vitest — synthetic session	Synthetic, not a real JWT	Configurable
5	`customer_agent`	Customer MCP / AI agents (FUTURE — not yet wired)	OAuth authorization code flow → JWT	Per consenting user

The WorkOS API Keys widget is not used for staff authentication — API Keys are org-scoped, not per-user (empirically confirmed 2026-04-30 staging spike: validation response carries owner.type = "organization" with no user_id claim). Staff use device_code (see §13.7.a). The same org-scoping finding puts the customer-PAT roadmap at risk — see §13.3 for the open product question.

13.2 M2M Applications (delegated_agent and service principals)

WorkOS M2M Applications implement the OAuth 2.0 client_credentials grant:

A machine client is registered in the WorkOS dashboard (or programmatically via the WorkOS API).
The client gets a client_id and client_secret (up to 5 credentials per application, rotatable).
At runtime, the client exchanges its credentials at the WorkOS token endpoint for a short-lived JWT access token.
The client presents the JWT to sv0's API as Authorization: Bearer <jwt>.
sv0's middleware verifies the JWT locally using the WorkOS JWKS (no network round-trip per request) and extracts the org_id claim.
The org_id is mapped to a local tenants row via provider_org_id, and an authContext is built.

Properties:

Tokens are short-lived (typically 1 hour). The client secret is long-lived.
Org-scoped: one M2M Application belongs to exactly one WorkOS Organization. Tokens can only ever access data in that organization's corresponding tenant — except for M2M Applications in the securityv0-internal org, which resolve to super-admin (bypassing the per-tenant membership check). Their permissions are still bounded by the declared internal_role of the specific M2M Application per §6.3 and §7.2 — an internal_role: "member" M2M Application gets read-only access across tenants and is rejected on writes, same as a human with that internal role.
Locally verifiable: we verify tokens against a cached JWKS, so there is no runtime dependency on WorkOS for every request.
Audit attribution: every request logs the M2M Application's client_id, so "Claude Code" or "connector worker" is identifiable in audit logs.

Status, 2026-05-03 — at risk. This section's design assumes WorkOS API Keys can be issued per-user. The 2026-04-30 WorkOS staging spike found they are org-scoped only (owner.type = "organization", no user_id claim). The customer programmatic-access plan is therefore open. Three options under consideration:

Wait for WorkOS to ship per-user API Keys (no timeline; not on their public roadmap).

Build a thin sv0-side PAT layer (mint long-lived tokens, store hashed in our users_api_keys collection, verify in middleware) — replicates what WorkOS already does for the org-scoped case.

Route customer agents through OAuth/MCP only (§13.4) and not ship long-lived PATs at all.

The rest of this section describes the original plan as a reference design; assume any "user-scoped" claim is an assumption pending product resolution.

WorkOS provides an embeddable widget for user-facing API key management. sv0's Settings page hosts the widget; customer users click "Manage API keys" and see a WorkOS-rendered UI with:

List of their active keys (with last-used timestamp)
"Generate new key" button with per-key name, optional expiry, and optional scope selection
Rotate and revoke actions
One-time display of the key on generation (never shown again)

We write zero CRUD UI code for this. The widget posts directly to WorkOS's API. sv0's only responsibility is:

Verify bearer tokens against the WorkOS API Keys endpoint (with a short-lived local cache to avoid per-request latency).
Resolve the key's owning user and organization to a local authContext.
Enforce scopes on each route (same permission system as interactive users).

Properties:

Long-lived (not short-lived JWTs). Customer scripts can use a single key for months without rotation, though best practice is periodic rotation.
User-scoped (ASSUMPTION, not confirmed): this design assumes per-user keys. The 2026-04-30 WorkOS staging spike disproved this for the current WorkOS implementation — see the status callout above. If the design ships via Option 2 (sv0-side PAT layer) or a future WorkOS feature, the per-user assumption stands; otherwise it does not.
Revocable: the user can revoke at any time through the widget; revocation is effective within the cache TTL (default 60s).
Prefix-branded: keys use a recognizable prefix (e.g., sv0_live_..., sv0_test_...) for leak detection via GitHub secret scanning.

13.4 AuthKit OAuth Applications and MCP support (future `customer_agent` principal)

AuthKit is a spec-compliant OAuth 2.0 authorization server. Third-party applications — including MCP clients used by AI agents like Claude Desktop — can request delegated access to a user's sv0 data through the standard authorization code flow:

The MCP client presents itself to sv0 with its OAuth client metadata (or a Client ID Metadata Document URL for dynamic registration).
sv0 redirects the user to AuthKit's consent page: "Claude Desktop is requesting access to your sv0 account. Scopes requested: read findings, read entities, read access paths. Approve?"
The user clicks Approve.
AuthKit issues an authorization code to the MCP client.
The MCP client exchanges the code for an access token at the token endpoint.
The MCP client presents the access token to sv0's API on every subsequent call.

The user can revoke the authorization at any time from their Settings page, which shows a list of authorized applications (managed by WorkOS, rendered in our UI).

Client ID Metadata Document (CIMD) is a WorkOS feature that lets MCP clients without a prior relationship identify themselves by hosting a JSON metadata document at a well-known URL. This is the mechanism the MCP specification expects for OAuth discovery. To enable MCP support, we flip one toggle in the WorkOS dashboard and implement a small metadata endpoint on our side; WorkOS handles the rest.

13.5 Middleware changes to support programmatic access

The auth middleware pipeline (§5.2) is extended with one additional check at the top:

request
  │
  ▼
[0] auth-source-detection
  │   - Has Cookie? → cookie path (existing §5.2 flow)
  │   - Has Authorization: Bearer eyJ... (3-segment JWT)? → m2m path
  │   - Otherwise → 401 or redirect to /login
  │
  ▼
[1] session-middleware       — for cookies: decrypt, load user from mirror
    m2m-middleware           — for JWTs: verify via JWKS, then dispatch:
                                  a. verifyM2MToken (JWKS) → JWT payload
                                  b. resolve tenant via findTenantByProviderOrgId(jwt.org_id)
                                  c. dispatch on sub prefix:
                                     - sub=user_*    → delegated_agent path
                                                       (introspection resolves agentClientId;
                                                        agentClientId must exist in the
                                                        agent-clients registry or 401)
                                     - sub=client_*  → service path (attachMachineContext)
                                                       (no registry gate — any JWKS-verified
                                                        token from a known-tenant org_id is
                                                        accepted as a service principal)
  │
  ▼
[2..4] (unchanged)           — tenant, membership, permission checks

All authentication paths produce the same AuthContext shape. The session.method field reflects the principal kind:

interface AuthContext {
  user: {
    id: ObjectId;
    provider_user_id: string;     // user_* for human/delegated_agent; client_* for service
    email: string;                // For service: synthetic "svc:<client_id>@securityv0.internal"
    display_name: string;
    is_super_admin: boolean;
  };
  tenant: { /* unchanged */ };
  membership: { /* unchanged */ };
  session: {
    expires_at: Date;
    // The shipped enum is the AuthMethod union from src/api/auth/auth-provider.ts:
    //   "sso" | "magic_link" | "oauth_google" | "oauth_microsoft" | "oauth_github"
    //   | "password" | "passkey" | "impersonation" | "m2m" | "api_key" | "dev_bypass"
    //   | "delegated_agent"
    //
    // The values that matter for the principal-kind dispatch are:
    //   - "sso" | "magic_link" | "oauth_*" | "password" | "passkey"
    //                                  → human_session (the AuthCallbackResult.authMethod)
    //   - "delegated_agent"            → delegated_agent (device_code, sub=user_*).
    //                                    Audit logs carry agentClientId (registry entry).
    //   - "m2m"                        → service (CI, workers — sub=client_*)
    //   - "api_key"                    → server-issued connector API keys
    //   - "dev_bypass"                 → local dev / test only
    method: AuthMethod;
    auth_connection_id: string | null;
    principal_kind: PrincipalKind; // See src/api/auth/principal-kind.ts
  };
}

Route handlers cannot tell (and should not care) how the caller authenticated. Permissions are checked identically across all paths.

13.6 Rate limiting and abuse

Programmatic access is where abuse concerns live. Three controls:

Per-credential rate limits. M2M Applications and API Keys have per-token rate limits configured in sv0's rate-limit middleware. Default: 600 req/min for M2M, 300 req/min for API keys. Can be raised per customer on request.
Anomaly detection on WorkOS's side. WorkOS tracks unusual API Key usage patterns (new IP, new user agent, sudden volume spike) and surfaces them in its dashboard. We subscribe to these signals via webhooks.
Scope enforcement. API keys and M2M tokens are gated by the same Permission enum as interactive users. A leaked token with FINDING_READ cannot delete findings.

13.7 SecurityV0 team bot access — interactive vs headless vs services

Three patterns exist for staff and services authenticating to the platform API. The pattern is chosen by the caller's runtime context, not by preference. Using the wrong pattern for the context creates either a broken flow (interactive device_code in a headless environment) or an attribution gap (service M2M for human-owned work).

13.7.a Staff CLI — `claude-code` device_code OAuth App (interactive bootstrap)

For staff working interactively on a local machine — Claude Code sessions, tsx scripts, visual-review tooling, ad-hoc curl.

One Connect App: claude-code-agent in securityv0-internal org.
Flow: Engineer runs npm run auth:login once per ~3 months → device_code grant opens a browser → JWT stored at ~/.config/sv0/auth.json.
JWT shape: sub=user_* — the JWT carries the engineer's WorkOS user identity directly.
Registry lookup: bearer middleware looks up the OAuth App in src/api/auth/agent-clients.ts by the agentClientId resolved via introspection.
Effective permissions: user_perms(sub) ∩ agent_scopes(claude-code) ∩ env_policy — bounded by the engineer's internal role and the app's declared scopes (api:read, ui:session:create).
Not suitable for: headless agents (Telegram, remote sandboxes, CI sandboxes) where interactive browser auth is impossible.

13.7.b Per-service M2M Applications (CI/workers, no human owner)

For long-running services that have no single human owner — CI runners, connector workers, reconciliation jobs, scheduled jobs.

One Connect App per service (e.g., ci-staging-m2m, connector-runner-staging).
Tokens dispatch as service principal (sub=client_*). Attribution in audit logs is the App's client_id, not a human user.
The registry is static in code (src/api/auth/agent-clients.ts). The registry gate applies to the delegated_agent path (§13.7.a): a user_* token whose resolved agentClientId is not in the registry is rejected with 401. The service-principal path (client_*) is not registry-gated — any JWKS-verified token from a known-tenant org passes. The registry's purpose for service principals is operational (the canonical inventory of which CI/worker apps exist), not gatekeeping.
Each entry's declared scopes are the ceiling — no short-circuit for blanket super-admin access.
Used for: GitHub Actions CI (visual-review, smoke checks), connector workers, scheduled jobs. Lives in securityv0-internal org.

Current service registry:

App name	Principal kind	Purpose
`claude-code`	`delegated_agent` (device_code, `sub=user_*`)	Staff CLI interactive bootstrap
`ci-staging-m2m`	`service` (no bridge)	GitHub Actions CI visual-review workflow

Rotation: generate a new credential pair in WorkOS for the specific service, update the service's 1Password entry and deployment config, restart the service, delete the old credential. No other service is affected. No staff laptops are touched.

Decision table — which pattern to use

Caller context	Pattern	Principal kind
Engineer running `npm run dev` locally or `tsx` scripts on workstation	§13.7.a — device_code	`delegated_agent`
Engineer running agent on a remote box, Telegram bot, CI sandbox	No shipped clean path. Talk to the auth owner before building one.	—
Service with no human owner (CI runner, connector worker, cron job)	§13.7.b — per-service M2M	`service`

Explicitly forbidden: sharing a credential across engineers, sharing an M2M credential across services, creating a single "bot" credential for all Claude Code sessions. Each defeats attribution and rotation.

13.8 What we don't build ourselves

To be explicit about the boundary: none of the following are sv0 code.

PAT generation UI
PAT list/rotate/revoke UI
API key hashing and storage
OAuth authorization server (authorization endpoint, token endpoint, consent screen, scope registry)
Client registration flow
Consent management UI for end users ("which apps have access to my account")
JWT signing key rotation
JWKS endpoint

All of this is WorkOS. We write only the token-verification middleware: JWKS-based JWT validation for M2M tokens. The WorkOS API Keys endpoint path (for customer PATs) is not yet integrated.

14. Cloudflare Access composition

Cloudflare Access is Layer 1 for the non-prod origins. It composes with WorkOS:

dev.securityv0.com and PR previews (pr-N-dev.securityv0.com) are gated by CF Access policies that allow: (a) SecurityV0 Google Workspace members, (b) Cloudflare service tokens for CI/CD.
app.securityv0.com has no CF Access app — the WorkOS hosted login is the only gate. The prod origin is open at the network layer.
Local development does not traverse CF Access at all.

For dev and PR previews this is a "belt and suspenders" posture: CF Access prevents unauthenticated internet traffic from reaching the origin, and WorkOS determines what authenticated users can actually do. For prod, only the WorkOS layer is in place. See Cloudflare Zero Trust Access for service-token names and policy details.

15. Not yet implemented

These items are intentionally deferred. Each is documented here so a future contributor knows whether to extend an existing path or add a new one.

Component	Status
Webhook receiver (`POST /webhooks/workos`)	Not wired. The Mongo `users` / `memberships` mirror is maintained via login-time upsert only.
Reconciliation job	Not implemented. Mirror drift between WorkOS and the local Mongo mirror is not auto-detected.
WorkOS API Keys widget (customer PATs)	Deferred — product gap. WorkOS API Keys are org-scoped (empirically confirmed 2026-04-30 staging spike), so the widget cannot directly serve as a per-user customer PAT. Customer programmatic access path is open: wait for WorkOS user-scoped keys, build a thin sv0-side PAT layer, or route customer agents through OAuth/MCP only (§13.4). See §13.3 for the open question.
SCIM Directory Sync	Per-customer, not yet enabled.
Full Admin Portal flow	Not yet surfaced in UI. SSO connections are created manually via the WorkOS dashboard.

For operational details (org IDs, DNS records, Google OAuth client config, common gotchas), see the WorkOS Production Configuration runbook.

16. Design principles and operational notes

16.1 Don't add a secret to fix the legacy path

The current shape is the floor: one provider, one cookie-seal slot, one super-admin signal, one redirect-URI source. If you are about to add a new env var, allowlist, or middleware to fix a problem, stop and look for the existing path first. Each duplicate config slot this surface has carried produced a real production incident; adding a parallel slot is how we got there. The design rationale lives in the Auth Simplification Plan (2026-05-08).

16.2 Failure mode — WorkOS API outage

The provider's getOrganizationMemberships helper returns an empty list on transient WorkOS errors (it logs a warning and does not throw — a thrown error would break the auth flow entirely; an empty list lets the user proceed without super-admin for the duration of the outage). Two request paths hit WorkOS:

Cookie callback (src/api/routes/auth.ts) — every cookie login during the outage resolves to is_super_admin: false.
First-time bearer JIT-upsert (src/api/middleware/bearer-token-middleware.ts) — runs only on the very first bearer request for a provider_user_id against a given DB.

Existing-row bearer requests are unaffected — they read user.is_super_admin from the local mirror without calling WorkOS. The 5-minute in-memory cache means subsequent attempts within that window pick up the real value once WorkOS recovers — no manual intervention.

On-call triage rule. A cookie login bounced to /no-access during an outage is indistinguishable from a real org-membership revocation. If multiple staff are simultaneously bounced while Grafana or WorkOS status pages show API errors, it's the outage; if it's one staff member and WorkOS is healthy, it's a real membership change.

16.3 Failure mode — bearer existing-row super-admin is not auto-refreshed

is_super_admin is set on the local users row at cookie callback (every login) and at bearer JIT-upsert (first bearer request for a user). Subsequent bearer requests do not refresh the flag — they read the cached value. A staff member added to the super-admin org after their first bearer call against a DB still needs one cookie login (or a manual DB refresh) to flip the bit on their existing row. Refreshing on every bearer request would multiply the WorkOS API load on cache miss without a corresponding user-visible benefit.

Authentication, end-to-end (runbook) — operational flow walkthrough
Agent and M2M Authentication (runbook) — how to authenticate as an agent or M2M client
Auth Simplification Plan (2026-05-08) — rationale behind the current single-signal shape
ADR-016: Multi-Tenant Authentication Architecture
ADR-017: WorkOS as Authentication Provider
ADR-012: User Authentication Strategy (superseded)
04 — API Layer
Infrastructure: Cloudflare Zero Trust
Provider comparison research
sv0-platform: scripts/cli/README.md

1. Mental model​

2. Layered auth architecture​

2.1 Auth provider abstraction (AuthProvider interface)​

3. Data model​

tenants​

users​

memberships​

tenant_configs​

3.5 Tenant ↔ provider org binding: constraint, escape hatch, and refactor triggers​

What the constraint forces today​

Escape hatch: tenant_slug override on cookie mint​

When this constraint actually breaks (refactor triggers)​

Target shape for the refactor (when needed, not now)​

Operational hygiene right now (no schema change)​

4. Identity lifecycle​

4.1 User creation​

4.2 Login flow​

4.3 Session lifetime​

4.4 Per-tenant SSO enforcement (tenant- and connection-specific)​

5. Tenant routing and middleware​

5.1 URL structure​

5.2 Middleware pipeline​

6. Super-admin model​

6.1 What super-admin grants​

6.2 Internal roles​

6.3 Permission resolution for super-admin requests​

7. Roles and permissions​

7.1 Tenant roles (mirrored from WorkOS)​

7.2 Permissions (defined in sv0 code)​

7.3 Role → permission map​

7.4 Route declarations​

8. Per-tenant configuration​

8.1 What lives in tenant_configs​

8.2 Editing flow (phase 1)​

8.3 Editing flow (later — self-service)​

9. Operating policies​

9.1 Evaluation vs paid​

9.2 Provisioning a new evaluation tenant​

9.3 Converting evaluation to paid​

9.4 Churning a customer​

9.5 SecurityV0 staff onboarding​

9.6 SecurityV0 staff offboarding​

10. Local development​

10.1 Testing auth paths locally​

11. Webhooks and mirror sync​

11.1 Webhook events we handle​

11.2 Webhook security and idempotency​

11.3 Reconciliation​

12. Security posture and threat considerations​

12.1 What this design protects against​

12.2 What this design does NOT protect against​

12.3 Audit logging​

13. Programmatic and machine access​

13.1 The five principal kinds​

13.2 M2M Applications (delegated_agent and service principals)​

13.3 API Keys widget (customer PATs — DEFERRED, product question open)​

13.4 AuthKit OAuth Applications and MCP support (future customer_agent principal)​

13.5 Middleware changes to support programmatic access​

13.6 Rate limiting and abuse​

13.7 SecurityV0 team bot access — interactive vs headless vs services​

13.7.a Staff CLI — claude-code device_code OAuth App (interactive bootstrap)​

13.7.b Per-service M2M Applications (CI/workers, no human owner)​

Decision table — which pattern to use​

13.8 What we don't build ourselves​

14. Cloudflare Access composition​

15. Not yet implemented​

16. Design principles and operational notes​

16.1 Don't add a secret to fix the legacy path​

16.2 Failure mode — WorkOS API outage​

16.3 Failure mode — bearer existing-row super-admin is not auto-refreshed​

Related​

1. Mental model

2. Layered auth architecture

2.1 Auth provider abstraction (`AuthProvider` interface)

3. Data model

`tenants`

`users`

`memberships`

`tenant_configs`

3.5 Tenant ↔ provider org binding: constraint, escape hatch, and refactor triggers

What the constraint forces today

Escape hatch: `tenant_slug` override on cookie mint

When this constraint actually breaks (refactor triggers)

Target shape for the refactor (when needed, not now)

Operational hygiene right now (no schema change)

4. Identity lifecycle

4.1 User creation

4.2 Login flow

4.3 Session lifetime

4.4 Per-tenant SSO enforcement (tenant- and connection-specific)

5. Tenant routing and middleware

5.1 URL structure

5.2 Middleware pipeline

6. Super-admin model

6.1 What super-admin grants

6.2 Internal roles

6.3 Permission resolution for super-admin requests

7. Roles and permissions

7.1 Tenant roles (mirrored from WorkOS)

7.2 Permissions (defined in sv0 code)

7.3 Role → permission map

7.4 Route declarations

8. Per-tenant configuration

8.1 What lives in `tenant_configs`

8.2 Editing flow (phase 1)

8.3 Editing flow (later — self-service)

9. Operating policies

9.1 Evaluation vs paid

9.2 Provisioning a new evaluation tenant

9.3 Converting evaluation to paid

9.4 Churning a customer

9.5 SecurityV0 staff onboarding

9.6 SecurityV0 staff offboarding

10. Local development

10.1 Testing auth paths locally

11. Webhooks and mirror sync

11.1 Webhook events we handle

11.2 Webhook security and idempotency

11.3 Reconciliation

12. Security posture and threat considerations

12.1 What this design protects against

12.2 What this design does NOT protect against

12.3 Audit logging

13. Programmatic and machine access

13.1 The five principal kinds

13.2 M2M Applications (delegated_agent and service principals)

13.3 API Keys widget (customer PATs — DEFERRED, product question open)

13.4 AuthKit OAuth Applications and MCP support (future `customer_agent` principal)

13.5 Middleware changes to support programmatic access

13.6 Rate limiting and abuse

13.7 SecurityV0 team bot access — interactive vs headless vs services

13.7.a Staff CLI — `claude-code` device_code OAuth App (interactive bootstrap)

13.7.b Per-service M2M Applications (CI/workers, no human owner)

Decision table — which pattern to use

13.8 What we don't build ourselves

14. Cloudflare Access composition

15. Not yet implemented

16. Design principles and operational notes

16.1 Don't add a secret to fix the legacy path

16.2 Failure mode — WorkOS API outage

16.3 Failure mode — bearer existing-row super-admin is not auto-refreshed

Related