Pre-Client Enterprise Readiness Plan (v2.0)

TL;DR

MediaPro pilot targets early May 2026 (~10–14 days from 2026-04-22). Six parallel adversarial reviews (security, ops, connectors, enterprise buyer, Tier-2 analyst, CISO UI) identified 15 P0 ship-blockers at v1. Two architectural decisions — taken between v1.1 and v1.4 — collapse most of those P0s:

WorkOS for auth eliminates the production auth bypass, attacker-controlled x-tenant-id, and hardcoded dev cookie password (original P0-1/2/3) by replacing them with real production auth.

MongoDB Atlas M10 migration in the pilot window eliminates the unauthenticated-MongoDB and same-disk-backup risks (original P0-4/6) — Atlas provides auth, HA, and native PITR backups out of the box.

What remains is 11 items of real pilot work across four capabilities (observability, connectors, analyst workflow, marketing/trust hygiene) plus a compute-substrate migration that is deferred to post-pilot. Three tracks run in parallel: engineering, client-facing (Isaac + Ivan + Sergey), and trust/legal (de-escalated because MediaPro waived DPO for read-only metadata scope).

The single highest-leverage action is not engineering — it is booking the WorkOS sales call (Sally Park thread, response received 2026-04-16). Nothing in Track A past Day 2 unblocks until that call completes.

1. Client context — MediaPro

Global media company (EMEA HQ). CIO Sergi and head of IT/infra Ezequiel are the buying committee. They stopped engagement with Fortinet and Cisco on AI security (both delivered incomplete proposals). Stack: Azure-heavy (Copilot, Foundry, Copilot Studio) + AWS + ServiceNow + Jira; Google and Oracle are present but not the primary concern; dev team uses Claude.

Compliance posture (confirmed in meeting)

Access scope is read-only, metadata-only. Ezequiel confirmed this scope does not require DPO involvement on their side; Sergi (Corporate Security Officer within the CIO org) signed off verbally.
Data-residency constraints have not been formally stated; MediaPro's stack is EMEA-based, so the Atlas region choice should match or explain.
Net effect on this plan: Trust/legal artifacts (original P0-13, Trust & Controls Summary) are de-escalated from "legal gate" to "credibility hygiene." Still required for the next client, no longer pilot-blocking.

Pilot scope and client commitments

Pilot environment: TBD (prod or staging) — they decide after stack inventory.
They are offering us: Jira Cloud test environment (unblocks sv0-connectors#72), AWS test access (validates our AWS multi-account work on real data), full stack disclosure by end of this week.
Client-side next step (two weeks): Ezequiel's team prepares credentials and env access for early-May pilot.

Use-case coverage — MediaPro's six concerns vs. our model

Tracked in detail in sv0-documentation#194.

#	MediaPro concern	Our coverage	Confidence
1	Lack of AI tool visibility (shadow AI, licenses)	Foundry + Copilot Studio discovery via service principals; workload entity type	High for discovery; gap on license-usage framing
2	Hallucinations / guardrails	Out of scope — frame explicitly, do not overpromise	n/a
3	Accountability over time (ownership decay)	Scope-drift evaluator + temporal diff + orphaned-sensitive cluster	Direct hit — our core value prop
4	Illicit automation via script with egress	Scope drift + cross-system auth + data-domain egress category	Direct hit pending verification we catch IAM-roled scripts, not only registered apps
5	Platform coverage / global reader access	Read-only metadata connector model matches exactly	High; Google + Oracle on roadmap
6	Admin enforcement for out-of-report-line teams	Out of scope — reframe as visibility	n/a

Ezequiel's emphasis on concern #4 — "automation via script with egress traffic, not DSPM" — is exactly our execution-authority story. This is the single most important positioning point for our discovery call.

2. Architectural decisions

Three decisions, taken between v1.0 and v1.4, shape everything downstream. Each one pre-resolves or collapses multiple items from the original P0 inventory.

2.1 Auth — WorkOS (decision: v1.0, unchanged)

Decision: WorkOS as the identity provider for production auth, per ADR-016 and ADR-017. WorkOS thread active with Sally Park (response 2026-04-16 offering startup credits, non-prod test environment, shared Slack channel).

What it resolves:

Original P0-1: prod runs NODE_ENV=development + AUTH_PROVIDER=dev — resolved by Phase 0/1 rollout flipping to NODE_ENV=production + AUTH_PROVIDER=workos.
Original P0-2: attacker-controlled x-tenant-id — auto-resolved once P0-1 lands; the new middleware derives tenant from the verified JWT's provider_org_id.
Original P0-3: hardcoded DEV_COOKIE_PASSWORD — obsolete once AUTH_PROVIDER=dev is no longer in prod.

What it enables:

Magic-link test accounts for pilot-window QA (free, 2–3 hours setup, no Google Workspace required).
Session-stacking "act as user" impersonation feature, post-pilot (3–4 days of engineering on already-scaffolded code: Permission.INTERNAL_IMPERSONATE, AuthMethod.impersonation, is_super_admin session fields are already present in src/api/auth/).

Day-0 blocker: Book WorkOS sales call. Nothing past Day 2 of Track A unblocks until Phase 0/1 is live.

Tracking: sv0-platform#373, sv0-platform#492 (epic).

2.2 Database — MongoDB Atlas M10, direct billing, in-pilot (decision: v1.4)

Decision: Migrate prod MongoDB from Hetzner self-managed to Atlas M10 inside the pilot window. Direct MongoDB billing (not Azure Marketplace). Region: pending MediaPro confirmation; default to EU (Frankfurt) unless they explicitly approve US residency. MediaPro is an EMEA-based media company; defaulting to US assumes a residency posture they haven't signed off on. Frankfurt also aligns with the Grafana Cloud region picked in the observability research.

Why direct billing, not Marketplace: Azure Sponsorship and Founders Hub credits explicitly exclude Marketplace third-party products. Atlas via Azure Marketplace only helps with billing/procurement under MACC (a paid commitment we don't have). Atlas-direct preserves the ability to burn MongoDB for Startups credits (application in-flight).

What it resolves (once cutover is complete and verified):

Original P0-4: unauthenticated MongoDB — resolved natively; Atlas requires auth.
Original P0-6: no offsite backups on same-disk — resolved natively; Atlas ships with PITR and offsite backups.
Single-node risk — resolved; M10 is a three-node replica set by default.

Resolution evidence required (must all pass before these P0s are marked closed):

Atlas cluster has DB auth enabled; admin user is rotated to a managed secret.
PITR + backup policy visible in Atlas console; retention matches committed SLA.
App pointed at Atlas MONGODB_URI; Hetzner mongo compose service is dev-only or removed.
Restore + smoke test passed (verified mongorestore from a backup, followed by a read/write smoke).

Until all four are green, mark P0-4 / P0-6 as "in-flight, expected-resolved-by-Atlas" rather than "resolved."

Migration plan (from Codex, validated):

Externalize prod MONGODB_URI in docker-compose.deploy.yml; make the mongo service dev-local-only. Zero production impact.
Stand up Atlas M10 with DB auth, IP allowlist from Hetzner static egress, backups on. No traffic yet.
30-minute maintenance window: mongodump Hetzner → mongorestore Atlas. Flip MONGODB_URI. Smoke test. Prod Mongo has near-zero data today; the data migration is trivial.

Anti-patterns locked (per Codex review):

No self-hosted MongoDB on cloud VMs. Replaces one operational burden with another.
No Cosmos DB for MongoDB (vCore or RU). Wire-compatible but ops-divergent; weakens portability and MongoDB behavioral confidence.
No Atlas-via-Marketplace billing. Adds indirection without unlocking credit burn.
No app-in-cloud-A + Atlas-in-cloud-B split pre-revenue.

Funding:

MongoDB for Startups application — submitted this week. Eligibility confirmed (pre-Series-A, <7 yrs, single software product). Credit tier opaque publicly; typical $500–$5K with VC-partner referral unlocking higher.
Cash overflow: Atlas M10 ~$57/mo if credits don't stretch. Tolerable.

Tracking: sv0-platform#493.

2.3 Compute — Hetzner through pilot, cloud-agnostic VM post-pilot (decision: v1.3)

Decision: Keep the API container on Hetzner for the pilot window. Migrate compute post-pilot to a cloud-agnostic Linux VM + Docker Compose. Cloud choice (Azure VM vs AWS VM) defers until post-pilot, contingent on credit approval status.

Why staged: Opus research concluded that migrating compute inside the 10-day pilot window introduces change-risk for cosmetic gain. ADR-018 explicitly allows managed-platform deferral ("we can defer through pilot as long as MongoDB is hardened"). With Atlas covering the DB layer, the Hetzner compute substrate is acceptable for a single-pilot duration.

Why cloud-agnostic VM, not Container Apps / ECS: Keep the first cloud migration boring and portable so a future cloud flip (Azure↔AWS) is a MONGODB_URI + DNS change, not a rewrite. Container Apps, ECS, and Fargate all bake in cloud-specific deployment semantics.

Post-pilot credit picture:

Program	Status	What it pays for
AWS Mercury Activate	$5K activated	AWS compute if we go AWS post-pilot
Azure Founders Hub	NOT applied	Azure VM + networking + Key Vault if we go Azure post-pilot. Entry tier realistically $1–5K for bootstrapped/pre-revenue; $150K ceiling requires investor-backing. Decision: apply after post-pilot compute choice is locked, or at raise via VC-partner path.
MongoDB for Startups	Applying 2026-04	Atlas (independent of cloud choice)
WorkOS startup credits	Awaiting Sally	SSO + SCIM + Impersonation

Tracking: sv0-platform#493 (Phase 2 = post-pilot compute migration).

2.4 Trust & legal — de-escalated per DPO waiver (decision: v1.2)

Decision: MediaPro confirmed read-only metadata access does not require DPO involvement. Trust & Controls Summary is still produced (credibility hygiene, required for client #2+), but no longer gated by external counsel sign-off for the MediaPro pilot.

What it resolves: Original P0-13 framing of "no DPA will block signature" is materially weaker in the MediaPro-specific path. Track C runs in parallel with engineering but is not on the critical path for pilot go-live.

Tracking: sv0-documentation#192 (Trust & Controls Summary v1), sv0-website#41 (website EU claim + status string hygiene).

3. Active work — what's left to ship for the pilot

The 11 items below remain after architectural decisions. Grouped by capability, each with owner and concrete action.

3.1 Observability stack + security hygiene

Decision (2026-04-22): Grafana Cloud free tier + BetterStack free tier + grafana/mcp-grafana MCP server. Two engineer-days total. $0 at pilot scale. Portable across Hetzner → Azure VM / AWS VM post-pilot migration. Full analysis in research/2026-04-22-observability-stack.md. Tracked in sv0-platform#494.

Decision drivers: (1) we already emit prom-client metrics + JSON logs, so no instrumentation rewrite; (2) grafana/mcp-grafana is best-in-class for Claude Code agent access (logs, metrics, alerts, dashboards, traces via MCP); (3) survives the post-pilot compute migration unchanged. Datadog explicitly rejected (free tier is a trap); self-host LGTM deferred until post-pilot if cost becomes an issue.

Day-1 security fixes (fold into observability rollout):

Close P0-5 — metrics tenant-ID leak. sv0_job_duration_seconds carries a tenant_id label in src/shared/metrics/metrics.ts:25; call sites at src/workers/runtime.ts:125,133. Strip the label; move per-tenant duration into log context. Required before any external scraper is pointed at /metrics. Note: the HTTP route label on sv0_http_request_duration_seconds is already parameterized via req.route.path (src/api/middleware/metrics.ts:11) — no tenant cardinality leak from URL scoping. ~15 min.
Gate /metrics and /diagnostics at the route level. Removing from PUBLIC_PATHS alone is not sufficient — createSystemRoutes() is mounted at src/api/app.ts:80, before the bearer and session middleware at lines 108 and 114. The route never reaches PUBLIC_PATHS evaluation. Fix: add a route-local bearer-check middleware wrapping /metrics and /diagnostics inside createSystemRoutes (src/api/routes/system.ts:52,62), leaving /health and /ready public for Docker / k8s / Cloudflare probes. Alloy scrapes with the token. ~30 min.
Close P0-7 — external uptime monitor. BetterStack free, 60-second checks on /api/v1/health, SMS + email, CF Access service-token headers to bypass login. ~30 min.
Session cookie secure=true unconditionally. Decouple from NODE_ENV. ~5 min.

Day 2-5: Grafana Cloud signup + Alloy + dashboards + alerts + MCP server wiring. Full runbook in sv0-platform#494. Owner: Ivan + Ops.

3.2 Connector resilience at MediaPro scale — 3 items

AWS multi-account enumeration (original P0-8, sv0-connectors#32). MediaPro is AWS-heavy; single-account scan is pilot-blocking. Validated against MediaPro-provided test env. Owner: Connectors.
Streaming pagination for Entra + AWS (original P0-9). azure_client.py:59-79 and aws_client.py:306-343 accumulate all pages in memory; 500K Entra SPs = OOM on 4GB host. Convert to generators. Owner: Connectors.
Azure shared retry_session() honors Retry-After (original P0-10). shared/sv0_azure/sv0_azure/auth.py:40-59. Pin respect_retry_after_header=True, retry_after_max=300, retries=6. ServiceNow client is the reference impl. Owner: Connectors.

3.3 Analyst workflow — 2 items

Connector-finding lifecycle parity (original P0-11). src/api/routes/findings.ts:67 hardcodes status: "active" for connector findings; they have no acknowledge/FP/close/reopen path. Give them the same lifecycle as evaluator findings. Owner: Platform.
Finding assignment + audit trail (original P0-12). Wire acknowledged_by end-to-end, add comments collection + UI, allow remediated → active reopen with audit. Owner: Platform + UI.

3.4 Marketing / trust hygiene — 3 items, ~15 minutes total (plus Trust Summary)

Remove four EU-infrastructure claims from website (original P0-14, sv0-website#41):
- src/pages/index.astro:33 feature list
- src/pages/evaluate.astro:58 detail text
- src/pages/platform.astro:62-65 FAQ
- src/pages/platform.astro:345 metric card
Remove or wire "All Systems Operational" (original P0-15) — hardcoded footer string not connected to any monitor.
Trust & Controls Summary v1 (original P0-13 reframed, sv0-documentation#192). De-escalated but still needed. CEO + external counsel.

3.5 Quality-of-life (P1 — ship in pilot-week-1 if not sooner)

PDF / printable report export (currently Markdown-only; audit committees cannot consume .md).
Evidence pack SHA256 hash visible on UI (computed but hidden today).
Access-path nodes show display names, not truncated hex IDs (FindingDetail.tsx:79,87,95 — display names already on payload).
Connector-report findings show meaningful evidence, not empty JSON viewer.
Jira Cloud connector Phase 1 ready against MediaPro test env (sv0-connectors#72).
Tenant onboarding runbook + credential rotation runbook per connector.

3.6 Post-pilot follow-ups (weeks 3–6)

Compute migration to Azure VM or AWS VM per credit status (sv0-platform#493 Phase 2).
Session-stacking acting_as impersonation feature (3–4 days, gated on Permission.INTERNAL_IMPERSONATE).
Add Atlas Private Link, Terraform the stack, secrets to Key Vault / Secrets Manager.
Log aggregation / tenant-tagged search.
express-rate-limit IPv6 bypass patch.
HMAC on evidence pack integrity (SHA256-only today).

4. Remediation calendar — three parallel tracks

Day 0 — single blocking action

Book WorkOS sales call. Ivan-owned. Nothing downstream unblocks until this call completes.

Track A — Platform / Ops / Connectors

Day	Action	Owner	Ref
1	MongoDB for Startups application submitted. Externalize prod `MONGODB_URI`. External uptime monitor wired. Hetzner snapshots on. Remove `/metrics` from `PUBLIC_PATHS`. Session cookie `secure=true`.	Ivan + Ops	§3.1
2	Atlas M10 cluster stood up. Default region: Frankfurt unless MediaPro confirms US-residency is acceptable. DB auth + IP allowlist + backups on. No traffic yet.	Ops	§2.2
2-3	WorkOS Phase 0/1 (`securityv0-internal` org, Google Workspace OAuth). `NODE_ENV=production` + `AUTH_PROVIDER=workos` in prod env. Kill dev-auth-bypass.	Platform	§2.1
3	Atlas cutover, 30-min maintenance window. `mongodump` → `mongorestore`. Flip `MONGODB_URI`. Smoke test.	Ops	§2.2
4	Seed 3–5 magic-link test accounts in demo tenant. Manual walkthrough as a client-role user.	Platform	§5
4-7	AWS multi-account enumeration validated against MediaPro test env.	Connectors	§3.2
5-8	Connector streaming pagination; Azure `Retry-After`.	Connectors	§3.2
6-9	Connector-finding lifecycle parity; finding assignment + audit.	Platform + UI	§3.3
8-11	Jira Cloud connector Phase 1 against MediaPro test env.	Connectors	§3.5
10-12	Tenant onboarding + credential rotation runbooks. Pilot provisioning dry-run.	Docs + Ops	§3.5

Track B — Client-facing

Day	Action	Owner
1-2	Isaac probes MediaPro on: prod vs staging pilot target, data-residency expectations, SSO requirements, compliance attestations.	Isaac
1-3	MediaPro requirements packet v1 drafted (Azure / AWS / Foundry / Copilot Studio / ServiceNow / Jira / Entra permissions and IAM role templates).	Ivan + Isaac
2-4	Use-case coverage doc: MediaPro's 6 concerns → our rules → gap list → Sergey's fast-track recommendations.	Ivan
5-7	Requirements packet sent to Ezequiel. Two-week credential-prep window starts.	Isaac
ongoing	Weekly sync with Ezequiel on credential prep progress.	Isaac

Track C — Trust / legal (de-escalated, parallel)

Day	Action	Owner
1	Remove four EU claims from website; remove / wire "All Systems Operational".	Website
2-7	Trust & Controls Summary v1 drafted, reconciled with Atlas + Hetzner + WorkOS infrastructure. Supersedes fragmented Inetum/Deloitte partner-prep content.	CEO
8-12	Sign-off, version-controlled commit, share with MediaPro alongside requirements packet.	CEO

If a track slips

Track A slips: pilot goes live with documented exceptions in the client agreement annex. Exceptions: unfinished connector resilience = "one-account pilot" scope, deferred assignment UI = "contact us" support model, etc.
Track B slips: MediaPro cannot prepare credentials in two weeks; pilot date pushes. Worse commercial outcome than a technical slip.
Track C slips: acceptable for MediaPro (DPO waived). Not acceptable for client #2.

5. Multi-user testing and impersonation strategy

Current state in code (audit 2026-04-22)

The impersonation scaffolding is more built-out than ADR-016's "proposed" status suggests. Present in src/api/auth/:

Permission.INTERNAL_IMPERSONATE defined and assigned to InternalRole.owner
AuthMethod union includes "impersonation"
workos-provider.ts maps WorkOS's "Impersonation" response
session.ts stores is_super_admin with 8-hour TTL
resolvePermissions() distinguishes super_admin_derived vs. membership sources

What's missing is the easy part: a POST /api/v1/admin/impersonate route, acting_as_user_id session field, UI banner, audit log emit. ~3–4 days of engineering.

Gating dependency

All multi-user testing paths require WorkOS Phase 0/1 live. Until then, REQUIRE_AUTH=false is in effect and the users / memberships / tenants collections don't exist.

Path comparison

Path	Cost	Fit	Verdict
Second Google Workspace	$35–70/mo + 1–2 days	Operationally clunky	Overkill
Users in `securityv0.com` Workspace	$0–70/mo	Domain-collision risk — users may inherit `is_super_admin` from `securityv0-internal` org	Dangerous; skip
Magic-link test accounts	$0 + 2–3 hours	Any email works; no IdP required	Recommended interim
WorkOS Dashboard Impersonation	Included (confirm with Sally)	Must trigger from admin.workos.com	Free fallback
sv0 session-stacking	$0 + 3–4 days engineering	Uses existing permission scaffolding	Recommended long-term (post-pilot)

Recommendation

Pilot-window: magic-link test accounts, seeded post-WorkOS-Phase-0 (Track A day 4). Week 1 post-pilot: build the acting_as feature — gated by existing Permission.INTERNAL_IMPERSONATE, with mandatory audit log and server-rendered banner.

6. Trust & Controls Summary v1 — contents

De-escalated from "legal blocker" to "credibility artifact." Still needed.

Named sub-processors: Hetzner (transitional pilot hosting), MongoDB Atlas (hosted on Azure or AWS per region choice), Cloudflare (Zero Trust + CDN), WorkOS (auth), GitHub (CI/CD + GHCR).
Data controller identity + DPA commitment timeline. Unsigned draft DPA attached as annex.
Breach notification SLA — 72 hours (GDPR-grade).
Deletion-on-termination guarantee with timeframe and verification.
Current security controls with honest caveats + remediation dates.
SOC 2 readiness timeline. Interim: completed CAIQ.
Named security contact + escalation path.

Tracked in sv0-documentation#192.

7. Issue tracking — standing readiness board

Umbrella tracker: sv0-documentation#195 — EPIC: MediaPro pilot readiness is the single pane of glass with per-track checklists. Use that issue for status skimming; use this document for rationale and evidence.

Repo	Tracking issue	Covers
`sv0-platform`	#492 — epic: pre-client readiness P0s (platform)	Remaining platform P0s (after auth + DB auto-resolutions)
`sv0-platform`	#493 — Atlas cutover + compute migration	Atlas this week, compute post-pilot
`sv0-platform`	#373 — WorkOS auth rollout	Auth auto-resolutions
`sv0-platform`	#366 — capabilities endpoint	Settings hardcoded drift
`sv0-platform`	#494 — observability stack rollout	§3.1 — Grafana Cloud + BetterStack + MCP
`sv0-connectors`	#89 — epic: pre-client readiness (connectors)	§3.2 connector resilience
`sv0-connectors`	#32 — AWS multi-account	AWS scale
`sv0-connectors`	#72 — Jira Cloud Phase 1	MediaPro-offered test env
`sv0-website`	#41 — pre-client readiness (website)	EU claims + status string
`sv0-documentation`	#192 — Trust & Controls Summary v1	§6
`sv0-documentation`	#193 — MediaPro requirements packet	Track B
`sv0-documentation`	#194 — MediaPro use-case coverage	Sergey fast-track input

Appendix A: Original P0 inventory — what was found, how resolved

The v1.0 six-perspective review identified 15 P0 ship-blockers. Annotations reflect v1.4 architectural decisions.

#	P0	v2.0 status
P0-1	Prod runs `NODE_ENV=development` + `AUTH_PROVIDER=dev`	Active — fixed by WorkOS Phase 0/1 (§2.1)
P0-2	Attacker-controlled `x-tenant-id`	Auto-resolved once P0-1 lands; new middleware derives from JWT
P0-3	Hardcoded `DEV_COOKIE_PASSWORD`	Auto-resolved once `AUTH_PROVIDER=dev` is no longer in prod
P0-4	MongoDB runs with no authentication	In-flight, resolved once Atlas cutover verified against evidence checklist in §2.2
P0-5	`/metrics` publicly exposed + leaks tenant_id labels	Active — §3.1, 15 min fix
P0-6	No offsite backups; backups on same disk	In-flight, resolved once Atlas cutover verified (native PITR; evidence checklist §2.2)
P0-7	No external uptime monitor	Active — §3.1, 30 min fix
P0-8	AWS connector single-account only	Active — §3.2, pilot-blocking at MediaPro scale
P0-9	Connector pagination buffers in memory	Active — §3.2, OOM risk at scale
P0-10	Azure `retry_session()` doesn't honor `Retry-After`	Active — §3.2
P0-11	Connector findings stuck at `status: "active"`	Active — §3.3
P0-12	No assignment / comments / audit trail	Active — §3.3
P0-13	No canonical Trust & Controls package	Active but de-escalated — §2.4, MediaPro waived DPO
P0-14	Website EU-infrastructure claims (4 locations)	Active — §3.4, 5 min fix
P0-15	Hardcoded "All Systems Operational" footer	Active — §3.4, 5 min fix

Net: 2 in-flight via Atlas cutover (P0-4, P0-6, resolution gated on verification checklist), 2 in-flight via WorkOS Phase 0/1 (P0-2, P0-3), 11 active pilot work items.

Appendix B: Infrastructure options not taken

From Opus research agent (2026-04-22). Documented to audit the path not taken.

Option	1-mo cash	Why not
A. AWS t3.small + Atlas M10	$72/mo	Deferred to post-pilot per Opus change-risk argument
B. AWS t3.medium + Atlas M10	$87/mo	Same as A with more headroom
C. Azure Container Apps + Cosmos DB vCore	$0 cash, ~$210/mo credits	Cosmos weakens portability; explicit anti-pattern
D. Azure Container Apps + Atlas-on-Azure	~$57 cash + ~$60 credits	Container Apps = cloud-specific deployment semantics; breaks portability promise
E. Hetzner + Atlas M10 (chosen hybrid)	~$81/mo ($24 Hetzner + $57 Atlas)	Lowest-risk for pilot window, preserves cloud optionality post-pilot

Revision history

Date	Change
2026-04-21	v1.0 — initial six-perspective adversarial review, 15 P0s, single-track 14-day calendar
2026-04-22	v1.1 — Codex review applied: issue tracking table, P0-13 reframed, P0-14 expanded, Track A/B split
2026-04-22	v1.2 — MediaPro client context folded in; initial infra direction (migrate to AWS+Atlas during pilot)
2026-04-22	v1.3 — Opus + Sonnet research: deferred AWS migration to post-pilot; multi-user testing section added; Day-0 WorkOS sales call
2026-04-22	v1.4 — Codex review: split DB/compute decisions; Atlas cutover in-pilot; compute stays Hetzner; anti-patterns locked
2026-04-22	v2.0 — structural rewrite. Architectural decisions (WorkOS, Atlas, compute, trust) moved to the front. Active work regrouped by capability (observability, connectors, analyst workflow, marketing). Original 15-P0 inventory moved to Appendix A with per-item resolution status. Opus options table moved to Appendix B. Purpose: eliminate the incongruity between "MongoDB has no auth" style P0s and "we're migrating to Atlas" decisions they are pre-resolved by.
2026-04-22	v2.1 — Observability stack research completed (Opus subagent). Decision: Grafana Cloud free + BetterStack free + `grafana/mcp-grafana` MCP server. Rationale: best agentic-ops surface, zero ops burden, survives post-pilot compute migration, $0 at pilot scale. §3.1 rewritten to incorporate 5-day rollout; new issue sv0-platform#494. Full research in `docs/architecture/research/2026-04-22-observability-stack.md`.
2026-04-22	v2.2 — URL tenant-scoping research (Opus subagent) + Codex review applied: (1) Decided to keep `/t/:slug/` per ADR-016; Path 1 (add `SINGLE_TENANT_SLUG` env flag when first dedicated-deployment client signs) rather than subdomain migration. ADR-016 amended with cardinality verification (`route` label is parameterized, no URL-cardinality leak), enterprise-tool survey, revisit thresholds. (2) §3.1 corrected: removing `/metrics` from `PUBLIC_PATHS` alone is insufficient because `createSystemRoutes()` mounts at `app.ts:80` before auth middleware — requires route-local bearer-check on `/metrics` and `/diagnostics`. (3) "Resolved by Atlas" language softened to "in-flight, resolved once cutover verified" with explicit evidence checklist. (4) Atlas region language tightened: pending MediaPro confirmation, default EU/Frankfurt (aligns with Grafana Cloud region) unless US residency is explicitly approved.
2026-04-22	v2.3 — Umbrella epic sv0-documentation#195 opened as single-pane-of-glass tracker with per-track checklists. This review doc remains the source of rationale; the epic is for status skimming.

Owner and review cadence

Owner: Ivan Fofanov (CTO).

Review cadence: Twice-weekly status update against Track A/B/C items until MediaPro go-live. Close items with commit/PR reference, not silently.

Definition of done: Every §3 active item is closed with evidence, or documented as a client-agreement exception with Ezequiel's sign-off.

TL;DR​

1. Client context — MediaPro​

Compliance posture (confirmed in meeting)​

Pilot scope and client commitments​

Use-case coverage — MediaPro's six concerns vs. our model​

2. Architectural decisions​

2.1 Auth — WorkOS (decision: v1.0, unchanged)​

2.2 Database — MongoDB Atlas M10, direct billing, in-pilot (decision: v1.4)​

2.3 Compute — Hetzner through pilot, cloud-agnostic VM post-pilot (decision: v1.3)​

2.4 Trust & legal — de-escalated per DPO waiver (decision: v1.2)​

3. Active work — what's left to ship for the pilot​

3.1 Observability stack + security hygiene​

3.2 Connector resilience at MediaPro scale — 3 items​

3.3 Analyst workflow — 2 items​

3.4 Marketing / trust hygiene — 3 items, ~15 minutes total (plus Trust Summary)​

3.5 Quality-of-life (P1 — ship in pilot-week-1 if not sooner)​

3.6 Post-pilot follow-ups (weeks 3–6)​

4. Remediation calendar — three parallel tracks​

Day 0 — single blocking action​

Track A — Platform / Ops / Connectors​

Track B — Client-facing​

Track C — Trust / legal (de-escalated, parallel)​

If a track slips​

5. Multi-user testing and impersonation strategy​

Current state in code (audit 2026-04-22)​

Gating dependency​

Path comparison​

Recommendation​

6. Trust & Controls Summary v1 — contents​

7. Issue tracking — standing readiness board​

Appendix A: Original P0 inventory — what was found, how resolved​

Appendix B: Infrastructure options not taken​

Revision history​

Owner and review cadence​