Multi-Account End-to-End Implementation Plan
Parent epic: sv0-documentation#195 — MediaPro pilot readiness umbrella. This plan supplies the architectural depth for items #195 lists as "validate against MediaPro test env" without specifying how.
Tracking issue: sv0-documentation#198.
Revision-2 changelog (2026-04-23, post-team-decisions): all five open questions resolved by the team:
- Q1 (Lambda-secret-stored-Entra-SP correlation rule): defer post-pilot → locked as D18. Keeps v0 focused on V5/V7 proof; avoids a brittle cross-cloud rule before core stitching stabilises.
- Q2 (Stream 4 vs
sv0-connectors#27): build alongside, share Terraform modules → locked as D19.- Q3 (M1–M4 migration ordering): ship as their own PRs against existing bug issues → locked as D20. Stream 3 Phase 1 scope reduces to M5–M12. Round 0 dispatches four small parallel PRs against
sv0-platform#488/#485/#487/#383.- Q5 (Terraform state backend): single backend in
mp-security→ locked as D21.- Team also explicitly agreed with declining the CEO reviewer's M11/M12 deferral recommendation — the schema-level correlation/lineage fields are load-bearing; only the UI card (Phase 6) belongs in "full."
- Coord-7 (
sv0-connectors#32AC amendment for "never chains STS") filed on the issue on 2026-04-23.Plan is now at pre-kickoff state. All open questions closed, all blockers/majors addressed, reviewer sign-off applied.
Revision-1 changelog (2026-04-22, post-review): Adversarial reviewer (PR #199 comment) and ceo-reviewer (PR #199 comment) returned with two blockers, three majors, and a convergent v0 scope reframe. Applied fixes:
- B1 (scope shape conflict): locked
ScanScopeDoc.scope_keys = { account_ids: string[], regions: string[] }(always plural) withservice_categories[]outsidescope_keys. Updated Stream 1 schema example, Stream 1 task example, Stream 1 interface section, Stream 2 scope object example, Stream 2 OQ #1 closure.- B2 (Q4 ↔ Stream 1 OQ #6 contradiction): promoted Q4 to D15 (minimal
ConnectorReportwrap in Stream 1 Phase 4 — Task 4.6 added; closes Stream 1 OQ #6).- M1 (V6 weak link in v0): V6 moves to "full" (post-pilot). v0 now ships V1–V5 + V7 + V8. D10 reframed; V6 row in validation matrix updated; demo headline becomes V5 (cross-account, no fragile rule).
- M2 (Stream 2 spec deviation): explicit spec-deviation banner added to Stream 2 doc; Coord-7 added to file the
sv0-connectors#32AC amendment.- M3 (Round 4 intra-round dep): Stream 4 Phase 1 split into 1a (no IAM, Round 4) and 1b (uses Stream 2 Phase 5 module, Round 5). Round 4 close gate: Stream 2 Phase 5 ships.
- M5 (D14 colon-suffix hand-wave): D14 rewritten as a concrete Phase 1 task (verify + fix-if-needed in
src/ingestion/diff-engine/evidence-completeness.ts).- CEO-cuts (D16, D17): Stream 4 Phase 8 (Foundry Terraform in MediaPro Lab 2) and Stream 4 Scenario B2 and Stream 3 Phase 5 (debuggability + opt-out APIs) defer from v0 to full. v0 reuses the live dev/prod Foundry tenant for V7.
- CEO-flagged "trivial migration" overclaim: corrected to "naming reuse, not type reuse" (Stream 1 introduces a new
ScanScopeDoccollection that follows the existing in-memoryScanScopeshape conventions but is structurally new).- CEO-flagged partner-facing missing pieces: new "Partner-facing follow-on items" section added; three items filed under #195, not under this architectural plan.
TL;DR
The MediaPro pilot needs SecurityV0 to render a single authority graph spanning multiple AWS accounts × Entra × ServiceNow × Azure Foundry. Today the platform can't, because three architectural pieces are missing: per-tenant connector control, multi-account AWS scanning, and cross-connector graph stitching. This plan sequences the four work-streams — three architectural, one demo-as-validation-gate — that together close the gap, retire the placeholder Nimbus Enterprise brand in favor of MediaPro, and produce the demo that closes the May pilot.
Why one umbrella plan, not four PRs
The four streams are entangled. Lab 2 cannot be built credibly until Streams 1, 2, and 3 land — the 2026-04-08 demo lab plan already named this gate. Tracking the streams under one umbrella keeps the dependency chain visible, lets reviewers reason about cross-stream contracts in one place, and produces one coherent artifact partners can read.
The work itself ships across sv0-platform, sv0-connectors, and sv0-demo-labs — but the plans stay together in this repo so the architectural shape stays trackable. Implementation PRs in those repos reference back to the relevant phase in the relevant substream doc.
The four streams
Stream 1 — Connector control & execution architecture
2026-04-22-connector-control-execution-architecture.md · 642 lines · 22 tasks · 4 phases (revision-1: +1 minimal-wrap task per D15)
Per-tenant ConnectorInstance documents (1..N per tenant per connector kind), first-class persisted ScanScope (each scope a unit of scan work), ScanRun history with per-category structured errors, in-process Mongo-claim scheduler. Replaces today's manual laptop invocations with a multi-tenant, observable, per-scope control plane. Naming reuse, not type reuse: Stream 1 introduces a new ScanScopeDoc MongoDB collection that follows the shape conventions of the existing in-memory ScanScope type at src/ingestion/types.ts:92 (mode + sourceSystems + scannedEntityTypes + errors), but is structurally a new persisted document with tenant_id, instance_id, scope_keys, service_categories, schedule, and budget fields. The two coexist — the in-memory type continues to serve the diff engine and circuit breaker; the new doc serves the control plane. Migration is straightforward but not literally a no-op (adversarial reviewer's correction to revision-0's "trivial migration" claim).
Defines for the others: ConnectorInstanceDoc, ScanScopeDoc, ScanRunDoc schemas; ConnectorSyncDoc.connector_instance_id link; HTTP API (POST /api/v1/connector-instances, POST /api/v1/scan-runs, DELETE /api/v1/connector-instances/:id) for IaC up/scan/teardown.
Stream 2 — Multi-account AWS connector architecture
2026-04-22-multi-account-aws-connector-architecture.md · 540 lines · 19 tasks · 6 phases
Two-mode auth (Organizations auto-discovery via AWS_ORGANIZATION_ROLE_ARN, or explicit --accounts). Never chains STS — every per-account assume happens directly from bootstrap creds. Twelve service categories (iam, lambda, bedrock, ecs_ecr, step_functions, eventbridge, s3, secrets, dynamodb_sns, cloudtrail, access_analyzer, config). (account × category) is the unit of work; ThreadPoolExecutor capped at 2 concurrent cells per account. CFN StackSet template + parallel Terraform module deploy SecurityV0ReadOnly to N accounts via SERVICE_MANAGED OrganizationalUnitIds. Estimated cost: ~10K AWS API calls / Lab-2 scan, $0.05 CloudTrail charge.
Consumes from Stream 1: ConnectorInstance.scanScope shape (scope_keys = {account_ids[], regions[], service_categories[]}), ScanRun partial-success semantics.
Defines for Stream 3: sourceFingerprint = sha256(source_system_id : source_record_id : source_field_path) on every node/edge — deterministic stable join key. New node types: aws_ou, external_aws_account, external_oidc_provider (provider URL contains <tenant_id> for Entra-SP correlation). BELONGS_TO (workload→account, account→OU) and TRUSTS (with boundary: cross_account and trustPolicyHash) edges.
Defines for Stream 4: CFN StackSet cfn/securityv0-readonly-role-stackset.yaml + Terraform module sv0-demo-labs/shared/securityv0-spoke-role/. Permission boundary SecurityV0ReadOnlyBoundary explicit-denies write verbs.
Stream 3 — Cross-connector graph stitching architecture
2026-04-22-cross-connector-graph-stitching-architecture.md · 674 lines · 30 tasks · 6 phases
Status name canonicalized to
succeededper code (sv0-platform/src/domain/scan-runs/types.ts) — Codex Tier 2 #706.
Extends, not supersedes the 2026-02-26 correlation research — generalizes its endpoint-URL bridging to a rule registry + first-class correlations collection + canonical_identity_id equivalence-class traversal that covers AWS multi-account federation. New stitch_ingestion worker job runs post per-connector sync_ingestion, pre evaluate_findings, async, debounced 60s per tenant, at-most-one-in-flight. Triggered by ScanRun.status=succeeded. Twelve schema migrations (M1–M12). Seven initial correlation rules.
Consumes from Stream 1: ScanRun.status=succeeded trigger semantics.
Consumes from Stream 2: sourceFingerprint shape, external_oidc_provider.providerUrl join key, trustPolicyHash on cross-account TRUSTS edges.
Schema migrations are the critical path. M1 (connector_id → connector_owners[], closes #488), M2 (property_provenance map, closes #485), M3 (atomic upsert, #487), M4 (first-class human_identity entity_type, #383) are independently shippable as Option A from the existing sv0-platform#486 epic. M5–M10 add the stitching collections. M11–M12 add correlationKeys[] and lineage_records[] to NormalizedNode.
Stream 4 — MediaPro Lab 2 demo plan
2026-04-22-mediapro-lab2-demo-plan.md · 627 lines · 32 tasks · 10 phases
The killer demo and the validation gate for Streams 1+2+3. MediaPro is "Nimbus Cloud 18 months later" — post-Series-C streaming-media SaaS, ~600 employees, recently-hired CISO Priya Reyes, three AWS accounts (mp-security, mp-workloads, mp-data) under one MediaPro OU, Entra/ServiceNow adopted, Azure Foundry agents in production. Six scenarios from 12-deployment-approval.md (B1, B2, B3, M2, X3, X4) covered, including Sergey's canonical Auto-route identity tickets → sn-ticket-router → gpt-nano-for-summary Foundry path. Full IaC: up.sh → connectors scan → down.sh. Cost envelope $0.40–$1.50 per cycle, well under the $5 target.
Consumes from Streams 1+2+3: the entire stack. Defines the validation matrix that closes the umbrella plan (V1–V14, see Cumulative validation matrix).
Retires the legacy sv0-demo-labs/labs/jira-mediapro/ lab (Jira de-prioritized — limited integration capabilities). Replaces the Nimbus Enterprise brand in 2026-04-08-demo-lab-plan.md Lab 2 section.
Cross-stream contracts (alignment proof)
Tenant slug naming (post-pilot): The canonical multi-AWS demo tenant slug is
enterprise-nimbus. Earlier revisions of this plan and code referenceddemo-mediapro— that slug was retired post the early-May 2026 MediaPro pilot in favor of a brand-neutral name. The lab brand itself ("MediaPro Lab 2", filesystem pathlabs/mediapro/, scripts, scenario names) is unchanged.
| Producer | Artifact | Consumer | Status |
|---|---|---|---|
| Stream 1 | ScanScopeDoc.scope_keys = { account_ids: string[], regions: string[] } (opaque to platform; AWS keys always plural arrays even at length 1) plus top-level ScanScopeDoc.service_categories: string[] (validated against ConnectorInstance.discovered_capabilities.service_categories_available) | Stream 2 | ✓ Locked in revision-1. Originally Stream 1 had account_id singular and Stream 2 had account_ids plural; revision-1 settles on plural. service_categories lives outside scope_keys, not inside. (Adversarial reviewer B1 fixed.) |
| Stream 1 | ScanRunDoc partial-success ("partial unless all categories failed") | Stream 2 | ✓ Stream 2 confirms the assumption |
| Stream 1 | ScanRun.status=succeeded event | Stream 3 | ✓ Stream 3 stitcher debounces on this event |
| Stream 1 | HTTP API (POST /connector-instances, POST /scan-runs, DELETE …) | Stream 4 | ✓ Stream 4 IaC up.sh/scan.sh/down.sh use this surface |
| Stream 2 | sourceFingerprint on every node/edge | Stream 3 | ✓ Stream 3 correlation rules use it as stable join key |
| Stream 2 | external_oidc_provider.providerUrl containing <tenant_id> | Stream 3 | ✓ matches Stream 3's aws-oidc-federation-to-entra-sp rule |
| Stream 2 | external_aws_account placeholder for unscanned trust targets | Stream 3 | ✓ Stream 3 can back-fill when later scan brings the account in-scope |
| Stream 2 | CFN StackSet + Terraform securityv0-spoke-role module | Stream 4 | ✓ Stream 4 deploys via for_each over account IDs |
| Stream 3 | M1–M4 schema migrations | Streams 2, 4 | ✓ Streams 2, 4 build on the post-M4 EntityDoc shape |
| Stream 3 | M11 NormalizedNode.correlationKeys[] | Stream 2 | ✓ Stream 2 emits the keys; Stream 3 consumes them |
| Stream 3 | M12 NormalizedNode.lineage_records[] | Stream 2 | ✓ Stream 2 emits source-record provenance |
| Stream 3 | Stitched-path validation contract | Stream 4 | ✓ Stream 4 V5–V8 use these paths as acceptance tests |
No contract conflicts surfaced.
Cross-stream architectural invariants
These hold across all four streams and are non-negotiable:
- Read-only. No connector ever writes back to a source system. Stream 2's IAM permission boundary explicit-denies write verbs.
- Deterministic. No ML, no probabilistic correlation, no fuzzy matching above defined confidence thresholds. Stream 3's correlation rules are deterministic; rule confidence is a compile-time property.
- MongoDB-only (per ADR-003). No graph DB introduction. All new collections (
connector_instances,scan_scopes,scan_runs,correlations,stitch_runs,stitch_audit,tenant_correlation_settings,stitched_paths) are MongoDB. - Per-tenant isolation. Every new collection is tenant-scoped via
tenant_id. Stream 3's correlation settings are per-tenant. Stream 1'sConnectorInstanceis per-tenant. - Source lineage preserved. Every property on every entity carries
(connector_id, source_record_id, observed_at). Stream 3's M2property_provenanceis the durable form. An analyst can always answer "where did this fact come from." - Per-(account × category) failure isolation. A cell failure does not abort the parent ScanRun. Failed categories tag
NormalizedGraph.scanScope.scannedEntityTypesso the existing circuit breaker treats their domain as out-of-scope (not deleted). - IaC up/scan/teardown supported throughout. Every operational primitive needed for Stream 4's lifecycle scripts is exposed via the platform HTTP API or connector CLI. No manual click steps.
Cross-stream decisions (resolved in this plan)
These are the calls I'm making. They override the corresponding "open question" entries in the substream docs unless the team explicitly disagrees during review.
| # | Decision | Rationale |
|---|---|---|
| D1 | Subprocess driver for Phase 1. Side-car container deferred post-pilot. | Stream 1 default. MediaPro pilot doesn't need it; faster to ship. Side-car becomes its own research stream once the credential-broker work starts. |
| D2 | Service-category enum is open-string + validated against discovered_capabilities. | Stream 1 preference. Lets connectors declare their categories instead of forcing a platform-side enum migration every time a category is added. |
| D3 | In-process scheduler with atomic Mongo claim, single-process for MediaPro v0. Multi-process leader-election deferred. | Sufficient for one tenant in pilot. Multi-process work tied to sv0-platform#309 post-pilot. |
| D4 | mcp-server-to-entra-sp correlation rule ships as opt-in, enabled for MediaPro tenant explicitly. | Stream 3 flagged false-merge risk for the rule. Pilot tenant's MCP wiring is known; we accept the risk for one tenant. Default-off for all other tenants until we see post-pilot data. |
| D5 | BRIDGES_TO (not SAME_ENTITY) for MCP-server-to-Entra-SP correlation. | Conservative: the Lambda uses the Entra SP credential, but they're not the same identity in the strict sense. BRIDGES_TO preserves authority-path connectivity without merging the entities. |
| D6 | Equivalence classes spanning different entity_type resolved by survivorship rule: most-specific type wins (e.g., entra_service_principal beats generic workload). | Stream 3 open question. Survivorship is deterministic and explainable to analysts. |
| D7 | Interval-disagreement across connectors is a finding-layer concern, not a stitcher concern. | Stream 4 surfaced this. The stitcher merges identities; whether observation windows overlap is an evaluator-rule problem (file as a follow-up post-pilot). |
| D8 | aws:PrincipalOrgID condition on the SecurityV0ReadOnly trust policy is mandatory alongside sts:ExternalId. | Stream 2 open question. Defense-in-depth; required for MediaPro since they have an org. |
| D9 | TRUSTS edge direction: trusted → trusting (i.e., the role being assumed is the source). | Stream 2 open question. Matches the AWS mental model ("X trusts Y to assume me") and aligns with how the path materializer already handles RUNS_AS edges. |
| D10 | MediaPro Lab 2 v0 ships V1–V5 + V7 + V8. V6 (cross-Lambda MCP stitched path) moves to "full" because Stream 3's mcp-server-to-entra-sp rule is opt-in / known-fragile (D4) and V6 was the weakest demo link. Full (V1–V14) is the post-pilot follow-on (end of May target). | Stream 4 timeline + adversarial reviewer M1 + CEO reviewer convergent finding: V6 in v0 was committing to a path the plan itself documents as a "weak link." V5 (pure cross-account, no fragile rule) becomes the v0 demo climax; V7 (Foundry path, already live on dev/prod) carries the headline; V8 (stitched identity via the OIDC federation rule, which is solid) proves stitching works. |
| D16 | MediaPro Lab 2 v0 reuses the existing dev/prod Foundry tenant for V7 (the live servicenow-openai-client → gpt-nano-for-summary path) instead of standing up Foundry resources in the MediaPro lab. Stream 4 Phase 8 (Foundry Terraform) defers to "full". The early-May demo switches tabs from MediaPro Lab 2 → existing dev/prod Foundry env to show V7. | CEO reviewer cut. The Foundry path already ships on dev/prod (per 2026-04-20 session note) and standing up duplicate Foundry resources in MediaPro Lab 2 adds Terraform complexity without visibly improving the buyer story. |
| D17 | Stream 4 Scenario B2 (Multi-Agent Supervisor) and Stream 3 Phase 5 (debuggability + opt-out APIs) defer from v0 to full. | CEO reviewer cuts. B2 is not on the v0 demo walkthrough (V9 covers it in full). Phase 5 is operational, not buyer-facing — V10 (per-tenant rule opt-out via DB toggle) is in full already; the API surface around it can wait. |
| D18 | Lambda-secret-stored-Entra-SP correlation rule defers post-pilot (resolves Q1). Reason: keep v0 focused on V5 / V7 proof; don't add a brittle cross-cloud correlation rule before the core stitching (OIDC federation + client_id anchor) is stable. Files as a follow-on under the Stream 3 Phase 6 post-pilot work. | Locks in the weak-link avoidance D10 already committed to. |
| D19 | Stream 4 builds alongside Victor's sv0-connectors#27, sharing Terraform modules; does not block on it (resolves Q2). Extract MCP server and cross-system trust modules into sv0-demo-labs/modules/ so both labs import them. Coordinate interfaces, not schedules. | Keeps both labs independently demoable while avoiding code duplication. |
| D20 | Stream 3 migrations M1–M4 ship as their own PRs against the existing bug issues (sv0-platform#488, #485, #487, #383), not as Stream 3 Phase 1 sub-tasks (resolves Q3). Each M1–M4 closure comment references the umbrella plan + its Stream 3 phase. Stream 3 Phase 1 scope reduces to M5–M12. | Reduces review blast radius, de-risks Phase 1, closes existing bug backlog in the same pass, and lets Round 0 dispatch four small PRs in parallel. |
| D21 | MediaPro Lab 2 Terraform uses a single state backend in mp-security (S3 bucket + DynamoDB lock table), one backend for all three accounts (resolves Q5). Per-account backends revisited only if cross-account isolation becomes a real operational constraint. | Simpler operationally; matches the MediaPro Lab 2 mental model of "mp-security is the audit/ops account." |
| D11 | Lab AWS accounts are persistent; only resources within them are torn down between demos. | Stream 4 surfaced AWS 90-day account-closure cool-down. Cost envelope is on the resources, not the accounts. |
| D12 | Per-lab ServiceNow PDI to avoid data bleed between MediaPro and the Sergey-canonical-flow scenarios. | Stream 4 open question. PDIs are free; isolation is cheap. |
| D13 | Buy Entra ID P1 for the demo tenant before MediaPro pilot. $30–60/mo against the $5K Mercury credit, per sv0-connectors#86. | Stream 4 surfaced; necessary for sign-in log fidelity. Tracked separately, no blocker. |
| D14 | evidenceCompleteness keys use the format <service_category>:<account_id> with a single colon delimiter. Concrete plan: Stream 1 Phase 1 Task 1.4 adds a unit test on the existing diff engine asserting that a key like aws_iam:111111111111 is parsed correctly and treated as a distinct evidence key (not collapsed with aws_iam:222222222222). If the test fails, Stream 1 Phase 1 also adds the parser fix in src/ingestion/diff-engine/evidence-completeness.ts (estimated ≤ 30 LoC) before Phase 1 closes. No deferral, no hand-wave — verification + fix-if-needed is in Phase 1 scope. | Stream 2 open question + adversarial reviewer M5. The original D14 wording deferred verification to Stream 3 Phase 1, which is wrong — Stream 2 Phase 4 emits these keys, so verification has to land at-or-before Stream 2 Phase 4, which means inside Stream 1 Phase 1. |
| D15 | ConnectorReport (legacy entra-servicenow + foundry path) gets a minimal wrap, not a refactor, in Stream 1 Phase 4. The connector keeps emitting ConnectorReport exactly as today; Stream 1 Phase 4 adds a single platform-side adapter that, on receipt of a ConnectorReport, looks up the matching ConnectorInstance (by tenant_id + connector_kind), creates a ScanRun document with status=succeeded (or partial/failed based on ConnectorReport.errors), and back-links the existing ConnectorSync record. No connector code changes; no breaking pipeline changes. | Resolves the umbrella Q4 ↔ Stream 1 OQ #6 contradiction the adversarial reviewer surfaced. Minimal wrap makes V2 ("all four ScanRuns succeed") verifiable without forcing entra-servicenow / foundry connectors to learn the full instance/scope CLI in this stream. Full CLI migration of those connectors stays deferred (post-pilot). |
Cross-stream open questions (all resolved — pre-kickoff state)
All open questions have been resolved by the team. The table is preserved for traceability; see D18–D21 in the decisions table for the binding resolutions.
| # | Question | Resolution |
|---|---|---|
| Lambda-secret-stored-Entra-SP correlation rule | Resolved as D18 (revision-2): defer post-pilot. | |
Coordination with Victor's sv0-connectors#27 | Resolved as D19 (revision-2): build alongside, share Terraform modules. | |
| Schema migration ordering (M1–M4) | Resolved as D20 (revision-2): ship M1–M4 as their own PRs against existing bug issues. | |
ConnectorReport wrapping | Resolved as D15 (revision-1): minimal platform-side wrap in Stream 1 Phase 4 Task 4.6. | |
| Terraform state backend for Lab 2 | Resolved as D21 (revision-2): single backend in mp-security. |
Critical path
The four streams sequence into seven rounds. Items in the same round can run in parallel. Dotted lines indicate cross-stream contracts that must be settled before the next round opens.
Round 0 — Bug-fix unblock (independent PRs)
├─ Stream 3 M1: connector_id → connector_owners[] (closes sv0-platform#488)
├─ Stream 3 M2: property_provenance map (closes sv0-platform#485)
├─ Stream 3 M3: atomic aggregation-pipeline upsert (closes sv0-platform#487)
└─ Stream 3 M4: first-class human_identity entity_type (closes sv0-platform#383)
⋮
Round 1 — Foundations (parallel)
├─ Stream 1 Phase 1: data model + storage (4 tasks)
└─ Stream 3 Phase 1: remaining schema migrations M5–M12 (7 tasks)
⋮ Stream 1 publishes ConnectorInstance/ScanScope/ScanRun schemas
⋮ Stream 3 publishes correlationKeys[]/lineage_records[] NormalizedNode shape
Round 2 — Connector + scope execution (parallel)
├─ Stream 1 Phase 2: scoped scan execution (7 tasks)
└─ Stream 2 Phases 1–3: org discovery + service-category scoping + parallel exec (11 tasks)
⋮ Stream 1 publishes scan-execution semantics
⋮ Stream 2 publishes sourceFingerprint, node/edge shapes
Round 3 — Scheduling + stitching engine + cross-account emission (parallel)
├─ Stream 1 Phase 3: scheduling (5 tasks)
├─ Stream 2 Phase 4: cross-account node/edge emission (5 tasks)
└─ Stream 3 Phase 2: rule engine + 7 initial rules (9 tasks)
⋮
Round 4 — Pipeline integration + StackSet + Lab 2 bootstrap (parallel; Stream 4 splits)
├─ Stream 2 Phase 5: StackSet template + Terraform module (4 tasks)
├─ Stream 3 Phase 3: stitch_ingestion worker pipeline integration (5 tasks)
└─ Stream 4 Phase 1a: Lab 2 account bootstrap (Organizations create-account + OU only — no IAM bootstrap; 3 tasks)
⋮ Round 4 close gate: Stream 2 Phase 5 SHIPS before Round 5 opens.
⋮ Stream 4 Phase 1b (per-account skeleton, IAM via SecurityV0ReadOnly module) starts in Round 5 — depends on Stream 2 Phase 5's Terraform module being importable.
Round 5 — Re-materialization + Lab 2 construction (parallel)
├─ Stream 1 Phase 4: CLI provisioning + minimal ConnectorReport wrap + migration (6 tasks; per D15)
├─ Stream 2 Phase 6: hardening + permission boundary (3 tasks)
├─ Stream 3 Phase 4: re-materialization (4 tasks; addresses sv0-platform#491)
└─ Stream 4 Phase 1b + Phases 3–7: per-account skeleton + mp-workloads + mp-data + mp-security + Entra + ServiceNow (14 tasks; Phase 8 Foundry deferred to "full" per D16; B2 scenario deferred per D17)
⋮
Round 6 — Demo lifecycle (v0 close)
└─ Stream 4 Phases 9–10: up/scan/down lifecycle scripts + demo + validation (9 tasks)
⋮ v0 GATE: V1–V5, V7, V8 pass against MediaPro Lab 2.
⋮ Stream 3 Phase 5 (debuggability + opt-out APIs) deferred to Round 7 per D17.
Round 7 — Full Lab 2 (post-pilot)
├─ Stream 3 Phase 5: debuggability + opt-out APIs (4 tasks; deferred from Round 6 per D17)
├─ Stream 3 Phase 6: stitched-identity card UI (1 task)
├─ Stream 4 Phase 8: Foundry Terraform in MediaPro Lab 2 (deferred from Round 5 per D16)
├─ Stream 4 Scenario B2: Multi-Agent Supervisor (deferred per D17)
├─ Stream 3 + 4 follow-on: Q1's `lambda-secret-stored-entra-sp` correlation rule (closes V6)
└─ Run MediaPro Lab 2 end-to-end; verify all V1–V14 cumulative validation criteria pass
Critical-path stream: Stream 3 (graph stitching). Schema migrations M1–M12 are the longest single chain and gate Stream 4's V5–V8.
Parallelism budget: at peak (Round 5), ~25 tasks across 4 streams in parallel. Each stream is single-owned for clarity; cross-stream coordination happens via the contracts above.
MediaPro pilot timeline tie-in
Per sv0-documentation#195, MediaPro pilot target is early May 2026. Stream 4's analysis: realistic v0 covering V1–V8 (with V7 — Sergey's Foundry path — already live on dev/prod) is achievable for early May given the dependencies. Full V1–V14 (cross-Lambda stitching + interval-finding work + UI stitched-identity card) requires an additional 3–4 person-weeks beyond pilot.
Recommendation:
- Early May (pilot): Rounds 0–4 complete. Lab 2 v0 ships covering V1–V8 + V7 Foundry path. Lab 2 v0 demo can be run live for MediaPro.
- End of May (post-pilot): Rounds 5–7 complete. Full V1–V14. Lab 2 becomes the canonical sales-team demo.
Items from sv0-documentation#195's checklist this plan closes:
- Track A → Connector resilience → "AWS multi-account enumeration validated against MediaPro test env" (sv0-connectors#32) — closed by Stream 2 Round 2–4.
- Track A → Connector resilience → "Streaming pagination" (Entra + AWS) — addressed by Stream 2 Phase 6 hardening, Round 5.
Items from #195 this plan does not close (they remain in #195's scope):
- WorkOS rollout (sv0-platform#373)
- Atlas cutover (sv0-platform#493)
- Observability stack (sv0-platform#494)
- Jira Cloud Phase 1 (sv0-connectors#72)
- Trust & legal (Track C)
- Client-facing requirements packet (Track B)
Cumulative validation matrix
The umbrella is "done" when MediaPro Lab 2 runs end-to-end and produces all of these. V1–V8 are the v0 gate; V9–V14 are the full-Lab-2 gate.
| ID | Validation | Producing stream(s) | v0 / Full |
|---|---|---|---|
| V1 | up.sh provisions 3 AWS accounts + Entra resources + ServiceNow PDI seed + Foundry resources via Terraform; no manual click steps | 4 | v0 |
| V2 | scan.sh invokes platform HTTP API to trigger AWS / Entra / ServiceNow / Foundry scans against the lab; all ScanRun documents land with status=succeeded | 1, 2, 4 | v0 |
| V3 | MediaPro tenant has 1+ ConnectorInstance per source system; per-instance scopes scheduled at independent cadences | 1 | v0 |
| V4 | Per-(account × category) cell failure isolation: simulating an aws_bedrock failure in mp-workloads does not abort the aws_iam scan in the same account, nor any scan in the other two accounts | 1, 2 | v0 |
| V5 | Cross-account path mp-content-tagger → mp-cross-account-data-reader → mp-pii-customers materializes as one authority path with crosses_account_boundary=true | 2, 3 | v0 |
| V6 | Stitched path mp-support-router (Bedrock) → mp-mcp-bridge (Lambda) → sp-mp-mcp-bridge (Entra SP) → ServiceNow OAuth → HR table spans AWS+Entra+ServiceNow connectors as a single authority path | 3 | full (deferred from v0 per D10/D17 — depends on Q1's Lambda-secret-stored-Entra-SP correlation rule) |
| V7 | servicenow-openai-client → gpt-nano-for-summary Foundry path renders with exec30 ≥ 1 — matches the live dev/prod baseline from 2026-04-20 session | 3 | v0 (already live) |
| V8 | sp-mp-foundry-router stitched identity has source_records=2 (entra + aws), accessible via the correlations collection with rule attribution | 3 | v0 |
| V9 | Re-scan produces zero entity churn (regression for sv0-platform#460/#461; closes sv0-platform#491) | 3 | full |
| V10 | Tenant opt-out: disabling mcp-server-to-entra-sp rule for MediaPro tenant un-merges the V8 identity within one stitch run | 3 | full |
| V11 | UI stitched-identity card shows "this identity has 2 source records" with rule attribution and per-property provenance | 3 | full |
| V12 | down.sh tears all Lab 2 resources down; AWS billing for the lab returns to baseline within 24h; account closure not invoked | 4 | full |
| V13 | Total cost per up/scan/down cycle ≤ $1.50 (validated across 5 cycles) | 4 | full |
| V14 | All seven correlation rules from Stream 3 produce ≥ 1 stitched edge in MediaPro Lab 2 | 3 | full |
Implementation orchestration
Task count (revision-1):
- v0 (early-May, MediaPro pilot): ~88 tasks — Stream 1: 22 / Stream 2: 19 / Stream 3: 22 (Phases 1–4; M1–M10 + 7 rules + integration + re-mat) / Stream 4: 25 (Phase 8 Foundry + Phase 3.3 B2 + Phase 3.4 MCP deferred).
- Full (end-May, post-pilot): ~104 tasks — adds Stream 3 Phase 5 (debuggability + opt-out APIs) + Stream 3 Phase 6 (UI card) + Stream 4 Phases 3.3/3.4/8 + Q1's Lambda-secret correlation rule.
Repo distribution:
sv0-platform: ~50 tasks (Streams 1, 3, plus minor Stream 2 platform-shape work)sv0-connectors: ~25 tasks (Stream 2 + connector CLI work from Stream 1)sv0-demo-labs: ~25 tasks (Stream 4 lab Terraform + the shared spoke-role module from Stream 2)sv0-documentation: 0 implementation tasks (the plans are the documentation work)
Issue-creation pattern: one GitHub issue per stream-phase, labeled claude-code + implementation (or infrastructure for Stream 4 Terraform). Issues reference back to the relevant section anchor in the substream doc. The umbrella issue (sv0-documentation#198) tracks the whole plan; substream issues close as their phases complete.
Coordination tasks (called out so they don't get lost between phases):
- Coord-1: After Stream 1 Phase 1 lands, Stream 2 Phase 1 reviews Stream 1's
ScanScopeDoc.scope_keysfinal field names and adjusts--accounts/--service-categoriesCLI to write into them. - Coord-2: After Stream 3 M1–M4 land, Streams 2 and 4 verify their NormalizedGraph emission against the post-M4 EntityDoc shape (especially
connector_owners[]semantics). - Coord-3: After Stream 3 M11–M12 land, Stream 2 Phase 4 emits
correlationKeys[]andlineage_records[]per the contract before Stream 3 Phase 2 rules attempt to consume them. - Coord-4: Before Stream 4 Phase 9 (lifecycle scripts), Stream 1 Phase 4 must have shipped HTTP
POST /api/v1/connector-instances+POST /api/v1/scan-runs+DELETE /api/v1/connector-instances/:id(the IaC up/scan/down surface). - Coord-5: Before Stream 4 Phase 10 (demo + validation harness), all 14 V1–V14 acceptance criteria are recorded in a machine-readable
validation.yamlso the harness can assert pass/fail per criterion. - Coord-6: Coordination with
sv0-connectors#27(Victor) before Stream 4 Phase 3 kicks off — see Q2 above. - Coord-7 (revision-1): Before Stream 2 Phase 1 implementation starts, post a comment on
sv0-connectors#32documenting the "never chains STS" deviation from its acceptance criterion (see the Stream 2 doc's spec-deviation banner). Get explicit ack from #32's owner; amend the AC if needed.
Risk register
| # | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| R1 | Stream 3 schema migrations break existing tenants' findings during rollout | Medium | High | Migrations include backfill + dry-run mode. Run against demo-w1 and demo-nimbus before flipping for prod tenants. Each migration is its own PR with explicit rollback steps. |
| R2 | MediaPro pilot date moves up; v0 scope insufficient | Low | High | D10 commits to V1–V8 v0 by early May. If pilot moves up, V7 (Foundry path live on dev/prod) carries the demo as fallback. |
| R3 | AWS Organizations + StackSet permissions rejected by MediaPro security team | Medium | Medium | Provide explicit --accounts mode (Stream 2 second auth mode); MediaPro can list accounts manually. CFN template + boundary documented for MediaPro's own review. |
| R4 | False merges from mcp-server-to-entra-sp correlation rule embarrass demo | Low | Medium | D4 keeps rule opt-in. MediaPro pilot tenant has the rule enabled with explicit hint setup. UI lineage card (V11) provides an "unmerge" affordance for any false merge spotted live. |
| R5 | In-process scheduler insufficient for >1 tenant under pilot load | Low | Medium | D3 documents the upgrade path (multi-process leader-election, MongoDB transactions). MediaPro pilot is one tenant; risk does not surface until pilot scales. |
| R6 | Cross-stream contract drift during implementation | Medium | High | Coord-1 through Coord-6 above. Round-2 reviewers explicitly check contract alignment in PRs. |
| R7 | Stream 4 Lab 2 cost overruns the $1.50/cycle target | Low | Low | Stream 4 already flagged AWS Config + KMS as watch items; periodic Config recorder + $25 OU billing alarm bound the worst case. |
Partner-facing follow-on items (not in this plan, file under #195)
The CEO reviewer flagged that the validation matrix is internal QA, not a sellable artifact. These three items close the partner-readiness gap and should be filed as sub-issues of sv0-documentation#195 (the MediaPro pilot readiness umbrella). They do not belong in this architectural plan because they're product-marketing surface, not architectural change — but the architecture must support them, and the umbrella's validation matrix can call out hooks for them:
- "Fix these three things this week" template — a one-page slide / printable summary that picks the top-3 highest-impact remediations from the MediaPro Lab 2 evidence pack and frames them as a one-week action list a CISO can hand to operations. The platform's existing
risk-cluster-servicealready groups findings by remediation pattern; the partner-facing template wraps that grouping in CISO-readable language. - Responsible-role tags on findings — every finding rendered in the demo carries an explicit "responsible role" tag (e.g., "AWS account owner," "Entra tenant admin," "ServiceNow connection owner") so a partner walking a CISO through the report can answer "who is on the hook" without reading code. Architecturally this is a small data addition to
FindingDoc(one tag) — fitting it into Stream 3 Phase 4 re-materialization is straightforward. - Exportable, partner-rebrandable demo deliverable — a single PDF or HTML artifact derived from the evidence pack that a Deloitte/Accenture analyst can rebrand with their own logo and hand to a Fortune 500 CISO. Architecturally this is
evidence-packHTML rendering with a swappable header — extends the existing evidence-pack module, doesn't change the data model.
These items are not required to ship the architecture in this plan. They are required to ship the MediaPro pilot — which is why they live under #195, not under #198.
Out of scope (deferred post-pilot)
Each item listed because someone will ask "why isn't this in the plan":
- Side-car container connector driver (D1)
- Multi-process scheduler leader-election (D3)
- Tenant self-service connector configuration UI (deferred per Stream 1; data model supports it)
- Wiz/Snyk/Datadog UI parity for connector management (post-pilot)
- Interval-disagreement findings (D7) — a finding-layer concern
- Lambda-secret-stored-Entra-SP correlation rule (Q1 default) — Stream 3 follow-up
- Cross-tenant correlation (deliberately not supported — would break tenant isolation invariant)
- Graph DB introduction (per ADR-003, ruled out)
- ML-based fuzzy correlation (per Stream 3 invariant, ruled out)
- Jira Cloud connector enhancements (separately tracked in
sv0-connectors#72; Jira de-prioritized for Lab 2 per user) - Bot identity correlation across GitHub / Bitbucket / Slack (separate research)
References
Substream docs (this PR):
Note (Codex Tier 5 path correction): Architecture substream docs live under
docs/architecture/research/, NOT underdocs/plans/. The links below are correct; if a review tool reports them as "NOT FOUND under docs/plans/", that is a tool limitation — the canonical paths are as linked.
- Connector control & execution architecture
- Multi-account AWS connector architecture
- Cross-connector graph stitching architecture
- MediaPro Lab 2 demo plan
Parent + adjacent epics:
- sv0-documentation#195 — MediaPro pilot readiness umbrella (parent)
- sv0-documentation#198 — this plan's tracking issue
- sv0-platform#486 — multi-connector reconciliation phase (Option C) — Stream 3 wraps
- sv0-connectors#78 — scheduled connector scans (Stream 1 wraps)
- sv0-connectors#89 — pre-client readiness P0s (connectors)
Spec issues this plan implements:
- sv0-platform#300 — cross-connector graph stitching
- sv0-platform#488 — connector_id singular
- sv0-platform#487 — atomic upsert
- sv0-platform#485 — diffProperties not scoped by connector
- sv0-platform#383 — human_identity silently retyped
- sv0-platform#491 — cross-sync re-materialization gaps
- sv0-platform#309 — connector throttling research
- sv0-platform#247 — Phase 1 systemd timers (superseded by Stream 1)
- sv0-connectors#32 — multi-account scanning
- sv0-connectors#57 — CloudTrail org multi-account discovery
- sv0-connectors#79 — cross-connector entity correlation Phase A–E
- sv0-connectors#27 — Demo Lab X1 cross-system scenario (Victor — coordinate per Q2)
- sv0-connectors#86 — Entra ID P1 for demo tenant (D13)
Prior research:
docs/architecture/research/2026-02-26-cross-connector-entity-correlation-research.md(Stream 3 extends)docs/architecture/research/2026-03-30-aws-integration-strategy.mddocs/architecture/research/2026-03-11-aws-connector-research.mddocs/plans/2026-04-08-demo-lab-plan.md(this plan retires the Nimbus Enterprise Lab 2 section in favor of MediaPro Lab 2)docs/plans/2026-04-04-aws-integration-implementation-cycle.md
Architecture invariants:
docs/architecture/01-data-model.mddocs/architecture/02-processing-pipeline.mddocs/architecture/03-database.mddocs/architecture/05-connectors.mddocs/architecture/12-deployment-approval.mddocs/architecture/decisions/adr-003-reject-apache-age.md
Session note (live demo state):
sv0-platform/docs/session-notes/2026-04-20-foundry-demo-resolution-session-handoff.md