Skip to main content

Core Platform Implementation Plan (New Repository)

Date: 2026-02-07
Status: Draft (planning only; no implementation work started)

1. Objective

Create a new core platform repository (rewrite from scratch) that implements the current SecurityV0 architecture and data model, including evidence-grade P0 decisions made on 2026-02-07.

This document is a planning artifact only. It defines scope, sequencing, milestones, and acceptance criteria.


2. Scope and Non-Goals

In Scope

  • New repository creation and bootstrap in a new folder.
  • Core platform backend architecture and schemas.
  • MongoDB storage layer with updated collections and indexes.
  • Diff engine, trigger evaluator, evidence-pack generation.
  • API surface for findings, graph queries, and temporal queries.
  • Web UI and reporting layer aligned with 07-ui-reporting.md.
  • OAA export and read-only SCIM endpoints.
  • Integration contract with sv0-connectors.
  • Test strategy, CI baseline, and deployment readiness.

Out of Scope (for this plan iteration)

  • Reusing graph-mongo code directly.
  • Building all connectors; only contract alignment and first integration target are planned.
  • Full enterprise multi-region deployment hardening.
  • Advanced graph-index secondary store (Neo4j) implementation.
  • Native PDF evidence-pack rendering in MVP (post-MVP optional conversion phase only).
  • Full mTLS certificate lifecycle hardening in MVP (deferred to hardening phase/post-MVP).

3. Repository Naming Decision

Confirmed Decision

Canonical repository name: sv0-platform.

Rationale:

  • Aligns with prior approved repository-organization note.
  • Clearly distinguishes from sv0-connectors and sv0-documentation.
  • Keeps URLs, import paths, and automation names concise.

4. Architectural Baseline (Must Be Preserved)

The implementation must satisfy these non-negotiable constraints:

  • Deterministic findings only (no ML, no probabilistic scoring).
  • Read-only posture toward source systems.
  • Fully explainable evidence chains.
  • Temporal model for authority drift over time.
  • Evidence-grade artifact model (immutable, hash-sealed outputs).

Additionally, the P0 architecture decisions from 2026-02-07 are mandatory:

  • source_metadata allowlist + source_metadata_hash (no raw payload persistence).
  • Baseline redesign (baseline_metadata + baseline_entities).
  • execution_evidence as first-class persisted artifact.
  • Deterministic AUTHENTICATES_TO.evidence_references.
  • evidence_completeness required on findings.

5. Lessons from graph-mongo Draft

Keep as Design Patterns

  • StorageAdapter abstraction for backend-agnostic persistence.
  • Strong tenant-scoped indexing strategy.
  • Materialized path concept for fast blast-radius reads.
  • Mock scenario-driven testing approach.

Do Not Carry Forward as-Is

  • Fixed tenant assumptions in API scaffolding.
  • Any raw response persistence pattern.
  • Baseline strategy that risks 16MB MongoDB document limits.
  • Unbounded embedded arrays without guardrails/overflow plan.

Reuse as Reference Inputs Only

  • Query patterns and demo scenarios.
  • Existing type vocabulary where consistent with updated docs.
  • Early API route concepts as scaffolding ideas, not contract truth.

6. Target Repository Layout (Planned)

sv0-platform/
├── src/
│ ├── api/ # REST API + SCIM/OAA endpoints
│ ├── domain/ # Core domain models and validation
│ ├── storage/ # StorageAdapter, Mongo implementation, indexes
│ ├── ingestion/ # Normalizer, diff engine, sync orchestration
│ ├── evaluator/ # Deterministic finding rules
│ ├── evidence/ # Evidence pack assembler + sealing
│ ├── query/ # Graph and temporal query services
│ ├── workers/ # Async jobs (sync, evaluation, evidence)
│ └── shared/ # Common utilities and contracts
├── test/
│ ├── unit/
│ ├── integration/
│ └── fixtures/
├── infra/ # IaC and environment provisioning
├── docs/ # Repo-local technical docs/runbooks
├── scripts/ # Bootstrap, migrations, seed, verification
├── .github/workflows/ # CI/CD pipelines
├── CLAUDE.md
└── package.json

7. Phased Implementation Plan

Phase 0: Pre-Implementation Alignment

Deliverables

  • Architecture freeze for MVP scope.
  • Explicit acceptance of P0 decisions.
  • Resolution log for open architecture questions in 00-overview.md.
  • Initial backlog split into platform epics/stories.

Key Tasks

  • Confirm canonical source docs (00, 01, 02, 03, 04, 05, 06, 07).
  • Resolve naming decision (sv0-platform).
  • Define versioned contracts between platform and connectors (language-agnostic JSON schema; connectors remain Python-based).
  • Resolve connector-to-platform transport for MVP:
    • Decision: connector push over authenticated HTTPS (POST to platform ingestion endpoint), idempotent by sync_id/payload hash
    • MVP auth decision: per-tenant connector API keys over HTTPS for ingestion
    • Deferred: strict mTLS trust-boundary enforcement (cert issuance/rotation/revocation, proxy origin controls)
    • Not allowed in MVP: direct shared MongoDB writes from connectors
    • Future option: queue-backed ingestion (SQS/Kafka) if throughput/reliability needs increase
  • Resolve open architecture questions and record answers:
    • scheduling model: pull-based sync for MVP
    • path recompute strategy: recompute for affected identities (not full-tenant incremental graph index work)
    • evidence pack storage: MongoDB for MVP, object storage optional later for large artifacts
    • PDF policy: post-MVP optional conversion phase (already resolved)
    • connector SDK language: Python connectors + JSON contract boundary
  • Define initial non-functional targets (latency, sync duration, retention policy).

Exit Criteria

  • Written sign-off on scope, repo name, and architecture baseline.
  • Signed decision appendix covering transport, scheduling, recompute strategy, storage, and SDK-language contract.
  • Prioritized backlog with owners and sequencing.

Phase 1: Repository Bootstrap and Engineering Baseline

Deliverables

  • New repository initialized in new folder.
  • Build, lint, test, and CI pipeline running.
  • Local runtime environment (docker compose) operational.
  • Auth/security skeleton in place for all integration surfaces.
  • Worker runtime skeleton and async execution model defined.

Key Tasks

  • Initialize Node.js/TypeScript project with strict settings.
  • Add code quality gates (lint + typecheck + tests in CI).
  • Add local MongoDB + observability-ready logging.
  • Add baseline app skeleton with health and readiness endpoints.
  • Create initial CLAUDE.md with cross-repo references.
  • Implement auth/authz scaffolding from 00-overview.md:
    • OAuth 2.0 middleware path for web UI
    • API key auth path for programmatic clients and connector ingestion (MVP)
    • mTLS integration points stubbed/documented, but strict enforcement deferred from MVP
    • tenant context extraction and propagation in request lifecycle
  • Define sync vs async boundaries:
    • sync trigger API returns accepted job record (non-blocking)
    • diff/evaluator/evidence processing runs in worker jobs
    • evidence-pack retrieval API serves already-built artifacts (no blocking generation)

Exit Criteria

  • main branch protected with passing CI.
  • New contributor can run bootstrap and test commands end-to-end.
  • Auth middleware paths are test-covered for success/failure and tenant isolation.
  • Worker entrypoint and queue/job abstraction compile and run in local environment.

Phase 1.5: Preliminary Thin UI Scaffolding (Validation Spike)

Deliverables

  • Lightweight UI app scaffold in sv0-platform/ui (React + TypeScript).
  • Minimal visualization surface to validate value with real platform data.
  • Stable frontend contracts for currently available API endpoints (without blocking core backend phases).

Key Tasks

  • Initialize frontend app with routing, layout shell, and shared design tokens.
  • Implement three thin pages for early validation:
    • Dashboard (headline counts from findings summary)
    • Findings List
    • Finding Detail
  • Wire typed API client for existing endpoints (/api/v1/findings and related ingest validation flow where needed).
  • Add placeholder routes for later-phase pages (Graph Explorer, Entity Detail, Temporal Comparison, Sync Management) clearly labeled as planned.
  • Add local dev ergonomics:
    • API base URL environment config
    • dev proxy or CORS-safe local setup
    • simple loading/error/empty states
  • Add UI smoke tests and basic responsiveness checks (desktop + mobile layouts).
  • Document startup workflow so backend + UI can run together for demo/validation.

Exit Criteria

  • UI can be started locally and connected to running API.
  • Imported connector findings are visible in Dashboard/List/Detail flows.
  • No blocker assumptions introduced for Phase 2/3 backend work (UI remains thin and contract-driven).

Phase 2: Domain and Storage Foundation

Deliverables

  • Versioned domain schemas aligned to architecture docs.
  • Mongo collections + indexes for MVP data model.
  • StorageAdapter + Mongo adapter implemented.
  • Retention policies enforced in schema/index layer.
  • Mock data generator for deterministic validation scenarios.

Key Tasks

  • Implement entities: identity, owner, role, permission, resource, credential, execution_evidence.
  • Implement Identity subtypes required by current data model (including automation-related values): flow_designer_flow, business_rule, scheduled_job, system_execution.
  • Implement relationship types required by updated model: RUNS_AS, TRIGGERS_ON, CREATED_BY (in addition to core relationship set).
  • Implement all five finding types in domain contracts: orphaned_ownership, ownership_degraded, scope_drift, dormant_authority, privilege_justification_gap.
  • Implement collections:
    • entities
    • entity_versions
    • events
    • baseline_metadata
    • baseline_entities
    • execution_evidence
    • sync_cursors
    • connector_syncs
    • findings
    • evidence_packs
  • Implement index definitions for tenant isolation and query patterns.
  • Enforce source_metadata allowlist + hash at write boundary.
  • Add document-size guardrails and overflow thresholds for path-heavy entities.
  • Implement retention configuration and cleanup behavior:
    • events TTL index (2 years default)
    • retention indexes/policies for entity_versions, baseline_metadata, baseline_entities, execution_evidence
    • retention verification script and policy tests
  • Build mock data generator with scenarios from 00-overview.md (orphaned ownership, scope drift, dormant authority, transitive chain).

Exit Criteria

  • Schema verification script passes on clean environment.
  • Storage integration tests cover CRUD, versioning, and index-backed query paths.
  • Retention tests prove TTL/index behavior and policy enforcement.
  • Mock dataset exercises all five finding types and cross-system path cases.

Phase 3: Normalizer, Ingestion, Diff, and Temporal Pipeline

Deliverables

  • Ingestion transport path consuming connector NormalizedGraph output.
  • Normalizer layer as a first-class platform component.
  • Deterministic diff engine with append-only event generation.
  • Baseline and sync-cursor lifecycle operational.
  • Cross-system path materialization operational (including AUTHENTICATES_TO traversal).

Key Tasks

  • Implement ingestion API/adapter for connector push transport with:
    • authenticated service-to-service access for MVP (per-tenant API keys over HTTPS)
    • mTLS transport/auth hardening scheduled after MVP baseline is stable
    • idempotency keys (sync_id), replay protection, and dedupe behavior
    • retry-safe request semantics and validation errors
  • Define normalized graph intake contracts (with evidenceCompleteness payload).
  • Implement normalizer responsibilities from 05-connectors.md:
    • map connector human_identity nodes to platform Owner semantics based on OWNED_BY/BELONGS_TO
    • enforce owner typing (human, team, business_unit, organization) where evidence exists
    • normalize edge/property conventions before persistence
  • Implement node/edge diff logic (create/update/delete + relationship changes).
  • Emit canonical event types with provenance.
  • Implement baseline scheduler and retention-aware sync cursor management.
  • Implement path materialization in two passes:
    • single-system execution path materialization
    • cross-system path expansion via AUTHENTICATES_TO chains (Entra -> ServiceNow pattern)
  • Implement path recompute strategy for changed identities/resources and measure write amplification.

Exit Criteria

  • Idempotent re-run behavior verified (same input = no duplicate events).
  • Temporal replay checks match expected historical states.
  • Normalizer tests verify deterministic owner mapping behavior from connector outputs.
  • Cross-system materialized paths match expected blast radius for reference scenarios.
  • Ingestion transport tests cover retries, duplicates, and invalid payload rejection.

Phase 4: Query and API Layer

Deliverables

  • Production API surface for graph, findings, and temporal queries.
  • Tenant-scoped authn/authz enforcement with concrete mechanisms.
  • OAA export endpoints and read-only SCIM endpoints.

Key Tasks

  • Implement core API groups:
    • graph queries (blast radius, reverse access, path traversal)
    • temporal queries (entity history, drift timeline)
    • findings and evidence retrieval
    • sync management (list, trigger, status)
    • structured query endpoint (POST /api/v1/query)
  • Implement SCIM read-only endpoints:
    • /ServiceProviderConfig
    • /Schemas
    • /ResourceTypes
    • /Users
    • /Groups
  • Implement OAA endpoints:
    • applications list
    • application payload export
    • bulk export trigger
  • Apply auth model from 00-overview.md on all endpoints:
    • OAuth 2.0 for web UI audience
    • API key access for programmatic clients
    • service-to-service auth enforcement for connector-facing ingestion APIs (API key in MVP; mTLS in hardening phase)
    • per-tenant authorization checks on every query path
  • Enforce API contract details from 04-api-layer.md:
    • cursor-based pagination envelope on all list endpoints
    • standard error response schema and status/error code mapping
    • rate-limit response headers
    • finding status transition rules (active -> acknowledged -> remediated|false_positive)
    • URL versioning strategy (/api/v1, forward-compatible for /api/v2)
  • Implement structured query DSL fully (not partial):
    • operators: eq, ne, in, gt, lt, between, exists
    • logical combinators: and, or
    • join-like filters: with_findings, with_paths_to (including materialized-path intersection logic)
  • Validate API coverage against UI route dependencies in 07-ui-reporting.md (dashboard, graph, findings, entity detail, compare, syncs).
  • Add API contract tests and tenant isolation tests.

Exit Criteria

  • API contract tests green with deterministic fixtures.
  • SCIM/OAA payloads validated against expected schemas.
  • Endpoint coverage matrix confirms required UI consumer paths from 07-ui-reporting.md are satisfied by implemented API contracts.
  • Auth tests verify OAuth/API-key/service-auth enforcement and tenant boundary correctness.
  • Structured query tests cover all operators, combinators, and join-like filters.

Phase 5: Trigger Evaluator and Evidence Packs

Deliverables

  • Deterministic evaluator for MVP finding set.
  • Evidence pack assembly and sealing pipeline.

Key Tasks

  • Implement deterministic rules for all model-defined finding types:
    • orphaned ownership
    • ownership degraded
    • scope drift (with evidence availability semantics)
    • dormant authority
    • privilege_justification_gap (granted privilege vs observed execution behavior)
  • Require evidence_completeness in finding record.
  • Implement evidence pack builder:
    • structured section assembly aligned with UI/reporting expectations:
      • Identity Summary
      • Authority Snapshot
      • Ownership Timeline
      • Blast Radius
      • Temporal Context
      • Deterministic Explanation
      • Remediation
      • Evidence Completeness
    • provenance links
    • hash seal generation
    • export formats (JSON + Markdown as MVP priority)
    • define post-MVP optional PDF phase as Markdown/HTML-to-PDF conversion add-on

Exit Criteria

  • Golden tests prove deterministic outputs for known scenarios.
  • Evidence packs contain walkable source references and integrity hash.
  • privilege_justification_gap outputs verified against execution-evidence fixtures.
  • Evidence-pack schema validates required structured sections.

Phase 6: UI and Reporting Implementation

Deliverables

  • Web UI implementation aligned with 07-ui-reporting.md.
  • Page-level integration with API contracts from 04-api-layer.md.
  • Reporting/export UX for JSON and Markdown evidence packs.

Key Tasks

  • Implement route set and layout:
    • Dashboard
    • Graph Explorer
    • Findings List
    • Finding Detail
    • Entity Detail
    • Temporal Comparison
    • Sync Management
  • Implement TanStack Query hooks and cache/invalidation patterns per UI architecture doc.
  • Implement graph visualization components (ReactFlow + Dagre) with path highlighting and finding overlays.
  • Implement evidence-pack rendering UI including evidence-completeness visualization.
  • Implement MVP exports:
    • JSON evidence-pack download
    • Markdown evidence-pack download
    • CSV list export and graph snapshot export
  • Keep PDF export explicitly behind post-MVP feature flag/roadmap item.

Exit Criteria

  • All seven pages functional against live API in staging.
  • UI route-to-endpoint dependency matrix is green.
  • Evidence-pack UI accurately renders structured sections and completeness states.

Phase 7: Connector Integration and End-to-End Validation

Deliverables

  • Working integration path with sv0-connectors contract.
  • End-to-end scenario from connector output to finding + evidence pack.

Key Tasks

  • Finalize ingestion contract version with connector team.
  • Run at least one primary scenario end-to-end (Entra + ServiceNow).
  • Implement connector/platform alignment to 08-reference-impl-entra-servicenow.md:
    • deterministic correlation: Entra appId <-> ServiceNow oauth_entity.client_id
    • full AUTHENTICATES_TO.evidence_references tuple persisted (issuer tenant + target instance context)
    • health-check probes mapped to evidence completeness categories
    • permission normalization mapping coverage (explicit + fallback rules)
    • execution evidence ingestion from Entra signIns and ServiceNow syslog_transaction when available
    • source metadata allowlists enforced per mapped fields
  • Validate required connector permission prerequisites and failure behavior when optional evidence sources are unavailable.
  • Validate AUTHENTICATES_TO proof tuple completeness.
  • Validate evidence-completeness behavior when source tables are unavailable.
  • Add failure-path tests (partial evidence, retention gaps, permission denied).

Exit Criteria

  • End-to-end demo scenario reproducible from clean environment.
  • Findings remain deterministic under incomplete-evidence conditions.
  • Reference scenario outputs (ownership, role changes, execution evidence, cross-system links) match expected transforms from 08-reference-impl-entra-servicenow.md.
  • UI, API, and connector outputs are coherent in a single integrated flow.

Phase 8: Operational Hardening and Release Readiness

Deliverables

  • Production deployment baseline.
  • Observability, runbooks, and release controls.

Key Tasks

  • Add metrics and alerts: sync health, evaluator throughput, path materialization time, document-size growth, write amplification.
  • Add structured logs with correlation IDs (sync_id, tenant_id, finding_id).
  • Add runbooks for ingestion failures, evidence gaps, and retention warnings.
  • Add backup/restore checks and migration playbooks.
  • Define release checklist and rollback procedure.
  • Implement and alert on explicit externalization triggers from 03-database.md:
    • path recompute p95 > 30s
    • sync duration > 2x baseline attributable to path materialization
    • write amplification ratio > 10
    • entity document size > 8MB
  • Validate async boundaries operationally:
    • sync API remains non-blocking under load
    • evaluator/evidence jobs do not block ingestion
    • retry and dead-letter behavior documented and tested

Exit Criteria

  • Staging deployment validated with realistic volume.
  • Operational SLOs and on-call runbooks approved.
  • Threshold-based alerts are live and tested in staging.
  • Externalization-readiness dashboard demonstrates trigger visibility.

8. Workstream Parallelization Plan

WorkstreamCan StartDepends OnNotes
Repo bootstrap + CIImmediatelyNoneFoundation for all tracks
Auth middleware + tenant enforcementImmediatelyBootstrapOAuth + API key enforcement in MVP; mTLS hardening staged later
Preliminary UI scaffolding spikeAfter Phase 1 bootstrapBasic findings API + auth postureThin validation UI (dashboard/findings/detail) without waiting for full Phase 4 API breadth
Domain model + storage schemaAfter Phase 0BootstrapCritical path
Mock data generatorAfter domain draftDomain contractsEnables early evaluator/API/UI validation
Normalizer + ingestion transportAfter Phase 0Contract decisionsRequires transport and owner-mapping decisions
API scaffoldingAfter bootstrapDomain contractsCan parallelize with ingestion internals
Evaluator designAfter domain draftStorage + event typesStart rule specs early
UI implementationAfter API contracts stabilizeAPI scaffolding + auth modelParallel with evaluator/evidence in mid-plan
Connector contract validationAfter domain draftConnector team syncAvoid late contract churn
Ops hardeningAfter API/evaluator stableFunctional baselineStage-gated

9. Milestones and Definition of Done

MilestoneMinimum Evidence of Completion
M1: Foundation ReadyCI green, local stack boots, OAuth/API-key auth baseline + worker runtime established
M1.5: UI Spike ReadyThin UI scaffold runs with API and renders Dashboard/Findings List/Finding Detail from live findings data
M2: Domain/Storage ReadyCollections/indexes/retention live, adapter tests pass, mock generator available
M3: Ingestion/Normalizer ReadyTransport + normalizer + diff idempotent, events + baselines validated
M4: API ReadyGraph/temporal/findings/sync/query endpoints contract-tested with auth enforcement
M5: Findings/Evidence ReadyAll 5 deterministic rules + structured evidence packs validated
M6: UI ReadyAll 7 pages functional against API, exports align with MVP policy
M7: Connector E2E ReadyConnector payload to UI-visible finding flow demonstrated
M8: Release ReadyStaging validation + runbooks + rollback + threshold alerts live

10. Risks and Mitigations

RiskImpactMitigation
Contract drift between docs and codeRework, defectsTreat docs as versioned contracts; add schema tests in CI
Evidence source availability differs by tenantFalse confidenceEnforce evidence_completeness; never imply unavailable checks
Materialized path growth impacts performanceSlow syncsAdd size + operational triggers; predefine overflow path
Structured query DSL complexity underestimatedPhase 4 schedule slipImplement operator-by-operator test matrix and stage with_paths_to early
Auth model implemented lateTenant boundary/security regressionsBuild auth scaffolding in Phase 1 and enforce in Phase 4 contract tests
mTLS hardening deferred from MVPWeaker service-auth assurance during early rolloutRestrict network paths, rotate connector API keys, and gate production hardening in Phase 8
Connector/platform interface changes lateSchedule slipsLock JSON schema version early and gate changes through ADR
Sensitive metadata leakageCompliance riskEnforce allowlist at ingestion boundary, block raw writes

11. Migration and Cutover Strategy

Strategy

  • Build new repo independently (sv0-platform).
  • Keep graph-mongo as legacy draft reference during build.
  • Validate parity and improvements via deterministic scenario tests.
  • Cut over only after Milestones M1-M8 are satisfied (including M1.5 thin-UI validation gate).

Cutover Gates

  • Critical scenarios pass in new platform.
  • Evidence artifacts validated by manual review.
  • Operational runbooks exercised in staging.
  • UI workflows validated end-to-end against live staging APIs.

12. Immediate Next Actions (Planning-Level)

  1. Run Phase 0 decision closure workshop (transport, scheduling, recompute, storage, SDK contract) and record signed outcomes.
  2. Create the new folder and initialize repository scaffold under sv0-platform.
  3. Convert this plan into tracked epics/issues with milestone mapping (including UI, normalizer, and auth tracks).
  4. Execute Phase 1 baseline implementation with auth middleware and worker-runtime skeleton.
  5. Run Phase 1.5 thin UI scaffolding spike (Dashboard + Findings List + Finding Detail) to validate early visualization flow.