Skip to main content

Vision vs Delivered - Automation-First Realignment Plan (Codex Version)

Date: 2026-02-11
Status: Draft - separate comparison version


1. Executive Assessment

The platform has strong foundations (ingestion, evaluator, evidence packs, graph, temporal endpoints), but delivery is still mostly identity/findings-centric rather than automation execution-centric.

The biggest mismatch is not "missing infrastructure"; it is missing product semantics:

  1. Automation is not treated as the primary unit of investigation in UI and evaluator outputs.
  2. Event ordering is weakly aligned to source chronology.
  3. Ingestion is still connector-normalized batch processing, not a full platform-side correlation model.
  4. Business-flow continuity is not modeled separately from technical identity instances.

2. Vision-to-Delivery Comparison

2.1 What aligns well

  1. Deterministic, evidence-oriented architecture exists.
  2. Cross-system relationships and execution paths are materialized.
  3. Findings and evidence packs are generated and versioned.
  4. Graph focus mode and subgraph APIs exist.

2.2 Where current delivery diverges

  1. MVP1 progress tracker still shows open requirements for core per-automation outputs (inventory, execution detection, egress, origin, risk grouping, evidence pack assembly).
  2. Connector outputs rich automation fields, but platform evaluator/UI do not systematically consume those fields.
  3. Diff events are timestamped using ingestion time, causing potential chronology distortion.
  4. temporalMarkers are accepted at ingest but not used for ordering or visualization.
  5. Execution-flow traversal is not strict to canonical flow semantics and remains partially generic.
  6. Graph UI has stale relationship filters/styles that drift from current domain relationship set.
  7. Timeline UI contract appears misaligned with backend event shape (details/sync_version vs change_details/sync_id).

3. Must-Change Decisions

3.1 Product model: automation-first

Make automation flow the primary operator object:

  1. Add a dedicated automation inventory API and page.
  2. Provide per-automation state cards for: execution proof, identity binding, origin, egress, ownership, risk group.
  3. Keep findings as consequences, not primary navigation.

3.2 Temporal correctness

Adopt explicit timestamp semantics:

  1. effective_at (source event time)
  2. observed_at (connector fetch time)
  3. ingested_at (platform persist time)

Timeline, ordering, and replay should default to effective_at.

3.3 Canonical execution graph semantics

Enforce flow templates in graph APIs:

  1. Forward mode: origin -> trigger -> automation -> auth chain -> role/permission -> resource -> egress
  2. Reverse mode: egress/resource -> ... -> automation -> origin

Avoid mixing causal flow edges and generic neighborhood expansion in execution mode.

3.4 Ingestion and correlation redesign

Move from "connector submits final normalized chain" to staged platform pipeline:

  1. Fact import (all entities/edges/events)
  2. Correlation/identity resolution against existing graph
  3. Connection updates + low-risk deletion handling
  4. Flow projection/materialization
  5. Evaluation and evidence generation

4. UI Realignment Plan

4.1 Core UX objective

Users should answer this in under one minute:

"What automations are running, what data do they touch, where does data leave, who is accountable, and what changed recently?"

4.2 Display options for automation flow

Fixed columns:

  1. Origin
  2. Trigger
  3. Automation
  4. Run-As/Auth
  5. Authority
  6. Resource
  7. Egress

Pros: highest clarity and strongest PRD alignment.

Option B: Split graph + event sequence

  1. Left: canonical chain
  2. Right: ordered event sequence by effective_at

Pros: best for investigations with change context.

Option C: Query-based path explorer

Path search from source/egress constraints, returns ranked deterministic chains.

Pros: powerful for advanced operators; steeper UX.

  1. Automation Overview (new default landing)
  2. Automation Detail (chain + evidence + state)
  3. Graph Explorer (deep dive mode only)
  4. Findings (operational queue after automation triage)

5. Business vs Technical Identity Continuity

Problem: technical identifier changes (for example, client ID rotation) can look like a new execution chain, while business intent is unchanged.

Solution: two-level model.

  1. automation_instance for technical runtime identities and credentials.
  2. business_flow for stable business process identity.

Linking strategy:

  1. INSTANCE_OF from automation instance to business flow.
  2. SUPERSEDES between technical instances across rotations.
  3. Deterministic business-flow fingerprint from trigger, target domain/system, operation type, and accountable function.

Outcome: technical churn is preserved without losing business continuity.


6. Ingestion Architecture Upgrade (Target State)

6.1 Pipeline stages

  1. Import: ingest source facts without over-collapsing.
  2. Resolve: correlate with existing entities via deterministic identity keys + alias map.
  3. Reconcile: update/new edges and controlled deletions.
  4. Project: recompute execution and automation flow views.
  5. Evaluate: run rule engine at flow and instance level.
  6. Publish: update UI projections and evidence references.

6.2 Continuous update model

  1. Use persistent sync cursors and retention-safe refresh.
  2. Support low-rate incremental sync plus webhook-driven updates.
  3. Recompute only impacted flow neighborhoods, not full tenant graph.

7. Delivery Plan

Phase 0 - Contract lock (1 week)

  1. Freeze canonical execution flow semantics.
  2. Align PRD/walkthrough/progress language with actual behavior.
  3. Define API contracts for automation-first pages.

Phase 1 - Data model and ingestion core (2 weeks)

  1. Add business_flow and mapping tables.
  2. Add durable ingest idempotency ledger.
  3. Activate cursor-based incremental processing in platform pipeline.

Phase 2 - Temporal and event correctness (1 week)

  1. Add multi-timestamp event model.
  2. Update timeline APIs and UI adapters.
  3. Consume temporalMarkers in ordering logic.

Phase 3 - Evaluator and evidence alignment (2 weeks)

  1. Make MVP1 automation fields first-class evaluator inputs.
  2. Add automation/flow-level findings where needed.
  3. Extend evidence packs with continuity and flow-chain sections.

Phase 4 - UI automation-first rollout (2 weeks)

  1. Build Automation Overview and Automation Detail.
  2. Implement flow-lane visualization + reverse traversal view.
  3. Keep graph explorer as investigation depth tool.

Phase 5 - Validation and rollout hardening (1 week)

  1. Backfill/migration for existing entities.
  2. Regression suite for ordering, continuity, and traversal semantics.
  3. Pilot validation with deterministic replay checks.

8. Success Criteria

  1. Analysts can trace any automation from origin to egress (and reverse) in three clicks or fewer.
  2. Timeline ordering is source-accurate and reproducible.
  3. Client ID/credential rotations preserve business-flow continuity.
  4. MVP1 per-automation outputs are visible and actionable in UI without custom filtering.
  5. Continuous low-rate update mode maintains graph correctness without full-sync dependence.

9. Implementation Notes

  1. This document is intentionally separate from plans/2026-02-10-vision-vs-delivered-analysis.md for side-by-side comparison.
  2. It prioritizes architecture and UX semantics over connector feature inventory.
  3. It is intended to drive engineering tickets across sv0-platform, sv0-connectors, and docs.