Reconciled Roadmap — Automation-First MVP + Post-MVP Architecture
Date: 2026-02-11 Status: Active
1. Three-Way Comparison
1.1 Our Analysis (2026-02-10)
Strengths:
- Concrete, immediately actionable — 7 implementation steps, all shipped
- Correct diagnosis: all 7 PRD requirements exist in storage, gap is entirely UI presentation
- Practical Phase A/B/C/D split: UI first, design docs for architecture, defer heavy backend
- Strong PRD requirement coverage mapping (Appendix table)
- oauth_entity.sys_id tracking as near-term fix for credential rotation
Weaknesses:
- Does not address temporal correctness (event ordering by ingestion time, not source time)
- Does not address timeline UI contract misalignment
- Execution flow visualization is simplified (step diagram only), not enforced at API level
- Business automation concept proposed but left entirely to Track 2
Verdict: Best for shipping quickly. Correctly prioritized.
1.2 Codex Version (2026-02-11)
Strengths:
- Sharper framing: "missing product semantics, not missing infrastructure"
- Temporal model is a real gap (
effective_at/observed_at/ingested_at) - Catches specific bugs: timeline UI contract mismatch, temporalMarkers not consumed, stale graph filters
- Canonical execution flow semantics (forward/reverse templates) is the right target
- business_flow + automation_instance two-level model is architecturally sound
- 6-stage ingestion pipeline is a clean target state
Weaknesses:
- Puts architecture BEFORE UI (Phases 0-3 are all backend, UI is Phase 4) — wrong order for pilot
- 9-week delivery plan is unrealistic for pre-pilot
- Does not reference what already exists (our implementation, Track 1 completeness)
- business_flow model requires new entity types, collections, relationships — heavy for Phase 1
- No concrete file-level implementation detail
Verdict: Best for architectural direction. Wrong execution order for pilot.
1.3 Combined Version (2026-02-11)
Strengths:
- Correctly identifies each plan's strengths
- Wave 0-4 structure is reasonable
- Prioritization matrix is balanced
Weaknesses:
- Written before UI shipped — Wave 0 is already done
- Still puts temporal model fix as P0 (medium effort) — debatable for pilot
- Doesn't distinguish "must fix for pilot" from "nice to have before pilot"
- Wave 3 (business_flow + correlation + incremental ingestion) is still pre-MVP in timing
Verdict: Good framework but needs updating with reality. Not directly usable as-is.
2. Assessment: Is the Codex Combined Version Valid?
Partially. The combined version is a reasonable synthesis but I reject it as the execution plan because:
- Wave 0 is done. We shipped Steps 1-7 (see
2026-02-11-ui-automation-first-implementation.md). - Temporal model is not P0 for pilot. It's a correctness improvement but doesn't block the pilot demo. Ingestion timestamps are close enough for Phase 1 data.
- business_flow is overengineered for MVP. The user explicitly said: "focus on low hanging fruit, leave rearchitecture to post-MVP."
- Canonical flow enforcement needs design first. The API change (strict traversal templates) is non-trivial and risks breaking existing graph queries.
What I accept from Codex:
- Temporal correctness is a real gap — schedule it as an early post-pilot improvement
- Timeline UI contract fix — small, concrete, should be done
- Canonical flow semantics — right direction, but as a design doc now, implementation post-MVP
- business_flow fingerprint concept — valuable for Track 2 design, not for implementation now
3. Reconciled Roadmap
Phase 1: SHIPPED (Complete)
Automation-centric UI layer. See 2026-02-11-ui-automation-first-implementation.md.
| Item | Status |
|---|---|
| Backend: identity_subtype filter | Done |
| Automations list page (8 PRD columns, 4 filters, search) | Done |
| Automation detail page (flow diagram + 6 info cards + tabs) | Done |
| Dashboard redesign (automation summary + risk overview) | Done |
| Navigation (routes + sidebar) | Done |
| Graph: authority vs provenance edge styling | Done |
| Graph: path highlighting on node selection | Done |
Phase 2: LOW-HANGING FRUIT (Next — pre-pilot)
Small, targeted fixes that improve pilot quality without architecture changes.
| # | Item | Source | Effort | Impact | Priority |
|---|---|---|---|---|---|
| 2A | Flow Discovery: expand trigger filter to include service_catalog | Scan validation 2026-02-11 | Small | Blocking: AI Triage Flow not discovered. Connector queries trigger_typeINrecord,schedule, excluding Service Catalog triggers. One-line filter fix + test. Flow will appear as identity_binding_status: "unlinked" (correct — API Key auth, no SP). | Done |
| 2B | Codex §2.2.7 | — | N/A — false positive. Both backend (EventDoc) and UI types already use change_details/sync_id. No mismatch exists. | N/A | |
| 2C | Graph filter dropdown: add missing relationship types | Codex §2.2.6 | Small | Aligned UI filter with backend vocabulary (14 types). Removed ghost HAS_CREDENTIAL. Added BELONGS_TO, DELEGATES_TO, APPROVED_BY, MEMBER_OF with styles. | Done |
| 2D | Execution flow mode: order by causal sequence | Codex §3.3 | Medium | Added execution-layer ranking + edge reversal for causal left-to-right ordering in Dagre. Reverse-traversed edges (OWNED_BY, AUTHENTICATES_TO, etc.) no longer confuse layout. | Done |
| 2E | oauth_entity.sys_id tracking in connector | Our §GAP4 | Small | Credential nodes now use oauth_entity.sys_id as stable identifier instead of client_id. Prevents false chain-break findings on credential rotation. targetRecordSysId added to execution chain AUTHENTICATES_TO edges. | Done |
| 2F | Flow diagram: handle missing relationships gracefully | Implementation | Small | Already implemented — AutomationFlowDiagram.tsx dims missing steps with opacity-60 and shows "Unknown". | Done |
| 2G | AutomationsPage: server-side filters for egress/ownership/risk | Implementation | Medium | Existing filters (egress/ownership/risk) were already server-side. Added execution_mode + security_relevance as new filter dimensions through full stack (EntityQuery → MongoDB → API → hooks → UI dropdowns). | Done |
2A Details: Flow Discovery Fix
Root cause found 2026-02-11: Ran connector scan against live ServiceNow instance. The AI Triage Flow (Service Catalog trigger → REST action → Azure OpenAI) was NOT discovered. Log shows sys_hub_trigger_instance query returned 0 rows because the filter excludes service_catalog trigger type.
Scenario: sv0-documentation/docs/product/scenario-setup/flow-service-now.md
- Flow: "AI Triage via Azure OpenAI (Catalog Trigger)"
- Trigger: Service Catalog (
sc_req_itemcreation) - Auth: API Key (NOT OAuth) — no Azure SP binding expected
- Egress:
https://gpt-nano-for-summary.cognitiveservices.azure.com(external/LLM)
Fix scope (Option A — minimal):
servicenow_client.py:1182— expand filter:trigger_typeINrecord,schedule,service_catalog- Verify action endpoint extraction works for this flow's REST-sm action
- Run scan, confirm flow appears in graph with correct egress/ownership
- Import to local + deployed platform, verify in UI
What the flow will look like after fix:
identitySubtype: "flow_designer_flow",identity_binding_status: "unlinked"egress_category: "external"or"llm"(cognitiveservices.azure.com)ownership_status: "valid"(if creator is active admin)- No RUNS_AS → SP edge (API Key auth, no OAuth entity match)
- Findings: orphaned_ownership may fire if creator is departed
Future Option B (post-pilot): Resolve Connection Alias → HTTP Connection tables for richer auth evidence. Not blocking for pilot.
Estimated effort: 3-5 days total.
2A Delivery Note (2026-02-12)
Scope drift from Option A plan. Code review (combined analysis doc) identified 4 bugs beyond the original "one-line filter fix" scope. All were fixed:
| Fix | Planned? | What changed |
|---|---|---|
_parse_trigger_from_label_cache() two-pass parsing | No | First match returned empty reference; second entry with sc_req_item was skipped. Two-pass: prefer entries with non-empty table reference. |
identity_binding_status RUNS_AS semantics | No | Was set to "bound" whenever execution data existed (even API-key flows). Now based on presence of RUNS_AS edge: OAuth/SP binding → "bound", no SP → "unlinked". |
| Connection alias resolution expansion | Partial (Option B) | flows_needing_endpoints excluded flows with actions but no extracted endpoints. Expanded filter condition. |
| Strict test assertions | No (test quality) | in ("bound", "unlinked") → == "unlinked" for API-key flows; == "" → == "sc_req_item" for label_cache. |
Result: 235 tests pass (59 unit + 22 integration + 24 CLI converter + 126 transformer + 4 e2e). All assertion changes align with the reconciled plan's expected behavior (identity_binding_status: "unlinked" for API-key flows).
2A+ Delivery Note: Automation Filtering + Classification (2026-02-12)
Context: Live scan (2026-02-12) revealed 83% false positive rate — 77 of 92 identity nodes are RG4 + 0 exec + unlinked flows with no external egress. Graph Explorer became an unusable vertical line with 126 entities. Peer review confirmed root cause: OWNED_BY fan-in (87 edges → single admin) + RUNS_AS fan-in (85 edges → 3 human_identity nodes).
New properties added to automation nodes:
| Property | Values | Purpose |
|---|---|---|
execution_mode | autonomous, operator_assisted, human_triggered, unknown | Classifies HOW the automation runs. Based on trigger types. |
security_relevance | active_external, dormant_authority, internal_inventory | Classifies WHY it matters. Computed from egress + execution + binding signals. |
Connector-side pre-filter (opt-in): internal_inventory automations can be excluded from NormalizedGraph output via filter_internal_inventory=True. Default is OFF to preserve Phase 1 inventory completeness gate (PRD §1). When enabled, removes ~77 noise flows and their orphaned edges. Orphaned human_identity, resource, permission, and execution_evidence nodes are also cleaned up.
Key design insight (from peer review): Filtering is based on the combination of multiple signals, NOT on execution_mode alone. Example: AI Triage Flow has execution_mode: "operator_assisted" (service_catalog trigger) but security_relevance: "active_external" (LLM egress + 7 executions) — it is NOT filtered.
Correctness fixes (2026-02-12 review):
- P0:
execution_modemapping expanded to recognize granular ServiceNow trigger values (record_create,daily,weekly, etc.) — generic values (record,schedule) never matched real data - P0: Filtering changed from hard-applied to opt-in (
filter_internal_inventory=Falsedefault) to preserve Phase 1 inventory gate - P1: Orphan cleanup expanded from
human_identity-only to all dependent node types (resource,permission,execution_evidence) - P1: Glossary risk-group definitions corrected to match PRD §7 and
risk_grouper.pyimplementation
Architecture docs updated:
glossary.md: Added Automation, Execution Mode, Security Relevance sections; corrected Risk Groups to match PRDintegrations/servicenow/automation-types.md: Formalized execution_mode framework with trigger mappingarchitecture/01-data-model.md: Replacedexecution_typewithexecution_mode+security_relevance
Result: 260 tests pass (84 PRD + 67 transformer + others). 20 new tests covering execution_mode classification (including granular trigger parametrized tests), security_relevance classification, opt-in filtering, and orphan resource cleanup.
Future refactoring note: The security_relevance property is always set on all automation nodes regardless of filter setting. When the platform moves to import-by-type-first ingestion (Phase 4F), the UI can use security_relevance to show/hide automations directly, making connector-side filtering unnecessary. See Phase 4 item 4I.
Phase 3: PILOT HARDENING (Before pilot delivery)
Items that make the pilot demo polished and robust.
| # | Item | Effort | Notes |
|---|---|---|---|
| 3A | Seed demo script with realistic automation data | Medium | Update scripts/seed-demo.ts with BR/Flow/Job/SI examples |
| 3B | Evidence pack: include automation flow chain section | Medium | Extend evidence pack content with origin→egress summary |
| 3C | Automation detail: link to findings with pre-filtered view | Small | Navigate to /findings?entity_id=X |
| 3D | Graph: execution_flow mode defaults when navigating from automation | Small | Auto-set mode=execution_flow, seed=automation_id |
| 3E | End-to-end test: ingest → evaluate → view automation → view finding → view evidence | Medium | Validation script |
Estimated effort: 3-5 days total.
Phase 4: POST-MVP ARCHITECTURE (After pilot)
These are the right improvements but wrong timing for pilot. Design docs can be written now.
| # | Item | Source | Type | Notes |
|---|---|---|---|---|
| 4A | Temporal model (effective_at/observed_at/ingested_at) | Codex §3.2 | Design + Implement | Multi-timestamp event model, timeline ordering by source time |
| 4B | Canonical execution flow API | Codex §3.3 | Design + Implement | Strict forward/reverse traversal templates, not generic BFS |
| 4C | business_flow entity type | Codex §5 | Design + Implement | Two-level model, INSTANCE_OF/SUPERSEDES relationships, fingerprinting |
| 4D | Platform-side correlation engine | Our §Proposal 1 | Design + Implement | Deterministic correlation rules, cross-connector matching |
| 4E | Incremental entity API | Our §Proposal 2 | Design + Implement | Single-entity ingest, webhook-driven updates |
| 4F | Ingestion pipeline stages | Codex §6.1 | Design + Implement | Import→Resolve→Reconcile→Project→Evaluate→Publish |
| 4G | Swimlane graph layout | Our §Option A | Implement | Origin→Processing→Destination visual model |
| 4H | Evaluator: automation-level findings | Codex §Phase 3 | Implement | MVP1 fields as first-class evaluator inputs |
| 4I | Move security_relevance computation platform-side | 2A+ delivery (2026-02-12) | Design + Implement | When 4F (import-by-type ingestion) lands, remove connector-side internal_inventory pre-filter. Import all automations. Compute security_relevance in platform evaluator. Add UI filter toggle (show/hide internal_inventory). Enables analysts to inspect full inventory when needed. |
Recommended design docs to write now (unblocks Track 2):
- ADR: Temporal event model
- ADR: Business automation continuity
- ADR: Platform-side correlation engine
- ADR: Incremental ingestion API
- ADR: Security relevance classification (when to move from connector to platform)
4. Decision Log
| Decision | Rationale |
|---|---|
| Flow discovery (2A) is P0 — priority #1 | AI Triage Flow not discovered due to trigger filter gap. Blocking for pilot demo. |
| Option A (filter fix) before Option B (Connection Alias resolution) | One-line fix discovers the flow. Richer auth evidence is post-pilot. |
API Key flows show as identity_binding_status: "unlinked" | Correct per PRD — no OAuth/SP binding means no deterministic join. |
| Ship UI before fixing temporal model | Pilot demo needs visual story, not perfect timestamps |
internal_inventory pre-filter. | |
| Filter by signal combination, not trigger type | Peer review P0: human-triggered flows CAN be security-relevant if they have external egress + active execution (e.g., AI Triage). execution_mode and security_relevance are orthogonal. |
| Connector-side pre-filter is temporary | Will move to platform-side computation when import-by-type ingestion lands (4F→4I). |
| Defer business_flow to post-MVP | Requires new entity types + collections + relationships — too heavy |
| Defer ingestion redesign to post-MVP | Current batch model works for pilot, continuous sync is post-pilot |
| Write architecture ADRs now | Costs nothing, unblocks Track 2 when we get there |
| Accept Codex temporal model as target | Three-timestamp model is correct, just wrong timing |
| Accept Codex flow semantics as target | Forward/reverse templates are right, need design doc first |
| Reject Codex delivery ordering | Architecture-first ordering would delay pilot by weeks |
5. Success Criteria
For Pilot (Phases 1-3)
- Analyst can see all automations with PRD requirement outputs in dedicated page
- Clicking an automation shows the execution flow (trigger → automation → egress) visually
- Dashboard shows automation risk distribution, not just finding counts
- Graph highlights connected paths when selecting a node
- All 7 PRD requirements are visible per-automation without digging into raw properties
For Post-MVP (Phase 4)
- Timeline ordering reflects source chronology
- Credential rotation preserves business process continuity
- Platform correlates entities from multiple connectors
- Incremental updates work via webhook
- Origin-to-egress can be traced in both directions in 3 clicks
6. Source Documents
| Document | Date | Role |
|---|---|---|
2026-02-10-vision-vs-delivered-analysis.md | Feb 10 | Original gap analysis — practical, UI-focused |
2026-02-11-vision-vs-delivered-analysis-codex.md | Feb 11 | Codex version — architecture-focused, temporal rigor |
2026-02-11-vision-vs-delivered-analysis-combined.md | Feb 11 | Combined version — framework, not updated with reality |
2026-02-11-ui-automation-first-implementation.md | Feb 11 | What actually shipped (Steps 1-7) |
product/scenario-setup/flow-service-now.md | Feb 11 | AI Triage Flow scenario — exposed trigger filter gap |
| This document | Feb 11 | Reconciled roadmap — the active execution plan |