Vision Alignment — Architect Perspective Analysis
Date: 2026-02-17 Author: Architect perspective (automated analysis) Scope: Reconcile docs 06 and 07 against Sergey's W1 product vision, W1 scope/logic/UX specs, and the existing platform architecture (9-entity model, ADR-006/007/008).
1. Architectural Alignment Assessment
1.1 How doc 06's 4-concept model maps to the existing 9-entity data model
Doc 06 proposes four bounded concepts. Here is how each maps to existing architecture:
| Doc 06 Concept | Canonical Term | Current Architecture Artifact | Mapping Quality |
|---|---|---|---|
| AutomationDefinition | automation entity | entity_type: "automation" in 9-entity model (ADR-006) | Direct 1:1 match. Already a first-class entity type with subtypes (business_rule, script_include, flow_designer_flow, scheduled_job, event_script, transform_map). No new artifact required. |
| AutomationTopology | execution_chains collection | execution_chains collection (ADR-008) with BFS assembly, composition fingerprint, entity_refs with chain roles | Direct 1:1 match with naming mismatch. ADR-008's execution_chains collection is exactly what doc 06 calls "AutomationTopology" -- a sync-derived structural projection, not a runtime instance. The naming mismatch is the central problem doc 06 identifies. |
| AutomationRun | (not implemented) | No current artifact. Runtime proof is only available indirectly via execution_evidence records linked to entities. | No current mapping. Doc 06 correctly identifies this as a future concept. |
| ExecutionEvidenceEvent | execution_evidence entity | entity_type: "execution_evidence" in 9-entity model, with evidence_type (api_call, flow_execution, scheduled_job, sign_in) | Direct 1:1 match. Already a first-class entity type. |
Assessment: Three of four concepts already exist in the current architecture. The 4-concept model is primarily a naming and semantic clarity exercise, not an architectural restructuring. The only new architectural artifact (AutomationRun) is explicitly deferred to W2, which is correct.
1.2 How W1's scope contract intersects with the current architecture
W1 Scope (scope.md) declares these graph dependencies:
| W1 Requirement | Current Architecture Status | Gap |
|---|---|---|
automation entities | Implemented (ADR-006) | None |
identity entities | Implemented (ADR-006) | None |
role / permission / resource | Implemented (01-data-model.md) | None |
owner entities | Implemented (ADR-006, normalizer maps human_identity to owner) | None |
execution_evidence entities | Implemented as first-class entity type | None |
connection / credential (optional) | Implemented (ADR-006, ADR-007) | None |
RUNS_AS traversal | Implemented in path materializer and graph | None |
HAS_ROLE -> GRANTS -> APPLIES_TO traversal | Implemented in execution path computation | None |
INVOKES -> USES -> AUTHENTICATES_AS traversal | Implemented (ADR-007) | None |
OWNED_BY evaluation | Implemented with primary/secondary/inherited levels | None |
Deterministic findings with evidence_completeness | Implemented (6-category model in finding schema) | None |
| Risk Cluster aggregation | Not implemented | New artifact needed |
| W1-specific finding types | Not implemented | New evaluator rules needed |
proven/unproven execution status | Evidence exists; explicit binary status not standardized in entity or finding output | Small gap |
The current platform covers approximately 85% of W1's graph dependency contract. The meaningful gaps are: (1) Risk Cluster as a product primitive, (2) W1-specific finding types, and (3) explicit proven/unproven execution status on automations.
1.3 Impact of W1's explicit exclusions on doc 06/07 priorities
W1 Scope explicitly states it does NOT require:
execution_chainspersistence- Chain fingerprinting
- Drift detection
- Temporal versioning
- Full blast radius traversal
This materially changes the priority of several doc 06/07 proposals:
| Doc 06/07 Proposal | W1 Impact |
|---|---|
Rename execution_chains in UI | Lower priority. W1 UX does not surface execution chains as a navigation concept. The W1 investigation flow is Homepage -> Risk Clusters -> Findings -> Detail. There is no "Chains" page in the W1 flow. |
Introduce proven/unproven execution status | High priority. W1 logic explicitly requires binary `execution_status = proven |
| Add W1 Exposure view/projection | High priority. This is the primary user-facing artifact in W1. |
AutomationRun persistence | Deferred to W2. Both doc 06 and W1 scope agree. |
| Topology/run correlation | Deferred to W2. W1 explicitly excludes this. |
| UI label changes (Q1-Q5 in doc 07) | Low priority for W1. The chain/topology pages are not part of the W1 investigation flow. Changes are valid but should not block W1 delivery. |
Key insight: W1's UX specification describes a new primary navigation paradigm (risk clusters -> findings -> detail) that does not use the current "Execution Chains" page at all. Most of doc 07's naming questions (Q1, Q3, Q5) become less urgent for W1 delivery because the pages they target are not part of the W1 investigation flow.
2. Concept Model Reconciliation
2.1 Doc 06's 4 concepts vs W1's scoped needs
Doc 06 proposes:
- AutomationDefinition
- AutomationTopology
- AutomationRun
- ExecutionEvidenceEvent
W1 Scope uses:
- automation (= AutomationDefinition)
- identity, role, permission, resource, owner (authorization + governance graph)
- execution_evidence (= ExecutionEvidenceEvent)
- connection, credential (optional, for egress)
W1 does NOT reference AutomationTopology or AutomationRun by any name.
Reconciliation:
- AutomationDefinition is required by W1 and already exists. No action needed.
- AutomationTopology is NOT required by W1. W1 Scope explicitly says: "Derived relationships (e.g., identity -> resource reachability) may be computed ephemerally during evaluation but are not required to be stored." This means W1 can compute the
Automation -> Identity -> Destination -> Data Domainpath at evaluation time without materializing execution chains. - AutomationRun is NOT required by W1. W1's execution validation logic (logic.md section 1) only needs to determine
provenvsunprovenstatus by checking whether deterministic execution_evidence records exist and can be linked to the automation or its identity. This is a boolean check, not a run-instance correlation. - ExecutionEvidenceEvent is required by W1 and already exists. No action needed.
2.2 What W1 actually needs that doc 06 does not name
Doc 06 identifies an important concept -- the ExposureUnit (section 5) -- as a derived projection:
ExposureUnit shape:
- automation_id
- execution_identity_id (or unknown)
- execution_status (proven | unproven)
- data_domains[] with sensitivity (or unknown)
- egress_category (llm | external | internal | unknown | none_observed)
- ownership_status (valid | invalid | ambiguous | unknown)
- deterministic explanation + evidence declaration
This aligns closely with what W1 UX calls a "Finding Row" (ux.md section 2A):
Each row represents one deterministic path:
Automation -> Identity -> Destination -> Data Domain
With display fields: Execution Mode, Last Execution, Executions (30d), Ownership Status, Data Domains, Egress Category.
The ExposureUnit and the W1 Finding Row are conceptually the same thing. The architect recommendation:
- The ExposureUnit is NOT a new persistence artifact for W1. It is the projection logic that the W1 evaluator computes from existing graph state and evidence.
- It maps naturally to the platform's existing Finding model. W1 findings (unproven_execution, unknown_identity_binding, reachable_sensitive_domain, etc.) are each a facet of the ExposureUnit.
- The ExposureUnit concept is valuable as an internal design abstraction but should not be a new collection or schema. W1 findings persisted in the Finding model, combined with entity graph data, are sufficient.
3. W1-Specific Architecture Implications
3.1 "Autonomous Authority Risk Clusters" -- what data model artifact supports this?
W1 UX (ux.md section 1B) describes:
Top 5 Autonomous Authority Risk Clusters Clusters are prioritized by compound governance condition:
- Sensitive + LLM + Active + Invalid Owner
- Sensitive + External + Active
- Sensitive + External + Dormant Authority ...
A Risk Cluster is a deterministic aggregation of active findings grouped by compound condition. The W1 Logic doc confirms (logic.md section 7):
Risk grouping is a deterministic roll-up of active findings. Inputs: Execution status, Data domain sensitivity, Egress classification, Ownership status. Grouping does not replace canonical findings and does not introduce new risk semantics.
Architecture implication: Risk Clusters are NOT a new persisted entity. They are a computed view over the findings collection.
Required inputs for cluster computation (all currently available in the data model):
| Signal | Source |
|---|---|
execution_status (active/dormant) | execution_evidence presence + recency (30d threshold) |
data_domain_sensitivity (sensitive = confidential or restricted) | resource.sensitivity reachable via authorization path |
egress_category (LLM/external/internal/unknown) | connection properties or execution_chains.summary.egress_category |
ownership_status (valid/invalid/ambiguous/unknown) | OWNED_BY relationship evaluation |
Implementation options:
- Compute on demand -- API endpoint aggregates findings + entity signals at query time. Simple, stateless, no new persistence.
- Materialized view -- compute and store cluster membership during post-evaluation step. Faster reads, additional write complexity.
Architect recommendation: Start with compute-on-demand (option 1). W1 operates on periodic refresh, not real-time. The finding set for a single tenant in W1 is expected to be small (tens to low hundreds). An aggregation query over findings with entity lookups is feasible at this scale. If performance becomes an issue during pilot, materialize as a second step.
No new MongoDB collection is needed. A new API endpoint with in-memory aggregation over the findings collection is sufficient.
3.2 "Deterministic Path: Automation -> Identity -> Destination -> Data Domain" -- what is this?
W1 UX describes each finding row as representing "one deterministic path":
Automation -> Identity -> Destination -> Data Domain
This is neither an execution chain nor an execution path in the current data model's terminology:
| W1 Path Segment | Source in Current Model |
|---|---|
| Automation | entity_type: "automation" |
| Identity | Follow RUNS_AS from automation to identity |
| Destination | Follow INVOKES -> USES -> AUTHENTICATES_AS from automation through connection/credential to destination identity; OR follow AUTHENTICATES_TO from the RUNS_AS identity to a target system identity |
| Data Domain | Follow HAS_ROLE -> GRANTS -> APPLIES_TO -> resource from the terminal identity; extract business_domain and sensitivity from the resource |
It is a W1 Exposure Path -- a simplified, deterministic projection that starts at automation, resolves identity via RUNS_AS, and terminates at the first reachable data domain via the authorization path.
This path is an ephemeral traversal computed during W1 evaluation. The traversal logic is bounded (first provable boundary only, per W1 scope). It does not require execution_chains persistence.
The W1 evaluator needs a path computation function that can:
- Start from an automation entity
- Resolve the RUNS_AS identity (or mark unknown)
- Follow the authorization path to resources (bounded, first hop only)
- Classify egress via the connection/credential path
- Return a structured result suitable for finding creation
Recommendation: Add a structured exposure_path field to W1 finding documents containing the collapsed path: { automation_id, automation_name, identity_id, identity_name, destination_resource_id, destination_name, data_domain, sensitivity }. This is a finding-level property, not a new collection. Implement a dedicated computeW1ExposurePath(automationId) function in the evaluator that does bounded graph traversal for W1's specific needs, rather than reusing the full chain-builder BFS.
3.3 W1 finding types vs current evaluator rules
Current evaluator rules (5):
| Current Rule | Finding Type |
|---|---|
orphaned_ownership | Entity with no active owner at any level |
ownership_degraded | Primary owner decayed, secondary active |
dormant_authority | Elevated permissions, no recent activity |
scope_drift | Roles expanded without re-approval |
privilege_justification_gap | Elevated permissions with no evidence of need |
W1 finding types (from scope.md section 2.7):
| W1 Finding Type | Current Overlap | Assessment |
|---|---|---|
unproven_execution | Partially overlaps dormant_authority but semantically different: dormant_authority = has permissions + no activity; unproven_execution = has automation + no deterministic evidence linkage | New rule needed -- different trigger condition |
unknown_identity_binding | No overlap | New rule needed -- automation has no deterministic RUNS_AS identity |
reachable_sensitive_domain | Partially overlaps privilege_justification_gap but different framing: PJG = has elevated access with no evidence of need; reachable_sensitive = automation path reaches confidential/restricted domain | New rule needed -- different entity focus (automation-centric, not identity-centric) |
llm_egress | No overlap | New rule needed -- automation path includes LLM-classified egress |
external_egress | No overlap | New rule needed -- automation path includes external egress |
ownership_invalid | Overlaps orphaned_ownership | May reuse existing rule with W1-specific finding_type label |
ownership_ambiguous | Overlaps ownership_degraded concept and data model definition | May reuse existing rule with W1-specific label |
ownership_unknown | No direct overlap | New rule needed -- no OWNED_BY relationship at all (distinct from all-owners-decayed) |
Assessment: W1 requires at least 5 new evaluator rules beyond the current set. Some ownership findings can map to existing rules with W1-specific labels, but the execution-centric findings (unproven_execution, unknown_identity_binding, reachable_sensitive_domain, llm_egress, external_egress) are fundamentally new.
Architectural implication: The new rules require graph traversal beyond what current rules perform. Current rules operate on single entities and their immediate relationships. W1 rules need to traverse the full Automation -> Identity -> Destination -> Data Domain path. The evaluator rule interface should be extended to provide a GraphContext that allows bounded traversal from the subject entity. Rules should not perform raw database queries; instead, the evaluator framework should pre-compute the relevant path context and pass it to each rule.
4. Naming Recommendations (Q1-Q5)
For each question, evaluated against three criteria:
- Survives beyond W1 -- name works for future wedges
- Aligns with 9-entity model and existing ADRs -- no semantic conflicts
- Avoids conflicts with W1's explicit scope constraints -- does not imply capabilities W1 excludes
Q1: What should we call Concept 2 (currently "Execution Chains")?
Recommendation: "Authority Chains"
Rationale:
- Survives beyond W1. "Authority" is the core concept across all wedges. The vision doc's central question is "on what authority?" (vision.md). Authority is durable vocabulary that applies to W2 drift detection, W3 remediation, and beyond.
- Aligns with 9-entity model. The chain traverses the authority graph (automation -> connection -> credential -> identity -> role -> permission -> resource). "Authority" describes what the chain represents: the path through which authority flows from an automation to its target resources. ADR-008's chain roles already use authority-aligned names:
authorized_role,authorized_permission,target_resource. - Avoids W1 conflicts. "Authority Chains" does not imply runtime execution, drift detection, or temporal versioning -- all of which W1 excludes. "Execution Chains" falsely implies runtime execution which causes the confusion documented in doc 06.
Why not other options:
- "Configured Chains" -- vague; "configured" implies user configuration, but chains are platform-derived
- "Automation Chains" -- could still imply runtime; also redundant (chains are already anchored to automations)
- "Topologies" -- technically precise but jargon; sounds like network diagrams, not security constructs
- Keep "Chains" + subtitle -- does not fix root semantic issue; chains eventually appear in export/API contexts where subtitles are unavailable
Q2: Observed execution count column header
Recommendation: "Observed Runs (30d)"
Rationale:
- Survives into W2. When AutomationRun becomes a first-class concept, "Runs" is the natural terminology (per the Camunda/Airflow/Temporal patterns cited in doc 06 section 8).
- Aligns with W1 UX. W1 UX (ux.md section 2A) uses "Executions (30d)" as a row field. "Observed Runs (30d)" is compatible and more precise.
- Explicitly separates from configured capability. The count reflects runtime evidence, not structural configuration.
Q3: Timestamp label on chain detail pages
Recommendation: "Last Computed"
Rationale:
- Prevents the "this chain ran at T" misreading. The chain is recomputed at sync ingestion time, not at runtime. "Last Computed" makes this unambiguous.
- Aligns with W1's "as of last refresh" model. W1 UX displays "Last Refreshed: [timestamp]" globally. "Last Computed" for chains is consistent with this pattern.
- Against "Last Synced": While technically accurate, "synced" implies data transfer from a source system. The chain is assembled platform-side from already-synced entities -- the sync and the assembly are different operations.
Q4: Should the glossary upgrade from 3-concept to 4-concept model?
Recommendation: Option A -- stay at 3 concepts until W2 infrastructure lands.
Rationale:
- W1 does not require AutomationRun as even a conceptual reference. Adding it as "planned" creates expectation that W1 should deliver something run-related.
- W1 UX explicitly avoids referencing chain/topology/run internals. The glossary should reflect what the product actually exposes.
- When W2 scoping begins, upgrade the glossary to 4 concepts with concrete definitions tied to actual infrastructure.
However, an internal architecture note (not user-facing glossary) should document the 4-concept model for developer alignment. This already exists as doc 06 itself.
Q5: Nav item for the chains/topology page
Recommendation: "Authority Chains" with Shield icon (per Q1 resolution)
However, the more important architectural question is whether the chains page should be part of W1 navigation at all. W1 UX (ux.md section 4) describes:
- Start on Homepage.
- Review Top 5 Autonomous Authority Risk Clusters.
- Click a cluster.
- Review execution magnitude and ownership status in row.
- Expand row for summary.
- Open detail to inspect standing authority and linkage proof. No alternate navigation. No graph browsing mode.
The W1 UX explicitly excludes a chains page from the primary investigation flow. Recommendation:
- For W1: Remove "Chains" from the primary navigation.
- For the platform's power-user/diagnostic mode (if retained alongside W1): Use "Authority Chains" with Shield icon.
5. Architectural Risks and Open Questions
5.1 Decisions in docs 06/07 that should be deferred because W1 explicitly excludes them
| Decision | Doc 06/07 Reference | W1 Status | Recommendation |
|---|---|---|---|
| AutomationRun persistence | Doc 06 section 4.1 concept 3 | Explicitly deferred to W2 | Defer. Do not design the runs schema until W1 pilots reveal whether run-level granularity is needed. |
| Topology/run correlation | Doc 06 section 7, Phase W2 | Out of scope | Defer. Requires AutomationRun and chain versioning, neither of which are W1 concerns. |
| Chain fingerprint versioning | ADR-008 Phase 2 (execution_chain_versions, execution_chain_events) | Explicitly excluded | Defer. No version tracking or drift detection in W1. |
| Chain detail page redesign | Doc 07 Phase 1, Q1/Q3/Q5 | Secondary -- W1 does not use chains page | Low priority. Fix only if the chains page is retained for diagnostic purposes. |
| Full blast radius traversal | Implied by chain assembly | Explicitly excluded (first boundary only) | Do not implement recursive traversal for W1. Use bounded, first-hop-only traversal. |
| Chain role naming mismatch fix | Doc 07 section 7.2 | Low priority | Defer. Valid fix but should not compete with W1-critical work. |
5.2 New architectural artifacts W1 requires that don't exist in the current architecture or in docs 06/07
| New Artifact | Description | Effort |
|---|---|---|
| W1 Evaluator Rules | 5+ new rules: unknown_identity_binding, unproven_execution, reachable_sensitive_domain, llm_egress, external_egress, plus ownership refinements. Requires bounded graph traversal context that the current evaluator does not provide. | Medium |
| W1 Exposure Path Computation | Bounded traversal function: automation -> RUNS_AS -> identity -> HAS_ROLE -> GRANTS -> APPLIES_TO -> resource + egress path. Simpler than full chain-builder BFS. | Small-medium |
| Risk Cluster API | Server-side aggregation endpoint that groups active W1 findings into compound-condition clusters and returns Top 5. No new persistence. | Small |
| W1 Finding Type Registry | Restricted allowlist of W1 finding_type values. Config/constant that restricts which finding types the W1 evaluator can emit. | Small |
| Proven/Unproven Execution Status | Binary classification per automation based on deterministic evidence linkage. Entity property or evaluation-time derivation. | Small |
| Egress Classification Logic | Deterministic classifier for outbound endpoints as LLM / External / Internal / Unknown. Static mapping + rule-based classification. | Small-medium |
| W1 Homepage | New page with posture summary, delta-since-refresh, and Top 5 Risk Clusters. | Medium (UI) |
| Since Last Refresh Delta | Discrete delta indicators from comparing current sync to previous sync state. May be partially available via events collection. | Small |
5.3 Should there be a new ADR for W1's Risk Cluster concept?
Yes, but scoped narrowly. Justification:
- Risk Clusters are a new product primitive that does not exist in any current ADR.
- They aggregate over findings using compound conditions -- this is a new evaluation pattern distinct from individual finding rules.
- The design decision (compute-on-demand vs materialize) has consequences for API design, caching strategy, and future wedge extensibility.
- The cluster condition schema (which signals combine, in what priority order) is a product-architecture contract that should be recorded.
Suggested scope for new ADR (adr-010-w1-risk-clusters.md):
- Define cluster as deterministic aggregation, not entity type
- Define cluster condition schema (the compound signal combinations)
- Decide compute-on-demand for W1, with materialization option for future
- Decide API contract (endpoint, response shape, filtering)
- Explicitly exclude scoring, ML-based ranking, probabilistic weighting
5.4 Additional open questions
-
Finding-level vs entity-level proven/unproven status: Should
execution_status: proven | unprovenbe a computed property on theautomationentity document (updated per sync) or a per-finding attribute (computed during evaluation)? Entity-level is simpler for queries and W1 homepage metrics. Finding-level is more granular. Recommendation: entity-level property, computed during evaluation and written back to entity, since W1 treats proven/unproven as a property of the automation, not of individual findings. -
W1 finding_type namespace: Should W1 finding types coexist with current evaluator finding types in the same
findingscollection, or should they be a separate partition? Recommendation: same collection, same schema. Use afinding_sourceorwedgefield to filter. This preserves the single findings API endpoint while allowing W1-scoped queries. -
Cluster stability across refreshes: If a cluster condition changes rank between refreshes (e.g., cluster #1 drops to #3 because owners were reassigned), should the UI maintain visual stability or re-sort? W1 UX says "Top 5" without implying stable ordering, suggesting ephemeral aggregation is sufficient -- no cluster ID needed.
-
W1 evaluator trigger timing: Current evaluator rules run as a post-ingestion worker. W1 rules need graph traversal context. Should the W1 evaluator run after the path materializer, or should it perform its own bounded traversal? Recommendation: own bounded traversal -- it keeps W1 decoupled from execution_chains infrastructure that W1 does not require.
-
Relationship between existing findings and W1 findings: An automation may trigger both a current
dormant_authorityfinding (via its RUNS_AS identity) and a W1unproven_executionfinding. These are related but distinct. Should the platform link them, deduplicate them, or treat them independently? Recommendation: treat independently. They have different finding_types, different evidence bases, and serve different user audiences. -
Egress classification source of truth: W1 needs to classify endpoints as LLM / External / Internal. For W1's deterministic requirement, should there be a canonical endpoint classification registry? Recommendation: yes, a simple static mapping (e.g.,
*.openai.com -> LLM,*.anthropic.com -> LLM, all others -> rule-based external/internal classification). This should be documented and versionable.
6. Summary
W1 Delivery Path (immediate priorities)
- Implement W1-specific evaluator rules with bounded graph traversal
- Implement Risk Cluster computation as a server-side API endpoint
- Build W1 exposure path computation function (bounded, first-hop)
- Add proven/unproven execution status as computed automation property
- Add egress classification logic with static endpoint registry
- Create ADR for Risk Cluster concept
Keep from docs 06/07 (relevant, parallel or post-W1)
- Adopt "Authority Chains" naming for execution_chains (Q1)
- Use "Observed Runs (30d)" for execution count column (Q2)
- Use "Last Computed" for chain timestamps (Q3)
- Stay at 3 concepts in glossary until W2 (Q4)
- Fix stale API doc and chain-role naming mismatch (doc 07 section 7)
Defer (W2 or later)
- AutomationRun persistence
- Topology/run correlation and version tracking
- Chain fingerprint versioning (ADR-008 Phase 2)
- Full blast radius traversal beyond first boundary
- Chains page redesign (not part of W1 navigation)
Key architectural insight
W1 represents a product-level narrowing of the platform's capabilities into a CISO-facing exposure view. The current architecture already contains all the building blocks W1 needs (9-entity model, typed relationships, execution evidence, findings with evidence completeness). What W1 adds is:
- New evaluator rules that traverse the graph with W1's bounded constraints
- A new aggregation primitive (Risk Clusters) over findings
- A new UX paradigm (cluster-first investigation flow, not entity-table browsing)
None of these require changes to the core data model, storage schema, or ingestion pipeline. The architecture is ready for W1. The work is in the evaluator, the API layer, and the UI.