ADR-008: Execution Chains Collection

Status

Accepted (2026-02-13). Amended 2026-05-19 — see the update block immediately below.

Update 2026-05-19 — last_seen_at is not a GC signal yet.

The Consequences section below ("Stale chains — chains for deleted entry points will linger until garbage collection removes them, tracked via last_seen_at") describes intent, not implementation. As of 2026-05-19, code audit of sv0-platform confirms:

last_seen_at is written at src/ingestion/chain-builder.ts:147 (every successful assembly) and surfaced through src/storage/mongo/adapters/execution-chain-adapter.ts:25.
last_seen_at is read for display and as a default sort field. It is not read by any GC pass — no GC code against execution_chains exists.

Net effect: stale chain anchors are immortal until manual cleanup. GC of orphaned chains is tracked as tech debt at sv0-platform#1176. A future ADR addresses the GC trigger and the deletion semantics; this ADR is amended to remove the implicit claim that last_seen_at already does the job.

This amendment was made alongside ADR-026 (chain re-materialization triggers), which deliberately preserves the last_seen_at semantics so a future GC pass has a usable signal — it rejected an evaluator-side re-materialization plan in part because that plan would have overwritten last_seen_at on every evaluation and made the signal unusable.

The historical text in the Consequences section below is preserved as-is; this block governs.

Context

The platform needs to track automation chains as durable, listable entities with stable identity across scans. The current model can reconstruct chains on the fly via BFS traversal of the entity graph, but this approach has significant limitations:

No listability — CISOs cannot ask "show me all automations that can reach HR data" without triggering a full graph traversal for every automation entry point
No bookmarking — a specific chain cannot be referenced by ID in findings, reviews, or exports
No diffing — comparing chain composition across sync versions requires reconstructing both versions from scratch
No versioning — there is no record of when a chain first appeared, when it changed, or what changed

CISO requirement

The core query pattern is: "List all automation chains, filter by blast radius domain, sort by sensitivity, show ownership status." This is a collection query, not a graph traversal. Without a dedicated collection, every such request requires O(N) BFS traversals where N is the number of automation entry points.

Decision

New `execution_chains` collection

Create a new MongoDB collection that persists assembled execution chains as first-class entities.

Schema

{
  _id: "chain-uuid",
  tenant_id: String,
  name: String,                    // Human-readable (e.g., "BR: Auto-close incidents → Entra ID")
  anchor_entity_id: String,        // Entry point entity_id (stable root)
  entity_refs: [
    {
      entity_id: String,
      entity_type: String,         // resource | automation | connection | credential | identity | role | permission
      role: String                 // entry_point | code_component | outbound_target | auth_credential | destination_identity | trigger_resource | authorized_role | authorized_permission | target_resource
    }
  ],
  summary: {
    trigger: String,               // What initiates the chain (e.g., "incident.update")
    destination: String,           // Where the chain terminates (e.g., "graph.microsoft.com")
    egress_category: String,       // identity_provider | itsm | monitoring | unknown
    blast_radius_domains: [String],// Domains reachable via this chain
    ownership_status: String,      // owned | orphaned | degraded
    total_roles: Number,           // Count of roles in the chain's identity
    max_sensitivity: String,       // highest sensitivity level across chain entities
    canonical_permissions: { reads: [String], writes: [String] } // OAA-aligned permission labels
  },
  composition_hash: String,        // SHA256 of sorted entity_id:role pairs
  first_detected_at: Date,
  last_seen_at: Date,
  sync_version: Number
}

Chain identity

Chain identity is anchored to the entry point entity (e.g., a Business Rule's sys_id). This provides stable identity even when downstream components change:

If a Script Include is replaced, the chain updates but keeps the same _id (anchored to the Business Rule)
If the Business Rule itself is deleted, the chain is marked as no longer seen (last_seen_at stops updating)

Chain roles

Each entity in the chain has a role describing its position:

Role	Entity Type	Description	Example
`entry_point`	automation	Trigger — the automation that initiates the chain	Business Rule
`code_component`	automation	Logic — automations called within the chain	Script Include
`outbound_target`	connection	Where the chain makes external calls	REST Message
`auth_credential`	credential	Authentication material used for external calls	OAuth Profile
`destination_identity`	identity	The authenticating entity at the chain's terminus	Service Principal
`trigger_resource`	resource	Resource or event that triggers the entry-point automation	incident table
`authorized_role`	role	Role held by the destination identity	hr_admin
`authorized_permission`	permission	Permission granted by an authorized role	hr_case.write
`target_resource`	resource	Resource at the end of the authorization path	hr_case table

Assembly process

Chains are assembled platform-side during sync processing:

BFS from each automation entity with role: entry_point
Follow typed edges: TRIGGERS_ON (reverse, to find trigger resource) -> CALLS -> INVOKES -> USES -> AUTHENTICATES_AS -> HAS_ROLE -> GRANTS -> APPLIES_TO
Collect entity refs with roles (see Chain Roles table above for entity_type → role mapping)
Compute composition_hash (SHA256 of sorted entity_id:role pairs)
Upsert into execution_chains: if composition_hash matches existing chain for same anchor_entity_id, update last_seen_at; otherwise create new version

Composition fingerprint

The composition_hash enables efficient change detection:

SHA256(sort([
  "entity-001:entry_point",
  "entity-002:code_component",
  "entity-003:outbound_target",
  "entity-004:auth_credential",
  "entity-005:destination_identity"
]))

If any entity is added, removed, or changes role, the hash changes, triggering a new chain version.

Phase 2: Temporal tracking (future)

Two additional collections for temporal comparison:

execution_chain_versions — snapshots of chain composition at each sync version, enabling "what changed" queries
execution_chain_events — lifecycle events (created, modified, entity_added, entity_removed, ownership_changed) for was-to-is delta views

Phase 2 enables queries like: "This chain gained access to HR data since last review" or "Show me all chains that changed ownership in the last 30 days."

Consequences

Positive

CISOs can list, filter, and search automation chains — standard MongoDB queries against a flat collection, no graph traversal required
Chain identity survives entity rotation — OAuth client_id change or SP credential rotation updates the chain but preserves its identity
Chain-level findings become possible — aggregate ownership gaps, blast radius expansion, and sensitivity drift can be evaluated at the chain level
Composition fingerprint enables efficient sync — only chains that actually changed are re-evaluated
Phase 2 temporal tracking enables "this chain gained access to HR data since last review"

Negative

One more collection to manage — acceptable trade-off per 6/6 team consensus
Assembly adds processing time during sync — BFS for each entry point; mitigated by running after entity upsert, not during
Stale chains — chains for deleted entry points will linger until garbage collection removes them (tracked via last_seen_at)

Neutral

Read-only projection — the execution_chains collection is derived from the entity graph; the graph remains the source of truth
No impact on existing APIs — chain endpoints are additive (new routes, not modifications to existing ones)

When to Reconsider

If chain assembly becomes a performance bottleneck during sync (would require incremental/streaming assembly)
If chain identity anchored to entry point proves insufficient (e.g., chains that share entry points but diverge downstream)

Analysis

Round 3 Synthesis — 5 agents, 5,760 lines, 6/6 unanimous

Status​

Context​

CISO requirement​

Decision​

New execution_chains collection​

Schema​

Chain identity​

Chain roles​

Assembly process​

Composition fingerprint​

Phase 2: Temporal tracking (future)​

Consequences​

Positive​

Negative​

Neutral​

When to Reconsider​

Analysis​