Skip to main content

ADR-008: Execution Chains Collection

Status

Accepted (2026-02-13). Amended 2026-05-19 — see the update block immediately below.

Update 2026-05-19 — last_seen_at is not a GC signal yet.

The Consequences section below ("Stale chains — chains for deleted entry points will linger until garbage collection removes them, tracked via last_seen_at") describes intent, not implementation. As of 2026-05-19, code audit of sv0-platform confirms:

  • last_seen_at is written at src/ingestion/chain-builder.ts:147 (every successful assembly) and surfaced through src/storage/mongo/adapters/execution-chain-adapter.ts:25.
  • last_seen_at is read for display and as a default sort field. It is not read by any GC pass — no GC code against execution_chains exists.

Net effect: stale chain anchors are immortal until manual cleanup. GC of orphaned chains is tracked as tech debt at sv0-platform#1176. A future ADR addresses the GC trigger and the deletion semantics; this ADR is amended to remove the implicit claim that last_seen_at already does the job.

This amendment was made alongside ADR-026 (chain re-materialization triggers), which deliberately preserves the last_seen_at semantics so a future GC pass has a usable signal — it rejected an evaluator-side re-materialization plan in part because that plan would have overwritten last_seen_at on every evaluation and made the signal unusable.

The historical text in the Consequences section below is preserved as-is; this block governs.


Context

The platform needs to track automation chains as durable, listable entities with stable identity across scans. The current model can reconstruct chains on the fly via BFS traversal of the entity graph, but this approach has significant limitations:

  • No listability — CISOs cannot ask "show me all automations that can reach HR data" without triggering a full graph traversal for every automation entry point
  • No bookmarking — a specific chain cannot be referenced by ID in findings, reviews, or exports
  • No diffing — comparing chain composition across sync versions requires reconstructing both versions from scratch
  • No versioning — there is no record of when a chain first appeared, when it changed, or what changed

CISO requirement

The core query pattern is: "List all automation chains, filter by blast radius domain, sort by sensitivity, show ownership status." This is a collection query, not a graph traversal. Without a dedicated collection, every such request requires O(N) BFS traversals where N is the number of automation entry points.


Decision

New execution_chains collection

Create a new MongoDB collection that persists assembled execution chains as first-class entities.

Schema

{
_id: "chain-uuid",
tenant_id: String,
name: String, // Human-readable (e.g., "BR: Auto-close incidents → Entra ID")
anchor_entity_id: String, // Entry point entity_id (stable root)
entity_refs: [
{
entity_id: String,
entity_type: String, // resource | automation | connection | credential | identity | role | permission
role: String // entry_point | code_component | outbound_target | auth_credential | destination_identity | trigger_resource | authorized_role | authorized_permission | target_resource
}
],
summary: {
trigger: String, // What initiates the chain (e.g., "incident.update")
destination: String, // Where the chain terminates (e.g., "graph.microsoft.com")
egress_category: String, // identity_provider | itsm | monitoring | unknown
blast_radius_domains: [String],// Domains reachable via this chain
ownership_status: String, // owned | orphaned | degraded
total_roles: Number, // Count of roles in the chain's identity
max_sensitivity: String, // highest sensitivity level across chain entities
canonical_permissions: { reads: [String], writes: [String] } // OAA-aligned permission labels
},
composition_hash: String, // SHA256 of sorted entity_id:role pairs
first_detected_at: Date,
last_seen_at: Date,
sync_version: Number
}

Chain identity

Chain identity is anchored to the entry point entity (e.g., a Business Rule's sys_id). This provides stable identity even when downstream components change:

  • If a Script Include is replaced, the chain updates but keeps the same _id (anchored to the Business Rule)
  • If the Business Rule itself is deleted, the chain is marked as no longer seen (last_seen_at stops updating)

Chain roles

Each entity in the chain has a role describing its position:

RoleEntity TypeDescriptionExample
entry_pointautomationTrigger — the automation that initiates the chainBusiness Rule
code_componentautomationLogic — automations called within the chainScript Include
outbound_targetconnectionWhere the chain makes external callsREST Message
auth_credentialcredentialAuthentication material used for external callsOAuth Profile
destination_identityidentityThe authenticating entity at the chain's terminusService Principal
trigger_resourceresourceResource or event that triggers the entry-point automationincident table
authorized_roleroleRole held by the destination identityhr_admin
authorized_permissionpermissionPermission granted by an authorized rolehr_case.write
target_resourceresourceResource at the end of the authorization pathhr_case table

Assembly process

Chains are assembled platform-side during sync processing:

  1. BFS from each automation entity with role: entry_point
  2. Follow typed edges: TRIGGERS_ON (reverse, to find trigger resource) -> CALLS -> INVOKES -> USES -> AUTHENTICATES_AS -> HAS_ROLE -> GRANTS -> APPLIES_TO
  3. Collect entity refs with roles (see Chain Roles table above for entity_type → role mapping)
  4. Compute composition_hash (SHA256 of sorted entity_id:role pairs)
  5. Upsert into execution_chains: if composition_hash matches existing chain for same anchor_entity_id, update last_seen_at; otherwise create new version

Composition fingerprint

The composition_hash enables efficient change detection:

SHA256(sort([
"entity-001:entry_point",
"entity-002:code_component",
"entity-003:outbound_target",
"entity-004:auth_credential",
"entity-005:destination_identity"
]))

If any entity is added, removed, or changes role, the hash changes, triggering a new chain version.

Phase 2: Temporal tracking (future)

Two additional collections for temporal comparison:

  • execution_chain_versions — snapshots of chain composition at each sync version, enabling "what changed" queries
  • execution_chain_events — lifecycle events (created, modified, entity_added, entity_removed, ownership_changed) for was-to-is delta views

Phase 2 enables queries like: "This chain gained access to HR data since last review" or "Show me all chains that changed ownership in the last 30 days."


Consequences

Positive

  • CISOs can list, filter, and search automation chains — standard MongoDB queries against a flat collection, no graph traversal required
  • Chain identity survives entity rotation — OAuth client_id change or SP credential rotation updates the chain but preserves its identity
  • Chain-level findings become possible — aggregate ownership gaps, blast radius expansion, and sensitivity drift can be evaluated at the chain level
  • Composition fingerprint enables efficient sync — only chains that actually changed are re-evaluated
  • Phase 2 temporal tracking enables "this chain gained access to HR data since last review"

Negative

  • One more collection to manage — acceptable trade-off per 6/6 team consensus
  • Assembly adds processing time during sync — BFS for each entry point; mitigated by running after entity upsert, not during
  • Stale chains — chains for deleted entry points will linger until garbage collection removes them (tracked via last_seen_at)

Neutral

  • Read-only projection — the execution_chains collection is derived from the entity graph; the graph remains the source of truth
  • No impact on existing APIs — chain endpoints are additive (new routes, not modifications to existing ones)

When to Reconsider

  • If chain assembly becomes a performance bottleneck during sync (would require incremental/streaming assembly)
  • If chain identity anchored to entry point proves insufficient (e.g., chains that share entry points but diverge downstream)

Analysis

Round 3 Synthesis — 5 agents, 5,760 lines, 6/6 unanimous