Access Chain — Identity-Anchored Control Primitive

GitHub Issue: SecurityV0/sv0-platform#216

Trigger: Sergey's March 26 feedback: "We should re-examine whether the current one-row-per-path presentation is the right unit for risk, drift, and remediation."

Reframe (March 31): Founder response elevated the question from grouping/noise-reduction to defining the canonical product unit. The access chain is the thing the customer acts on — not a grouped view over paths.

We show actual access chains based on observed execution, not just assigned permissions.

1. Current Data Model

Path Primary Key (4-tuple)

AuthorityPathDoc._id = SHA256(tenant_id : workload_id : identity_id : destination_id)

One row per (workload, identity, destination) triple. Multiple roles/actions reaching the same destination through the same identity are merged into a single path via composition_hash.

Existing Grouping Primitives

Primitive	Key	Indexed	Groups by
`path_lineage_id`	SHA256(tenant, workload, destination)	Yes	All identities reaching same destination from same workload
`identity_id`	Entity ID	Yes	All paths through one identity (queryable, not aggregated)
`workload_id`	Entity ID	Yes (compound)	All paths from one workload
`composition_hash`	SHA256(identity, roles, actions)	No	Change detection for upserts

What's Missing

No aggregation key for (identity + all destinations) or (identity + all data domains)
No API endpoint: "show all destinations reachable by identity X"
No UI component for collapsed/expanded identity grouping
No grouped drift delta showing role changes per identity across all affected paths

2. Noise Analysis — Demo Data

Path Distribution

Metric	Count
Total authority paths	29
Unique identities (with binding)	7
Paths without identity (unbound)	6
Paths with identity	23
Avg destinations per identity	3.3
Max paths for single identity	6 (id-svc-ascribe-prod)
Shared identity across workloads	1 (id-svc-foundry: 5 paths across 2 workloads)

Identity Fan-Out Detail

Identity	Workload(s)	Paths	Destinations	Roles
id-svc-finance	wl-invoice-rule	5	3 (financial-api, invoice-archive, apar-ledger)	5
id-svc-ascribe-prod	wl-ascribe-summarizer	6	5 (clinical-notes, oncology, psych, billing, health-files)	2
id-svc-foundry	wl-foundry-agent + wl-foundry-provisioner	5	5 (azure-openai, azure-func, sn-incident-api, logic-app, sn-incident-table)	4
id-svc-hr-sync	wl-hr-sync	2	2	2
id-svc-audit	wl-audit-export	2	2	1
id-svc-it-ops	wl-it-router	2	2	2
id-svc-sec-audit	wl-sec-logger	1	1	1

Noise Observations

id-svc-finance appears as 5 separate rows. The real story is: "one service account reaches 3 financial systems via 5 roles." That's one remediation conversation, not five.
id-svc-ascribe-prod appears as 6 rows touching clinical data. The risk is the combined reachable surface (5 healthcare destinations), not 6 individual path risks.
id-svc-foundry spans 2 workloads (agent + provisioner) — 5 rows. This is actually the most important case: one identity shared across workloads, meaning compromise of that identity affects both.
Remediation overlap: In the current flat view, the action "review roles assigned to id-svc-finance" would appear identically on 5 separate path rows.
Drift fragmentation: If id-svc-finance gains a new role, scope_drift fires on all 5 paths separately. The analyst sees 5 drift findings when the root cause is one role change on one identity.
Exposure is combinatorial. Risk is not the sum of isolated paths — it is the union of reachable surface. Several individually modest paths can create a materially different class of risk when combined across domains and systems.

Projected Scale

At production scale (100+ workloads, 50+ identities), the flat path list could grow to 500-2000 rows. Access chains would reduce this to 50-200 objects — a 5-10x reduction — and each object is directly actionable.

3. Impact Assessment

3.1 Evaluator Rules

14 entity-level rules: Evaluate on EntityDoc, not path. Reference entity.execution_paths[] for severity/evidence. No impact from access chain introduction.

10 path-level rules: Evaluate on individual AuthorityPathDoc. Finding IDs are keyed by path._id:

stableFindingId(tenantId, findingType, path._id)

Impact Area	Severity	Detail
Finding ID stability	HIGH if changed	Switching from path-keyed to identity-keyed finding IDs would orphan all historical findings
Path-level evaluation logic	None	Rules check individual path fields (sensitivity, execution_30d, via_roles) — still valid per-path
Drift comparison	None	scope_drift compares baseline vs current via_roles per path — works regardless of presentation

Conclusion: Path-level rules must continue to operate per-path. The access chain is a presentation/query layer — not a change to the evaluation primitive.

3.2 Risk Clusters

Risk cluster service groups paths by finding type combinations. Key metrics:

path_count    = matchingPathIds.length       // Stays as evidence count within access chains
identity_count = unique identity_ids          // Becomes access chain count
workload_count = unique workload_ids          // Stays meaningful as facet

Impact Area	Severity	Detail
path_count metric	Low	Stays valid — counts individual paths in cluster, even within access chains
identity_count	Changes meaning	Currently a derived count; becomes the primary access chain count
Cluster remediation	Medium	Currently deduplicates by `workloadName \|\| workload_id`, not by identity. `runs_as_name` is optionally carried but not the dedup key. Access-chain-level remediation requires semantic change to dedup by identity.
Governance checklist	None	Already flags identity reuse: "N identities across M paths"

Conclusion: Clusters already track identity_count. Making access chains the primary dimension is a natural fit.

3.3 Evidence Packs

Evidence sections reference entity.execution_paths[] (resource_id level), not path._id. The only path-ID dependency is in PathRemediationAction (remediation guidance).

Impact Area	Severity	Detail
authority_snapshot section	None	Lists all execution_paths for entity — identity-agnostic
scope_drift_detail section	None	Maps role changes to affected resources — identity-aware already
blast_radius section	None	Groups by sensitivity/domain — no path IDs
Remediation actions	Low	Include path_id for targeting — could include access_chain_id instead

Conclusion: Evidence packs are effectively unaffected. They're entity-scoped with resource-level detail.

3.4 API Layer

Current API: GET /api/v1/authority-paths returns flat list with identity_id filter support.

The access chain model needs a dedicated endpoint: GET /api/v1/access-chains. Flat path endpoint remains for backwards compatibility and evaluator-level queries.

3.5 UI

Currently flat tables in both AuthorityPathsListPage and RiskClusterDetailPage. One existing grouping computation:

// RiskClusterDetailPage — PathInlineExpand (lines 282-290)
const sameIdentityPaths = allPaths.filter((p) => p.identity?.id === identityId);
const distinctRoles = new Set(sameIdentityPaths.flatMap((p) => p.via_roles));
// Displays: "Identity total: N roles across M paths"

This is already doing identity-level aggregation — but only as an annotation inside expanded rows, not as a top-level object.

4. Option Analysis

Option A: Access chain as virtual first-class object

Approach: AuthorityPathDoc stays as-is. Introduce AccessChain as a virtual first-class object computed at query time. The access chain IS the product unit — paths are expandable supporting evidence within it.

Canonical axis for v1: identity. This is a product decision, not an open question. For the current wedge, identity is the anchor. Workloads, destinations, and data domains are context inside the access chain object. Unbound paths (where identity_id is null) fall back to workload as the grouping key, clearly labelled "[Unbound]".

Data model changes:

None to AuthorityPathDoc
New virtual type: AccessChain (computed at query time)

interface AccessChain {
  identity: { id: string; display_name: string; source_system: string } | null;
  workloads: Array<{ id: string; display_name: string }>;
  destinations: Array<{
    id: string;
    display_name: string;
    data_domain: string;
    sensitivity: string;
  }>;
  combined_roles: string[];           // union of via_roles across all paths
  combined_actions: string[];         // union of actions across all paths
  path_ids: string[];                 // underlying path IDs (supporting evidence)
  path_count: number;
  total_execution_30d: number;        // sum across paths
  last_execution_at: string | null;   // max across paths
  ownership_status: string;           // worst-case across paths
  max_finding_severity: string | null;
  finding_types: string[];            // union across paths
  active_finding_count: number;       // sum across paths
  behavior_pattern: string;           // see section 4.3
  chain_rank: number;                 // see section 4.2
}

API changes:

New endpoint: GET /api/v1/access-chains?workload_id=&identity_id=&...
Existing flat endpoint unchanged (backwards compatible)

Evaluator impact: None. Path-level findings continue per-path. Access chains aggregate findings for presentation.

Evidence impact: None. Evidence packs stay entity/path-scoped.

Risk cluster impact: Minimal. Could add access_chain_count alongside path_count in cluster metadata.

4.1 One Remediation Object per Access Chain

Each access chain produces one primary remediation object. This is not visual dedup of path-level actions — it is a semantically distinct remediation primitive.

Example:

Reduce svc-finance from 5 roles to 2, with impact across 3 destinations.

Not 5 path-level actions that happen to collapse visually.

The access-chain remediation object contains:

Identity: which service account or principal to act on
Current state: N roles reaching M destinations across K workloads
Recommended action: reduce to target role set, with justification per removed role
Impact scope: which destinations and workloads are affected by the change
Supporting evidence: individual path-level findings that drive the recommendation

Path-level PathRemediationAction records continue to exist as evidence. The access-chain remediation object is the unit the customer sees and acts on. The relationship is: one AccessChainRemediation references N PathRemediationAction records and may produce one MitigationActionDoc (see #215).

4.2 Access Chain Ranking Model

max_finding_severity and active_finding_count are not sufficient for prioritization. Access chains are ranked by a composite model driven by these factors:

Factor	Signal	Weight rationale
Blast radius	Number of distinct destinations reachable, weighted by data domain diversity	More destinations across more domains = wider damage if compromised
Sensitivity	Worst-case `sensitivity` across all destinations (restricted > confidential > internal > public)	A chain reaching one restricted system outranks one reaching five internal systems
Execution intensity	`total_execution_30d` normalized against peer chains in the same tenant	High-activity chains are confirmed attack surface, not theoretical
Drift	Count of `scope_drift` findings across paths, weighted by recency	Active drift signals loss of control — prioritize chains that are changing now
Cross-workload reuse	Number of distinct workloads sharing this identity	Shared identities are force multipliers — compromise spreads across workload boundaries
Ownership quality	`ownership_status` — orphaned > unknown > owned	Orphaned chains have no accountable owner and are hardest to remediate

The ranking model is deterministic (no ML, no probabilistic scoring). Implementation computes a composite rank per chain based on these factors and sorts the access chain list accordingly.

4.3 Behavior and Drift Narrative

Aggregating execution_30d and last_execution_at is not enough to tell the story of an access chain over time. Each access chain carries a behavior pattern classification:

Newly active: First observed execution within the last 30 days. May indicate a newly provisioned identity or a previously dormant one that has started executing.
Steadily active: Consistent execution volume over multiple observation windows. This is the baseline — expected behavior for a legitimate workload.
Bursty: Execution spikes followed by quiet periods. May indicate batch jobs, incident response automation, or anomalous usage patterns.
Expanding over time: The set of destinations or roles is growing across snapshots. More destinations or more roles than the previous baseline — drift in action.
Dormant but dangerous: No recent execution, but the identity retains broad permissions. These are standing privileges that could be exploited without triggering execution-based alerts.

The behavior pattern is derived from the execution history across snapshots (not a single point-in-time metric). It is displayed prominently on the access chain card to give the analyst immediate context about the chain's trajectory.

Option A Summary

Pros:

Zero migration risk — path-level findings, IDs, evidence all unchanged
Backwards compatible API
Can ship incrementally (API first, then UI)
Access chain is the product unit from day one — not a toggle layered on top of flat paths
Ranking model and behavior narrative make each chain self-explanatory

Cons:

Access chain is computed at query time (no materialized document)
"Which view is canonical for reports and evidence packs?" — access chain for customer-facing, paths for evaluator internals

Option B: New primary entity — AccessChain as stored document

Approach: Materialize access chains as a new MongoDB collection alongside paths.

Data model changes:

New collection: access_chains
New document type: AccessChainDoc (materialized, not virtual)
New ID builder: buildAccessChainId(tenantId, identityId)
Path documents gain access_chain_id foreign key

API changes:

New resource: /api/v1/access-chains with full CRUD-like query support
Paths become children: /api/v1/access-chains/:id/paths
Findings could be queried at chain level: /api/v1/access-chains/:id/findings

Evaluator impact:

Path-level rules continue per-path (finding IDs unchanged)
Could add chain-level rules in the future (e.g., "this identity's combined blast radius exceeds threshold")
Would need chain refresh after path materialization

Evidence impact:

Evidence packs could reference access_chain_id in addition to path_id
New evidence section: "access chain snapshot" showing all destinations

Risk cluster impact:

Clusters could group by chain instead of path
identity_count becomes chain_count (same semantics, clearer naming)
Remediation targets by chain instead of path

Pros:

Clean, queryable entity with its own lifecycle
Enables chain-level findings, drift tracking, and remediation
Natural unit for "what can this identity reach?" question
Could support future "access review" workflow (approve/reject per access chain)

Cons:

New collection, new materialization step, new indexes
Chain-to-path sync complexity (what if paths change but chain isn't refreshed?)
Finding ID namespace decision: chain-scoped findings would need new IDs
More infrastructure to build and maintain
Risk of premature abstraction if the grouping key changes

5. Recommendation

Option A — access chain as virtual first-class object — is the right first step.

This is not a grouping toggle. It is an access-chain-first product model implemented via a virtual computation layer.

Rationale:

Lower risk. No migration, no new collection, no finding ID changes. Ship it and validate with real users before committing to a materialized entity.
Identity is the canonical axis for v1. This is a product decision per founder direction — not an open question. Workloads, destinations, and data domains are facets inside the access chain.
Demo data confirms the value. Even with 29 paths, access chains reduce the list to 7+6=13 objects — each directly actionable. The remediation deduplication alone justifies the model.
Infrastructure exists. The cluster detail page already computes identity totals. The API already supports identity_id filtering. The leap to an access chain endpoint is small.
Path to Option B is clear. If access chains prove to be the right permanent model, materializing AccessChainDoc from the virtual logic is straightforward. Option A is a reversible bet; Option B is not.

Proposed Sequence

Phase	Scope	Effort
Phase 1	API: `GET /api/v1/access-chains` returns `AccessChain[]` with identity as canonical axis	2-3 days
Phase 2	UI: Access-chain-first list page (replaces flat path list as default view)	2-3 days
Phase 3	UI: access-chain default on RiskClusterDetailPage, cluster navigation by chain	1-2 days
Phase 4	Remediation: one remediation object per access chain (deduplicated by identity). Must define how chain-level actions relate to path-level `PathRemediationAction` and `MitigationActionDoc` (#215).	3-4 days
Phase 5	Ranking + behavior: access chain ranking model and behavior pattern classification. Must define derivation rules for composite rank and how behavior patterns are computed from snapshot history.	3-4 days

Phase 1-3 are implementation-ready (~1 week). Phase 4-5 need one more design pass before implementation — they introduce access-chain-level remediation, ranking, and behavior as new product concepts. Total: ~2 weeks of focused work, with a design checkpoint between Phase 3 and Phase 4.

API Contract Sketch

GET /api/v1/access-chains?status=active

Response:
{
  "data": [
    {
      "identity": { "id": "id-svc-finance", "display_name": "svc-finance", "source_system": "entra_id" },
      "workloads": [{ "id": "wl-invoice-rule", "display_name": "Invoice Rule Engine" }],
      "destinations": [
        { "id": "res-financial-api", "display_name": "Financial API", "data_domain": "Financial", "sensitivity": "restricted" },
        { "id": "res-invoice-archive", "display_name": "Invoice Archive", "data_domain": "Financial", "sensitivity": "confidential" },
        { "id": "res-apar-ledger", "display_name": "AP/AR Ledger", "data_domain": "Financial", "sensitivity": "restricted" }
      ],
      "combined_roles": ["sn-role-finance-read", "sn-role-invoice-view", "sn-role-ap-write", "sn-role-ar-write", "sn-role-ledger-admin"],
      "path_count": 5,
      "path_ids": ["abc123", "def456", "..."],
      "total_execution_30d": 847,
      "last_execution_at": "2026-03-25T14:30:00Z",
      "ownership_status": "owned",
      "max_finding_severity": "high",
      "finding_types": ["scope_drift", "reachable_sensitive_domain"],
      "active_finding_count": 10,
      "behavior_pattern": "expanding_over_time",
      "chain_rank": 1
    }
  ],
  "meta": {
    "total_chains": 9,
    "total_paths": 29
  }
}

UX Model — Access-Chain-First

The product UX centers on the access chain, not on a table of paths with a grouping toggle. The mental model for every screen is:

One identity — who is this?
Actual access chain — what can it reach, via which roles?
What it reaches — destinations, data domains, sensitivity levels
Why it matters — ranking factors, behavior pattern, drift narrative
What to do next — one remediation action per chain

Access Chain List Page

The primary view is a ranked list of access chain cards. Each card is a self-contained summary:

┌─────────────────────────────────────────────────────────────────┐
│ #1  svc-finance                                          Owned  │
│     Invoice Rule Engine                                         │
│                                                                 │
│     5 roles → 3 destinations (Financial)                        │
│     847 exec/30d · Expanding over time                          │
│     Findings: scope_drift, reachable_sensitive_domain           │
│                                                                 │
│     Action: Reduce from 5 roles to 2,                           │
│             impact across 3 destinations                        │
├─────────────────────────────────────────────────────────────────┤
│ #2  svc-ascribe-prod                                  Orphaned  │
│     Ascribe Summarizer                                          │
│                                                                 │
│     2 roles → 5 destinations (Healthcare)                       │
│     312 exec/30d · Steadily active                              │
│     Findings: reachable_sensitive_domain                        │
│                                                                 │
│     Action: Assign owner, review clinical data access           │
├─────────────────────────────────────────────────────────────────┤
│ #3  svc-foundry                                       Orphaned  │
│     Foundry Agent + Foundry Provisioner                         │
│     !! Shared across 2 workloads                                │
│                                                                 │
│     4 roles → 5 destinations (Infrastructure, IT)               │
│     715 exec/30d · Bursty                                       │
│     Findings: scope_drift, cross_workload_identity              │
│                                                                 │
│     Action: Split into per-workload identities                  │
└─────────────────────────────────────────────────────────────────┘

Access Chain Detail Page

Clicking into a chain shows the full detail:

┌─────────────────────────────────────────────────────────────────┐
│ svc-finance · Invoice Rule Engine                        Owned  │
│                                                                 │
│ CHAIN SUMMARY                                                   │
│ 5 roles → 3 destinations │ 847 exec/30d │ Expanding over time   │
│ Rank: #1 of 13 chains                                          │
│                                                                 │
│ WHY IT MATTERS                                                  │
│ · Blast radius: 3 financial systems across 1 data domain        │
│ · Sensitivity: restricted (Financial API, AP/AR Ledger)         │
│ · Drift: scope expanded — 2 new roles since baseline            │
│ · Execution: 847 calls/30d, trending up                         │
│                                                                 │
│ WHAT TO DO NEXT                                                 │
│ Reduce svc-finance from 5 roles to 2, with impact across       │
│ 3 destinations. Remove: ap-write, ar-write, ledger-admin.       │
│ Keep: finance-read, invoice-view.                               │
│                                                                 │
│ DESTINATIONS (expandable supporting evidence)                   │
│ ▸ Financial API       restricted  │ 421 exec │ scope_drift      │
│ ▸ Invoice Archive     confidential│ 312 exec │ scope_drift      │
│ ▸ AP/AR Ledger        restricted  │ 114 exec │ scope_drift      │
│                                                                 │
│ ROLES                                                           │
│ finance-read, invoice-view, ap-write, ar-write, ledger-admin    │
└─────────────────────────────────────────────────────────────────┘

Expanding a destination row shows the underlying path-level detail: specific roles, actions, execution counts, and individual findings. This is the supporting evidence layer — not the primary frame.

6. Risks and Open Questions

Unbound paths. Paths with identity_id = null don't form access chains naturally. Proposed: group by workload for unbound paths, clearly labelled "[Unbound]".
Cross-workload identity sharing. id-svc-foundry spans 2 workloads. The access chain should surface this prominently — it's a higher-risk pattern and a force multiplier.
Pagination. Access chain response may not fit cursor-based pagination cleanly (chains have variable path counts). May need offset pagination or full-result with client-side virtualization.
Reports and exports. If reports currently list flat paths, they'll need an access-chain variant. Defer until after Phase 3.
Grouped drift aggregation. When drift is shown at the access chain level, how are severity, finding counts, and finding types derived? Rules: (a) max_finding_severity = worst across paths, (b) active_finding_count = sum across paths (with dedup against entity-level findings to avoid inflation), (c) finding_types = union across paths.
Behavior pattern derivation. Behavior classification requires access to multiple snapshots. Phase 5 must define which snapshot fields are compared and the threshold logic for each pattern category.
Ranking model calibration. The composite rank needs weight calibration against real customer environments. Start with equal weights, adjust based on feedback.

Next Action

Status: research-complete

Decision needed from: Sergey (product direction), Ivan (engineering scope)

Options:

Adopt Phase 1-3 — API access chain endpoint + UI access-chain-first pages. ~1 week. Validates the model with minimal risk. Implementation-ready now.
Adopt Phase 1-5 — Full sequence including remediation, ranking, and behavior. ~2 weeks. Phase 4-5 require an additional design pass — they introduce access-chain remediation and ranking as new product concepts.
Defer — Wait for more connector data to validate access chain model at scale.

GitHub Issue: SecurityV0/sv0-platform#216

1. Current Data Model​

Path Primary Key (4-tuple)​

Existing Grouping Primitives​

What's Missing​

2. Noise Analysis — Demo Data​

Path Distribution​

Identity Fan-Out Detail​

Noise Observations​

Projected Scale​

3. Impact Assessment​

3.1 Evaluator Rules​

3.2 Risk Clusters​

3.3 Evidence Packs​

3.4 API Layer​

3.5 UI​

4. Option Analysis​

Option A: Access chain as virtual first-class object​

4.1 One Remediation Object per Access Chain​

4.2 Access Chain Ranking Model​

4.3 Behavior and Drift Narrative​

Option A Summary​

Option B: New primary entity — AccessChain as stored document​

5. Recommendation​

Proposed Sequence​

API Contract Sketch​

UX Model — Access-Chain-First​

Access Chain List Page​

Access Chain Detail Page​

6. Risks and Open Questions​

Next Action​