Access Chain — Identity-Anchored Control Primitive
GitHub Issue: SecurityV0/sv0-platform#216
Trigger: Sergey's March 26 feedback: "We should re-examine whether the current one-row-per-path presentation is the right unit for risk, drift, and remediation."
Reframe (March 31): Founder response elevated the question from grouping/noise-reduction to defining the canonical product unit. The access chain is the thing the customer acts on — not a grouped view over paths.
We show actual access chains based on observed execution, not just assigned permissions.
1. Current Data Model
Path Primary Key (4-tuple)
AuthorityPathDoc._id = SHA256(tenant_id : workload_id : identity_id : destination_id)
One row per (workload, identity, destination) triple. Multiple roles/actions reaching the same destination through the same identity are merged into a single path via composition_hash.
Existing Grouping Primitives
| Primitive | Key | Indexed | Groups by |
|---|---|---|---|
path_lineage_id | SHA256(tenant, workload, destination) | Yes | All identities reaching same destination from same workload |
identity_id | Entity ID | Yes | All paths through one identity (queryable, not aggregated) |
workload_id | Entity ID | Yes (compound) | All paths from one workload |
composition_hash | SHA256(identity, roles, actions) | No | Change detection for upserts |
What's Missing
-
No aggregation key for (identity + all destinations) or (identity + all data domains)
-
No API endpoint: "show all destinations reachable by identity X"
-
No UI component for collapsed/expanded identity grouping
-
No grouped drift delta showing role changes per identity across all affected paths
2. Noise Analysis — Demo Data
Path Distribution
| Metric | Count |
|---|---|
| Total authority paths | 29 |
| Unique identities (with binding) | 7 |
| Paths without identity (unbound) | 6 |
| Paths with identity | 23 |
| Avg destinations per identity | 3.3 |
| Max paths for single identity | 6 (id-svc-ascribe-prod) |
| Shared identity across workloads | 1 (id-svc-foundry: 5 paths across 2 workloads) |
Identity Fan-Out Detail
| Identity | Workload(s) | Paths | Destinations | Roles |
|---|---|---|---|---|
| id-svc-finance | wl-invoice-rule | 5 | 3 (financial-api, invoice-archive, apar-ledger) | 5 |
| id-svc-ascribe-prod | wl-ascribe-summarizer | 6 | 5 (clinical-notes, oncology, psych, billing, health-files) | 2 |
| id-svc-foundry | wl-foundry-agent + wl-foundry-provisioner | 5 | 5 (azure-openai, azure-func, sn-incident-api, logic-app, sn-incident-table) | 4 |
| id-svc-hr-sync | wl-hr-sync | 2 | 2 | 2 |
| id-svc-audit | wl-audit-export | 2 | 2 | 1 |
| id-svc-it-ops | wl-it-router | 2 | 2 | 2 |
| id-svc-sec-audit | wl-sec-logger | 1 | 1 | 1 |
Noise Observations
-
id-svc-finance appears as 5 separate rows. The real story is: "one service account reaches 3 financial systems via 5 roles." That's one remediation conversation, not five.
-
id-svc-ascribe-prod appears as 6 rows touching clinical data. The risk is the combined reachable surface (5 healthcare destinations), not 6 individual path risks.
-
id-svc-foundry spans 2 workloads (agent + provisioner) — 5 rows. This is actually the most important case: one identity shared across workloads, meaning compromise of that identity affects both.
-
Remediation overlap: In the current flat view, the action "review roles assigned to id-svc-finance" would appear identically on 5 separate path rows.
-
Drift fragmentation: If id-svc-finance gains a new role, scope_drift fires on all 5 paths separately. The analyst sees 5 drift findings when the root cause is one role change on one identity.
-
Exposure is combinatorial. Risk is not the sum of isolated paths — it is the union of reachable surface. Several individually modest paths can create a materially different class of risk when combined across domains and systems.
Projected Scale
At production scale (100+ workloads, 50+ identities), the flat path list could grow to 500-2000 rows. Access chains would reduce this to 50-200 objects — a 5-10x reduction — and each object is directly actionable.
3. Impact Assessment
3.1 Evaluator Rules
14 entity-level rules: Evaluate on EntityDoc, not path. Reference entity.execution_paths[] for severity/evidence. No impact from access chain introduction.
10 path-level rules: Evaluate on individual AuthorityPathDoc. Finding IDs are keyed by path._id:
stableFindingId(tenantId, findingType, path._id)
| Impact Area | Severity | Detail |
|---|---|---|
| Finding ID stability | HIGH if changed | Switching from path-keyed to identity-keyed finding IDs would orphan all historical findings |
| Path-level evaluation logic | None | Rules check individual path fields (sensitivity, execution_30d, via_roles) — still valid per-path |
| Drift comparison | None | scope_drift compares baseline vs current via_roles per path — works regardless of presentation |
Conclusion: Path-level rules must continue to operate per-path. The access chain is a presentation/query layer — not a change to the evaluation primitive.
3.2 Risk Clusters
Risk cluster service groups paths by finding type combinations. Key metrics:
path_count = matchingPathIds.length // Stays as evidence count within access chains
identity_count = unique identity_ids // Becomes access chain count
workload_count = unique workload_ids // Stays meaningful as facet
| Impact Area | Severity | Detail |
|---|---|---|
| path_count metric | Low | Stays valid — counts individual paths in cluster, even within access chains |
| identity_count | Changes meaning | Currently a derived count; becomes the primary access chain count |
| Cluster remediation | Medium | Currently deduplicates by workloadName || workload_id, not by identity. runs_as_name is optionally carried but not the dedup key. Access-chain-level remediation requires semantic change to dedup by identity. |
| Governance checklist | None | Already flags identity reuse: "N identities across M paths" |
Conclusion: Clusters already track identity_count. Making access chains the primary dimension is a natural fit.
3.3 Evidence Packs
Evidence sections reference entity.execution_paths[] (resource_id level), not path._id. The only path-ID dependency is in PathRemediationAction (remediation guidance).
| Impact Area | Severity | Detail |
|---|---|---|
| authority_snapshot section | None | Lists all execution_paths for entity — identity-agnostic |
| scope_drift_detail section | None | Maps role changes to affected resources — identity-aware already |
| blast_radius section | None | Groups by sensitivity/domain — no path IDs |
| Remediation actions | Low | Include path_id for targeting — could include access_chain_id instead |
Conclusion: Evidence packs are effectively unaffected. They're entity-scoped with resource-level detail.
3.4 API Layer
Current API: GET /api/v1/authority-paths returns flat list with identity_id filter support.
The access chain model needs a dedicated endpoint: GET /api/v1/access-chains. Flat path endpoint remains for backwards compatibility and evaluator-level queries.
3.5 UI
Currently flat tables in both AuthorityPathsListPage and RiskClusterDetailPage. One existing grouping computation:
// RiskClusterDetailPage — PathInlineExpand (lines 282-290)
const sameIdentityPaths = allPaths.filter((p) => p.identity?.id === identityId);
const distinctRoles = new Set(sameIdentityPaths.flatMap((p) => p.via_roles));
// Displays: "Identity total: N roles across M paths"
This is already doing identity-level aggregation — but only as an annotation inside expanded rows, not as a top-level object.
4. Option Analysis
Option A: Access chain as virtual first-class object
Approach: AuthorityPathDoc stays as-is. Introduce AccessChain as a virtual first-class object computed at query time. The access chain IS the product unit — paths are expandable supporting evidence within it.
Canonical axis for v1: identity. This is a product decision, not an open question. For the current wedge, identity is the anchor. Workloads, destinations, and data domains are context inside the access chain object. Unbound paths (where identity_id is null) fall back to workload as the grouping key, clearly labelled "[Unbound]".
Data model changes:
-
None to AuthorityPathDoc
-
New virtual type:
AccessChain(computed at query time)
interface AccessChain {
identity: { id: string; display_name: string; source_system: string } | null;
workloads: Array<{ id: string; display_name: string }>;
destinations: Array<{
id: string;
display_name: string;
data_domain: string;
sensitivity: string;
}>;
combined_roles: string[]; // union of via_roles across all paths
combined_actions: string[]; // union of actions across all paths
path_ids: string[]; // underlying path IDs (supporting evidence)
path_count: number;
total_execution_30d: number; // sum across paths
last_execution_at: string | null; // max across paths
ownership_status: string; // worst-case across paths
max_finding_severity: string | null;
finding_types: string[]; // union across paths
active_finding_count: number; // sum across paths
behavior_pattern: string; // see section 4.3
chain_rank: number; // see section 4.2
}
API changes:
-
New endpoint:
GET /api/v1/access-chains?workload_id=&identity_id=&... -
Existing flat endpoint unchanged (backwards compatible)
Evaluator impact: None. Path-level findings continue per-path. Access chains aggregate findings for presentation.
Evidence impact: None. Evidence packs stay entity/path-scoped.
Risk cluster impact: Minimal. Could add access_chain_count alongside path_count in cluster metadata.
4.1 One Remediation Object per Access Chain
Each access chain produces one primary remediation object. This is not visual dedup of path-level actions — it is a semantically distinct remediation primitive.
Example:
Reduce svc-finance from 5 roles to 2, with impact across 3 destinations.
Not 5 path-level actions that happen to collapse visually.
The access-chain remediation object contains:
-
Identity: which service account or principal to act on
-
Current state: N roles reaching M destinations across K workloads
-
Recommended action: reduce to target role set, with justification per removed role
-
Impact scope: which destinations and workloads are affected by the change
-
Supporting evidence: individual path-level findings that drive the recommendation
Path-level PathRemediationAction records continue to exist as evidence. The access-chain remediation object is the unit the customer sees and acts on. The relationship is: one AccessChainRemediation references N PathRemediationAction records and may produce one MitigationActionDoc (see #215).
4.2 Access Chain Ranking Model
max_finding_severity and active_finding_count are not sufficient for prioritization. Access chains are ranked by a composite model driven by these factors:
| Factor | Signal | Weight rationale |
|---|---|---|
| Blast radius | Number of distinct destinations reachable, weighted by data domain diversity | More destinations across more domains = wider damage if compromised |
| Sensitivity | Worst-case sensitivity across all destinations (restricted > confidential > internal > public) | A chain reaching one restricted system outranks one reaching five internal systems |
| Execution intensity | total_execution_30d normalized against peer chains in the same tenant | High-activity chains are confirmed attack surface, not theoretical |
| Drift | Count of scope_drift findings across paths, weighted by recency | Active drift signals loss of control — prioritize chains that are changing now |
| Cross-workload reuse | Number of distinct workloads sharing this identity | Shared identities are force multipliers — compromise spreads across workload boundaries |
| Ownership quality | ownership_status — orphaned > unknown > owned | Orphaned chains have no accountable owner and are hardest to remediate |
The ranking model is deterministic (no ML, no probabilistic scoring). Implementation computes a composite rank per chain based on these factors and sorts the access chain list accordingly.
4.3 Behavior and Drift Narrative
Aggregating execution_30d and last_execution_at is not enough to tell the story of an access chain over time. Each access chain carries a behavior pattern classification:
-
Newly active: First observed execution within the last 30 days. May indicate a newly provisioned identity or a previously dormant one that has started executing.
-
Steadily active: Consistent execution volume over multiple observation windows. This is the baseline — expected behavior for a legitimate workload.
-
Bursty: Execution spikes followed by quiet periods. May indicate batch jobs, incident response automation, or anomalous usage patterns.
-
Expanding over time: The set of destinations or roles is growing across snapshots. More destinations or more roles than the previous baseline — drift in action.
-
Dormant but dangerous: No recent execution, but the identity retains broad permissions. These are standing privileges that could be exploited without triggering execution-based alerts.
The behavior pattern is derived from the execution history across snapshots (not a single point-in-time metric). It is displayed prominently on the access chain card to give the analyst immediate context about the chain's trajectory.
Option A Summary
Pros:
-
Zero migration risk — path-level findings, IDs, evidence all unchanged
-
Backwards compatible API
-
Can ship incrementally (API first, then UI)
-
Access chain is the product unit from day one — not a toggle layered on top of flat paths
-
Ranking model and behavior narrative make each chain self-explanatory
Cons:
-
Access chain is computed at query time (no materialized document)
-
"Which view is canonical for reports and evidence packs?" — access chain for customer-facing, paths for evaluator internals
Option B: New primary entity — AccessChain as stored document
Approach: Materialize access chains as a new MongoDB collection alongside paths.
Data model changes:
-
New collection:
access_chains -
New document type:
AccessChainDoc(materialized, not virtual) -
New ID builder:
buildAccessChainId(tenantId, identityId) -
Path documents gain
access_chain_idforeign key
API changes:
-
New resource:
/api/v1/access-chainswith full CRUD-like query support -
Paths become children:
/api/v1/access-chains/:id/paths -
Findings could be queried at chain level:
/api/v1/access-chains/:id/findings
Evaluator impact:
-
Path-level rules continue per-path (finding IDs unchanged)
-
Could add chain-level rules in the future (e.g., "this identity's combined blast radius exceeds threshold")
-
Would need chain refresh after path materialization
Evidence impact:
-
Evidence packs could reference access_chain_id in addition to path_id
-
New evidence section: "access chain snapshot" showing all destinations
Risk cluster impact:
-
Clusters could group by chain instead of path
-
identity_count becomes chain_count (same semantics, clearer naming)
-
Remediation targets by chain instead of path
Pros:
-
Clean, queryable entity with its own lifecycle
-
Enables chain-level findings, drift tracking, and remediation
-
Natural unit for "what can this identity reach?" question
-
Could support future "access review" workflow (approve/reject per access chain)
Cons:
-
New collection, new materialization step, new indexes
-
Chain-to-path sync complexity (what if paths change but chain isn't refreshed?)
-
Finding ID namespace decision: chain-scoped findings would need new IDs
-
More infrastructure to build and maintain
-
Risk of premature abstraction if the grouping key changes
5. Recommendation
Option A — access chain as virtual first-class object — is the right first step.
This is not a grouping toggle. It is an access-chain-first product model implemented via a virtual computation layer.
Rationale:
-
Lower risk. No migration, no new collection, no finding ID changes. Ship it and validate with real users before committing to a materialized entity.
-
Identity is the canonical axis for v1. This is a product decision per founder direction — not an open question. Workloads, destinations, and data domains are facets inside the access chain.
-
Demo data confirms the value. Even with 29 paths, access chains reduce the list to 7+6=13 objects — each directly actionable. The remediation deduplication alone justifies the model.
-
Infrastructure exists. The cluster detail page already computes identity totals. The API already supports
identity_idfiltering. The leap to an access chain endpoint is small. -
Path to Option B is clear. If access chains prove to be the right permanent model, materializing
AccessChainDocfrom the virtual logic is straightforward. Option A is a reversible bet; Option B is not.
Proposed Sequence
| Phase | Scope | Effort |
|---|---|---|
| Phase 1 | API: GET /api/v1/access-chains returns AccessChain[] with identity as canonical axis | 2-3 days |
| Phase 2 | UI: Access-chain-first list page (replaces flat path list as default view) | 2-3 days |
| Phase 3 | UI: access-chain default on RiskClusterDetailPage, cluster navigation by chain | 1-2 days |
| Phase 4 | Remediation: one remediation object per access chain (deduplicated by identity). Must define how chain-level actions relate to path-level PathRemediationAction and MitigationActionDoc (#215). | 3-4 days |
| Phase 5 | Ranking + behavior: access chain ranking model and behavior pattern classification. Must define derivation rules for composite rank and how behavior patterns are computed from snapshot history. | 3-4 days |
Phase 1-3 are implementation-ready (~1 week). Phase 4-5 need one more design pass before implementation — they introduce access-chain-level remediation, ranking, and behavior as new product concepts. Total: ~2 weeks of focused work, with a design checkpoint between Phase 3 and Phase 4.
API Contract Sketch
GET /api/v1/access-chains?status=active
Response:
{
"data": [
{
"identity": { "id": "id-svc-finance", "display_name": "svc-finance", "source_system": "entra_id" },
"workloads": [{ "id": "wl-invoice-rule", "display_name": "Invoice Rule Engine" }],
"destinations": [
{ "id": "res-financial-api", "display_name": "Financial API", "data_domain": "Financial", "sensitivity": "restricted" },
{ "id": "res-invoice-archive", "display_name": "Invoice Archive", "data_domain": "Financial", "sensitivity": "confidential" },
{ "id": "res-apar-ledger", "display_name": "AP/AR Ledger", "data_domain": "Financial", "sensitivity": "restricted" }
],
"combined_roles": ["sn-role-finance-read", "sn-role-invoice-view", "sn-role-ap-write", "sn-role-ar-write", "sn-role-ledger-admin"],
"path_count": 5,
"path_ids": ["abc123", "def456", "..."],
"total_execution_30d": 847,
"last_execution_at": "2026-03-25T14:30:00Z",
"ownership_status": "owned",
"max_finding_severity": "high",
"finding_types": ["scope_drift", "reachable_sensitive_domain"],
"active_finding_count": 10,
"behavior_pattern": "expanding_over_time",
"chain_rank": 1
}
],
"meta": {
"total_chains": 9,
"total_paths": 29
}
}
UX Model — Access-Chain-First
The product UX centers on the access chain, not on a table of paths with a grouping toggle. The mental model for every screen is:
- One identity — who is this?
- Actual access chain — what can it reach, via which roles?
- What it reaches — destinations, data domains, sensitivity levels
- Why it matters — ranking factors, behavior pattern, drift narrative
- What to do next — one remediation action per chain
Access Chain List Page
The primary view is a ranked list of access chain cards. Each card is a self-contained summary:
┌─────────────────────────────────────────────────────────────────┐
│ #1 svc-finance Owned │
│ Invoice Rule Engine │
│ │
│ 5 roles → 3 destinations (Financial) │
│ 847 exec/30d · Expanding over time │
│ Findings: scope_drift, reachable_sensitive_domain │
│ │
│ Action: Reduce from 5 roles to 2, │
│ impact across 3 destinations │
├─────────────────────────────────────────────────────────────────┤
│ #2 svc-ascribe-prod Orphaned │
│ Ascribe Summarizer │
│ │
│ 2 roles → 5 destinations (Healthcare) │
│ 312 exec/30d · Steadily active │
│ Findings: reachable_sensitive_domain │
│ │
│ Action: Assign owner, review clinical data access │
├─────────────────────────────────────────────────────────────────┤
│ #3 svc-foundry Orphaned │
│ Foundry Agent + Foundry Provisioner │
│ !! Shared across 2 workloads │
│ │
│ 4 roles → 5 destinations (Infrastructure, IT) │
│ 715 exec/30d · Bursty │
│ Findings: scope_drift, cross_workload_identity │
│ │
│ Action: Split into per-workload identities │
└─────────────────────────────────────────────────────────────────┘
Access Chain Detail Page
Clicking into a chain shows the full detail:
┌─────────────────────────────────────────────────────────────────┐
│ svc-finance · Invoice Rule Engine Owned │
│ │
│ CHAIN SUMMARY │
│ 5 roles → 3 destinations │ 847 exec/30d │ Expanding over time │
│ Rank: #1 of 13 chains │
│ │
│ WHY IT MATTERS │
│ · Blast radius: 3 financial systems across 1 data domain │
│ · Sensitivity: restricted (Financial API, AP/AR Ledger) │
│ · Drift: scope expanded — 2 new roles since baseline │
│ · Execution: 847 calls/30d, trending up │
│ │
│ WHAT TO DO NEXT │
│ Reduce svc-finance from 5 roles to 2, with impact across │
│ 3 destinations. Remove: ap-write, ar-write, ledger-admin. │
│ Keep: finance-read, invoice-view. │
│ │
│ DESTINATIONS (expandable supporting evidence) │
│ ▸ Financial API restricted │ 421 exec │ scope_drift │
│ ▸ Invoice Archive confidential│ 312 exec │ scope_drift │
│ ▸ AP/AR Ledger restricted │ 114 exec │ scope_drift │
│ │
│ ROLES │
│ finance-read, invoice-view, ap-write, ar-write, ledger-admin │
└─────────────────────────────────────────────────────────────────┘
Expanding a destination row shows the underlying path-level detail: specific roles, actions, execution counts, and individual findings. This is the supporting evidence layer — not the primary frame.
6. Risks and Open Questions
-
Unbound paths. Paths with
identity_id = nulldon't form access chains naturally. Proposed: group by workload for unbound paths, clearly labelled "[Unbound]". -
Cross-workload identity sharing. id-svc-foundry spans 2 workloads. The access chain should surface this prominently — it's a higher-risk pattern and a force multiplier.
-
Pagination. Access chain response may not fit cursor-based pagination cleanly (chains have variable path counts). May need offset pagination or full-result with client-side virtualization.
-
Reports and exports. If reports currently list flat paths, they'll need an access-chain variant. Defer until after Phase 3.
-
Grouped drift aggregation. When drift is shown at the access chain level, how are severity, finding counts, and finding types derived? Rules: (a)
max_finding_severity= worst across paths, (b)active_finding_count= sum across paths (with dedup against entity-level findings to avoid inflation), (c)finding_types= union across paths. -
Behavior pattern derivation. Behavior classification requires access to multiple snapshots. Phase 5 must define which snapshot fields are compared and the threshold logic for each pattern category.
-
Ranking model calibration. The composite rank needs weight calibration against real customer environments. Start with equal weights, adjust based on feedback.
Next Action
Status: research-complete
Decision needed from: Sergey (product direction), Ivan (engineering scope)
Options:
-
Adopt Phase 1-3 — API access chain endpoint + UI access-chain-first pages. ~1 week. Validates the model with minimal risk. Implementation-ready now.
-
Adopt Phase 1-5 — Full sequence including remediation, ranking, and behavior. ~2 weeks. Phase 4-5 require an additional design pass — they introduce access-chain remediation and ranking as new product concepts.
-
Defer — Wait for more connector data to validate access chain model at scale.
GitHub Issue: SecurityV0/sv0-platform#216