Skip to main content

Exposure Aggregation APIs: Implementation Plan

Date: 2026-02-25 Status: Draft v2 (addresses review findings) Depends on: None (foundational — must complete before Gap 2/3 UI work) Effort estimate: 10-14 hours Owner: TBD Target: Week 1 (by Mar 4)


Problem Description

The sv0-platform W1 UI has four key pages — Overview, Risk Clusters, Exposures List, and Exposure Detail — that are already built as React components. The backend APIs powering them (/api/v1/posture/summary, /api/v1/posture/risk-clusters, /api/v1/exposures, /api/v1/exposures/:id) exist and serve data, but they diverge from the W1 specification in ways that weaken the demo narrative.

Current State

All 4 endpoints are implemented and the UI consumes them. The gaps are in data shape and computation method:

EndpointWhat's ThereWhat's Missing
posture/summaryPath-centric counts (active_paths, dormant_paths)Identity-level counts (active_autonomous, dormant_authority) and workload-level counts (autonomous, operator_assisted, human_triggered)
posture/risk-clustersFinding-type cluster matching with path countsidentity_count, workload_count, sensitive_domains, priority classification (P0/P1/P2)
exposuresEntity-with-findings groupingAuthority-path-based computation, deterministic EXP-{hash} IDs, execution_count_30d, data_domains, egress_category
exposures/:idEntity + findings + authority pathsownership_breakdown, workload_metadata, identity_binding, execution_evidence_summary, path_summary panels

Why It Matters

Without identity-level and workload-level aggregations, the Overview page cannot answer "How many autonomous identities exist?" — the fundamental W1 question. Without enriched exposure detail, CISOs cannot drill into a single workload and see its complete authority context in one view.


Architectural Analysis

Approach: Evolve Incrementally

The existing implementations work end-to-end (seed -> API -> UI). Rather than rewriting to match the spec exactly, we enhance the current path-centric architecture with additive fields. This avoids breaking the working UI.

Key principle: all changes are additive — existing response fields stay in place, new fields are added alongside. The UI can adopt new fields incrementally.

Data Flow

All required queries can be served by existing StorageAdapter methods:

Query NeedExisting Method
Count workloads by execution_modecountEntities(tenantId, { entityType: "workload", executionMode: "..." })
Fetch active authority pathsqueryAuthorityPaths(tenantId, { status: "active" })
Count authority paths by workloadcountAuthorityPaths(tenantId, { workloadId: id })
Batch entity lookupgetEntitiesByIds(tenantId, ids)
Query findings by entityqueryFindings(tenantId, { entityId, status: "active" })
Count execution evidencecountExecutionEvidence(tenantId, { entityId })
Get posture snapshotgetPostureSnapshotBefore(tenantId, date)

One type extension needed: PostureSnapshotDoc (in src/domain/posture/types.ts) needs optional fields for delta computation:

export interface PostureSnapshotDoc {
// ... existing fields ...
active_autonomous_identities?: number;
workload_counts?: {
autonomous: number;
operator_assisted: number;
human_triggered: number;
};
}

Endpoint Specifications

1. GET /api/v1/posture/summary

Current response shape (keep all existing fields):

{
"data": {
"active_paths": 15,
"dormant_paths": 3,
"total_executions_30d": 1200,
"ownership_invalid_count": 5,
"executed_invalid_ownership_count": 2,
"delta": { "new_paths": 3, "removed_paths": 1, "new_orphaned": 2 },
"last_refresh": "2026-02-25T10:00:00Z"
}
}

New fields to add:

{
"data": {
// ... existing fields unchanged ...
"identity_counts": {
"active_autonomous": 8,
"dormant_authority": 3
},
"workload_counts": {
"autonomous": 6,
"operator_assisted": 12,
"human_triggered": 45
},
"delta": {
// ... existing delta fields ...
"new_autonomous_identities": 2,
"ownership_invalidations": 3
}
}
}

Computation logic for new fields:

  • identity_counts.active_autonomous: Collect distinct identity_id values from active authority paths where current_state.execution_30d > 0.
  • identity_counts.dormant_authority: Same but where execution_30d === 0 AND first_seen_at < 90 days ago.
  • workload_counts.*: Call countEntities(tenantId, { entityType: "workload", executionMode: "autonomous" }) for each mode.
  • delta.new_autonomous_identities: Compare against prior PostureSnapshotDoc.active_autonomous_identities.

Implementation file: src/services/posture-service.ts

2. GET /api/v1/posture/risk-clusters

Current response already returns RiskClusterResult[] with cluster_key, label, description, severity, finding_types, path_count, total_execution_30d, ownership_breakdown, sensitivity_breakdown, oldest_finding_days, new_paths_30d.

New fields per cluster:

{
"identity_count": 5,
"workload_count": 7,
"sensitive_domains": ["financial", "identity"],
"priority": "P0"
}

Computation logic:

  • identity_count: For each cluster's matched path IDs, collect distinct identity_id values from AuthorityPathDoc records.
  • workload_count: Same but for workload_id.
  • sensitive_domains: Collect distinct data_domain values from matched paths where sensitivity is "confidential" or "restricted".
  • priority: Deterministic: P0 = severity "critical" + paths > 0; P1 = severity "high" + paths > 0; P2 = everything else with paths > 0.

Implementation file: src/services/risk-cluster-service.ts

3. GET /api/v1/exposures

Current issues: Groups by entity (workload with findings) rather than by authority path. Uses entity_id as exposure ID. Does not trace workload -> RUNS_AS -> identity chain.

Rewrite to authority-path-based computation.

Exposure grain decision: workload-level, not workload+identity.

A workload with multiple RUNS_AS identities (e.g., a Function App with both a system-assigned MI and user-assigned MI) produces ONE exposure row, not N. The identities field is an array. This avoids splitting what is operationally one workload into confusing duplicate rows.

If a workload has zero identities (unknown_identity_binding), it still gets one exposure row with identities: [] and identity_binding: "unknown".

Target response shape:

{
"data": [
{
"id": "EXP-a1b2c3d4",
"workload_id": "...",
"workload_name": "Invoice Processing Rule",
"workload_type": "workload",
"source_system": "servicenow",
"identities": [
{ "id": "...", "name": "svc-finance-api", "source_system": "entra_id" }
],
"identity_binding": "bound",
"path_count": 3,
"finding_count": 5,
"finding_types": ["orphaned_ownership", "reachable_sensitive_domain"],
"max_severity": "critical",
"sensitive_domain_count": 2,
"data_domains": ["finance", "customer"],
"execution_count_30d": 120,
"last_execution_at": "2026-02-25T10:00:00Z",
"ownership_status": "orphaned",
"egress_category": "internal",
"execution_mode": "triggered",
"last_evaluated_at": "2026-02-25T10:00:00Z"
}
],
"cursor": { "has_more": true, "next": "..." },
"meta": { "total_count": 45 }
}

Why identities is an array: A workload may RUNS_AS multiple identities (system-assigned MI + user-assigned MI, or different SPs per environment). Collapsing to a single identity_id would silently drop authority paths from secondary identities. The UI displays the primary identity name with a "+N more" badge if multiple exist. The identity_binding field reflects the composite state: "bound" if any identity resolved, "unknown" if zero identities.

Deterministic ID: EXP-{sha256(tenant_id + workload_id).slice(0, 8)} — hashed on workload only, NOT on identity. One workload = one exposure ID regardless of how many identities it uses.

Computation logic:

  1. Query active authority paths, group by workload_id.
  2. For each workload group: collect distinct identity_ids into an array, collect destination data_domains, sensitivity levels; sum execution_30d; find max last_execution_at; count paths.
  3. Batch-fetch workload entities AND identity entities for display names and properties.
  4. Query active findings grouped by entity_id for all workload IDs.
  5. Merge path-derived stats with finding data.
  6. Generate deterministic ID: EXP-{sha256(tenant_id + workload_id).slice(0, 8)}.
  7. Handle "identity unknown" case: workloads with findings but NO authority paths get identities: [], identity_binding: "unknown", path_count: 0.
  8. Support filters: severity, cluster, egress_category, ownership_status, execution_mode.
  9. Sort by max_severity desc, then finding_count desc.

Implementation file: src/api/routes/exposures.ts

4. GET /api/v1/exposures/:id

Current state works but missing panels. Add:

{
"data": {
// ... existing fields (entity, findings, authority_paths, evidence_completeness, etc.) ...
"ownership_breakdown": {
"owners": [
{ "id": "...", "name": "Bob Chen", "type": "individual", "status": "active", "role": "primary" }
],
"effective_status": "owned"
},
"workload_metadata": {
"source_system": "servicenow",
"source_id": "wl-invoice-rule",
"artifact_identifier": "Business Rule: Invoice Processing",
"created_at": "2025-11-01T00:00:00Z",
"last_synced_at": "2026-02-25T10:00:00Z"
},
"identity_bindings": [
{
"relationship_type": "RUNS_AS",
"identity_id": "...",
"identity_name": "svc-finance-api",
"protocol": "client_credentials",
"target_system": "entra_id"
}
],
"execution_evidence_summary": {
"total_direct_records": 8,
"total_related_records": 12,
"last_execution_at": "2026-02-25T08:00:00Z",
"execution_count_30d": 120
},
"path_summary": {
"total_paths": 3,
"sensitive_paths": 2,
"dormant_paths": 0,
"total_execution_30d": 120
}
}
}

Computation logic:

  • ownership_breakdown: Resolve OWNED_BY/CREATED_BY relationships from entity, batch-fetch owner entities, return status and display name.
  • workload_metadata: From entity doc (source_system, source_id, created_at, last_synced_at). artifact_identifier from properties.description.
  • identity_bindings: Array of RUNS_AS relationships (one per identity). Each includes protocol from cross-system auth properties. Consistent with the list-level identities array.
  • execution_evidence_summary: Call countExecutionEvidence() for the entity and its RUNS_AS targets.
  • path_summary: Query authority paths for this workload.

Implementation file: src/api/routes/exposures.ts


Demo Data Requirements

Current Seed State (seed-demo-w1.ts)

Already creates 8-9 workloads, 5 identities, 17 resources, 26+ execution evidence records, 6 risk clusters. This is sufficient for all 4 endpoints.

Changes Needed

  1. Workload execution_mode coverage: Currently only "triggered" and "autonomous" are used. Add:

    • Change wl-sec-logger to execution_mode: "operator_assisted" (from "triggered")
    • Add or modify one workload with execution_mode: "human_triggered"
    • This populates all 3 workload_counts categories on the Overview page
  2. Posture snapshot extension: The posture snapshot upserted at the end of the seed needs active_autonomous_identities and workload_counts fields for delta computation.

  3. Multi-identity workload: Add at least one workload with two RUNS_AS identities (e.g., wl-data-pipeline runs as both id-svc-etl and id-svc-etl-staging) to verify the identities array and identity_bindings array render correctly. This also validates that exposure ID is stable (hashed on workload, not identity).

  4. Verify display_name coverage: Ensure all role and resource entities in the seed have display_name properties so the ownership_breakdown and identity_bindings panels show human-readable names.

Expected Demo Output

After seeding, the Overview page should show:

  • Autonomous identities: 3-5 (with delta indicator)
  • Dormant authority: 1-2
  • Workload breakdown: autonomous: 3-4, operator_assisted: 1-2, human_triggered: 1
  • Risk clusters: 6 clusters with identity_count and priority badges

Implementation Steps

StepFile(s)DescriptionEffort
1src/domain/posture/types.tsExtend PostureSnapshotDoc with optional identity/workload count fields5 min
2src/services/posture-service.tsAdd identity_counts and workload_counts computation; extend delta1-2 hr
2bsrc/workers/handlers/evaluate-findings.tsUpdate snapshot writer to persist new fields (see Snapshot Writer section below)30 min
3src/services/risk-cluster-service.tsAdd identity_count, workload_count, sensitive_domains, priority per cluster30 min
4src/api/routes/exposures.tsRewrite exposures list (workload-grain, identities array); rewrite detail to add 5 panels2-3 hr
5ui/src/api/api-types.tsUpdate PathPostureSummary, PathRiskCluster, ExposureSummary, ExposureDetail types30 min
6ui/src/pages/OverviewPage.tsxDisplay identity_counts, workload_counts in stat cards1-2 hr
6bui/src/pages/ClustersListPage.tsxDisplay identity_count, workload_count, sensitive_domains, priority badge on cluster cards1 hr
7ui/src/pages/ExposureDetailPage.tsxRender ownership_breakdown, workload_metadata, identity_binding, identities array, etc.1 hr
8scripts/seed-demo-w1.tsAdd execution_mode diversity, multi-identity workload, extend posture snapshot1 hr
9TestsUnit + integration for all 4 endpoints + snapshot writer2-3 hr

Steps 2/2b, 3, and 8 can run in parallel. Step 4 depends on Step 1. Steps 5-7 depend on Steps 2-4.


Snapshot Writer Updates (Critical — addresses delta computation)

The evaluate-findings handler (src/workers/handlers/evaluate-findings.ts, lines 50-80) writes posture snapshots after each evaluation run. It currently persists:

await storageAdapter.upsertPostureSnapshot({
tenant_id, snapshot_at, sync_id,
active_paths, dormant_paths, total_executions_30d,
ownership_invalid_count, cluster_executions,
});

This must be updated to also persist the new fields. Otherwise PostureService.getPostureSummary() will compute deltas against prior snapshots that lack active_autonomous_identities and workload_counts, producing null deltas indefinitely.

Change in evaluate-findings.ts (Step 2b):

// After computing posture summary (which now includes identity_counts/workload_counts):
await storageAdapter.upsertPostureSnapshot({
tenant_id, snapshot_at, sync_id,
active_paths, dormant_paths, total_executions_30d,
ownership_invalid_count, cluster_executions,
// NEW fields:
active_autonomous_identities: posture.data.identity_counts?.active_autonomous ?? 0,
workload_counts: posture.data.workload_counts ?? undefined,
});

The seed script must also match: seed-demo-w1.ts manually inserts posture snapshots at the end. These must include the new fields to ensure the first demo load shows meaningful deltas.


Risk-Cluster UI Adoption

The review correctly identified that cluster card UI changes were under-scoped. The new fields (identity_count, workload_count, sensitive_domains, priority) are backend-ready but need to appear in the ClustersListPage.

Step 6b: ClustersListPage changes:

Each cluster card currently shows path_count, total_execution_30d, severity. Enhance to also display:

  • Priority badge (P0=red, P1=amber, P2=blue) in the card header
  • Identity count and workload count as secondary stats below path_count
  • Sensitive domains as small tags (e.g., "financial", "identity") at the bottom of the card
  • Sort clusters by priority (P0 first), then by path_count desc

Risks

RiskMitigation
Breaking existing UIAll changes are additive — existing fields stay, new fields added alongside. UI can adopt incrementally.
Exposure ID format changeSupport both EXP-{hash} and raw entity_id in detail endpoint. Update list to use new format, update UI links.
Performance at scaleAuthority path queries capped at 5000 (demo-scale: 20-50 paths). Add truncated flag. Future: MongoDB aggregation pipeline for server-side grouping.
Posture snapshot backward compatHandle null/undefined for new fields in prior snapshots using existing priorSnapshot?.field ?? null pattern.
Multi-identity workloadsExposure grain is workload-level. identities is an array. UI shows primary + "+N more" badge. ID is hashed on workload only.
Snapshot writer out of syncStep 2b explicitly updates the evaluate-findings handler to persist new snapshot fields. Tested alongside Step 9.