Exposure Aggregation APIs: Implementation Plan

Date: 2026-02-25 Status: Draft v2 (addresses review findings) Depends on: None (foundational — must complete before Gap 2/3 UI work) Effort estimate: 10-14 hours Owner: TBD Target: Week 1 (by Mar 4)

Problem Description

The sv0-platform W1 UI has four key pages — Overview, Risk Clusters, Exposures List, and Exposure Detail — that are already built as React components. The backend APIs powering them (/api/v1/posture/summary, /api/v1/posture/risk-clusters, /api/v1/exposures, /api/v1/exposures/:id) exist and serve data, but they diverge from the W1 specification in ways that weaken the demo narrative.

Current State

All 4 endpoints are implemented and the UI consumes them. The gaps are in data shape and computation method:

Endpoint	What's There	What's Missing
`posture/summary`	Path-centric counts (active_paths, dormant_paths)	Identity-level counts (active_autonomous, dormant_authority) and workload-level counts (autonomous, operator_assisted, human_triggered)
`posture/risk-clusters`	Finding-type cluster matching with path counts	`identity_count`, `workload_count`, `sensitive_domains`, `priority` classification (P0/P1/P2)
`exposures`	Entity-with-findings grouping	Authority-path-based computation, deterministic `EXP-{hash}` IDs, `execution_count_30d`, `data_domains`, `egress_category`
`exposures/:id`	Entity + findings + authority paths	`ownership_breakdown`, `workload_metadata`, `identity_binding`, `execution_evidence_summary`, `path_summary` panels

Why It Matters

Without identity-level and workload-level aggregations, the Overview page cannot answer "How many autonomous identities exist?" — the fundamental W1 question. Without enriched exposure detail, CISOs cannot drill into a single workload and see its complete authority context in one view.

Architectural Analysis

Approach: Evolve Incrementally

The existing implementations work end-to-end (seed -> API -> UI). Rather than rewriting to match the spec exactly, we enhance the current path-centric architecture with additive fields. This avoids breaking the working UI.

Key principle: all changes are additive — existing response fields stay in place, new fields are added alongside. The UI can adopt new fields incrementally.

Data Flow

All required queries can be served by existing StorageAdapter methods:

Query Need	Existing Method
Count workloads by execution_mode	`countEntities(tenantId, { entityType: "workload", executionMode: "..." })`
Fetch active authority paths	`queryAuthorityPaths(tenantId, { status: "active" })`
Count authority paths by workload	`countAuthorityPaths(tenantId, { workloadId: id })`
Batch entity lookup	`getEntitiesByIds(tenantId, ids)`
Query findings by entity	`queryFindings(tenantId, { entityId, status: "active" })`
Count execution evidence	`countExecutionEvidence(tenantId, { entityId })`
Get posture snapshot	`getPostureSnapshotBefore(tenantId, date)`

One type extension needed: PostureSnapshotDoc (in src/domain/posture/types.ts) needs optional fields for delta computation:

export interface PostureSnapshotDoc {
  // ... existing fields ...
  active_autonomous_identities?: number;
  workload_counts?: {
    autonomous: number;
    operator_assisted: number;
    human_triggered: number;
  };
}

Endpoint Specifications

1. `GET /api/v1/posture/summary`

Current response shape (keep all existing fields):

{
  "data": {
    "active_paths": 15,
    "dormant_paths": 3,
    "total_executions_30d": 1200,
    "ownership_invalid_count": 5,
    "executed_invalid_ownership_count": 2,
    "delta": { "new_paths": 3, "removed_paths": 1, "new_orphaned": 2 },
    "last_refresh": "2026-02-25T10:00:00Z"
  }
}

New fields to add:

{
  "data": {
    // ... existing fields unchanged ...
    "identity_counts": {
      "active_autonomous": 8,
      "dormant_authority": 3
    },
    "workload_counts": {
      "autonomous": 6,
      "operator_assisted": 12,
      "human_triggered": 45
    },
    "delta": {
      // ... existing delta fields ...
      "new_autonomous_identities": 2,
      "ownership_invalidations": 3
    }
  }
}

Computation logic for new fields:

identity_counts.active_autonomous: Collect distinct identity_id values from active authority paths where current_state.execution_30d > 0.
identity_counts.dormant_authority: Same but where execution_30d === 0 AND first_seen_at < 90 days ago.
workload_counts.*: Call countEntities(tenantId, { entityType: "workload", executionMode: "autonomous" }) for each mode.
delta.new_autonomous_identities: Compare against prior PostureSnapshotDoc.active_autonomous_identities.

Implementation file: src/services/posture-service.ts

2. `GET /api/v1/posture/risk-clusters`

Current response already returns RiskClusterResult[] with cluster_key, label, description, severity, finding_types, path_count, total_execution_30d, ownership_breakdown, sensitivity_breakdown, oldest_finding_days, new_paths_30d.

New fields per cluster:

{
  "identity_count": 5,
  "workload_count": 7,
  "sensitive_domains": ["financial", "identity"],
  "priority": "P0"
}

Computation logic:

identity_count: For each cluster's matched path IDs, collect distinct identity_id values from AuthorityPathDoc records.
workload_count: Same but for workload_id.
sensitive_domains: Collect distinct data_domain values from matched paths where sensitivity is "confidential" or "restricted".
priority: Deterministic: P0 = severity "critical" + paths > 0; P1 = severity "high" + paths > 0; P2 = everything else with paths > 0.

Implementation file: src/services/risk-cluster-service.ts

3. `GET /api/v1/exposures`

Current issues: Groups by entity (workload with findings) rather than by authority path. Uses entity_id as exposure ID. Does not trace workload -> RUNS_AS -> identity chain.

Rewrite to authority-path-based computation.

Exposure grain decision: workload-level, not workload+identity.

A workload with multiple RUNS_AS identities (e.g., a Function App with both a system-assigned MI and user-assigned MI) produces ONE exposure row, not N. The identities field is an array. This avoids splitting what is operationally one workload into confusing duplicate rows.

If a workload has zero identities (unknown_identity_binding), it still gets one exposure row with identities: [] and identity_binding: "unknown".

Target response shape:

{
  "data": [
    {
      "id": "EXP-a1b2c3d4",
      "workload_id": "...",
      "workload_name": "Invoice Processing Rule",
      "workload_type": "workload",
      "source_system": "servicenow",
      "identities": [
        { "id": "...", "name": "svc-finance-api", "source_system": "entra_id" }
      ],
      "identity_binding": "bound",
      "path_count": 3,
      "finding_count": 5,
      "finding_types": ["orphaned_ownership", "reachable_sensitive_domain"],
      "max_severity": "critical",
      "sensitive_domain_count": 2,
      "data_domains": ["finance", "customer"],
      "execution_count_30d": 120,
      "last_execution_at": "2026-02-25T10:00:00Z",
      "ownership_status": "orphaned",
      "egress_category": "internal",
      "execution_mode": "triggered",
      "last_evaluated_at": "2026-02-25T10:00:00Z"
    }
  ],
  "cursor": { "has_more": true, "next": "..." },
  "meta": { "total_count": 45 }
}

Why identities is an array: A workload may RUNS_AS multiple identities (system-assigned MI + user-assigned MI, or different SPs per environment). Collapsing to a single identity_id would silently drop authority paths from secondary identities. The UI displays the primary identity name with a "+N more" badge if multiple exist. The identity_binding field reflects the composite state: "bound" if any identity resolved, "unknown" if zero identities.

Deterministic ID: EXP-{sha256(tenant_id + workload_id).slice(0, 8)} — hashed on workload only, NOT on identity. One workload = one exposure ID regardless of how many identities it uses.

Computation logic:

Query active authority paths, group by workload_id.
For each workload group: collect distinct identity_ids into an array, collect destination data_domains, sensitivity levels; sum execution_30d; find max last_execution_at; count paths.
Batch-fetch workload entities AND identity entities for display names and properties.
Query active findings grouped by entity_id for all workload IDs.
Merge path-derived stats with finding data.
Generate deterministic ID: EXP-{sha256(tenant_id + workload_id).slice(0, 8)}.
Handle "identity unknown" case: workloads with findings but NO authority paths get identities: [], identity_binding: "unknown", path_count: 0.
Support filters: severity, cluster, egress_category, ownership_status, execution_mode.
Sort by max_severity desc, then finding_count desc.

Implementation file: src/api/routes/exposures.ts

4. `GET /api/v1/exposures/:id`

Current state works but missing panels. Add:

{
  "data": {
    // ... existing fields (entity, findings, authority_paths, evidence_completeness, etc.) ...
    "ownership_breakdown": {
      "owners": [
        { "id": "...", "name": "Bob Chen", "type": "individual", "status": "active", "role": "primary" }
      ],
      "effective_status": "owned"
    },
    "workload_metadata": {
      "source_system": "servicenow",
      "source_id": "wl-invoice-rule",
      "artifact_identifier": "Business Rule: Invoice Processing",
      "created_at": "2025-11-01T00:00:00Z",
      "last_synced_at": "2026-02-25T10:00:00Z"
    },
    "identity_bindings": [
      {
        "relationship_type": "RUNS_AS",
        "identity_id": "...",
        "identity_name": "svc-finance-api",
        "protocol": "client_credentials",
        "target_system": "entra_id"
      }
    ],
    "execution_evidence_summary": {
      "total_direct_records": 8,
      "total_related_records": 12,
      "last_execution_at": "2026-02-25T08:00:00Z",
      "execution_count_30d": 120
    },
    "path_summary": {
      "total_paths": 3,
      "sensitive_paths": 2,
      "dormant_paths": 0,
      "total_execution_30d": 120
    }
  }
}

Computation logic:

ownership_breakdown: Resolve OWNED_BY/CREATED_BY relationships from entity, batch-fetch owner entities, return status and display name.
workload_metadata: From entity doc (source_system, source_id, created_at, last_synced_at). artifact_identifier from properties.description.
identity_bindings: Array of RUNS_AS relationships (one per identity). Each includes protocol from cross-system auth properties. Consistent with the list-level identities array.
execution_evidence_summary: Call countExecutionEvidence() for the entity and its RUNS_AS targets.
path_summary: Query authority paths for this workload.

Implementation file: src/api/routes/exposures.ts

Demo Data Requirements

Current Seed State (seed-demo-w1.ts)

Already creates 8-9 workloads, 5 identities, 17 resources, 26+ execution evidence records, 6 risk clusters. This is sufficient for all 4 endpoints.

Changes Needed

Workload execution_mode coverage: Currently only "triggered" and "autonomous" are used. Add:
- Change wl-sec-logger to execution_mode: "operator_assisted" (from "triggered")
- Add or modify one workload with execution_mode: "human_triggered"
- This populates all 3 workload_counts categories on the Overview page
Posture snapshot extension: The posture snapshot upserted at the end of the seed needs active_autonomous_identities and workload_counts fields for delta computation.
Multi-identity workload: Add at least one workload with two RUNS_AS identities (e.g., wl-data-pipeline runs as both id-svc-etl and id-svc-etl-staging) to verify the identities array and identity_bindings array render correctly. This also validates that exposure ID is stable (hashed on workload, not identity).
Verify display_name coverage: Ensure all role and resource entities in the seed have display_name properties so the ownership_breakdown and identity_bindings panels show human-readable names.

Expected Demo Output

After seeding, the Overview page should show:

Autonomous identities: 3-5 (with delta indicator)
Dormant authority: 1-2
Workload breakdown: autonomous: 3-4, operator_assisted: 1-2, human_triggered: 1
Risk clusters: 6 clusters with identity_count and priority badges

Implementation Steps

Step	File(s)	Description	Effort
1	`src/domain/posture/types.ts`	Extend PostureSnapshotDoc with optional identity/workload count fields	5 min
2	`src/services/posture-service.ts`	Add identity_counts and workload_counts computation; extend delta	1-2 hr
2b	`src/workers/handlers/evaluate-findings.ts`	Update snapshot writer to persist new fields (see Snapshot Writer section below)	30 min
3	`src/services/risk-cluster-service.ts`	Add identity_count, workload_count, sensitive_domains, priority per cluster	30 min
4	`src/api/routes/exposures.ts`	Rewrite exposures list (workload-grain, identities array); rewrite detail to add 5 panels	2-3 hr
5	`ui/src/api/api-types.ts`	Update PathPostureSummary, PathRiskCluster, ExposureSummary, ExposureDetail types	30 min
6	`ui/src/pages/OverviewPage.tsx`	Display identity_counts, workload_counts in stat cards	1-2 hr
6b	`ui/src/pages/ClustersListPage.tsx`	Display identity_count, workload_count, sensitive_domains, priority badge on cluster cards	1 hr
7	`ui/src/pages/ExposureDetailPage.tsx`	Render ownership_breakdown, workload_metadata, identity_binding, identities array, etc.	1 hr
8	`scripts/seed-demo-w1.ts`	Add execution_mode diversity, multi-identity workload, extend posture snapshot	1 hr
9	Tests	Unit + integration for all 4 endpoints + snapshot writer	2-3 hr

Steps 2/2b, 3, and 8 can run in parallel. Step 4 depends on Step 1. Steps 5-7 depend on Steps 2-4.

Snapshot Writer Updates (Critical — addresses delta computation)

The evaluate-findings handler (src/workers/handlers/evaluate-findings.ts, lines 50-80) writes posture snapshots after each evaluation run. It currently persists:

await storageAdapter.upsertPostureSnapshot({
  tenant_id, snapshot_at, sync_id,
  active_paths, dormant_paths, total_executions_30d,
  ownership_invalid_count, cluster_executions,
});

This must be updated to also persist the new fields. Otherwise PostureService.getPostureSummary() will compute deltas against prior snapshots that lack active_autonomous_identities and workload_counts, producing null deltas indefinitely.

Change in evaluate-findings.ts (Step 2b):

// After computing posture summary (which now includes identity_counts/workload_counts):
await storageAdapter.upsertPostureSnapshot({
  tenant_id, snapshot_at, sync_id,
  active_paths, dormant_paths, total_executions_30d,
  ownership_invalid_count, cluster_executions,
  // NEW fields:
  active_autonomous_identities: posture.data.identity_counts?.active_autonomous ?? 0,
  workload_counts: posture.data.workload_counts ?? undefined,
});

The seed script must also match: seed-demo-w1.ts manually inserts posture snapshots at the end. These must include the new fields to ensure the first demo load shows meaningful deltas.

Risk-Cluster UI Adoption

The review correctly identified that cluster card UI changes were under-scoped. The new fields (identity_count, workload_count, sensitive_domains, priority) are backend-ready but need to appear in the ClustersListPage.

Step 6b: ClustersListPage changes:

Each cluster card currently shows path_count, total_execution_30d, severity. Enhance to also display:

Priority badge (P0=red, P1=amber, P2=blue) in the card header
Identity count and workload count as secondary stats below path_count
Sensitive domains as small tags (e.g., "financial", "identity") at the bottom of the card
Sort clusters by priority (P0 first), then by path_count desc

Risks

Risk	Mitigation
Breaking existing UI	All changes are additive — existing fields stay, new fields added alongside. UI can adopt incrementally.
Exposure ID format change	Support both `EXP-{hash}` and raw `entity_id` in detail endpoint. Update list to use new format, update UI links.
Performance at scale	Authority path queries capped at 5000 (demo-scale: 20-50 paths). Add `truncated` flag. Future: MongoDB aggregation pipeline for server-side grouping.
Posture snapshot backward compat	Handle null/undefined for new fields in prior snapshots using existing `priorSnapshot?.field ?? null` pattern.
Multi-identity workloads	Exposure grain is workload-level. `identities` is an array. UI shows primary + "+N more" badge. ID is hashed on workload only.
Snapshot writer out of sync	Step 2b explicitly updates the evaluate-findings handler to persist new snapshot fields. Tested alongside Step 9.

Problem Description​

Current State​

Why It Matters​

Architectural Analysis​

Approach: Evolve Incrementally​

Data Flow​

Endpoint Specifications​

1. GET /api/v1/posture/summary​

2. GET /api/v1/posture/risk-clusters​

3. GET /api/v1/exposures​

4. GET /api/v1/exposures/:id​

Demo Data Requirements​

Current Seed State (seed-demo-w1.ts)​

Changes Needed​

Expected Demo Output​

Implementation Steps​

Snapshot Writer Updates (Critical — addresses delta computation)​

Risk-Cluster UI Adoption​

Risks​

Problem Description

Current State

Why It Matters

Architectural Analysis

Approach: Evolve Incrementally

Data Flow

Endpoint Specifications

1. `GET /api/v1/posture/summary`

2. `GET /api/v1/posture/risk-clusters`

3. `GET /api/v1/exposures`

4. `GET /api/v1/exposures/:id`

Demo Data Requirements

Current Seed State (seed-demo-w1.ts)

Changes Needed

Expected Demo Output

Implementation Steps

Snapshot Writer Updates (Critical — addresses delta computation)

Risk-Cluster UI Adoption

Risks