Implementation Plan: Synthesize Data-Plane Authority Chain for Azure Foundry Connector

Date: 2026-03-03 Status: Draft v2 (addresses all v1 review findings) Scope: sv0-connectors (azure-foundry, sv0_azure shared lib) Platform changes: None required Review history: v1 → v2: 5 findings (3 blocking, 1 high, 1 advisory). See §13.

1. Problem Statement

The azure-foundry connector discovers AI agents, their managed identities, and ARM RBAC role assignments for those identities. The platform's path materializer then walks the standard authority chain:

workload → RUNS_AS → identity → HAS_ROLE → role → GRANTS → permission → APPLIES_TO → resource

to produce authority paths. The problem: ARM RBAC returns empty for Foundry managed identities because Foundry grants access through implicit data-plane binding, not ARM role assignments.

Concrete example: A Foundry project sv0-foundry-project has a system-assigned managed identity (entra-sp-{pid}). An AI agent my-agent within the project runs as this identity. The connector emits:

workload (ai_agent: "my-agent")
  RUNS_AS → identity (entra-sp-{pid})

But the chain stops there. The identity has no ARM role assignments, so HAS_ROLE never fires, and execution_paths remains [].

Current state in the database: 12 entities (6 identities, 6 resources), 6 orphaned_ownership findings, zero authority paths.

2. Root Cause Analysis

Azure AI Foundry uses a two-tier access model:

Tier	What it controls	How access is granted	Visible via ARM RBAC?
Control plane (ARM)	Create/modify the project, hub, AIServices account	Standard ARM role assignments (Owner, Contributor, Reader)	Yes
Data plane	Invoke agents, use connections, run endpoints within a project	Implicit binding — the project's managed identity is automatically granted access to all project resources	No

The connector only queries ARM RBAC (arm_client.get_role_assignments_for_principal()). Data-plane access is invisible to this API. The identity genuinely has no ARM role assignments — its access exists only at the Foundry service layer.

This is analogous to how an Azure SQL managed identity gets data-plane access to its database without any ARM role, or how a Key Vault access policy grants secret access without an ARM role assignment.

3. Solution: Synthetic Data-Plane Authority Chain

For each managed identity bound to a Foundry project, synthesize authority nodes that model the implicit data-plane access:

identity (entra-sp-{pid})
  HAS_ROLE → role (synthetic: "Foundry Project Member")
    GRANTS → permission (actions: [read, execute, connect])
      APPLIES_TO → resource (the project workspace)

This uses existing node types (role, permission, resource) and edge types (HAS_ROLE, GRANTS, APPLIES_TO) already defined in the platform schema (Data Model). The path materializer requires no changes — it already walks this chain.

Why "synthetic"?

The word "synthetic" distinguishes these nodes from nodes backed by a real ARM role assignment. All synthetic nodes and edges carry:

properties.synthetic: true
properties.source: "foundry_data_plane"

This allows the platform (and future rules/evaluators) to distinguish between explicit ARM authority and implicit data-plane authority.

Why this works

The connector already discovers:

Which managed identity is bound to which project (the RUNS_AS edge from workspace → identity)
Which connections exist in each project (via data-plane API, when Azure AI User role is granted)
Which agents exist and what they invoke

The missing piece is the authority link from identity → resource. The synthetic chain fills this gap using data we already have.

4. Prerequisites: Azure AI User Role

Before the synthetic chain adds value, the connector needs Azure AI User role on the AIServices accounts being scanned. Without it:

Agent discovery returns None (401 Forbidden)
Connection discovery returns None
Execution evidence (run summaries) unavailable

With it:

Agents are listed with model, tool types, identity binding
Connections are listed with endpoint URLs and auth methods
Run summaries provide execution counts and timestamps

See Pilot Permissions for the full permission matrix.

Action: Grant Azure AI User to the scanner's App Registration on each AIServices account. This is a manual one-time step (~2 minutes per account in the Azure Portal).

5. Implementation Tasks

Task 1: Shared Library — `sv0_azure`

New file: shared/sv0_azure/sv0_azure/foundry_actions.py

A Foundry-specific action normalizer, parallel to the existing arm_roles.py. Uses execute (not invoke) as the primary action keyword so it is recognized by WRITE_LEVEL_ACTIONS in both the connector (arm_roles.py:58) and the platform evaluator (privilege-justification-gap.ts:9):

FOUNDRY_DATA_PLANE_ACTIONS = {
    "Foundry Project Member": ["execute", "read", "connect"],
}

def normalize_foundry_action(role_name: str) -> NormalizedAction:
    """Normalize a synthetic Foundry data-plane role to standard actions."""
    actions = FOUNDRY_DATA_PLANE_ACTIONS.get(role_name, ["execute", "read"])
    return NormalizedAction(
        action=actions[0],  # "execute" — write-level, triggers privilege_justification_gap
        actions=actions,
        is_fallback=role_name not in FOUNDRY_DATA_PLANE_ACTIONS,
    )

(v2 change — addresses review finding 5: invoke is not in WRITE_LEVEL_ACTIONS, so it would never trigger write-level findings. AI agents can take actions, call tools, and write data — execute correctly classifies this.)

Edit: shared/sv0_azure/sv0_azure/node_ids.py

Add stable node ID and source ID generators for synthetic nodes. Uses the full principal GUID (not truncated) to match sp_node_id() convention and avoid birthday-paradox collisions:

import hashlib

def foundry_role_node_id(project_resource_id: str, principal_id: str) -> str:
    """Synthetic data-plane role for a managed identity in a Foundry project."""
    content = f"foundry-dp-role:{principal_id}:{project_resource_id}"
    return f"foundry-role-{hashlib.sha256(content.encode()).hexdigest()[:16]}"

def foundry_role_source_id(project_resource_id: str, principal_id: str) -> str:
    """Deterministic sourceId for synthetic role — used by platform upsert filter."""
    return f"foundry-dp-role:{principal_id}:{project_resource_id}"

def foundry_permission_node_id(project_resource_id: str, principal_id: str) -> str:
    """Synthetic data-plane permission for a managed identity in a Foundry project."""
    content = f"foundry-dp-perm:{principal_id}:{project_resource_id}"
    return f"foundry-perm-{hashlib.sha256(content.encode()).hexdigest()[:16]}"

def foundry_permission_source_id(project_resource_id: str, principal_id: str) -> str:
    """Deterministic sourceId for synthetic permission — used by platform upsert filter."""
    return f"foundry-dp-perm:{principal_id}:{project_resource_id}"

(v2 change — addresses review findings 1 and 4: adds deterministic sourceId schemes required by entity-adapter.ts upsert filter (tenant_id, source_system, source_id), and uses content-addressed SHA-256 hashes instead of truncated GUIDs to avoid collision risk.)

Why sourceId matters: The platform persists entities by (source_system, source_id), not nodeId (entity-adapter.ts:30-37). nodeId is ephemeral — only used within a single sync for edge resolution (graph-transformer.ts:116). Without deterministic sourceIds, each scan would create new entities instead of updating existing ones.

Task 2: Connector — `azure-foundry` Edge Resolver

Edit: edge_resolver.py

Add resolve_data_plane_authority_edges(). The scope_resource_type is derived dynamically from workspace.kind using the same mapping as _emit_project_node() in the transformer — not hardcoded:

# Reuse the existing workspace kind → ARM resource type mapping
_WORKSPACE_KIND_TO_RESOURCE_TYPE = {
    "FoundryProject": "Microsoft.CognitiveServices/accounts/projects",
    "AIServices": "Microsoft.CognitiveServices/accounts",
    # "Project" and "Hub" fall through to default:
}
_DEFAULT_RESOURCE_TYPE = "Microsoft.MachineLearningServices/workspaces"

def resolve_data_plane_authority_edges(
    self,
    workspace: FoundryWorkspace,
    principal_id: str,
) -> list[NormalizedEdge]:
    """
    Synthesize HAS_ROLE → GRANTS → APPLIES_TO chain for a managed identity's
    implicit data-plane access to a Foundry project.
    """
    role_id = foundry_role_node_id(workspace.resource_id, principal_id)
    perm_id = foundry_permission_node_id(workspace.resource_id, principal_id)
    ws_id = self._workspace_node_id(workspace)
    sp_id = self._sp_node_id(principal_id)

    resource_type = _WORKSPACE_KIND_TO_RESOURCE_TYPE.get(
        workspace.kind, _DEFAULT_RESOURCE_TYPE
    )

    return [
        self._edge("HAS_ROLE", sp_id, role_id, {
            "scope": workspace.resource_id,
            "scope_resource_type": resource_type,
            "synthetic": True,
            "source": "foundry_data_plane",
        }),
        self._edge("GRANTS", role_id, perm_id),
        self._edge("APPLIES_TO", perm_id, ws_id),
    ]

(v2 change — addresses review finding 3: the connector handles 3 workspace kinds mapping to 3 distinct ARM resource types. FoundryProject → Microsoft.CognitiveServices/accounts/projects, AIServices → Microsoft.CognitiveServices/accounts, Project/Hub → Microsoft.MachineLearningServices/workspaces. Hardcoding any single type would mislabel modern Foundry projects.)

Task 3: Connector — `azure-foundry` Transformer

Edit: transformer.py

Add three new emitter methods. All property names use snake_case to match the path materializer contract (path-materializer.ts:129,153-155):

_emit_foundry_role_node(workspace, principal_id) — emits a role node:
- nodeId: from foundry_role_node_id()
- sourceSystem: "azure"
- sourceId: from foundry_role_source_id() — e.g. "foundry-dp-role:{pid}:{resource_id}"
- nodeType: "role"
- displayName: "Foundry Project Member"
- properties.role_name: "Foundry Project Member" (snake_case — materializer reads role.properties.role_name)
- properties.roleDefinitionId: "foundry-data-plane" (synthetic, no real ARM role definition)
- properties.scope: workspace.resource_id
- properties.synthetic: true
- properties.source: "foundry_data_plane"
_emit_foundry_permission_node(workspace, principal_id) — emits a permission node:
- nodeId: from foundry_permission_node_id()
- sourceSystem: "azure"
- sourceId: from foundry_permission_source_id() — e.g. "foundry-dp-perm:{pid}:{resource_id}"
- nodeType: "permission"
- displayName: "Foundry data-plane access on {project_name}"
- properties.normalized_action: "execute" (from normalize_foundry_action() — write-level)
- properties.actions: ["execute", "read", "connect"]
- properties.action_classification: "explicit"
- properties.synthetic: true
- properties.source: "foundry_data_plane"
Update existing workspace resource nodes — ensure _emit_project_node() includes the properties the materializer reads:
- properties.resource_name: workspace.display_name (materializer reads resource.properties.resource_name)
- properties.business_domain: "azure" (materializer reads resource.properties.business_domain)
- properties.sensitivity: "internal" (AIServices/Foundry projects — materializer reads resource.properties.sensitivity; without this, normalizeSensitivity() defaults to "unknown" and privilege_justification_gap silently skips the path)
(v2 change — addresses review finding 2: this is also a pre-existing bug in the current ARM role node emission which uses roleName (camelCase) instead of role_name. The ARM role nodes should be fixed in the same PR as a follow-up task.)
Wire into transform() — in the managed identity loop, after existing ARM RBAC chain emission:

# Synthesize data-plane authority chain (always, alongside any ARM access)
role_node = self._emit_foundry_role_node(workspace, pid)
perm_node = self._emit_foundry_permission_node(workspace, pid)
dp_edges = self.edge_resolver.resolve_data_plane_authority_edges(workspace, pid)
self._graph.edges.extend(dp_edges)

Fix existing ARM role property names (follow-up in same PR):
- transformer.py:310-311: Change "roleName" → "role_name" on ARM role nodes to match the materializer contract. The entra-servicenow connector already uses role_name correctly (transformer.py:818).

Evidence completeness: Add a new source key data_plane_authority with status available (always synthesized when workspace + identity data exists). This makes it explicit in the graph output that the authority chain is synthetic.

Task 4: Tests

New/edited test files in tests/

Test	What it verifies
`test_data_plane_authority_chain_emitted`	Project with managed identity produces synthetic role → permission → resource chain with correct `sourceId`, `sourceSystem`, snake_case properties
`test_data_plane_authority_node_ids_stable`	Same input produces same `nodeId` and `sourceId` across multiple transforms (deduplication)
`test_data_plane_authority_source_ids_deterministic`	(v2) Verify `sourceId` matches expected scheme `foundry-dp-role:{pid}:{resource_id}` and survives upsert round-trip
`test_data_plane_authority_with_arm_roles`	Both ARM and synthetic chains coexist when ARM RBAC returns data
`test_resolve_data_plane_authority_edges`	Edge resolver produces correct edge types, source/target IDs, and kind-aware `scope_resource_type` for all 3 workspace kinds
`test_foundry_action_normalization`	`normalize_foundry_action()` returns `execute` as primary action (write-level), correct fallback behavior
`test_workspace_resource_node_properties`	(v2) Workspace resource nodes carry `resource_name`, `business_domain`, `sensitivity` (snake_case)
`test_role_node_snake_case_properties`	(v2) Both synthetic and ARM role nodes use `role_name` (not `roleName`)

6. Expected Result After Implementation

Running the azure-foundry connector with Azure AI User role should produce:

workload (ai_agent: "my-agent")
  RUNS_AS → identity (entra-sp-{pid})
    HAS_ROLE → role (synthetic: "Foundry Project Member")
      GRANTS → permission (actions: [execute, read, connect])
        APPLIES_TO → resource (foundry-workspace-{id})

The platform's path materializer walks this chain and produces:

{
  "execution_paths": [{
    "resource_id": "foundry-workspace-{id}",
    "resource_name": "sv0-foundry-project",
    "business_domain": "azure",
    "sensitivity": "internal",
    "via_roles": ["Foundry Project Member"],
    "actions": ["execute"],
    "via_identity": "entra-sp-{pid}",
    "auth_chain_depth": 0
  }]
}

Because execute is in WRITE_LEVEL_ACTIONS, the privilege_justification_gap evaluator will check for write-level execution evidence. If no write-level evidence is observed, it fires an action_mismatch finding — which is the correct behavior for an AI agent with execution authority.

Additionally, with Azure AI User:

Agent discovery returns actual agents (not None)
Connection discovery returns actual connections with endpoint URLs
Agent → connection INVOKES edges populate (including via_tool property)
Execution evidence (run summaries) may become available
USES edges link connections to credentials

7. What Does NOT Change

Component	Why it's already correct
Platform path materializer (`path-materializer.ts`)	Already follows `RUNS_AS → HAS_ROLE → GRANTS → APPLIES_TO` and forwarding edges (`INVOKES`, `USES`). No changes needed.
Platform ingestion / schema (`types.ts`)	Synthetic nodes use existing `NormalizedNodeType` and `NormalizedEdgeType` values. No schema changes.
Entra-servicenow connector	Already emits standard HAS_ROLE → GRANTS → APPLIES_TO from ARM RBAC. Its authority model is different (OAuth + function key, not data-plane binding). No changes needed.
NormalizedGraph schema	Uses existing node types (`role`, `permission`, `resource`) and edge types (`HAS_ROLE`, `GRANTS`, `APPLIES_TO`).
Docker Compose / deployment	No infrastructure changes.

8. Entra-ServiceNow Connector Audit

The entra-servicenow connector was audited for the same gap. Result: no changes needed.

It already emits the full HAS_ROLE → GRANTS → APPLIES_TO chain from ARM RBAC (see the function key authority path plan for the v2 fix).
Its authority model is explicit ARM role assignments, not implicit data-plane binding.
ServiceNow OAuth chains use a different pattern (Graph API permissions → Permission nodes) that already works.

The ServiceNow dev instance returning zero data is a separate issue (instance likely hibernating or not configured with test data).

9. Documentation Updates

This document

Serves as the design rationale and implementation record.

Connector SETUP.md (`sv0-connectors/integrations/azure-foundry/SETUP.md`)

Add section explaining:

Azure AI User role requirement and why
What the synthetic data-plane authority chain represents
How it differs from ARM RBAC authority (implicit vs explicit)

Data model docs (if applicable)

The Data Model already defines role, permission, resource nodes and HAS_ROLE, GRANTS, APPLIES_TO edges. The synthetic chain uses these existing types. A brief note may be added to clarify that some connectors produce synthetic authority chains where the source system uses implicit access.

10. PR Structure

PR 1: sv0-connectors

feat: synthesize data-plane authority chain for Foundry managed identities

Foundry grants project access via implicit data-plane binding, not ARM
RBAC. Emit synthetic role/permission/resource nodes so the platform's
path materializer can compute authority paths for AI agents.

Adds foundry_actions.py to shared lib, extends transformer and edge
resolver with data-plane chain synthesis.

PR 2: sv0-documentation (this document)

docs: add implementation plan for Foundry data-plane authority chain

11. Risk Assessment

Risk	Likelihood	Impact	Mitigation
Synthetic nodes misidentified as real ARM roles	Low	Medium	All synthetic nodes carry `synthetic: true` and `source: "foundry_data_plane"`
Node ID collisions across projects	Very Low	Low	(v2) Content-addressed SHA-256 hashes — no truncation of principal GUID
Entity churn from unstable sourceId	Very Low	Medium	(v2) Deterministic `sourceId` scheme defined: `foundry-dp-role:{pid}:{resource_id}`
Workspace type mislabeling	Very Low	Low	(v2) Kind-aware resource type mapping, not hardcoded
Action list doesn't match platform evaluator expectations	Very Low	Low	(v2) Uses `execute` (in `WRITE_LEVEL_ACTIONS`), not `invoke`
Synthetic chain creates false authority paths	Very Low	Medium	Chain only emits for identities already bound to projects via RUNS_AS — no speculative access
Breaking existing ARM RBAC chain (if ARM returns data later)	Very Low	High	Synthetic chain is additive, emitted alongside ARM chain — both coexist

12. Open Questions

Question	Current Answer
Should each connection get its own permission node?	Not in v2. One synthetic permission per (identity, project) pair is sufficient. Per-connection granularity can be added later if needed for least-privilege analysis.
Should the data model docs be updated?	Deferred. The synthetic chain uses existing types. A note about synthetic authority is a nice-to-have, not a blocker.

Resolved in v2

Question	Resolution
Should `invoke` be classified as write-level?	Resolved: use `execute` instead. AI agents can take actions, call tools, and write data. `execute` is already in `WRITE_LEVEL_ACTIONS` on both connector and platform sides, so Foundry agents will correctly trigger `privilege_justification_gap` findings.

13. Review Findings (v1 → v2)

#	Severity	Finding	Resolution
1	Blocking	No `sourceId` scheme defined — platform persists by `(source_system, source_id)`, not `nodeId`. Would cause entity churn on every scan.	Added `foundry_role_source_id()` and `foundry_permission_source_id()` to `node_ids.py`. Deterministic scheme: `foundry-dp-role:{pid}:{resource_id}`.
2	Blocking	camelCase property names (`roleName`) — path materializer reads `role_name`, `resource_name`, `business_domain`, `sensitivity` (all snake_case). Without these, execution paths resolve to IDs and `unknown` metadata. Sensitivity `unknown` causes `privilege_justification_gap` to silently skip.	All synthetic node properties use snake_case. Workspace resource nodes now carry `resource_name`, `business_domain`, `sensitivity`. Follow-up: fix existing ARM role `roleName` → `role_name` in same PR.
3	Blocking	Hardcoded `scope_resource_type` to `Microsoft.MachineLearningServices/workspaces` — connector handles 3 workspace kinds with 3 distinct ARM resource types.	Edge resolver now derives type from `workspace.kind` using the same mapping as `_emit_project_node()`.
4	High	Node ID uses `principal_id[:8]` (32 bits) — 50% collision at ~65K identities. `sp_node_id()` uses full GUID.	Switched to content-addressed SHA-256 hash of full `{principal_id}:{resource_id}`. No truncation of input.
5	Advisory	`invoke` not in `WRITE_LEVEL_ACTIONS` — agents would never trigger write-level findings.	Changed to `execute`, which is in `WRITE_LEVEL_ACTIONS` in both `arm_roles.py` and `privilege-justification-gap.ts`.

1. Problem Statement​

2. Root Cause Analysis​

3. Solution: Synthetic Data-Plane Authority Chain​

Why "synthetic"?​

Why this works​

4. Prerequisites: Azure AI User Role​

5. Implementation Tasks​

Task 1: Shared Library — sv0_azure​

Task 2: Connector — azure-foundry Edge Resolver​

Task 3: Connector — azure-foundry Transformer​

Task 4: Tests​

6. Expected Result After Implementation​

7. What Does NOT Change​

8. Entra-ServiceNow Connector Audit​

9. Documentation Updates​

This document​

Connector SETUP.md (sv0-connectors/integrations/azure-foundry/SETUP.md)​

Data model docs (if applicable)​

10. PR Structure​

11. Risk Assessment​

12. Open Questions​

Resolved in v2​

13. Review Findings (v1 → v2)​