Implementation Plan: Synthesize Data-Plane Authority Chain for Azure Foundry Connector
Date: 2026-03-03 Status: Draft v2 (addresses all v1 review findings) Scope: sv0-connectors (azure-foundry, sv0_azure shared lib) Platform changes: None required Review history: v1 → v2: 5 findings (3 blocking, 1 high, 1 advisory). See §13.
1. Problem Statement
The azure-foundry connector discovers AI agents, their managed identities, and ARM RBAC role assignments for those identities. The platform's path materializer then walks the standard authority chain:
workload → RUNS_AS → identity → HAS_ROLE → role → GRANTS → permission → APPLIES_TO → resource
to produce authority paths. The problem: ARM RBAC returns empty for Foundry managed identities because Foundry grants access through implicit data-plane binding, not ARM role assignments.
Concrete example: A Foundry project sv0-foundry-project has a system-assigned managed identity (entra-sp-{pid}). An AI agent my-agent within the project runs as this identity. The connector emits:
workload (ai_agent: "my-agent")
RUNS_AS → identity (entra-sp-{pid})
But the chain stops there. The identity has no ARM role assignments, so HAS_ROLE never fires, and execution_paths remains [].
Current state in the database: 12 entities (6 identities, 6 resources), 6 orphaned_ownership findings, zero authority paths.
2. Root Cause Analysis
Azure AI Foundry uses a two-tier access model:
| Tier | What it controls | How access is granted | Visible via ARM RBAC? |
|---|---|---|---|
| Control plane (ARM) | Create/modify the project, hub, AIServices account | Standard ARM role assignments (Owner, Contributor, Reader) | Yes |
| Data plane | Invoke agents, use connections, run endpoints within a project | Implicit binding — the project's managed identity is automatically granted access to all project resources | No |
The connector only queries ARM RBAC (arm_client.get_role_assignments_for_principal()). Data-plane access is invisible to this API. The identity genuinely has no ARM role assignments — its access exists only at the Foundry service layer.
This is analogous to how an Azure SQL managed identity gets data-plane access to its database without any ARM role, or how a Key Vault access policy grants secret access without an ARM role assignment.
3. Solution: Synthetic Data-Plane Authority Chain
For each managed identity bound to a Foundry project, synthesize authority nodes that model the implicit data-plane access:
identity (entra-sp-{pid})
HAS_ROLE → role (synthetic: "Foundry Project Member")
GRANTS → permission (actions: [read, execute, connect])
APPLIES_TO → resource (the project workspace)
This uses existing node types (role, permission, resource) and edge types (HAS_ROLE, GRANTS, APPLIES_TO) already defined in the platform schema (Data Model). The path materializer requires no changes — it already walks this chain.
Why "synthetic"?
The word "synthetic" distinguishes these nodes from nodes backed by a real ARM role assignment. All synthetic nodes and edges carry:
properties.synthetic: trueproperties.source: "foundry_data_plane"
This allows the platform (and future rules/evaluators) to distinguish between explicit ARM authority and implicit data-plane authority.
Why this works
The connector already discovers:
- Which managed identity is bound to which project (the
RUNS_ASedge from workspace → identity) - Which connections exist in each project (via data-plane API, when Azure AI User role is granted)
- Which agents exist and what they invoke
The missing piece is the authority link from identity → resource. The synthetic chain fills this gap using data we already have.
4. Prerequisites: Azure AI User Role
Before the synthetic chain adds value, the connector needs Azure AI User role on the AIServices accounts being scanned. Without it:
- Agent discovery returns
None(401 Forbidden) - Connection discovery returns
None - Execution evidence (run summaries) unavailable
With it:
- Agents are listed with model, tool types, identity binding
- Connections are listed with endpoint URLs and auth methods
- Run summaries provide execution counts and timestamps
See Pilot Permissions for the full permission matrix.
Action: Grant Azure AI User to the scanner's App Registration on each AIServices account. This is a manual one-time step (~2 minutes per account in the Azure Portal).
5. Implementation Tasks
Task 1: Shared Library — sv0_azure
New file: shared/sv0_azure/sv0_azure/foundry_actions.py
A Foundry-specific action normalizer, parallel to the existing arm_roles.py. Uses execute (not invoke) as the primary action keyword so it is recognized by WRITE_LEVEL_ACTIONS in both the connector (arm_roles.py:58) and the platform evaluator (privilege-justification-gap.ts:9):
FOUNDRY_DATA_PLANE_ACTIONS = {
"Foundry Project Member": ["execute", "read", "connect"],
}
def normalize_foundry_action(role_name: str) -> NormalizedAction:
"""Normalize a synthetic Foundry data-plane role to standard actions."""
actions = FOUNDRY_DATA_PLANE_ACTIONS.get(role_name, ["execute", "read"])
return NormalizedAction(
action=actions[0], # "execute" — write-level, triggers privilege_justification_gap
actions=actions,
is_fallback=role_name not in FOUNDRY_DATA_PLANE_ACTIONS,
)
(v2 change — addresses review finding 5: invoke is not in WRITE_LEVEL_ACTIONS, so it would never trigger write-level findings. AI agents can take actions, call tools, and write data — execute correctly classifies this.)
Edit: shared/sv0_azure/sv0_azure/node_ids.py
Add stable node ID and source ID generators for synthetic nodes. Uses the full principal GUID (not truncated) to match sp_node_id() convention and avoid birthday-paradox collisions:
import hashlib
def foundry_role_node_id(project_resource_id: str, principal_id: str) -> str:
"""Synthetic data-plane role for a managed identity in a Foundry project."""
content = f"foundry-dp-role:{principal_id}:{project_resource_id}"
return f"foundry-role-{hashlib.sha256(content.encode()).hexdigest()[:16]}"
def foundry_role_source_id(project_resource_id: str, principal_id: str) -> str:
"""Deterministic sourceId for synthetic role — used by platform upsert filter."""
return f"foundry-dp-role:{principal_id}:{project_resource_id}"
def foundry_permission_node_id(project_resource_id: str, principal_id: str) -> str:
"""Synthetic data-plane permission for a managed identity in a Foundry project."""
content = f"foundry-dp-perm:{principal_id}:{project_resource_id}"
return f"foundry-perm-{hashlib.sha256(content.encode()).hexdigest()[:16]}"
def foundry_permission_source_id(project_resource_id: str, principal_id: str) -> str:
"""Deterministic sourceId for synthetic permission — used by platform upsert filter."""
return f"foundry-dp-perm:{principal_id}:{project_resource_id}"
(v2 change — addresses review findings 1 and 4: adds deterministic sourceId schemes required by entity-adapter.ts upsert filter (tenant_id, source_system, source_id), and uses content-addressed SHA-256 hashes instead of truncated GUIDs to avoid collision risk.)
Why sourceId matters: The platform persists entities by (source_system, source_id), not nodeId (entity-adapter.ts:30-37). nodeId is ephemeral — only used within a single sync for edge resolution (graph-transformer.ts:116). Without deterministic sourceIds, each scan would create new entities instead of updating existing ones.
Task 2: Connector — azure-foundry Edge Resolver
Edit: edge_resolver.py
Add resolve_data_plane_authority_edges(). The scope_resource_type is derived dynamically from workspace.kind using the same mapping as _emit_project_node() in the transformer — not hardcoded:
# Reuse the existing workspace kind → ARM resource type mapping
_WORKSPACE_KIND_TO_RESOURCE_TYPE = {
"FoundryProject": "Microsoft.CognitiveServices/accounts/projects",
"AIServices": "Microsoft.CognitiveServices/accounts",
# "Project" and "Hub" fall through to default:
}
_DEFAULT_RESOURCE_TYPE = "Microsoft.MachineLearningServices/workspaces"
def resolve_data_plane_authority_edges(
self,
workspace: FoundryWorkspace,
principal_id: str,
) -> list[NormalizedEdge]:
"""
Synthesize HAS_ROLE → GRANTS → APPLIES_TO chain for a managed identity's
implicit data-plane access to a Foundry project.
"""
role_id = foundry_role_node_id(workspace.resource_id, principal_id)
perm_id = foundry_permission_node_id(workspace.resource_id, principal_id)
ws_id = self._workspace_node_id(workspace)
sp_id = self._sp_node_id(principal_id)
resource_type = _WORKSPACE_KIND_TO_RESOURCE_TYPE.get(
workspace.kind, _DEFAULT_RESOURCE_TYPE
)
return [
self._edge("HAS_ROLE", sp_id, role_id, {
"scope": workspace.resource_id,
"scope_resource_type": resource_type,
"synthetic": True,
"source": "foundry_data_plane",
}),
self._edge("GRANTS", role_id, perm_id),
self._edge("APPLIES_TO", perm_id, ws_id),
]
(v2 change — addresses review finding 3: the connector handles 3 workspace kinds mapping to 3 distinct ARM resource types. FoundryProject → Microsoft.CognitiveServices/accounts/projects, AIServices → Microsoft.CognitiveServices/accounts, Project/Hub → Microsoft.MachineLearningServices/workspaces. Hardcoding any single type would mislabel modern Foundry projects.)
Task 3: Connector — azure-foundry Transformer
Edit: transformer.py
Add three new emitter methods. All property names use snake_case to match the path materializer contract (path-materializer.ts:129,153-155):
-
_emit_foundry_role_node(workspace, principal_id)— emits arolenode:nodeId: fromfoundry_role_node_id()sourceSystem: "azure"sourceId: fromfoundry_role_source_id()— e.g."foundry-dp-role:{pid}:{resource_id}"nodeType: "role"displayName: "Foundry Project Member"properties.role_name: "Foundry Project Member"(snake_case — materializer readsrole.properties.role_name)properties.roleDefinitionId: "foundry-data-plane"(synthetic, no real ARM role definition)properties.scope: workspace.resource_idproperties.synthetic: trueproperties.source: "foundry_data_plane"
-
_emit_foundry_permission_node(workspace, principal_id)— emits apermissionnode:nodeId: fromfoundry_permission_node_id()sourceSystem: "azure"sourceId: fromfoundry_permission_source_id()— e.g."foundry-dp-perm:{pid}:{resource_id}"nodeType: "permission"displayName: "Foundry data-plane access on {project_name}"properties.normalized_action: "execute"(fromnormalize_foundry_action()— write-level)properties.actions: ["execute", "read", "connect"]properties.action_classification: "explicit"properties.synthetic: trueproperties.source: "foundry_data_plane"
-
Update existing workspace resource nodes — ensure
_emit_project_node()includes the properties the materializer reads:properties.resource_name: workspace.display_name(materializer readsresource.properties.resource_name)properties.business_domain: "azure"(materializer readsresource.properties.business_domain)properties.sensitivity: "internal"(AIServices/Foundry projects — materializer readsresource.properties.sensitivity; without this,normalizeSensitivity()defaults to"unknown"andprivilege_justification_gapsilently skips the path)
(v2 change — addresses review finding 2: this is also a pre-existing bug in the current ARM role node emission which uses
roleName(camelCase) instead ofrole_name. The ARM role nodes should be fixed in the same PR as a follow-up task.) -
Wire into
transform()— in the managed identity loop, after existing ARM RBAC chain emission:
# Synthesize data-plane authority chain (always, alongside any ARM access)
role_node = self._emit_foundry_role_node(workspace, pid)
perm_node = self._emit_foundry_permission_node(workspace, pid)
dp_edges = self.edge_resolver.resolve_data_plane_authority_edges(workspace, pid)
self._graph.edges.extend(dp_edges)
- Fix existing ARM role property names (follow-up in same PR):
transformer.py:310-311: Change"roleName"→"role_name"on ARM role nodes to match the materializer contract. The entra-servicenow connector already usesrole_namecorrectly (transformer.py:818).
Evidence completeness: Add a new source key data_plane_authority with status available (always synthesized when workspace + identity data exists). This makes it explicit in the graph output that the authority chain is synthetic.
Task 4: Tests
New/edited test files in tests/
| Test | What it verifies |
|---|---|
test_data_plane_authority_chain_emitted | Project with managed identity produces synthetic role → permission → resource chain with correct sourceId, sourceSystem, snake_case properties |
test_data_plane_authority_node_ids_stable | Same input produces same nodeId and sourceId across multiple transforms (deduplication) |
test_data_plane_authority_source_ids_deterministic | (v2) Verify sourceId matches expected scheme foundry-dp-role:{pid}:{resource_id} and survives upsert round-trip |
test_data_plane_authority_with_arm_roles | Both ARM and synthetic chains coexist when ARM RBAC returns data |
test_resolve_data_plane_authority_edges | Edge resolver produces correct edge types, source/target IDs, and kind-aware scope_resource_type for all 3 workspace kinds |
test_foundry_action_normalization | normalize_foundry_action() returns execute as primary action (write-level), correct fallback behavior |
test_workspace_resource_node_properties | (v2) Workspace resource nodes carry resource_name, business_domain, sensitivity (snake_case) |
test_role_node_snake_case_properties | (v2) Both synthetic and ARM role nodes use role_name (not roleName) |
6. Expected Result After Implementation
Running the azure-foundry connector with Azure AI User role should produce:
workload (ai_agent: "my-agent")
RUNS_AS → identity (entra-sp-{pid})
HAS_ROLE → role (synthetic: "Foundry Project Member")
GRANTS → permission (actions: [execute, read, connect])
APPLIES_TO → resource (foundry-workspace-{id})
The platform's path materializer walks this chain and produces:
{
"execution_paths": [{
"resource_id": "foundry-workspace-{id}",
"resource_name": "sv0-foundry-project",
"business_domain": "azure",
"sensitivity": "internal",
"via_roles": ["Foundry Project Member"],
"actions": ["execute"],
"via_identity": "entra-sp-{pid}",
"auth_chain_depth": 0
}]
}
Because execute is in WRITE_LEVEL_ACTIONS, the privilege_justification_gap evaluator will check for write-level execution evidence. If no write-level evidence is observed, it fires an action_mismatch finding — which is the correct behavior for an AI agent with execution authority.
Additionally, with Azure AI User:
- Agent discovery returns actual agents (not
None) - Connection discovery returns actual connections with endpoint URLs
- Agent → connection
INVOKESedges populate (includingvia_toolproperty) - Execution evidence (run summaries) may become available
USESedges link connections to credentials
7. What Does NOT Change
| Component | Why it's already correct |
|---|---|
Platform path materializer (path-materializer.ts) | Already follows RUNS_AS → HAS_ROLE → GRANTS → APPLIES_TO and forwarding edges (INVOKES, USES). No changes needed. |
Platform ingestion / schema (types.ts) | Synthetic nodes use existing NormalizedNodeType and NormalizedEdgeType values. No schema changes. |
| Entra-servicenow connector | Already emits standard HAS_ROLE → GRANTS → APPLIES_TO from ARM RBAC. Its authority model is different (OAuth + function key, not data-plane binding). No changes needed. |
| NormalizedGraph schema | Uses existing node types (role, permission, resource) and edge types (HAS_ROLE, GRANTS, APPLIES_TO). |
| Docker Compose / deployment | No infrastructure changes. |
8. Entra-ServiceNow Connector Audit
The entra-servicenow connector was audited for the same gap. Result: no changes needed.
- It already emits the full
HAS_ROLE → GRANTS → APPLIES_TOchain from ARM RBAC (see the function key authority path plan for the v2 fix). - Its authority model is explicit ARM role assignments, not implicit data-plane binding.
- ServiceNow OAuth chains use a different pattern (Graph API permissions → Permission nodes) that already works.
The ServiceNow dev instance returning zero data is a separate issue (instance likely hibernating or not configured with test data).
9. Documentation Updates
This document
Serves as the design rationale and implementation record.
Connector SETUP.md (sv0-connectors/integrations/azure-foundry/SETUP.md)
Add section explaining:
- Azure AI User role requirement and why
- What the synthetic data-plane authority chain represents
- How it differs from ARM RBAC authority (implicit vs explicit)
Data model docs (if applicable)
The Data Model already defines role, permission, resource nodes and HAS_ROLE, GRANTS, APPLIES_TO edges. The synthetic chain uses these existing types. A brief note may be added to clarify that some connectors produce synthetic authority chains where the source system uses implicit access.
10. PR Structure
PR 1: sv0-connectors
feat: synthesize data-plane authority chain for Foundry managed identities
Foundry grants project access via implicit data-plane binding, not ARM
RBAC. Emit synthetic role/permission/resource nodes so the platform's
path materializer can compute authority paths for AI agents.
Adds foundry_actions.py to shared lib, extends transformer and edge
resolver with data-plane chain synthesis.
PR 2: sv0-documentation (this document)
docs: add implementation plan for Foundry data-plane authority chain
11. Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Synthetic nodes misidentified as real ARM roles | Low | Medium | All synthetic nodes carry synthetic: true and source: "foundry_data_plane" |
| Node ID collisions across projects | Very Low | Low | (v2) Content-addressed SHA-256 hashes — no truncation of principal GUID |
| Entity churn from unstable sourceId | Very Low | Medium | (v2) Deterministic sourceId scheme defined: foundry-dp-role:{pid}:{resource_id} |
| Workspace type mislabeling | Very Low | Low | (v2) Kind-aware resource type mapping, not hardcoded |
| Action list doesn't match platform evaluator expectations | Very Low | Low | (v2) Uses execute (in WRITE_LEVEL_ACTIONS), not invoke |
| Synthetic chain creates false authority paths | Very Low | Medium | Chain only emits for identities already bound to projects via RUNS_AS — no speculative access |
| Breaking existing ARM RBAC chain (if ARM returns data later) | Very Low | High | Synthetic chain is additive, emitted alongside ARM chain — both coexist |
12. Open Questions
| Question | Current Answer |
|---|---|
| Should each connection get its own permission node? | Not in v2. One synthetic permission per (identity, project) pair is sufficient. Per-connection granularity can be added later if needed for least-privilege analysis. |
| Should the data model docs be updated? | Deferred. The synthetic chain uses existing types. A note about synthetic authority is a nice-to-have, not a blocker. |
Resolved in v2
| Question | Resolution |
|---|---|
Should invoke be classified as write-level? | Resolved: use execute instead. AI agents can take actions, call tools, and write data. execute is already in WRITE_LEVEL_ACTIONS on both connector and platform sides, so Foundry agents will correctly trigger privilege_justification_gap findings. |
13. Review Findings (v1 → v2)
| # | Severity | Finding | Resolution |
|---|---|---|---|
| 1 | Blocking | No sourceId scheme defined — platform persists by (source_system, source_id), not nodeId. Would cause entity churn on every scan. | Added foundry_role_source_id() and foundry_permission_source_id() to node_ids.py. Deterministic scheme: foundry-dp-role:{pid}:{resource_id}. |
| 2 | Blocking | camelCase property names (roleName) — path materializer reads role_name, resource_name, business_domain, sensitivity (all snake_case). Without these, execution paths resolve to IDs and unknown metadata. Sensitivity unknown causes privilege_justification_gap to silently skip. | All synthetic node properties use snake_case. Workspace resource nodes now carry resource_name, business_domain, sensitivity. Follow-up: fix existing ARM role roleName → role_name in same PR. |
| 3 | Blocking | Hardcoded scope_resource_type to Microsoft.MachineLearningServices/workspaces — connector handles 3 workspace kinds with 3 distinct ARM resource types. | Edge resolver now derives type from workspace.kind using the same mapping as _emit_project_node(). |
| 4 | High | Node ID uses principal_id[:8] (32 bits) — 50% collision at ~65K identities. sp_node_id() uses full GUID. | Switched to content-addressed SHA-256 hash of full {principal_id}:{resource_id}. No truncation of input. |
| 5 | Advisory | invoke not in WRITE_LEVEL_ACTIONS — agents would never trigger write-level findings. | Changed to execute, which is in WRITE_LEVEL_ACTIONS in both arm_roles.py and privilege-justification-gap.ts. |