Skip to main content

Architecture Plan: Shared Azure Modules Across Connectors

Date: 2026-02-25 Status: Draft v2 (addresses all peer review findings from v1) Scope: sv0-connectors (cross-connector shared package) Origin: Code review of function-key authority paths implementation revealed significant Azure code duplication between entra-servicenow and azure-foundry connectors, including one active bug caused by divergent implementations. Review history: v1 → v2: 8 findings (1 critical, 3 high, 4 medium). All resolved — see section 10.


1. Problem Statement

The entra-servicenow and azure-foundry connectors independently implement overlapping Azure functionality: ARM RBAC role resolution, ARM scope parsing, Entra Graph SP lookups, credential type detection, authentication setup, and node ID conventions. This duplication violates DRY and has already produced a concrete bug.

Active Bug: Divergent ARM Action Normalization

The two connectors normalize ARM role names to platform action keywords differently:

entra-servicenow (transformer.py, recently fixed):

_ARM_ROLE_ACTIONS = {
"Contributor": ["read", "write", "delete"],
"User Access Administrator": ["admin"],
"Key Vault Crypto User": ["read", "execute"],
...
}
# Picks highest-privilege action from explicit mapping

azure-foundry (transformer.py):

def _normalize_arm_action(role_name: str) -> str:
lower = role_name.lower()
if "owner" in lower: return "owner"
if "contributor" in lower: return "write"
if "user" in lower or "use" in lower: return "use" # ← BUG
...

Impact: The platform evaluator privilege_justification_gap checks hasWriteActions() against WRITE_LEVEL_ACTIONS = {"write", "admin", "delete", "update", "create", "execute"}. The string "use" is NOT in that set. So azure-foundry silently misses write-level findings for:

ARM Roleentra-servicenow (correct)azure-foundry (wrong)Should trigger write finding?
User Access Administrator"admin""use"Yes — admin is write-level
Key Vault Crypto User"execute""use"Yes — execute is write-level
Key Vault Secrets User"read""use"No — read-only, but "use" is still wrong
Cognitive Services User"read" (unmapped, but read-only role)"use"No — but "use" is not a recognized action

This is not hypothetical — Foundry workspaces commonly have "Cognitive Services User" and "Cognitive Services OpenAI User" role assignments on their managed identities.

Duplication Inventory

Capabilityentra-servicenowazure-foundryLines duplicated
ARM list_subscriptions()azure_client.py:420-448arm_client.py:55-82~30
ARM get_role_assignments_for_principal()azure_client.py:450-510arm_client.py:84-130~50
ARM role name resolution (UUID → name)azure_client.py:512-560arm_client.py:132-170~40
Well-known role UUID cache5 entries60+ entriesSubset problem
ARM scope parsingtransformer.py:870-940 (inline)arm_client.py:_parse_scope()~40
ARM role → action normalizationtransformer.py:69-106transformer.py:_normalize_arm_action()~30
ARM resource sensitivitytransformer.py:108-116Not presentMissing in foundry
Entra SP lookup by object IDazure_client.py:562-590azure_client.py:get_service_principal()~30
Entra SP ownersazure_client.py:170-210azure_client.py:get_service_principal_owners()~30
Entra user detailsazure_client.py:212-250azure_client.py:get_user()~25
Credential type detectionazure_client.py:_extract_sp_data()azure_client.py:_parse_sp()~15
Auth setup (ClientSecretCredential)azure_client.py:__init__()azure_client.py + arm_client.py~20
Retry session factoryazure_client.py:__init__()arm_client.py:__init__()~10
Node ID: SPentra-sp-{principal_id}entra-sp-{principal_id}Convention only
Node ID: ownerentra-user-{object_id}entra-user-{object_id}Convention only
Platform submissionInline in cli/main.pyplatform_client.py~40

Total: ~360 lines of functionally duplicated code, plus one active bug from divergent implementations.


2. Proposed Architecture

Package Structure

sv0-connectors/
├── shared/
│ ├── sv0_common/ # Connector-agnostic utilities
│ │ ├── __init__.py
│ │ ├── py.typed
│ │ ├── pyproject.toml
│ │ └── platform_client.py # NormalizedGraph submission to sv0-platform
│ │
│ └── sv0_azure/ # Azure-specific shared code
│ ├── __init__.py
│ ├── py.typed # PEP 561 marker
│ ├── pyproject.toml # Installable package
│ │
│ ├── auth.py # Credential + retry session factory
│ ├── arm.py # ARM client: subscriptions, role assignments
│ ├── arm_scope.py # parse_arm_scope() utility
│ ├── arm_roles.py # Role UUID cache, role→action mapping, sensitivity
│ ├── entra.py # Graph client: SP lookup, owners, users, sign-ins
│ └── node_ids.py # Canonical node ID generation

├── integrations/
│ ├── entra-servicenow/
│ │ └── pyproject.toml # depends on sv0_azure, sv0_common
│ └── azure-foundry/
│ └── pyproject.toml # depends on sv0_azure, sv0_common

Package separation rationale: platform_client.py is transport-layer code (HTTP POST to sv0-platform API). It has no Azure dependency and will be needed by any future non-Azure connector (e.g., a GCP or AWS connector). Placing it in sv0_common avoids forcing non-Azure connectors to depend on azure-identity just to submit graphs. sv0_azure depends on sv0_common (for shared retry session logic); connectors depend on both.

Dependency Model

sv0_azure  ← entra-servicenow
← azure-foundry
← (future connectors)

Each connector installs sv0_azure as an editable local dependency during development:

# Development install (both the shared package and the connector)
cd shared/sv0_azure && pip install -e .
cd integrations/entra-servicenow && pip install -e ".[dev,test]"

Note on pyproject.toml dependency declaration: PEP 440 file:// URIs are not portable across pip versions and environments. Instead of declaring a file:// dependency in pyproject.toml (which pip rejects as "non-local file URIs are not supported"), connectors declare sv0_azure as a bare dependency by name:

# integrations/entra-servicenow/pyproject.toml
[project]
dependencies = [
"sv0-azure>=0.1.0",
...
]

The editable install (pip install -e ../../shared/sv0_azure) satisfies this requirement during local development. For CI, the shared package is installed explicitly before the connector (see section 6.1).


3. Module Specifications

3.1 auth.py — Credential & Session Factory

Purpose: Eliminate duplicated ClientSecretCredential setup and retry session creation.

from azure.identity import ClientSecretCredential

GRAPH_SCOPE = "https://graph.microsoft.com/.default"
ARM_SCOPE = "https://management.azure.com/.default"

def azure_credential(
tenant_id: str, client_id: str, client_secret: str
) -> ClientSecretCredential:
"""Create a ClientSecretCredential for Azure API access."""

def retry_session(
retries: int = 3,
backoff_factor: float = 0.5,
status_forcelist: tuple[int, ...] = (429, 500, 502, 503, 504),
) -> requests.Session:
"""Create a requests.Session with retry strategy."""

def bearer_token(credential: ClientSecretCredential, scope: str) -> str:
"""Acquire a bearer token for the given scope."""

What moves here:

  • ClientSecretCredential initialization (both connectors)
  • requests.Session + HTTPAdapter + Retry setup (both connectors)
  • Token acquisition pattern (both connectors)

What stays in connectors:

  • Config loading (env vars, validation) — connector-specific
  • Connector-specific scopes (Foundry data-plane scopes like ai.azure.com, ml.azure.com)

3.2 arm.py — ARM Management Client

Purpose: Single ARM client for subscription enumeration and RBAC role assignment queries.

class ArmClient:
def __init__(self, credential: ClientSecretCredential) -> None: ...

def list_subscriptions(self) -> list[str]:
"""List all accessible subscription IDs. Cached after first call."""

def get_role_assignments_for_principal(
self, subscription_id: str, principal_id: str
) -> list[ArmRoleAssignment]:
"""Query ARM RBAC assignments for a principal in a subscription."""

@dataclass
class ArmRoleAssignment:
assignment_id: str
principal_id: str
principal_type: str # "ServicePrincipal", "User", "Group"
role_definition_id: str # Full ARM path
role_name: str # Resolved human-readable name
scope: str # Full ARM scope string

What moves here:

  • list_subscriptions() from both connectors
  • get_role_assignments_for_principal() from both connectors
  • _resolve_role_name() / _resolve_arm_role_name() — unified
  • _WELL_KNOWN_ROLES — merged to 60+ entries (azure-foundry's superset)

What stays in connectors:

  • resolve_function_app_identity() stays in entra-servicenow's azure_client.py — it is Function App-specific discovery logic that only entra-servicenow uses, and it composes on top of ArmClient.list_subscriptions() from the shared module
  • Foundry-specific ARM queries (list_workspaces, list_aiservices, list_projects)
  • entra-servicenow bulk SP fetch

3.3 arm_scope.py — ARM Scope Parsing

Purpose: Parse ARM scope strings into structured components. Currently inline in entra-servicenow and a standalone function in azure-foundry.

@dataclass
class ArmScope:
raw: str # Original scope string
level: str # "resource" | "resource_group" | "subscription"
subscription_id: str | None # Extracted if present
resource_group: str | None # Extracted if present
resource_type: str | None # e.g. "Microsoft.Web/sites" — None for subscription scope
resource_name: str | None # e.g. "myapp" — None for subscription scope

@property
def display_name(self) -> str:
"""Human-readable label for this scope."""
if self.level == "resource" and self.resource_type and self.resource_name:
return f"{self.resource_type}/{self.resource_name}"
if self.level == "resource_group" and self.resource_name:
return f"ResourceGroup/{self.resource_name}"
if self.level == "subscription" and self.subscription_id:
return f"Subscription/{self.subscription_id[:8]}..."
return self.raw

def parse_arm_scope(scope: str) -> ArmScope | None:
"""Parse an ARM scope string into structured components.

Handles three levels:
- Resource: /subscriptions/.../providers/Microsoft.Web/sites/myapp
→ resource_type="Microsoft.Web/sites", resource_name="myapp"
- Resource group: /subscriptions/.../resourceGroups/myRG
→ resource_type=None, resource_name="myRG"
- Subscription: /subscriptions/{sub-id}
→ resource_type=None, resource_name=None

Returns None for empty or unparseable scopes.

Note: resource_type and resource_name are None for subscription-level
scopes and resource_type is None for resource-group scopes, since those
levels have no provider/type. Callers should check the `level` field
or use `display_name` for presentation.
"""

def scope_hash(scope: str) -> str:
"""Deterministic 12-char hash of an ARM scope for use in node IDs."""

Design note on optional fields: resource_type and resource_name are str | None rather than always-present strings because subscription-level scopes genuinely have no resource type or name. Using None makes it explicit that callers must handle each scope level, rather than hiding the absence behind sentinel strings like "Microsoft.Subscription" or "unknown". The level field provides the discriminant for pattern matching.

What moves here:

  • Scope parsing logic from entra-servicenow transformer.py:870-940
  • _parse_scope() from azure-foundry arm_client.py
  • scope_hash() from entra-servicenow transformer.py:872

3.4 arm_roles.py — Role Actions, Sensitivity, and Classification

Purpose: Unified ARM role → action mapping and resource sensitivity classification. This module fixes the active bug in azure-foundry.

# --- Role → Action Mapping ---
# Maps well-known ARM built-in role names to platform action keywords.
# These keywords align with the platform evaluator's WRITE_LEVEL_ACTIONS set:
# {"write", "admin", "delete", "update", "create", "execute"}
ARM_ROLE_ACTIONS: dict[str, list[str]] = {
# Core
"Owner": ["read", "write", "delete", "admin"],
"Contributor": ["read", "write", "delete"],
"Reader": ["read"],
"User Access Administrator": ["admin"],
# Storage
"Storage Blob Data Owner": ["read", "write", "delete", "admin"],
"Storage Blob Data Contributor": ["read", "write", "delete"],
"Storage Blob Data Reader": ["read"],
# Key Vault
"Key Vault Administrator": ["read", "write", "delete", "admin"],
"Key Vault Secrets User": ["read"],
"Key Vault Crypto User": ["read", "execute"],
# SQL
"SQL DB Contributor": ["read", "write", "delete"],
"SQL Server Contributor": ["read", "write", "delete"],
# Compute / Web
"Website Contributor": ["read", "write", "delete"],
"Virtual Machine Contributor": ["read", "write", "delete"],
# AI / Cognitive Services
"Cognitive Services User": ["read"],
"Cognitive Services Contributor": ["read", "write", "delete"],
"Cognitive Services OpenAI User": ["read"],
"Cognitive Services OpenAI Contributor": ["read", "write", "delete"],
}

ARM_ROLE_DEFAULT_ACTIONS: list[str] = ["read", "write"]

WRITE_LEVEL_ACTIONS: frozenset[str] = frozenset(
{"write", "admin", "delete", "update", "create", "execute"}
)

@dataclass
class NormalizedAction:
"""Result of ARM role → action normalization."""
action: str # Highest-privilege action keyword
actions: list[str] # Full action list for evidence/display
is_fallback: bool # True if role_name was not in ARM_ROLE_ACTIONS

def normalize_arm_action(role_name: str) -> NormalizedAction:
"""Return the highest-privilege action for an ARM role.

Uses explicit mapping (ARM_ROLE_ACTIONS) with privilege precedence:
if any action is in WRITE_LEVEL_ACTIONS, return that one.
Otherwise return the first action (typically "read").

Unknown roles default to ["read", "write"] (conservative — ensures
write-level findings fire). The `is_fallback` flag is set to True so
callers can track, log, or flag unmapped roles for review.
Connectors should propagate `is_fallback` into the permission node's
properties (e.g. `"action_classification": "fallback"`) so that
downstream consumers can distinguish explicit from inferred actions.
"""

# --- Resource Sensitivity ---
ARM_SENSITIVITY: dict[str, str] = {
"Microsoft.KeyVault/vaults": "confidential",
"Microsoft.Sql/servers": "restricted",
"Microsoft.Storage/storageAccounts": "restricted",
"Microsoft.Web/sites": "internal",
"Microsoft.Compute/virtualMachines": "internal",
"Microsoft.CognitiveServices/accounts": "internal",
"Microsoft.MachineLearningServices/workspaces": "internal",
"Microsoft.Resources/resourceGroups": "unknown",
"Microsoft.Subscription": "unknown",
}

def classify_arm_sensitivity(resource_type: str) -> str:
"""Return sensitivity classification for an ARM resource type."""

What moves here:

  • _ARM_ROLE_ACTIONS, _ARM_ROLE_DEFAULT_ACTIONS, _WRITE_LEVEL_ACTIONS from entra-servicenow
  • _normalize_arm_action() from azure-foundry (replaced by explicit mapping)
  • _ARM_SENSITIVITY from entra-servicenow (extended with Cognitive Services and ML types for foundry)

Bug fix: azure-foundry switches from substring-matching ("user" in lower → "use") to the explicit mapping. "Cognitive Services User" now correctly returns "read" instead of "use". "User Access Administrator" correctly returns "admin" instead of "use".

3.5 entra.py — Entra Graph Client (SP Lookups)

Purpose: Shared Microsoft Graph queries for service principal details, owners, and user info.

@dataclass
class EntraServicePrincipal:
object_id: str
app_id: str
display_name: str
service_principal_type: str # "Application", "ManagedIdentity", "Legacy"
account_enabled: bool
credential_type: str # "none", "secret", "certificate", "secret_and_certificate"
created_at: datetime | None
sign_in_audience: str
owners: list[EntraOwner] = field(default_factory=list)
last_sign_in_at: datetime | None = None

@dataclass
class EntraOwner:
object_id: str
display_name: str
user_principal_name: str | None
owner_type: str # "user", "servicePrincipal"

class EntraClient:
def __init__(self, credential: ClientSecretCredential) -> None: ...

def get_service_principal(self, object_id: str) -> EntraServicePrincipal | None:
"""Fetch SP by Entra object ID. Returns None on 404."""

def get_service_principal_owners(self, object_id: str) -> list[EntraOwner]:
"""Get owners (users and SPs) of a service principal."""

def get_user(self, object_id: str) -> dict | None:
"""Get user details: display_name, UPN, account_enabled, job_title."""

def get_last_sign_in(self, object_id: str) -> datetime | None:
"""Most recent sign-in timestamp. Returns None if P1/P2 unavailable."""

@staticmethod
def detect_credential_type(raw_sp: dict) -> str:
"""Infer credential type from keyCredentials + passwordCredentials."""

What moves here:

  • get_service_principal_by_id() from entra-servicenow / get_service_principal() from azure-foundry
  • get_service_principal_owners() from both
  • get_user_details() / get_user() from both
  • detect_credential_type() logic from both
  • EntraServicePrincipal and EntraOwner dataclasses from azure-foundry (already cleaner)

What stays in connectors:

  • entra-servicenow's bulk get_service_principals() (tenant-wide scan, specific to that connector)
  • entra-servicenow's get_app_role_assignments(), get_oauth2_permission_grants(), get_audit_logs() (Graph permission-specific, not needed by foundry)
  • entra-servicenow's bulk get_service_principal_sign_ins() (foundry only needs single-SP sign-in)

3.6 node_ids.py — Canonical Node ID Generation

Purpose: Enforce consistent node ID conventions so cross-connector graph merges work correctly.

def sp_node_id(principal_id: str) -> str:
"""Canonical node ID for an Entra service principal."""
return f"entra-sp-{principal_id}"

def owner_node_id(object_id: str) -> str:
"""Canonical node ID for an Entra user/owner."""
return f"entra-user-{object_id}"

def arm_role_node_id(assignment_id: str, fallback_name: str = "") -> str:
"""Canonical node ID for an ARM role assignment."""
return f"arm-role-{assignment_id or fallback_name}"

def arm_resource_node_id(scope: str) -> str:
"""Canonical node ID for an ARM resource. Uses scope hash."""
return f"arm-resource-{scope_hash(scope)}"

def arm_permission_node_id(role_name: str, scope: str) -> str:
"""Canonical node ID for an ARM permission (role × scope)."""
h = hashlib.sha256(f"arm:{role_name}:{scope}".encode()).hexdigest()[:16]
return f"arm-perm-{h}"

Why this matters: If both connectors discover the same managed identity (e.g., a Foundry workspace's MI that also appears in a ServiceNow REST Message's Function App resolution), the platform must merge them into one entity. Consistent node IDs are the mechanism. Today this works by coincidence (both use entra-sp-{id}). A shared module makes it a contract.

3.7 platform_client.py — Platform Submission

Purpose: Shared HTTP client for submitting NormalizedGraph to sv0-platform.

class PlatformClient:
def __init__(
self,
base_url: str = "http://localhost:3000",
tenant_id: str = "default",
api_key: str | None = None,
) -> None: ...

def health_check(self) -> bool:
"""GET /health"""

def submit_normalized_graph(self, graph: dict) -> dict:
"""POST /api/v1/ingest/normalized-graph"""

Source: Lift from azure-foundry's platform_client.py (already clean, well-structured).


4. Migration Strategy

Phase 1: Extract arm_roles.py (fixes active bug)

Effort: Small. Impact: Fixes azure-foundry write-level finding gap.

  1. Create shared/sv0_azure/arm_roles.py with unified constants and normalize_arm_action()
  2. Update azure-foundry transformer.py to from sv0_azure.arm_roles import normalize_arm_action
  3. Update entra-servicenow transformer.py to from sv0_azure.arm_roles import ARM_ROLE_ACTIONS, ...
  4. Delete _normalize_arm_action() from azure-foundry
  5. Delete _ARM_ROLE_ACTIONS, _WRITE_LEVEL_ACTIONS, _ARM_SENSITIVITY from entra-servicenow
  6. Run both test suites

Phase 2: Extract arm_scope.py + arm.py

Effort: Medium. Impact: Eliminates ~120 lines of duplication, unifies well-known role cache.

  1. Create shared/sv0_azure/arm_scope.py with parse_arm_scope() and scope_hash()
  2. Create shared/sv0_azure/arm.py with unified ArmClient
  3. Merge well-known role UUID caches (azure-foundry's 60+ is the superset)
  4. Refactor both connectors' ARM calls to use shared client
  5. resolve_function_app_identity() stays in entra-servicenow's azure_client.py — it calls shared ArmClient.list_subscriptions() but the Function App-specific discovery logic remains local (see section 3.2)
  6. azure-foundry deletes arm_client.py entirely (fully replaced)

Phase 3: Extract auth.py + entra.py

Effort: Medium. Impact: Eliminates ~100 lines of duplication, shared dataclasses.

  1. Create shared/sv0_azure/auth.py and entra.py
  2. Both connectors adopt EntraServicePrincipal and EntraOwner dataclasses
  3. entra-servicenow keeps its bulk Graph methods (tenant-wide SP scan, audit logs, etc.)
  4. azure-foundry deletes azure_client.py (fully replaced by shared entra.py)

Phase 4: Extract node_ids.py + platform_client.py

Effort: Small. Impact: Convention enforcement, code reuse.

  1. Create shared modules
  2. Both connectors adopt shared node ID functions
  3. entra-servicenow adopts shared PlatformClient

5. What Does NOT Change

ComponentWhy it stays connector-specific
entra-servicenow correlator.pyCross-service matching (Azure SP ↔ SN OAuth by client_id) is connector-specific logic
entra-servicenow permission_mapper.pyOAA canonical permission mapping for Microsoft Graph permissions — not ARM-related
entra-servicenow servicenow_client.pyServiceNow REST API — no Azure overlap
azure-foundry foundry_client.pyFoundry workspace/agent/connection discovery — connector-specific
azure-foundry egress_classifier.pyFoundry connection endpoint classification — connector-specific
azure-foundry edge_resolver.pyFoundry-specific edge resolution (agents → connections → identities)
azure-foundry discoverer.pyFoundry-specific orchestration
Transformer core logicEach connector's transform() method is specific to its entity model

6. Packaging & CI

6.0 Package Definitions

# shared/sv0_common/pyproject.toml
[project]
name = "sv0-common"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"requests>=2.31.0",
]

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
# shared/sv0_azure/pyproject.toml
[project]
name = "sv0-azure"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"sv0-common>=0.1.0",
"azure-identity>=1.15.0",
"requests>=2.31.0",
]

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

Build backend is setuptools (not hatchling) to match both existing connectors.

6.1 CI Install Path

Current state: The entra-servicenow CI workflows use two different install strategies:

WorkflowInstall commandSource
entra-servicenow-ci.yml (line 39)pip install -r requirements.txt -r requirements-dev.txtrequirements files
entra-servicenow-scan.yml (line 58)pip install -r requirements.txtrequirements files
entra-servicenow-quality.yml (line 43)pip install -e ".[dev,test]"pyproject.toml extras

azure-foundry has no CI workflows — only local Makefile targets.

Problem: After Phase 1, connectors will from sv0_azure.arm_roles import .... If CI only runs pip install -r requirements.txt, sv0_azure is not installed, and imports fail immediately.

Fix: Each workflow's install step must install the shared packages before the connector. The shared packages are added to the requirements.txt files as editable local installs:

# integrations/entra-servicenow/requirements.txt (add at top)
-e ../../shared/sv0_common
-e ../../shared/sv0_azure

This works for both the requirements.txt and pyproject.toml install paths because pip processes -e lines in requirements files. The pyproject.toml extras path (pip install -e ".[dev,test]") also needs a pre-install step in CI:

# entra-servicenow-quality.yml — updated install step
- name: Install shared packages
run: |
pip install -e ../../shared/sv0_common
pip install -e ../../shared/sv0_azure
- name: Install connector
run: pip install -e ".[dev,test]"

6.2 CI Trigger Coverage for Shared Code

Current state: All three workflows are path-filtered to integrations/entra-servicenow/** only:

# entra-servicenow-ci.yml lines 6-13
on:
push:
paths:
- 'integrations/entra-servicenow/**'
- '.github/workflows/entra-servicenow-ci.yml'

Problem: Changes to shared/sv0_azure/** won't trigger any CI. A breaking change to the shared module would pass PR checks if only shared files are modified.

Fix: Add shared/** to path filters in all three entra-servicenow workflows, and create a new azure-foundry CI workflow:

# entra-servicenow-ci.yml — updated path filters
on:
push:
paths:
- 'integrations/entra-servicenow/**'
- 'shared/**'
- '.github/workflows/entra-servicenow-ci.yml'
pull_request:
paths:
- 'integrations/entra-servicenow/**'
- 'shared/**'
- '.github/workflows/entra-servicenow-ci.yml'

Same change for entra-servicenow-quality.yml and entra-servicenow-scan.yml.

Additionally, create .github/workflows/azure-foundry-ci.yml (modeled on entra-servicenow-ci.yml) with path filters for both integrations/azure-foundry/** and shared/**. This ensures shared code changes are validated against both consumers.

6.3 Working Directory for CI

All three existing workflows use working-directory: integrations/entra-servicenow. The shared packages are at ../../shared/ relative to that directory. The -e ../../shared/sv0_azure install path resolves correctly from the connector's working directory.

No packaging. Each connector adds shared/ to sys.path.

Pros: Zero packaging overhead. Cons: Fragile imports, no dependency declaration, harder for IDEs, CI cache invalidation unclear.

Recommendation: Option A (editable local packages) with the CI fixes described above.


7. Risk Assessment

RiskLikelihoodImpactMitigation
Breaking existing tests during extractionMediumMediumPhase approach — extract one module at a time, run both test suites after each
Import path confusion during developmentLowLowEditable install (pip install -e) makes imports work naturally
Shared package version driftLowMediumSingle monorepo — changes to shared code are always in the same PR as connector updates
Over-abstraction — shared code becomes too genericLowMediumOnly extract what both connectors already implement. No speculative interfaces
Connector-specific needs force shared module changesMediumLowShared module exposes data + functions, not opinionated frameworks. Connectors compose freely

8. Verification Plan

Verification criteria by phase type

Not all phases are pure refactors. Phase 1 intentionally changes azure-foundry's normalization output (that's the bug fix). The verification criteria must match:

PhaseTypeentra-servicenow graphazure-foundry graph
Phase 1Bug fix + extractionIdentical to baselineExpected diff in normalized_action values
Phase 2Pure refactorIdentical to Phase 1 outputIdentical to Phase 1 output
Phase 3Pure refactorIdentical to Phase 2 outputIdentical to Phase 2 output
Phase 4Pure refactorIdentical to Phase 3 outputIdentical to Phase 3 output

Per-phase verification (Phases 2-4, pure refactors)

After each extraction phase:

  1. cd integrations/entra-servicenow && pytest — all 296+ tests pass
  2. cd integrations/azure-foundry && pytest — all tests pass
  3. cd integrations/entra-servicenow && make lint — clean
  4. cd integrations/azure-foundry && make lint — clean
  5. Both connectors produce identical NormalizedGraph output compared to previous phase baseline:
    # Save baseline before starting phase
    entra-servicenow --all --graph-json reports/baseline-phaseN.json
    azure-foundry --all --graph-json reports/baseline-phaseN.json

    # After extraction: strict diff (must be empty)
    entra-servicenow --all --graph-json reports/after-phaseN.json
    diff <(jq -S . reports/baseline-phaseN.json) <(jq -S . reports/after-phaseN.json)
    # Exit code 0 = identical

Phase 1 specific verification (bug fix — expected diff)

Phase 1 intentionally changes azure-foundry output. The verification uses an expected-diff policy rather than strict equivalence:

# Step 1: Save azure-foundry baseline BEFORE Phase 1
azure-foundry --all --graph-json reports/foundry-before.json

# Step 2: Apply Phase 1, then generate new output
azure-foundry --all --graph-json reports/foundry-after.json

# Step 3: Extract permission node diffs only
python3 -c "
import json

before = json.load(open('reports/foundry-before.json'))
after = json.load(open('reports/foundry-after.json'))

before_perms = {n['nodeId']: n for n in before['nodes'] if n.get('nodeType') == 'permission'}
after_perms = {n['nodeId']: n for n in after['nodes'] if n.get('nodeType') == 'permission'}

# Verify: only normalized_action changed, and only from wrong to correct values
EXPECTED_CHANGES = {
'use': {'read', 'admin', 'execute'}, # 'use' was the bug; these are valid replacements
}

for nid in after_perms:
old_action = before_perms.get(nid, {}).get('properties', {}).get('normalized_action')
new_action = after_perms[nid]['properties'].get('normalized_action')
if old_action != new_action:
assert old_action in EXPECTED_CHANGES, f'Unexpected change from {old_action}'
assert new_action in EXPECTED_CHANGES[old_action], f'Unexpected new value {new_action}'
print(f' EXPECTED: {nid}: {old_action!r} -> {new_action!r}')

# Verify: non-permission nodes are identical
before_other = [n for n in before['nodes'] if n.get('nodeType') != 'permission']
after_other = [n for n in after['nodes'] if n.get('nodeType') != 'permission']
assert before_other == after_other, 'Non-permission nodes changed unexpectedly'
print('Phase 1 verification passed: only expected permission changes')
"
# Step 4: entra-servicenow must be STRICTLY identical (no bug fix, pure extraction)
entra-servicenow --all --graph-json reports/entra-before.json
# ... apply Phase 1 ...
entra-servicenow --all --graph-json reports/entra-after.json
diff <(jq -S . reports/entra-before.json) <(jq -S . reports/entra-after.json)
# Must be empty — entra-servicenow already had correct normalization

9. Files Changed (Per Phase)

Phase 1

FileChange
shared/sv0_common/__init__.py (new)Package init
shared/sv0_common/pyproject.toml (new)Package definition (requests only)
shared/sv0_azure/__init__.py (new)Package init
shared/sv0_azure/pyproject.toml (new)Package definition (azure-identity, sv0-common)
shared/sv0_azure/arm_roles.py (new)Unified role actions, sensitivity, NormalizedAction, normalize_arm_action()
integrations/azure-foundry/src/.../transformer.pyImport from sv0_azure.arm_roles, delete local _normalize_arm_action()
integrations/entra-servicenow/src/.../transformer.pyImport from sv0_azure.arm_roles, delete local constants
integrations/entra-servicenow/requirements.txtAdd -e ../../shared/sv0_common and -e ../../shared/sv0_azure
integrations/entra-servicenow/requirements-dev.txt(no change — shared packages are production deps)
.github/workflows/entra-servicenow-ci.ymlAdd shared/** to path filters, add shared install step
.github/workflows/entra-servicenow-quality.ymlAdd shared/** to path filters, add shared install step
.github/workflows/entra-servicenow-scan.ymlAdd shared/** to path filters, add shared install step
.github/workflows/azure-foundry-ci.yml (new)CI workflow for azure-foundry (path filters: integrations/azure-foundry/**, shared/**)

Phase 2

FileChange
shared/sv0_azure/arm_scope.py (new)ArmScope dataclass, parse_arm_scope(), scope_hash()
shared/sv0_azure/arm.py (new)Unified ArmClient with merged 60+ well-known role cache
integrations/azure-foundry/src/.../arm_client.pyDelete (replaced by shared)
integrations/entra-servicenow/src/.../azure_client.pyRemove ARM methods, import shared ArmClient
Both transformersUse parse_arm_scope() from shared

Phase 3

FileChange
shared/sv0_azure/auth.py (new)Credential + session factory
shared/sv0_azure/entra.py (new)Entra Graph client + dataclasses
integrations/azure-foundry/src/.../azure_client.pyDelete (replaced by shared)
integrations/entra-servicenow/src/.../azure_client.pyRemove duplicated methods, keep bulk scan methods

Phase 4

FileChange
shared/sv0_azure/node_ids.py (new)Canonical node ID functions
shared/sv0_common/platform_client.py (new)Lift from azure-foundry (Azure-independent)
Both transformers + edge resolversUse shared node ID functions
integrations/azure-foundry/src/.../platform_client.pyDelete (moved to shared)

Platform changes: None required. The shared module affects only connector internals. NormalizedGraph output format is unchanged (except the intentional Phase 1 bug fix in azure-foundry).


10. Peer Review Findings (v1 → v2)

#SeverityFindingResolution
1Criticalsv0-azure @ file://../../shared/sv0_azure is invalid pip syntax — file:// with relative paths is rejected as "non-local file URIs are not supported"Removed file:// URI from pyproject.toml. Connectors declare sv0-azure>=0.1.0 as a named dependency; the editable install (pip install -e ../../shared/sv0_azure) satisfies it. CI installs shared packages explicitly before the connector. (Section 2, 6.1)
2HighCI workflows install from requirements.txt / requirements-dev.txt, not pyproject extras — shared imports would fail in CIAdded -e ../../shared/sv0_common and -e ../../shared/sv0_azure to requirements.txt. Added explicit shared install steps to quality workflow. (Section 6.1)
3HighWorkflow path filters only match integrations/entra-servicenow/** — changes to shared/ won't trigger testsAdded shared/** to path filters in all three entra-servicenow workflows. Created new azure-foundry-ci.yml workflow with both integrations/azure-foundry/** and shared/** triggers. (Section 6.2)
4HighVerification plan requires "identical NormalizedGraph output" but Phase 1 intentionally changes azure-foundry normalization to fix the bugReplaced strict-equivalence with a two-tier verification policy: Phase 1 uses an expected-diff policy (only normalized_action changes from "use" to correct values are allowed); Phases 2-4 use strict graph equivalence against the previous phase baseline. entra-servicenow is always strict (no output change). (Section 8)
5Mediumresolve_function_app_identity() listed in shared ArmClient spec (section 3.2) but migration says it may stay local (section 4, Phase 2 step 5) — contradictoryRemoved from shared ArmClient spec. It stays in entra-servicenow's azure_client.py — it is Function App-specific discovery logic that only one connector uses, composing on top of shared ArmClient.list_subscriptions(). (Section 3.2)
6MediumArmScope.resource_type and resource_name are required strings but subscription/RG scopes have no resource type — type inconsistencyChanged to `str
7MediumUnknown-role fallback to ["read", "write"] is conservative but creates noise for custom roles with no way to distinguish inferred from explicitAdded NormalizedAction dataclass with is_fallback: bool field. Connectors propagate this into permission node properties (e.g. "action_classification": "fallback") so downstream consumers can filter or flag. (Section 3.4)
8Mediumplatform_client.py in sv0_azure is a leaky abstraction — non-Azure connectors would need to depend on an Azure package for generic HTTP transportMoved platform_client.py to new shared/sv0_common/ package. sv0_common has no Azure dependency (only requests). sv0_azure depends on sv0_common. Any future non-Azure connector can depend on sv0_common alone. (Section 2)

Open questions (resolved)

QuestionAnswer
Should requirements.txt or pyproject.toml be the canonical dependency source?Both coexist today and serve different CI workflows. Keep both; add shared package references to requirements.txt for ci/scan workflows and explicit install steps for quality workflow.
Should azure-foundry get its own CI workflow?Yes. It has none today, and shared code changes must be validated against both consumers. Phase 1 creates azure-foundry-ci.yml.
Is setuptools or hatchling the right build backend for shared packages?setuptools — both existing connectors use it. No reason to introduce a new build tool.