Architecture Plan: Shared Azure Modules Across Connectors
Date: 2026-02-25 Status: Draft v2 (addresses all peer review findings from v1) Scope: sv0-connectors (cross-connector shared package) Origin: Code review of function-key authority paths implementation revealed significant Azure code duplication between entra-servicenow and azure-foundry connectors, including one active bug caused by divergent implementations. Review history: v1 → v2: 8 findings (1 critical, 3 high, 4 medium). All resolved — see section 10.
1. Problem Statement
The entra-servicenow and azure-foundry connectors independently implement overlapping Azure functionality: ARM RBAC role resolution, ARM scope parsing, Entra Graph SP lookups, credential type detection, authentication setup, and node ID conventions. This duplication violates DRY and has already produced a concrete bug.
Active Bug: Divergent ARM Action Normalization
The two connectors normalize ARM role names to platform action keywords differently:
entra-servicenow (transformer.py, recently fixed):
_ARM_ROLE_ACTIONS = {
"Contributor": ["read", "write", "delete"],
"User Access Administrator": ["admin"],
"Key Vault Crypto User": ["read", "execute"],
...
}
# Picks highest-privilege action from explicit mapping
azure-foundry (transformer.py):
def _normalize_arm_action(role_name: str) -> str:
lower = role_name.lower()
if "owner" in lower: return "owner"
if "contributor" in lower: return "write"
if "user" in lower or "use" in lower: return "use" # ← BUG
...
Impact: The platform evaluator privilege_justification_gap checks hasWriteActions() against WRITE_LEVEL_ACTIONS = {"write", "admin", "delete", "update", "create", "execute"}. The string "use" is NOT in that set. So azure-foundry silently misses write-level findings for:
| ARM Role | entra-servicenow (correct) | azure-foundry (wrong) | Should trigger write finding? |
|---|---|---|---|
| User Access Administrator | "admin" | "use" | Yes — admin is write-level |
| Key Vault Crypto User | "execute" | "use" | Yes — execute is write-level |
| Key Vault Secrets User | "read" | "use" | No — read-only, but "use" is still wrong |
| Cognitive Services User | "read" (unmapped, but read-only role) | "use" | No — but "use" is not a recognized action |
This is not hypothetical — Foundry workspaces commonly have "Cognitive Services User" and "Cognitive Services OpenAI User" role assignments on their managed identities.
Duplication Inventory
| Capability | entra-servicenow | azure-foundry | Lines duplicated |
|---|---|---|---|
ARM list_subscriptions() | azure_client.py:420-448 | arm_client.py:55-82 | ~30 |
ARM get_role_assignments_for_principal() | azure_client.py:450-510 | arm_client.py:84-130 | ~50 |
| ARM role name resolution (UUID → name) | azure_client.py:512-560 | arm_client.py:132-170 | ~40 |
| Well-known role UUID cache | 5 entries | 60+ entries | Subset problem |
| ARM scope parsing | transformer.py:870-940 (inline) | arm_client.py:_parse_scope() | ~40 |
| ARM role → action normalization | transformer.py:69-106 | transformer.py:_normalize_arm_action() | ~30 |
| ARM resource sensitivity | transformer.py:108-116 | Not present | Missing in foundry |
| Entra SP lookup by object ID | azure_client.py:562-590 | azure_client.py:get_service_principal() | ~30 |
| Entra SP owners | azure_client.py:170-210 | azure_client.py:get_service_principal_owners() | ~30 |
| Entra user details | azure_client.py:212-250 | azure_client.py:get_user() | ~25 |
| Credential type detection | azure_client.py:_extract_sp_data() | azure_client.py:_parse_sp() | ~15 |
| Auth setup (ClientSecretCredential) | azure_client.py:__init__() | azure_client.py + arm_client.py | ~20 |
| Retry session factory | azure_client.py:__init__() | arm_client.py:__init__() | ~10 |
| Node ID: SP | entra-sp-{principal_id} | entra-sp-{principal_id} | Convention only |
| Node ID: owner | entra-user-{object_id} | entra-user-{object_id} | Convention only |
| Platform submission | Inline in cli/main.py | platform_client.py | ~40 |
Total: ~360 lines of functionally duplicated code, plus one active bug from divergent implementations.
2. Proposed Architecture
Package Structure
sv0-connectors/
├── shared/
│ ├── sv0_common/ # Connector-agnostic utilities
│ │ ├── __init__.py
│ │ ├── py.typed
│ │ ├── pyproject.toml
│ │ └── platform_client.py # NormalizedGraph submission to sv0-platform
│ │
│ └── sv0_azure/ # Azure-specific shared code
│ ├── __init__.py
│ ├── py.typed # PEP 561 marker
│ ├── pyproject.toml # Installable package
│ │
│ ├── auth.py # Credential + retry session factory
│ ├── arm.py # ARM client: subscriptions, role assignments
│ ├── arm_scope.py # parse_arm_scope() utility
│ ├── arm_roles.py # Role UUID cache, role→action mapping, sensitivity
│ ├── entra.py # Graph client: SP lookup, owners, users, sign-ins
│ └── node_ids.py # Canonical node ID generation
│
├── integrations/
│ ├── entra-servicenow/
│ │ └── pyproject.toml # depends on sv0_azure, sv0_common
│ └── azure-foundry/
│ └── pyproject.toml # depends on sv0_azure, sv0_common
Package separation rationale: platform_client.py is transport-layer code (HTTP POST to sv0-platform API). It has no Azure dependency and will be needed by any future non-Azure connector (e.g., a GCP or AWS connector). Placing it in sv0_common avoids forcing non-Azure connectors to depend on azure-identity just to submit graphs. sv0_azure depends on sv0_common (for shared retry session logic); connectors depend on both.
Dependency Model
sv0_azure ← entra-servicenow
← azure-foundry
← (future connectors)
Each connector installs sv0_azure as an editable local dependency during development:
# Development install (both the shared package and the connector)
cd shared/sv0_azure && pip install -e .
cd integrations/entra-servicenow && pip install -e ".[dev,test]"
Note on pyproject.toml dependency declaration: PEP 440 file:// URIs are not portable across pip versions and environments. Instead of declaring a file:// dependency in pyproject.toml (which pip rejects as "non-local file URIs are not supported"), connectors declare sv0_azure as a bare dependency by name:
# integrations/entra-servicenow/pyproject.toml
[project]
dependencies = [
"sv0-azure>=0.1.0",
...
]
The editable install (pip install -e ../../shared/sv0_azure) satisfies this requirement during local development. For CI, the shared package is installed explicitly before the connector (see section 6.1).
3. Module Specifications
3.1 auth.py — Credential & Session Factory
Purpose: Eliminate duplicated ClientSecretCredential setup and retry session creation.
from azure.identity import ClientSecretCredential
GRAPH_SCOPE = "https://graph.microsoft.com/.default"
ARM_SCOPE = "https://management.azure.com/.default"
def azure_credential(
tenant_id: str, client_id: str, client_secret: str
) -> ClientSecretCredential:
"""Create a ClientSecretCredential for Azure API access."""
def retry_session(
retries: int = 3,
backoff_factor: float = 0.5,
status_forcelist: tuple[int, ...] = (429, 500, 502, 503, 504),
) -> requests.Session:
"""Create a requests.Session with retry strategy."""
def bearer_token(credential: ClientSecretCredential, scope: str) -> str:
"""Acquire a bearer token for the given scope."""
What moves here:
ClientSecretCredentialinitialization (both connectors)requests.Session+HTTPAdapter+Retrysetup (both connectors)- Token acquisition pattern (both connectors)
What stays in connectors:
- Config loading (env vars, validation) — connector-specific
- Connector-specific scopes (Foundry data-plane scopes like
ai.azure.com,ml.azure.com)
3.2 arm.py — ARM Management Client
Purpose: Single ARM client for subscription enumeration and RBAC role assignment queries.
class ArmClient:
def __init__(self, credential: ClientSecretCredential) -> None: ...
def list_subscriptions(self) -> list[str]:
"""List all accessible subscription IDs. Cached after first call."""
def get_role_assignments_for_principal(
self, subscription_id: str, principal_id: str
) -> list[ArmRoleAssignment]:
"""Query ARM RBAC assignments for a principal in a subscription."""
@dataclass
class ArmRoleAssignment:
assignment_id: str
principal_id: str
principal_type: str # "ServicePrincipal", "User", "Group"
role_definition_id: str # Full ARM path
role_name: str # Resolved human-readable name
scope: str # Full ARM scope string
What moves here:
list_subscriptions()from both connectorsget_role_assignments_for_principal()from both connectors_resolve_role_name()/_resolve_arm_role_name()— unified_WELL_KNOWN_ROLES— merged to 60+ entries (azure-foundry's superset)
What stays in connectors:
resolve_function_app_identity()stays in entra-servicenow'sazure_client.py— it is Function App-specific discovery logic that only entra-servicenow uses, and it composes on top ofArmClient.list_subscriptions()from the shared module- Foundry-specific ARM queries (
list_workspaces,list_aiservices,list_projects) - entra-servicenow bulk SP fetch
3.3 arm_scope.py — ARM Scope Parsing
Purpose: Parse ARM scope strings into structured components. Currently inline in entra-servicenow and a standalone function in azure-foundry.
@dataclass
class ArmScope:
raw: str # Original scope string
level: str # "resource" | "resource_group" | "subscription"
subscription_id: str | None # Extracted if present
resource_group: str | None # Extracted if present
resource_type: str | None # e.g. "Microsoft.Web/sites" — None for subscription scope
resource_name: str | None # e.g. "myapp" — None for subscription scope
@property
def display_name(self) -> str:
"""Human-readable label for this scope."""
if self.level == "resource" and self.resource_type and self.resource_name:
return f"{self.resource_type}/{self.resource_name}"
if self.level == "resource_group" and self.resource_name:
return f"ResourceGroup/{self.resource_name}"
if self.level == "subscription" and self.subscription_id:
return f"Subscription/{self.subscription_id[:8]}..."
return self.raw
def parse_arm_scope(scope: str) -> ArmScope | None:
"""Parse an ARM scope string into structured components.
Handles three levels:
- Resource: /subscriptions/.../providers/Microsoft.Web/sites/myapp
→ resource_type="Microsoft.Web/sites", resource_name="myapp"
- Resource group: /subscriptions/.../resourceGroups/myRG
→ resource_type=None, resource_name="myRG"
- Subscription: /subscriptions/{sub-id}
→ resource_type=None, resource_name=None
Returns None for empty or unparseable scopes.
Note: resource_type and resource_name are None for subscription-level
scopes and resource_type is None for resource-group scopes, since those
levels have no provider/type. Callers should check the `level` field
or use `display_name` for presentation.
"""
def scope_hash(scope: str) -> str:
"""Deterministic 12-char hash of an ARM scope for use in node IDs."""
Design note on optional fields: resource_type and resource_name are str | None rather than always-present strings because subscription-level scopes genuinely have no resource type or name. Using None makes it explicit that callers must handle each scope level, rather than hiding the absence behind sentinel strings like "Microsoft.Subscription" or "unknown". The level field provides the discriminant for pattern matching.
What moves here:
- Scope parsing logic from entra-servicenow
transformer.py:870-940 _parse_scope()from azure-foundryarm_client.pyscope_hash()from entra-servicenowtransformer.py:872
3.4 arm_roles.py — Role Actions, Sensitivity, and Classification
Purpose: Unified ARM role → action mapping and resource sensitivity classification. This module fixes the active bug in azure-foundry.
# --- Role → Action Mapping ---
# Maps well-known ARM built-in role names to platform action keywords.
# These keywords align with the platform evaluator's WRITE_LEVEL_ACTIONS set:
# {"write", "admin", "delete", "update", "create", "execute"}
ARM_ROLE_ACTIONS: dict[str, list[str]] = {
# Core
"Owner": ["read", "write", "delete", "admin"],
"Contributor": ["read", "write", "delete"],
"Reader": ["read"],
"User Access Administrator": ["admin"],
# Storage
"Storage Blob Data Owner": ["read", "write", "delete", "admin"],
"Storage Blob Data Contributor": ["read", "write", "delete"],
"Storage Blob Data Reader": ["read"],
# Key Vault
"Key Vault Administrator": ["read", "write", "delete", "admin"],
"Key Vault Secrets User": ["read"],
"Key Vault Crypto User": ["read", "execute"],
# SQL
"SQL DB Contributor": ["read", "write", "delete"],
"SQL Server Contributor": ["read", "write", "delete"],
# Compute / Web
"Website Contributor": ["read", "write", "delete"],
"Virtual Machine Contributor": ["read", "write", "delete"],
# AI / Cognitive Services
"Cognitive Services User": ["read"],
"Cognitive Services Contributor": ["read", "write", "delete"],
"Cognitive Services OpenAI User": ["read"],
"Cognitive Services OpenAI Contributor": ["read", "write", "delete"],
}
ARM_ROLE_DEFAULT_ACTIONS: list[str] = ["read", "write"]
WRITE_LEVEL_ACTIONS: frozenset[str] = frozenset(
{"write", "admin", "delete", "update", "create", "execute"}
)
@dataclass
class NormalizedAction:
"""Result of ARM role → action normalization."""
action: str # Highest-privilege action keyword
actions: list[str] # Full action list for evidence/display
is_fallback: bool # True if role_name was not in ARM_ROLE_ACTIONS
def normalize_arm_action(role_name: str) -> NormalizedAction:
"""Return the highest-privilege action for an ARM role.
Uses explicit mapping (ARM_ROLE_ACTIONS) with privilege precedence:
if any action is in WRITE_LEVEL_ACTIONS, return that one.
Otherwise return the first action (typically "read").
Unknown roles default to ["read", "write"] (conservative — ensures
write-level findings fire). The `is_fallback` flag is set to True so
callers can track, log, or flag unmapped roles for review.
Connectors should propagate `is_fallback` into the permission node's
properties (e.g. `"action_classification": "fallback"`) so that
downstream consumers can distinguish explicit from inferred actions.
"""
# --- Resource Sensitivity ---
ARM_SENSITIVITY: dict[str, str] = {
"Microsoft.KeyVault/vaults": "confidential",
"Microsoft.Sql/servers": "restricted",
"Microsoft.Storage/storageAccounts": "restricted",
"Microsoft.Web/sites": "internal",
"Microsoft.Compute/virtualMachines": "internal",
"Microsoft.CognitiveServices/accounts": "internal",
"Microsoft.MachineLearningServices/workspaces": "internal",
"Microsoft.Resources/resourceGroups": "unknown",
"Microsoft.Subscription": "unknown",
}
def classify_arm_sensitivity(resource_type: str) -> str:
"""Return sensitivity classification for an ARM resource type."""
What moves here:
_ARM_ROLE_ACTIONS,_ARM_ROLE_DEFAULT_ACTIONS,_WRITE_LEVEL_ACTIONSfrom entra-servicenow_normalize_arm_action()from azure-foundry (replaced by explicit mapping)_ARM_SENSITIVITYfrom entra-servicenow (extended with Cognitive Services and ML types for foundry)
Bug fix: azure-foundry switches from substring-matching ("user" in lower → "use") to the explicit mapping. "Cognitive Services User" now correctly returns "read" instead of "use". "User Access Administrator" correctly returns "admin" instead of "use".
3.5 entra.py — Entra Graph Client (SP Lookups)
Purpose: Shared Microsoft Graph queries for service principal details, owners, and user info.
@dataclass
class EntraServicePrincipal:
object_id: str
app_id: str
display_name: str
service_principal_type: str # "Application", "ManagedIdentity", "Legacy"
account_enabled: bool
credential_type: str # "none", "secret", "certificate", "secret_and_certificate"
created_at: datetime | None
sign_in_audience: str
owners: list[EntraOwner] = field(default_factory=list)
last_sign_in_at: datetime | None = None
@dataclass
class EntraOwner:
object_id: str
display_name: str
user_principal_name: str | None
owner_type: str # "user", "servicePrincipal"
class EntraClient:
def __init__(self, credential: ClientSecretCredential) -> None: ...
def get_service_principal(self, object_id: str) -> EntraServicePrincipal | None:
"""Fetch SP by Entra object ID. Returns None on 404."""
def get_service_principal_owners(self, object_id: str) -> list[EntraOwner]:
"""Get owners (users and SPs) of a service principal."""
def get_user(self, object_id: str) -> dict | None:
"""Get user details: display_name, UPN, account_enabled, job_title."""
def get_last_sign_in(self, object_id: str) -> datetime | None:
"""Most recent sign-in timestamp. Returns None if P1/P2 unavailable."""
@staticmethod
def detect_credential_type(raw_sp: dict) -> str:
"""Infer credential type from keyCredentials + passwordCredentials."""
What moves here:
get_service_principal_by_id()from entra-servicenow /get_service_principal()from azure-foundryget_service_principal_owners()from bothget_user_details()/get_user()from bothdetect_credential_type()logic from bothEntraServicePrincipalandEntraOwnerdataclasses from azure-foundry (already cleaner)
What stays in connectors:
- entra-servicenow's bulk
get_service_principals()(tenant-wide scan, specific to that connector) - entra-servicenow's
get_app_role_assignments(),get_oauth2_permission_grants(),get_audit_logs()(Graph permission-specific, not needed by foundry) - entra-servicenow's bulk
get_service_principal_sign_ins()(foundry only needs single-SP sign-in)
3.6 node_ids.py — Canonical Node ID Generation
Purpose: Enforce consistent node ID conventions so cross-connector graph merges work correctly.
def sp_node_id(principal_id: str) -> str:
"""Canonical node ID for an Entra service principal."""
return f"entra-sp-{principal_id}"
def owner_node_id(object_id: str) -> str:
"""Canonical node ID for an Entra user/owner."""
return f"entra-user-{object_id}"
def arm_role_node_id(assignment_id: str, fallback_name: str = "") -> str:
"""Canonical node ID for an ARM role assignment."""
return f"arm-role-{assignment_id or fallback_name}"
def arm_resource_node_id(scope: str) -> str:
"""Canonical node ID for an ARM resource. Uses scope hash."""
return f"arm-resource-{scope_hash(scope)}"
def arm_permission_node_id(role_name: str, scope: str) -> str:
"""Canonical node ID for an ARM permission (role × scope)."""
h = hashlib.sha256(f"arm:{role_name}:{scope}".encode()).hexdigest()[:16]
return f"arm-perm-{h}"
Why this matters: If both connectors discover the same managed identity (e.g., a Foundry workspace's MI that also appears in a ServiceNow REST Message's Function App resolution), the platform must merge them into one entity. Consistent node IDs are the mechanism. Today this works by coincidence (both use entra-sp-{id}). A shared module makes it a contract.
3.7 platform_client.py — Platform Submission
Purpose: Shared HTTP client for submitting NormalizedGraph to sv0-platform.
class PlatformClient:
def __init__(
self,
base_url: str = "http://localhost:3000",
tenant_id: str = "default",
api_key: str | None = None,
) -> None: ...
def health_check(self) -> bool:
"""GET /health"""
def submit_normalized_graph(self, graph: dict) -> dict:
"""POST /api/v1/ingest/normalized-graph"""
Source: Lift from azure-foundry's platform_client.py (already clean, well-structured).
4. Migration Strategy
Phase 1: Extract arm_roles.py (fixes active bug)
Effort: Small. Impact: Fixes azure-foundry write-level finding gap.
- Create
shared/sv0_azure/arm_roles.pywith unified constants andnormalize_arm_action() - Update azure-foundry
transformer.pytofrom sv0_azure.arm_roles import normalize_arm_action - Update entra-servicenow
transformer.pytofrom sv0_azure.arm_roles import ARM_ROLE_ACTIONS, ... - Delete
_normalize_arm_action()from azure-foundry - Delete
_ARM_ROLE_ACTIONS,_WRITE_LEVEL_ACTIONS,_ARM_SENSITIVITYfrom entra-servicenow - Run both test suites
Phase 2: Extract arm_scope.py + arm.py
Effort: Medium. Impact: Eliminates ~120 lines of duplication, unifies well-known role cache.
- Create
shared/sv0_azure/arm_scope.pywithparse_arm_scope()andscope_hash() - Create
shared/sv0_azure/arm.pywith unifiedArmClient - Merge well-known role UUID caches (azure-foundry's 60+ is the superset)
- Refactor both connectors' ARM calls to use shared client
resolve_function_app_identity()stays in entra-servicenow'sazure_client.py— it calls sharedArmClient.list_subscriptions()but the Function App-specific discovery logic remains local (see section 3.2)- azure-foundry deletes
arm_client.pyentirely (fully replaced)
Phase 3: Extract auth.py + entra.py
Effort: Medium. Impact: Eliminates ~100 lines of duplication, shared dataclasses.
- Create
shared/sv0_azure/auth.pyandentra.py - Both connectors adopt
EntraServicePrincipalandEntraOwnerdataclasses - entra-servicenow keeps its bulk Graph methods (tenant-wide SP scan, audit logs, etc.)
- azure-foundry deletes
azure_client.py(fully replaced by sharedentra.py)
Phase 4: Extract node_ids.py + platform_client.py
Effort: Small. Impact: Convention enforcement, code reuse.
- Create shared modules
- Both connectors adopt shared node ID functions
- entra-servicenow adopts shared
PlatformClient
5. What Does NOT Change
| Component | Why it stays connector-specific |
|---|---|
entra-servicenow correlator.py | Cross-service matching (Azure SP ↔ SN OAuth by client_id) is connector-specific logic |
entra-servicenow permission_mapper.py | OAA canonical permission mapping for Microsoft Graph permissions — not ARM-related |
entra-servicenow servicenow_client.py | ServiceNow REST API — no Azure overlap |
azure-foundry foundry_client.py | Foundry workspace/agent/connection discovery — connector-specific |
azure-foundry egress_classifier.py | Foundry connection endpoint classification — connector-specific |
azure-foundry edge_resolver.py | Foundry-specific edge resolution (agents → connections → identities) |
azure-foundry discoverer.py | Foundry-specific orchestration |
| Transformer core logic | Each connector's transform() method is specific to its entity model |
6. Packaging & CI
6.0 Package Definitions
# shared/sv0_common/pyproject.toml
[project]
name = "sv0-common"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"requests>=2.31.0",
]
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
# shared/sv0_azure/pyproject.toml
[project]
name = "sv0-azure"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"sv0-common>=0.1.0",
"azure-identity>=1.15.0",
"requests>=2.31.0",
]
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
Build backend is setuptools (not hatchling) to match both existing connectors.
6.1 CI Install Path
Current state: The entra-servicenow CI workflows use two different install strategies:
| Workflow | Install command | Source |
|---|---|---|
entra-servicenow-ci.yml (line 39) | pip install -r requirements.txt -r requirements-dev.txt | requirements files |
entra-servicenow-scan.yml (line 58) | pip install -r requirements.txt | requirements files |
entra-servicenow-quality.yml (line 43) | pip install -e ".[dev,test]" | pyproject.toml extras |
azure-foundry has no CI workflows — only local Makefile targets.
Problem: After Phase 1, connectors will from sv0_azure.arm_roles import .... If CI only runs pip install -r requirements.txt, sv0_azure is not installed, and imports fail immediately.
Fix: Each workflow's install step must install the shared packages before the connector. The shared packages are added to the requirements.txt files as editable local installs:
# integrations/entra-servicenow/requirements.txt (add at top)
-e ../../shared/sv0_common
-e ../../shared/sv0_azure
This works for both the requirements.txt and pyproject.toml install paths because pip processes -e lines in requirements files. The pyproject.toml extras path (pip install -e ".[dev,test]") also needs a pre-install step in CI:
# entra-servicenow-quality.yml — updated install step
- name: Install shared packages
run: |
pip install -e ../../shared/sv0_common
pip install -e ../../shared/sv0_azure
- name: Install connector
run: pip install -e ".[dev,test]"
6.2 CI Trigger Coverage for Shared Code
Current state: All three workflows are path-filtered to integrations/entra-servicenow/** only:
# entra-servicenow-ci.yml lines 6-13
on:
push:
paths:
- 'integrations/entra-servicenow/**'
- '.github/workflows/entra-servicenow-ci.yml'
Problem: Changes to shared/sv0_azure/** won't trigger any CI. A breaking change to the shared module would pass PR checks if only shared files are modified.
Fix: Add shared/** to path filters in all three entra-servicenow workflows, and create a new azure-foundry CI workflow:
# entra-servicenow-ci.yml — updated path filters
on:
push:
paths:
- 'integrations/entra-servicenow/**'
- 'shared/**'
- '.github/workflows/entra-servicenow-ci.yml'
pull_request:
paths:
- 'integrations/entra-servicenow/**'
- 'shared/**'
- '.github/workflows/entra-servicenow-ci.yml'
Same change for entra-servicenow-quality.yml and entra-servicenow-scan.yml.
Additionally, create .github/workflows/azure-foundry-ci.yml (modeled on entra-servicenow-ci.yml) with path filters for both integrations/azure-foundry/** and shared/**. This ensures shared code changes are validated against both consumers.
6.3 Working Directory for CI
All three existing workflows use working-directory: integrations/entra-servicenow. The shared packages are at ../../shared/ relative to that directory. The -e ../../shared/sv0_azure install path resolves correctly from the connector's working directory.
6.4 Option B: Simple Python Path Import (Not Recommended)
No packaging. Each connector adds shared/ to sys.path.
Pros: Zero packaging overhead. Cons: Fragile imports, no dependency declaration, harder for IDEs, CI cache invalidation unclear.
Recommendation: Option A (editable local packages) with the CI fixes described above.
7. Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Breaking existing tests during extraction | Medium | Medium | Phase approach — extract one module at a time, run both test suites after each |
| Import path confusion during development | Low | Low | Editable install (pip install -e) makes imports work naturally |
| Shared package version drift | Low | Medium | Single monorepo — changes to shared code are always in the same PR as connector updates |
| Over-abstraction — shared code becomes too generic | Low | Medium | Only extract what both connectors already implement. No speculative interfaces |
| Connector-specific needs force shared module changes | Medium | Low | Shared module exposes data + functions, not opinionated frameworks. Connectors compose freely |
8. Verification Plan
Verification criteria by phase type
Not all phases are pure refactors. Phase 1 intentionally changes azure-foundry's normalization output (that's the bug fix). The verification criteria must match:
| Phase | Type | entra-servicenow graph | azure-foundry graph |
|---|---|---|---|
| Phase 1 | Bug fix + extraction | Identical to baseline | Expected diff in normalized_action values |
| Phase 2 | Pure refactor | Identical to Phase 1 output | Identical to Phase 1 output |
| Phase 3 | Pure refactor | Identical to Phase 2 output | Identical to Phase 2 output |
| Phase 4 | Pure refactor | Identical to Phase 3 output | Identical to Phase 3 output |
Per-phase verification (Phases 2-4, pure refactors)
After each extraction phase:
cd integrations/entra-servicenow && pytest— all 296+ tests passcd integrations/azure-foundry && pytest— all tests passcd integrations/entra-servicenow && make lint— cleancd integrations/azure-foundry && make lint— clean- Both connectors produce identical NormalizedGraph output compared to previous phase baseline:
# Save baseline before starting phase
entra-servicenow --all --graph-json reports/baseline-phaseN.json
azure-foundry --all --graph-json reports/baseline-phaseN.json
# After extraction: strict diff (must be empty)
entra-servicenow --all --graph-json reports/after-phaseN.json
diff <(jq -S . reports/baseline-phaseN.json) <(jq -S . reports/after-phaseN.json)
# Exit code 0 = identical
Phase 1 specific verification (bug fix — expected diff)
Phase 1 intentionally changes azure-foundry output. The verification uses an expected-diff policy rather than strict equivalence:
# Step 1: Save azure-foundry baseline BEFORE Phase 1
azure-foundry --all --graph-json reports/foundry-before.json
# Step 2: Apply Phase 1, then generate new output
azure-foundry --all --graph-json reports/foundry-after.json
# Step 3: Extract permission node diffs only
python3 -c "
import json
before = json.load(open('reports/foundry-before.json'))
after = json.load(open('reports/foundry-after.json'))
before_perms = {n['nodeId']: n for n in before['nodes'] if n.get('nodeType') == 'permission'}
after_perms = {n['nodeId']: n for n in after['nodes'] if n.get('nodeType') == 'permission'}
# Verify: only normalized_action changed, and only from wrong to correct values
EXPECTED_CHANGES = {
'use': {'read', 'admin', 'execute'}, # 'use' was the bug; these are valid replacements
}
for nid in after_perms:
old_action = before_perms.get(nid, {}).get('properties', {}).get('normalized_action')
new_action = after_perms[nid]['properties'].get('normalized_action')
if old_action != new_action:
assert old_action in EXPECTED_CHANGES, f'Unexpected change from {old_action}'
assert new_action in EXPECTED_CHANGES[old_action], f'Unexpected new value {new_action}'
print(f' EXPECTED: {nid}: {old_action!r} -> {new_action!r}')
# Verify: non-permission nodes are identical
before_other = [n for n in before['nodes'] if n.get('nodeType') != 'permission']
after_other = [n for n in after['nodes'] if n.get('nodeType') != 'permission']
assert before_other == after_other, 'Non-permission nodes changed unexpectedly'
print('Phase 1 verification passed: only expected permission changes')
"
# Step 4: entra-servicenow must be STRICTLY identical (no bug fix, pure extraction)
entra-servicenow --all --graph-json reports/entra-before.json
# ... apply Phase 1 ...
entra-servicenow --all --graph-json reports/entra-after.json
diff <(jq -S . reports/entra-before.json) <(jq -S . reports/entra-after.json)
# Must be empty — entra-servicenow already had correct normalization
9. Files Changed (Per Phase)
Phase 1
| File | Change |
|---|---|
shared/sv0_common/__init__.py (new) | Package init |
shared/sv0_common/pyproject.toml (new) | Package definition (requests only) |
shared/sv0_azure/__init__.py (new) | Package init |
shared/sv0_azure/pyproject.toml (new) | Package definition (azure-identity, sv0-common) |
shared/sv0_azure/arm_roles.py (new) | Unified role actions, sensitivity, NormalizedAction, normalize_arm_action() |
integrations/azure-foundry/src/.../transformer.py | Import from sv0_azure.arm_roles, delete local _normalize_arm_action() |
integrations/entra-servicenow/src/.../transformer.py | Import from sv0_azure.arm_roles, delete local constants |
integrations/entra-servicenow/requirements.txt | Add -e ../../shared/sv0_common and -e ../../shared/sv0_azure |
integrations/entra-servicenow/requirements-dev.txt | (no change — shared packages are production deps) |
.github/workflows/entra-servicenow-ci.yml | Add shared/** to path filters, add shared install step |
.github/workflows/entra-servicenow-quality.yml | Add shared/** to path filters, add shared install step |
.github/workflows/entra-servicenow-scan.yml | Add shared/** to path filters, add shared install step |
.github/workflows/azure-foundry-ci.yml (new) | CI workflow for azure-foundry (path filters: integrations/azure-foundry/**, shared/**) |
Phase 2
| File | Change |
|---|---|
shared/sv0_azure/arm_scope.py (new) | ArmScope dataclass, parse_arm_scope(), scope_hash() |
shared/sv0_azure/arm.py (new) | Unified ArmClient with merged 60+ well-known role cache |
integrations/azure-foundry/src/.../arm_client.py | Delete (replaced by shared) |
integrations/entra-servicenow/src/.../azure_client.py | Remove ARM methods, import shared ArmClient |
| Both transformers | Use parse_arm_scope() from shared |
Phase 3
| File | Change |
|---|---|
shared/sv0_azure/auth.py (new) | Credential + session factory |
shared/sv0_azure/entra.py (new) | Entra Graph client + dataclasses |
integrations/azure-foundry/src/.../azure_client.py | Delete (replaced by shared) |
integrations/entra-servicenow/src/.../azure_client.py | Remove duplicated methods, keep bulk scan methods |
Phase 4
| File | Change |
|---|---|
shared/sv0_azure/node_ids.py (new) | Canonical node ID functions |
shared/sv0_common/platform_client.py (new) | Lift from azure-foundry (Azure-independent) |
| Both transformers + edge resolvers | Use shared node ID functions |
integrations/azure-foundry/src/.../platform_client.py | Delete (moved to shared) |
Platform changes: None required. The shared module affects only connector internals. NormalizedGraph output format is unchanged (except the intentional Phase 1 bug fix in azure-foundry).
10. Peer Review Findings (v1 → v2)
| # | Severity | Finding | Resolution |
|---|---|---|---|
| 1 | Critical | sv0-azure @ file://../../shared/sv0_azure is invalid pip syntax — file:// with relative paths is rejected as "non-local file URIs are not supported" | Removed file:// URI from pyproject.toml. Connectors declare sv0-azure>=0.1.0 as a named dependency; the editable install (pip install -e ../../shared/sv0_azure) satisfies it. CI installs shared packages explicitly before the connector. (Section 2, 6.1) |
| 2 | High | CI workflows install from requirements.txt / requirements-dev.txt, not pyproject extras — shared imports would fail in CI | Added -e ../../shared/sv0_common and -e ../../shared/sv0_azure to requirements.txt. Added explicit shared install steps to quality workflow. (Section 6.1) |
| 3 | High | Workflow path filters only match integrations/entra-servicenow/** — changes to shared/ won't trigger tests | Added shared/** to path filters in all three entra-servicenow workflows. Created new azure-foundry-ci.yml workflow with both integrations/azure-foundry/** and shared/** triggers. (Section 6.2) |
| 4 | High | Verification plan requires "identical NormalizedGraph output" but Phase 1 intentionally changes azure-foundry normalization to fix the bug | Replaced strict-equivalence with a two-tier verification policy: Phase 1 uses an expected-diff policy (only normalized_action changes from "use" to correct values are allowed); Phases 2-4 use strict graph equivalence against the previous phase baseline. entra-servicenow is always strict (no output change). (Section 8) |
| 5 | Medium | resolve_function_app_identity() listed in shared ArmClient spec (section 3.2) but migration says it may stay local (section 4, Phase 2 step 5) — contradictory | Removed from shared ArmClient spec. It stays in entra-servicenow's azure_client.py — it is Function App-specific discovery logic that only one connector uses, composing on top of shared ArmClient.list_subscriptions(). (Section 3.2) |
| 6 | Medium | ArmScope.resource_type and resource_name are required strings but subscription/RG scopes have no resource type — type inconsistency | Changed to `str |
| 7 | Medium | Unknown-role fallback to ["read", "write"] is conservative but creates noise for custom roles with no way to distinguish inferred from explicit | Added NormalizedAction dataclass with is_fallback: bool field. Connectors propagate this into permission node properties (e.g. "action_classification": "fallback") so downstream consumers can filter or flag. (Section 3.4) |
| 8 | Medium | platform_client.py in sv0_azure is a leaky abstraction — non-Azure connectors would need to depend on an Azure package for generic HTTP transport | Moved platform_client.py to new shared/sv0_common/ package. sv0_common has no Azure dependency (only requests). sv0_azure depends on sv0_common. Any future non-Azure connector can depend on sv0_common alone. (Section 2) |
Open questions (resolved)
| Question | Answer |
|---|---|
Should requirements.txt or pyproject.toml be the canonical dependency source? | Both coexist today and serve different CI workflows. Keep both; add shared package references to requirements.txt for ci/scan workflows and explicit install steps for quality workflow. |
| Should azure-foundry get its own CI workflow? | Yes. It has none today, and shared code changes must be validated against both consumers. Phase 1 creates azure-foundry-ci.yml. |
Is setuptools or hatchling the right build backend for shared packages? | setuptools — both existing connectors use it. No reason to introduce a new build tool. |