Skip to main content

ADR-004: Import-by-Type Connector Architecture

Status

Accepted (2026-02-10)

Context

The Entra-ServiceNow connector originally used two heavy dataclasses — Integration and ExecutionChain — that pre-linked discovered entities into chains before the transformer could process them. For example, an ExecutionChain bundled together a REST Message, its OAuth entity, matching Azure SP, calling Business Rules, Script Includes, Scheduled Jobs, owner details, and sign-in data into a single object.

The transformer then decomposed these pre-linked chains back into individual NormalizedGraph nodes and edges. This round-trip (link → decompose) was redundant and made the codebase harder to reason about:

  • Adding a new entity type (e.g., Flow Designer flows) required threading it through the chain-building logic
  • The correlator mixed two concerns: entity discovery and relationship resolution
  • Test scenarios needed to construct complex chain objects with many nested fields
  • The data flow was opaque — it wasn't clear which entities came from which source

Decision

Replace pre-linked chains with flat entity discovery (DiscoveredEntities) and explicit edge resolution (EdgeResolver).

New data flow

SN Client → entity dicts by type → EdgeResolver (client_id match, script-text search) → DiscoveredEntities → Transformer → NormalizedGraph

Key types

  • DiscoveredEntities: Flat container with lists of entities by type (business_rules, script_includes, scheduled_jobs, flows, rest_messages, oauth_entities, azure_sps, etc.) plus resolved edge lists (auth_edges, caller_edges)
  • EdgeResolver: Explicit resolution of cross-entity relationships. resolve_auth_edges() matches OAuth entities to Azure SPs by client_id. resolve_caller_edges() matches automations to REST messages by script-text search.
  • ResolvedEdge: Edge with provenance properties (matching field, matching value, issuing system) preserved for evidence packs

Migration strategy

The transformer has a transform_entities() method that accepts DiscoveredEntities and internally bridges to the existing transform() method via _build_legacy_objects(). This adapter pattern guarantees output parity during migration. The legacy Integration/ExecutionChain classes remain in the codebase for the adapter but are no longer the primary interface.

Rationale

Why flat entities over pre-linked chains

  • Simpler connector code: Discovery functions return entities directly without needing to know about chain structure
  • Explicit edge resolution: Cross-entity relationships are resolved in a dedicated step with clear matching rules, not buried in correlator logic
  • Testability: Entity builders in test scenarios are straightforward lists, not deeply nested objects
  • Extensibility: Adding new entity types (e.g., flows) just means adding a new list field to DiscoveredEntities

Why preserve provenance on edges

ResolvedEdge.properties carries evidence references (matching field, matching value, issuing/target system IDs) that flow through to evidence packs. Using a dict (not a bare tuple) ensures explainability is preserved.

Why adapter pattern for migration

Rewriting the full transformer was unnecessary risk. The adapter (_build_legacy_objects()) reconstructs legacy chain objects from flat entities, delegating to the battle-tested transform(). Both paths produce identical NormalizedGraph output.

Consequences

  • Positive: Connector code is ~40% simpler. Adding Flow Designer support required only adding a flows list to DiscoveredEntities and a few lines in the transformer.
  • Positive: Edge resolution logic is unit-testable in isolation (23 tests for EdgeResolver).
  • Negative: The adapter layer adds intermediate conversion overhead. This is acceptable for the current scale and can be removed when the transformer is rewritten to process entities directly.
  • Neutral: Legacy Integration/ExecutionChain classes remain in correlator.py for the adapter. They can be removed in a future cleanup.