Critical Architectural Review: SaaS Connector ETL Pipeline
Date: 2026-02-20 Reviewer: Staff+ Software Architect Focus: CISO Security Perspective (Execution-risk + Auditability) under SaaS Constraints
A. High-Level Architecture Map
Components:
- Source System (ServiceNow): Contains identities (
sys_user), workloads (sys_script,sysauto_script), outbound integrations (sys_rest_message), and credentials (oauth_entity). Constraint: Read-only access via API. No custom instrumentation or structural changes allowed. - Target System (Azure AI Foundry / Fabric): Ingress point for ServiceNow calls. Authenticates via Service Principals (SPN) / Managed Identities. Executes jobs, maintains agents. Constraint: Read-only access via API.
- Execution/Storage (SV0 Platform): Ingests the Normalized Graph via REST API (
/api/v1/ingest/normalized-graph), applies diffs, evaluates rules, and generates SHA256-hashedevidence_packs.
SaaS Visibility Boundaries:
As a SaaS provider, we are bounded by the default telemetry emitted by customer environments. We cannot force customers to inject X-Correlation-ID headers or alter their core logging levels simply to satisfy our graph linkage.
B. Execution-Path Walkthrough (SaaS Reality)
- Trigger: A ServiceNow Business Rule fires on a table event.
- Evidence: SN
sys_auditorsyseventtables (if enabled for the target table).
- Evidence: SN
- Workload Execution: The Business Rule calls a Script Include, which invokes a
sys_rest_message.- Evidence: Difficult to prove definitively without outbound HTTP logs. We rely on structural scraping (the script can call this endpoint) rather than runtime proof (the script did call this endpoint).
- Outbound Call (Egress) -> Azure Ingress: The REST Message uses an
oauth_profileto get a token and calls Azure.- Evidence: Entra ID Sign-in logs for the corresponding Service Principal (
client_id).
- Evidence: Entra ID Sign-in logs for the corresponding Service Principal (
- Platform Ingestion: Azure Connector constructs the
NormalizedGraphrepresenting the Azure state; ServiceNow Connector constructs theNormalizedGraphrepresenting the SN state. - Correlation (The SaaS Hop): Because we lack a shared trace ID across the SN-to-Azure boundary, correlation is probabilistic based on configuration rather than deterministic based on execution. We match the SN OAuth Entity
client_idto the Entra ID Service PrincipalappId.
C. Evidence Gaps & The "Surfaced Uncertainty" Strategy
Since we cannot mandate configuration changes, we must adopt an "Embrace and Surface Uncertainty" strategy. When execution evidence is weak, the platform must explicitly downgrade the "Evidence Confidence" score of that specific authority_path.
| Evidence Gap | SaaS Reality | Mitigation Strategy (Platform UI & Logic) |
|---|---|---|
| 1. No Outbound Trace ID | We cannot force SN scripts to send a trace ID to Azure. The exact user trigger cannot be cryptographically linked to the exact Azure execution. | Surface the Gap: UI must display "Correlation: Structural (Implicit)" rather than "Deterministic". We prove the capability exists, not the exact causal chain per run. |
| 2. Unreliable Outbound Payloads | Customers rarely log full HTTP bodies out of ServiceNow due to PII/storage concerns. We cannot see what data was sent. | Compensating Control: Focus on State Diffing. Compare the target state in Fabric before and after the timestamp of the Entra Service Principal login to infer what was changed. |
| 3. Execution Fidelity Loss | We know a script contains .setValue(), but we don't know if that code path actually triggered in production. | Probabilistic Inference: Use time-window correlation. If SN sys_audit shows incident #123 updated at 10:01:00, and the Azure SPN logged in at 10:01:02, draw a dotted "Inferred Execution" edge in the graph. |
D. Deterministic Correlation Strategy (Without Instrumentation)
Since we cannot rely on standard Span IDs propagated across customer systems, our correlation engine (correlator.py) must be hardened to use environmental anchors:
- Identity Anchoring (Strong): The primary linchpin is the OAuth
client_id. If a ServiceNowoauth_entityholds the credential for Entra ID ApplicationX, any execution in Azure byXprovides a strong structural link back to the ServiceNow tenant. - Temporal Slicing (Weak/Probabilistic):
- Connector A (ServiceNow) pulls the
sys_updated_onfor business rules. - Connector B (Azure) pulls the
last_sign_in_atfor Service Principals. - If a business rule triggers, resulting in an outbound call, we attempt to match the SN event timestamp with the Entra sign-in timestamp (within a 5-minute sliding window).
- UI Representation: These temporal links must be visually distinct (e.g., dashed lines) from deterministic structural links (solid lines) in the SV0 Platform UI.
- Connector A (ServiceNow) pulls the
E. SaaS-Grade ETL Data Correctness
We cannot control the source telemetry, but we must absolutely control our ingestion pipeline's integrity.
Run Manifests & Invariants:
Every connector_syncs document requires a cryptographic manifest:
- Inputs: SHA256 of the raw Extracted Graph received via
/api/v1/ingest/normalized-graph. - Outputs: Entity/Edge counts, sum of
execution_evidencenodes committed to the DB. - Invariant:
nodesCreated + nodesUpdated + nodesDeleted == Total Discovered Nodes. If this invariant fails, thesync_ingestionqueue worker must halt and mark the sync as degraded. We guarantee we didn't lose any data the customer did give us.
F. Threat Model: Execution Evidence Fraud
Threat: A malicious actor compromises the customer's ServiceNow instance and alters the sys_audit logs or script definitions to hide unauthorized calls to Azure Fabric.
SaaS Defense:
- Immutability of the Target: We pull Entra ID sign-in logs and Azure Activity logs independently of ServiceNow. Even if SN logs are wiped, the Entra SP login and subsequent Fabric actions are recorded on the Microsoft side.
- Cross-Checking: The SV0 Platform flags anomalies. If Entra shows 500 logins for SPN 'A', but ServiceNow shows 0 business rule triggers for the associated OAuth entity, an "Evidence Discrepancy" alert is generated.
G. Recommendations & Product Roadmap
1. "Best Effort" Advisor Engine:
Build an advisory module within the platform. When viewing an authority path, if evidence is weak, display a contextual recommendation: "To achieve deterministic tracing for this path, advise the customer to enable glide.outbound_http.log.body for REST message X." This turns a gap into a consulting opportunity rather than a technical blocker.
2. Visualizing "Confidence Scores":
The NormalizedGraph schema must be updated to include an evidenceConfidence enum (e.g., STRUCTURAL, TEMPORAL_INFERRED, DETERMINISTIC). The frontend UI should use this score to color-code the reliability of the execution chains presented to the auditor.
3. State-Delta Inferencing: Shift focus away from perfect execution tracing. If we can't prove how the data was changed (due to lack of trace IDs), double down on proving what changed by ensuring the Diff Engine in the SV0 platform takes extremely high-fidelity snapshots of the source and target states.