Skip to main content

Execution Flow Data Modeling — Integrator Analysis

Date: 2026-02-13 Author: Integrator (ServiceNow + Entra ID domain expert) Status: Draft for Architect review Related: 00-synthesis.md


Executive Summary

The connector currently collects but does not emit critical execution provenance data:

  1. CALLS edges (BR→SI, Job→SI) — collected in correlator, consumed in transformer logic, but never emitted as NormalizedEdges
  2. Trigger record examples — fetched from source tables (INC0010023, etc.), stored in ExecutionChain.trigger_examples, never referenced
  3. Record mutation details — field-level writes, setWorkflow(false) suppression, mutation types — partially emitted as table resource nodes, losing granularity
  4. HTTP method details — REST message endpoint URLs emitted, but HTTP methods (GET/POST), headers, auth types stored but not exposed

These gaps prevent the platform from answering:

  • "What ServiceNow artifact actually invoked this REST message?" (BR→SI CALLS chain)
  • "Which real incident record triggered this automation?" (evidence trail from trigger to execution)
  • "What downstream triggers did this automation suppress with setWorkflow(false)?" (security-significant workflow bypass)
  • "What fields did this automation modify in the incident table?" (blast radius of local mutations)

Recommendation: Emit all execution provenance data as first-class entities and edges. The platform schema is extensible — no breaking changes required. Implementation effort: ~3-4 days connector work, minimal platform changes.


1. CALLS Edge Emission

Current State

What the connector does:

  • correlator.py:545-557EdgeResolver.resolve_indirect_caller_edges() creates ResolvedEdge objects with edge_type="CALLS" for BR→SI and Job→SI relationships
  • transformer.py:228-234 — Consumes CALLS edges to build indirect_business_rules and indirect_scheduled_jobs lists
  • transformer.py:612-731 — Emits both BR and SI as separate nodes with parallel EXECUTES_ON → REST Message edges, collapsing the invocation chain

Problem: The actual call graph (BR invokes SI, SI invokes REST Message) is flattened to (BR → REST Message, SI → REST Message). The platform can see that both artifacts exist but cannot determine which one actually invoked the other.

Example from chain report:

BUSINESS RULE: Auto-route identity tickets via Entra
Calls: AzureGraphRouter ← This relationship is NOT in NormalizedGraph

SCRIPT INCLUDE: AzureGraphRouter

REST MESSAGE: Graph - sn-ticket-router

Emitted graph:

BR -[EXECUTES_ON]-> REST Message
SI -[EXECUTES_ON]-> REST Message ← Parallel edges, no BR→SI link

Technical Analysis

Platform schema compatibility: The platform's NormalizedEdgeType enum (ingestion/types.ts:12-26) does not currently include CALLS as a valid edge type. The enum is:

export type NormalizedEdgeType =
| "OWNED_BY" | "BELONGS_TO" | "HAS_ROLE" | "GRANTS" | "APPLIES_TO"
| "AUTHENTICATES_TO" | "AUTHENTICATES_VIA" | "EXECUTES_ON"
| "RUNS_AS" | "TRIGGERS_ON" | "CREATED_BY"
| "DELEGATES_TO" | "APPROVED_BY" | "MEMBER_OF";

Is this extensible? Yes — adding a new edge type is non-breaking. The platform ingestion layer validates edge types at runtime and would reject unknown types, but existing edges are unaffected. This is a schema extension, not a breaking change.

Precedent: The RUNS_AS and TRIGGERS_ON edge types were added for automation modeling (01-data-model.md:535-541) without breaking existing integrations.

Proposed Solution

Step 1: Extend platform edge type enum (sv0-platform/src/ingestion/types.ts)

export type NormalizedEdgeType =
| "OWNED_BY" | "BELONGS_TO" | "HAS_ROLE" | "GRANTS" | "APPLIES_TO"
| "AUTHENTICATES_TO" | "AUTHENTICATES_VIA" | "EXECUTES_ON"
| "RUNS_AS" | "TRIGGERS_ON" | "CREATED_BY"
| "DELEGATES_TO" | "APPROVED_BY" | "MEMBER_OF"
| "CALLS"; // ← New: automation invocation (BR→SI, Job→SI, SI→SI)

Step 2: Emit CALLS edges in transformer (transformer.py:~690, after SI node creation)

# In _process_execution_chain(), after processing script_includes:
for si in chain.script_includes:
si_sys_id = si.get("sys_id", si.get("name", "unknown"))
si_node_id = f"sn-si-{si_sys_id}"

# ... existing SI node creation ...

# NEW: Emit CALLS edges from BRs/Jobs that invoke this SI
for caller_edge in entities.caller_edges:
if (caller_edge.edge_type == "CALLS"
and caller_edge.target_id == si_sys_id
and caller_edge.properties.get("target_type") == "script_include"):

caller_type = caller_edge.properties.get("caller_type")
caller_prefix = {
"business_rule": "sn-br-",
"scheduled_job": "sn-job-"
}.get(caller_type, "")

if caller_prefix:
caller_node_id = f"{caller_prefix}{caller_edge.source_id}"
self._add_edge(
edge_type="CALLS",
source_node_id=caller_node_id,
target_node_id=si_node_id,
properties={
"caller_type": caller_type,
"target_name": si.get("name", ""),
}
)

Step 3: Update path materializer (sv0-platform/src/domain/graph/path-materializer.ts)

// In _traverseExecutionPath(), add CALLS edge handling:
if (edge.edgeType === 'CALLS') {
// Follow invocation chain: BR → SI → REST Message
// Don't consume auth depth budget (this is code invocation, not delegation)
const targetNode = await this._getNode(edge.targetNodeId);
if (targetNode) {
path.push({ type: 'invocation', edge, node: targetNode });
await this._traverseExecutionPath(targetNode, depth, maxDepth, visited, path);
}
}

Semantic Considerations

CALLS vs EXECUTES_ON:

  • EXECUTES_ON — "this automation performs actions on this resource/endpoint" (writes to incident table, calls Graph API)
  • CALLS — "this automation invokes this other automation artifact" (BR invokes SI, SI invokes another SI)

CALLS vs RUNS_AS:

  • RUNS_AS — "this automation executes as this identity" (identity binding for permission inheritance)
  • CALLS — "this automation invokes this artifact" (code-level invocation, no identity change)

Path materializer impact: The path materializer must distinguish between:

  1. Code invocation (CALLS) — follow the chain without consuming auth depth
  2. Identity delegation (AUTHENTICATES_TO) — consumes auth depth, tracks trust chain position
  3. Permission inheritance (RUNS_AS) — borrows target identity's paths, doesn't consume depth

Why CALLS doesn't consume depth: It's not delegation — it's the same execution context invoking a subroutine. The BR and SI are both "system" code running in ServiceNow's server process. Only when the REST Message authenticates to an external system (AUTHENTICATES_TO) does the auth chain depth increment.

Implementation Effort

ComponentChangeEffortRisk
Platform types.tsAdd "CALLS" to NormalizedEdgeType enum5 minNone (non-breaking)
Platform path materializerAdd CALLS edge traversal logic1 hourLow (similar to existing edge handlers)
Connector transformer.pyEmit CALLS edges from caller_edges list2 hoursLow (data already available)
Connector testsAdd test cases for CALLS edge emission1 hourNone
Platform testsAdd path materializer tests for CALLS1 hourNone
Total~6 hoursLow

Backward compatibility: Fully backward compatible. Existing graphs without CALLS edges continue to work. New connector syncs emit CALLS edges; platform path materializer follows them if present.


2. Trigger Record Modeling

Current State

What the connector collects: servicenow_client.py:897-950get_recent_trigger_examples() fetches:

# Fields: sys_id, table, number, short_description, created_by, created_on
# Example: INC0010023, "joiner alice.joiner@smedvedsecurityv0..."

Stored in ExecutionChain.trigger_examples (correlator.py:184) but never referenced in transformer.py.

What the chain report shows (but platform never sees):

┌─────────────────────────────────────────────────────────────────┐
│ TRIGGER RECORD: incident/INC0010023
│ Created by: admin
│ Created on: 2026-02-05 04:07:18
│ Description: joiner alice.joiner@smedvedsecurityv0.onmicro...
└──────────────────────────────┬──────────────────────────────────┘

│ BUSINESS RULE: Auto-route identity tickets via Entra

This is first-party execution evidence — proof that a specific record triggered this automation at a specific time.

Platform Entity Model

The platform already has execution_evidence as a first-class node type (ingestion/types.ts:7, 01-data-model.md:285-308):

export type NormalizedNodeType =
| "autonomous_identity"
| "human_identity"
| "role" | "permission" | "resource" | "credential"
| "execution_evidence"; // ← Already exists

ExecutionEvidence properties (01-data-model.md:298-307):

  • source_table — where the evidence was fetched from (e.g., incident)
  • source_record_id — ID in the source system (e.g., INC0010023)
  • source_timestamp — when the execution occurred
  • evidence_type — api_call, flow_execution, scheduled_job, sign_in, trigger_record (NEW)
  • action — what was done (e.g., "trigger:incident.insert")
  • target_resource — what was acted upon (e.g., "incident")
  • outcome — success, failure, unknown
  • payload_hash — SHA256 of the source record content

Relationship: Execution evidence is linked to identities via EXECUTES_ON edges (01-data-model.md:461).

Proposed Solution

Option A: Emit trigger records as execution_evidence nodes (RECOMMENDED)

Create execution_evidence nodes for each trigger example, linked to the BR that triggered:

# In _process_execution_chain(), after BR node creation:
for br in chain.business_rules + chain.indirect_business_rules:
br_node_id = f"sn-br-{br_sys_id}"
# ... existing BR node creation ...

# NEW: Emit trigger record evidence
for trigger_rec in chain.trigger_examples:
if trigger_rec.get("table") == br.get("table"): # Match on trigger table
trigger_node_id = self._add_node(
node_id=f"evidence-trigger-{trigger_rec['sys_id']}",
node_type="execution_evidence",
source_system="servicenow",
source_id=trigger_rec["sys_id"],
display_name=f"Trigger: {trigger_rec.get('number', trigger_rec['sys_id'])}",
status="active",
created_at=trigger_rec.get("created_on"),
properties={
"source_table": trigger_rec["table"],
"source_timestamp": trigger_rec["created_on"],
"evidence_type": "trigger_record",
"action": f"trigger:{trigger_rec['table']}.insert",
"target_resource": trigger_rec["table"],
"outcome": "success",
"created_by": trigger_rec.get("created_by", ""),
"short_description": trigger_rec.get("short_description", "")[:200],
}
)

# Link trigger record → BR via TRIGGERS_ON (reuse existing edge type)
self._add_edge(
edge_type="TRIGGERS_ON",
source_node_id=trigger_node_id,
target_node_id=br_node_id,
properties={
"triggerType": "record_event",
"timestamp": trigger_rec["created_on"],
}
)

Option B: Embed trigger examples in TRIGGERS_ON edge properties

Store trigger examples as a trigger_examples array in the existing TRIGGERS_ON edge properties:

self._add_edge(
edge_type="TRIGGERS_ON",
source_node_id=br_node_id,
target_node_id=table_node_id,
properties={
"triggerType": "event",
"events": events or ["record_change"],
"trigger_examples": [ # NEW
{
"sys_id": rec["sys_id"],
"number": rec.get("number"),
"created_on": rec["created_on"],
"created_by": rec["created_by"],
}
for rec in chain.trigger_examples[:5] # Limit to 5 most recent
]
},
)

Option C: New edge type TRIGGERED_BY

Create a reverse relationship from table resource to trigger record:

execution_evidence (trigger record) -[TRIGGERED_BY]-> BR

Recommendation: Option A (execution_evidence nodes)

Why:

  1. Consistency with existing patterns — Azure sign-ins are already modeled as execution_evidence nodes (transformer.py:1442-1492)
  2. First-class queryability — trigger records become searchable entities, not nested properties
  3. Evidence completeness tracking — can declare whether trigger examples are available/unavailable per entity type
  4. Temporal analysis — trigger timestamps become first-class temporal markers for drift detection

Data volume concern: If a BR triggers 10,000 times/day, emitting all trigger examples would create 10,000 evidence nodes per sync. Mitigation: Limit to N most recent examples (5-10) per BR, or add a connector config flag --include-trigger-examples (default: false).

Alternative for high-volume scenarios: Use Option B (embed in edge properties) for trigger-heavy automations, Option A for low-volume or security-critical triggers.

Implementation Effort

ComponentChangeEffortRisk
Connector transformer.pyAdd trigger record → execution_evidence node emission2 hoursLow (pattern exists for sign-ins)
Connector transformer.pyLink trigger evidence → BR via TRIGGERS_ON edge30 minLow
Connector configAdd --max-trigger-examples flag (default: 5)30 minNone
Connector testsAdd test cases for trigger evidence emission1 hourNone
Platform ingestionNo changes (execution_evidence already supported)0None
Total~4 hoursLow

3. Record Mutation / Effect Modeling

Current State

What the connector collects: servicenow_client.py:38-112analyze_script_mutations() detects:

{
"table": "incident",
"fields_modified": ["assignment_group", "u_auto_routed", "work_notes"],
"mutation_types": ["update"],
"workflow_suppressed": True # setWorkflow(false) detected
}

Stored in ExecutionChain.local_mutations (correlator.py:207, transformer.py:802-817).

What the transformer emits: Only the table name becomes a resource node (transformer.py:802-817). Field-level writes and workflow suppression are lost.

Example from chain report (not in platform graph):

┌─────────────────────────────────────────────────────────────────┐
│ RECORD CHANGES (after API response): │
│ Table: incident
│ Fields: assignment_group, u_auto_routed, u_autorouted, work_notes
│ Action: update
│ ⚠ setWorkflow(false) — downstream triggers suppressed
└─────────────────────────────────────────────────────────────────┘

Semantic Question: What Are "Record Changes"?

Two interpretations:

  1. Egress effects — what the automation WRITES to the target system (Entra Graph, AWS, GitHub)

    • Example: "Creates user in Entra" or "Pushes to GitHub repo"
    • This is external blast radius (cross-system impact)
  2. Local effects — what the automation WRITES back to ServiceNow after processing

    • Example: "Updates incident.assignment_group after calling Entra Graph API"
    • This is internal blast radius (same-system side effects)

Chain report example is LOCAL: The BR calls Graph API (egress), receives response, then updates the incident record in ServiceNow (local mutation). The mutation is after the external call, not part of it.

Why this matters for modeling:

  • Egress effects are typically captured in the target resource (the external API endpoint)
  • Local effects need to be modeled as properties of the automation node or separate mutation edges

Proposed Solutions

Option A: Automation node properties (SIMPLEST)

Add mutation details as properties on the automation node:

br_node_id = self._add_node(
node_id=f"sn-br-{br_sys_id}",
node_type="autonomous_identity",
source_system="servicenow",
source_id=br_sys_id,
display_name=br.get("name", "Unknown Business Rule"),
status="active",
properties={
"identitySubtype": "business_rule",
"automation_type": "business_rule",
"table": br.get("table", ""),
# NEW:
"local_mutations": [
{
"table": "incident",
"fields_modified": ["assignment_group", "u_auto_routed", "work_notes"],
"mutation_types": ["update"],
"workflow_suppressed": True
}
],
"workflow_suppression_count": 1, # Aggregate signal
},
)

Pros:

  • No schema changes
  • Queryable via node properties filter
  • Low implementation effort

Cons:

  • Mutations not first-class entities (harder to query "all automations that modify assignment_group")
  • No temporal tracking of when mutations were detected

Option B: New edge type MODIFIES (MORE EXPRESSIVE)

Create edges from automation → table resource with mutation metadata:

for mutation in chain.local_mutations:
table_node_id = self._add_node(
node_id=f"sn-table-{mutation['table']}",
node_type="resource",
source_system="servicenow",
source_id=mutation["table"],
display_name=mutation["table"],
status="active",
properties={"resourceType": "table"},
)

self._add_edge(
edge_type="MODIFIES", # NEW edge type
source_node_id=br_node_id,
target_node_id=table_node_id,
properties={
"fields_modified": mutation["fields_modified"],
"mutation_types": mutation["mutation_types"],
"workflow_suppressed": mutation["workflow_suppressed"],
}
)

Pros:

  • First-class relationship (can query "all automations that MODIFIES incident")
  • Edge properties carry full mutation context
  • Enables blast radius queries: "show all tables modified by orphaned automations"

Cons:

  • Requires platform schema extension (add "MODIFIES" to NormalizedEdgeType)
  • Higher implementation effort (schema change + path materializer updates)

Option C: Mutation evidence nodes (MOST GRANULAR)

Create execution_evidence nodes for each mutation operation:

for mutation in chain.local_mutations:
mutation_node_id = self._add_node(
node_id=f"evidence-mutation-{br_sys_id}-{mutation['table']}",
node_type="execution_evidence",
source_system="servicenow",
source_id=f"mutation:{br_sys_id}:{mutation['table']}",
display_name=f"Mutation: {mutation['table']} by {br.get('name')}",
status="active",
properties={
"source_table": mutation["table"],
"evidence_type": "local_mutation",
"action": f"{mutation['mutation_types'][0]}:{mutation['table']}",
"target_resource": mutation["table"],
"fields_modified": mutation["fields_modified"],
"workflow_suppressed": mutation["workflow_suppressed"],
}
)

self._add_edge(
edge_type="EXECUTES_ON",
source_node_id=br_node_id,
target_node_id=mutation_node_id,
)

Pros:

  • Most granular (one evidence node per mutation)
  • Consistent with other execution evidence patterns
  • Temporal tracking built-in (node createdAt)

Cons:

  • Highest node count (one per mutation per automation)
  • Execution evidence type may be overloaded (mixing "proof of execution" with "proof of mutation")

Recommendation: Option A for MVP, Option B for production

Short-term (MVP): Use Option A (node properties) to quickly surface mutation data without schema changes.

Long-term (production): Use Option B (MODIFIES edge) for:

  • First-class queryability
  • Blast radius analysis ("which automations modify HR tables?")
  • Path materializer integration (follow MODIFIES edges to compute local blast radius)

Why not Option C: Execution evidence is conceptually "proof that something happened" (sign-ins, trigger records, API calls). Mutations are what the automation does (capabilities), not evidence that it executed. Mixing these semantics would confuse the evidence model.

setWorkflow(false) — Security Significance

What it means: ServiceNow's setWorkflow(false) suppresses downstream Business Rules and workflows when a record is updated. This is security-significant because:

  1. Audit trails may be bypassed (no BR fires to log the change)
  2. Approval workflows may be skipped (e.g., assignment group change without manager approval)
  3. Notifications may be suppressed (on-call engineer not alerted)

How to surface this:

  • Option A: Add workflow_suppression_count to automation node properties (aggregate signal)
  • Option B: Add workflow_suppressed: true to MODIFIES edge properties (per-mutation granularity)
  • Finding rule: Create a workflow_suppression finding type: "This automation bypasses downstream audit/approval controls"

Recommendation: Use Option B (per-mutation flag) + create a platform finding rule that fires when workflow_suppressed: true is detected on a MODIFIES edge from an automation with ownership_status: orphaned or security_relevance: active_external.

Implementation Effort

ComponentChangeEffortRisk
Option A (node properties)
Connector transformer.pyAdd local_mutations to automation node properties1 hourLow
Connector testsTest mutation property emission30 minNone
Option B (MODIFIES edge)
Platform types.tsAdd "MODIFIES" to NormalizedEdgeType5 minNone
Connector transformer.pyEmit MODIFIES edges with mutation metadata2 hoursLow
Platform path materializerAdd MODIFIES edge traversal1 hourLow
Connector testsTest MODIFIES edge emission1 hourNone
Platform evaluatorAdd workflow_suppression finding rule2 hoursMedium (new rule logic)
Total (Option A)~1.5 hoursLow
Total (Option B)~7 hoursLow

4. HTTP Method Details

Current State

What the connector collects: ExecutionChain.http_methods (correlator.py:167) stores:

[
{
"name": "GET User",
"http_method": "GET",
"endpoint": "https://graph.microsoft.com/v1.0/users/{id}",
"rest_endpoint": "https://graph.microsoft.com",
"sys_id": "abc123...",
"auth_type": "oauth2",
"headers": {"Accept": "application/json"},
},
...
]

What the transformer emits: Only the endpoint URL is included in the REST message resource node (transformer.py:576-601):

rest_msg_node_id = self._add_node(
node_id=f"sn-restmsg-{rest_msg['sys_id']}",
node_type="resource",
source_system="servicenow",
source_id=rest_msg["sys_id"],
display_name=rest_msg.get("name", "Unknown REST Message"),
status="active",
properties={
"resourceType": "rest_message",
"endpoint_url": endpoint_url, # ← Only URL, no method/headers
"authentication_type": rest_msg.get("authentication_type", ""),
},
)

Use Cases for HTTP Method Details

1. Egress classification accuracy:

  • GET https://graph.microsoft.com/v1.0/users — read-only, lower risk
  • POST https://graph.microsoft.com/v1.0/users — creates users, higher risk
  • DELETE https://graph.microsoft.com/v1.0/groups/{id}/members/{id} — removes group members, security-significant

2. Blast radius determination:

  • REST message "Graph API Sync" has 10 HTTP methods — 8 GET, 2 POST
  • Blast radius should reflect write operations, not just "accesses Graph API"

3. Evidence completeness:

  • REST message has 5 configured methods but only 2 have been used in the last 30 days (from execution evidence)
  • Scope drift: methods added over time without re-approval

Proposed Solution

Option: Add http_methods array to REST message node properties

rest_msg_node_id = self._add_node(
node_id=f"sn-restmsg-{rest_msg['sys_id']}",
node_type="resource",
source_system="servicenow",
source_id=rest_msg["sys_id"],
display_name=rest_msg.get("name", "Unknown REST Message"),
status="active",
properties={
"resourceType": "rest_message",
"endpoint_url": endpoint_url,
"authentication_type": rest_msg.get("authentication_type", ""),
# NEW:
"http_methods": [
{
"name": method.get("name", ""),
"http_method": method.get("http_method", "GET"),
"endpoint": method.get("endpoint", ""),
"auth_type": method.get("auth_type", ""),
}
for method in chain.http_methods
],
"method_count": len(chain.http_methods),
"write_method_count": len([
m for m in chain.http_methods
if m.get("http_method") in ["POST", "PUT", "PATCH", "DELETE"]
]),
},
)

Why not separate HTTP method nodes?

  • HTTP methods are not independent execution artifacts — they're configuration details of the REST message
  • Creating one node per method would inflate node count without adding execution path value
  • Methods don't have separate ownership/lifecycle — they inherit from the REST message

Egress classifier update: Modify egress_classifier.py to consider HTTP method when classifying risk:

def classify_egress(endpoint_url: str, instance_host: str, http_methods: list) -> dict:
# ... existing URL-based classification ...

# NEW: Adjust category based on methods
write_methods = [m for m in http_methods if m.get("http_method") in ["POST", "PUT", "PATCH", "DELETE"]]
if write_methods and category == "external":
# External write operations are higher risk than reads
return {
"egress_category": "external_write",
"write_method_count": len(write_methods),
}

Implementation Effort

ComponentChangeEffortRisk
Connector transformer.pyAdd http_methods array to REST message properties1 hourLow
Connector egress_classifier.pyUpdate classifier to consider HTTP methods2 hoursMedium (risk scoring logic)
Connector testsTest HTTP method property emission30 minNone
Platform UIDisplay HTTP methods in resource detail view1 hourLow
Total~4.5 hoursLow

5. Data Quality: "automation_disabled" Owner

Context

The chain report shows:

REST MESSAGE: Graph - sn-ticket-router-no-owner
Created by: automation_disabled
Updated by: automation_disabled

BUSINESS RULE: Auto-route identity tickets-Enta-no-own
Created by: automation_disabled

Question: Is "automation_disabled" a real ServiceNow user, a system account, or a disabled account?

ServiceNow Ground Truth

Common ServiceNow system accounts:

UsernameTypePurpose
adminHuman/systemDefault admin account (often used for automation)
systemSystemServiceNow platform itself (scheduled jobs, system tasks)
guestSystemUnauthenticated access (usually disabled)
maintSystemMaintenance operations
integration_userHuman/serviceDedicated account for external integrations

"automation_disabled" is NOT a standard ServiceNow system account. It's either:

  1. A custom service account created by an admin (likely disabled after use)
  2. A renamed/repurposed account
  3. An account name that suggests it SHOULD be disabled but may not be

How to verify: Query the sys_user table for this user:

// ServiceNow script to check account state
var gr = new GlideRecord('sys_user');
gr.addQuery('user_name', 'automation_disabled');
gr.query();
if (gr.next()) {
gs.info('User: ' + gr.user_name);
gs.info('Active: ' + gr.active);
gs.info('Locked out: ' + gr.locked_out);
gs.info('Last login: ' + gr.last_login_time);
gs.info('Email: ' + gr.email);
}

Connector Behavior

Current state: The connector fetches sys_created_by and emits it as a human_identity node with status="disabled" if no user details are found (transformer.py:1320-1331):

# Conservative default: unknown creator account state should not be assumed active.
creator_node_id = self._add_node(
node_id=f"sn-user-{creator_username}",
node_type="human_identity",
source_system="servicenow",
source_id=creator_username,
display_name=creator_username,
status="disabled", # ← Conservative default
properties={},
)

Problem: If "automation_disabled" is a functional service account that's still active but named misleadingly, the connector incorrectly marks it as disabled, triggering a false-positive orphaned_ownership finding.

Solution: The connector should:

  1. Fetch full sys_user record for the creator
  2. Use the actual active field from ServiceNow to set status
  3. Log a warning if the username contains "disabled" but the account is active

Code change (servicenow_client.py):

def get_user_details(self, username: str) -> dict | None:
"""Get full sys_user record for a username."""
users = self._get_table(
"sys_user",
query=f"user_name={username}",
fields=["sys_id", "user_name", "name", "email", "active", "locked_out", "last_login_time"],
limit=1,
)
if not users:
logger.warning(f"User {username} not found in sys_user table")
return None

user = users[0]

# Detect suspicious naming
if "disabled" in username.lower() and user.get("active") == "true":
logger.warning(
f"User {username} has 'disabled' in name but account is ACTIVE "
f"(last login: {user.get('last_login_time', 'never')})"
)

return user

Recommendation

Action: Enhance the connector to fetch full sys_user details for all creators and use the actual active field.

Finding rule: Create a suspicious_ownership finding that fires when:

  • An automation's creator username contains "disabled", "test", "temp", or "demo"
  • The account is still active
  • The automation is actively executing

Rationale: This indicates operational neglect — an account that should have been disabled after testing is now the de facto owner of production automation.

Implementation Effort

ComponentChangeEffortRisk
Connector servicenow_client.pyAdd get_user_details() method1 hourLow
Connector transformer.pyUse user.active field instead of conservative default1 hourLow
Connector transformer.pyLog warning for suspicious usernames30 minNone
Platform evaluatorAdd suspicious_ownership finding rule2 hoursMedium
Total~4.5 hoursLow

6. Implementation Plan

Phase 1: Quick Wins (No Platform Changes) — ~10 hours

Goal: Surface existing data without schema changes

  1. HTTP method details (4.5 hours)

    • Add http_methods array to REST message node properties
    • Update tests
  2. Local mutations as node properties (1.5 hours)

    • Add local_mutations array to automation node properties
    • Include workflow_suppression_count aggregate
  3. Trigger records as execution_evidence nodes (4 hours)

    • Emit trigger examples as execution_evidence nodes
    • Link via TRIGGERS_ON edge (reuse existing edge type)
    • Add --max-trigger-examples config flag

Deliverables: All execution provenance data visible in platform, no breaking changes.

Phase 2: Schema Extensions (Platform Changes Required) — ~13 hours

Goal: Make provenance data first-class queryable entities

  1. CALLS edge emission (6 hours)

    • Extend platform NormalizedEdgeType enum
    • Update path materializer to traverse CALLS edges
    • Emit CALLS edges in transformer
    • Add tests
  2. MODIFIES edge for local mutations (7 hours)

    • Extend platform NormalizedEdgeType enum
    • Emit MODIFIES edges with mutation metadata
    • Update path materializer to compute local blast radius
    • Add workflow_suppression finding rule
    • Add tests

Deliverables: Execution provenance is first-class in the data model, enabling advanced queries and findings.

Phase 3: Data Quality Improvements — ~4.5 hours

  1. Resolve "automation_disabled" ownership ambiguity (4.5 hours)
    • Enhance connector to fetch full sys_user details
    • Use actual active field instead of conservative default
    • Add suspicious_ownership finding rule

Deliverables: Accurate ownership status, fewer false positives.

Total Effort: ~28 hours (3.5 engineering days)

Risk Assessment:

  • Low risk: Phases 1 and 3 (no platform schema changes)
  • Medium risk: Phase 2 (schema extensions, but non-breaking)

Rollout Strategy:

  1. Phase 1 → Immediate (connector-only changes, backward compatible)
  2. Phase 2 → After Phase 1 validation (platform + connector coordination)
  3. Phase 3 → Parallel to Phase 2 (independent data quality work)

7. Backward Compatibility Analysis

Will adding new edges/nodes break existing ingestion?

Answer: No — fully backward compatible.

Why:

  1. New edge types (CALLS, MODIFIES) are additive — existing graphs without these edges continue to work
  2. New node types (execution_evidence for triggers) already exist in the platform schema
  3. New node properties (http_methods, local_mutations) are optional — ingestion doesn't enforce property schemas
  4. Path materializer changes are additive — new edge types are only traversed if present

Testing approach:

  1. Run existing connector test suite against new transformer code → all pass
  2. Ingest old graph (without new edges) into new platform → no errors
  3. Ingest new graph (with new edges) into old platform → edge validation rejects unknown types (expected)
  4. Mixed deployment: new connector + old platform → connector emits new edges, platform ignores them (graceful degradation)

Migration path:

  • Day 1: Deploy Phase 1 connector changes (node properties only) — no platform coordination needed
  • Day 7: Deploy Phase 2 platform changes (new edge types) — connector already emitting these edges
  • Day 8: Deploy Phase 2 connector changes (emit new edges) — platform already accepts them

Rollback safety: If new code causes issues, rolling back the connector is safe — old transformer will emit old schema, platform continues working.


8. Open Questions for Architect

  1. CALLS edge semantics: Should CALLS consume auth depth budget, or is it code-level invocation without delegation? (Recommendation: no depth consumption, similar to RUNS_AS)

  2. Trigger record volume: If a BR triggers 10,000 times/day, should we emit all trigger examples or limit to N most recent? (Recommendation: limit to 5-10 per automation, add config flag)

  3. Mutation evidence vs node properties: Is "local mutation" an execution capability (node property) or execution evidence (evidence node)? (Recommendation: capability, not evidence — use MODIFIES edge)

  4. workflow_suppression finding rule: Should this trigger only for orphaned automations, or any automation with external egress? (Recommendation: any automation with security_relevance: active_external + workflow_suppressed: true)

  5. HTTP method write classification: Should POST/PUT/PATCH/DELETE methods upgrade egress_category from "external" to "external_write"? (Recommendation: yes — write operations are higher risk)

  6. "automation_disabled" ownership: Should the connector emit a warning when a creator username contains "disabled" but the account is active? (Recommendation: yes — this is a data quality red flag)


Appendix A: Example Transformed Graph (With Proposed Changes)

Before (current state):

BR: Auto-route identity tickets via Entra
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router
-[TRIGGERS_ON]-> Table: incident
-[RUNS_AS]-> SP: sn-ticket-router

SI: AzureGraphRouter
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router ← Duplicate edge
-[RUNS_AS]-> SP: sn-ticket-router

After (with CALLS, trigger evidence, mutations):

Trigger Evidence: incident/INC0010023
created_by: admin, created_on: 2026-02-05 04:07:18
-[TRIGGERS_ON]-> BR: Auto-route identity tickets via Entra

BR: Auto-route identity tickets via Entra
properties: {
local_mutations: [{table: "incident", fields: ["assignment_group", "work_notes"], workflow_suppressed: true}]
}
-[CALLS]-> SI: AzureGraphRouter ← NEW: explicit invocation
-[TRIGGERS_ON]-> Table: incident
-[RUNS_AS]-> SP: sn-ticket-router
-[MODIFIES]-> Table: incident (fields: ["assignment_group", "work_notes"], workflow_suppressed: true)

SI: AzureGraphRouter
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router
-[RUNS_AS]-> SP: sn-ticket-router

REST Message: Graph - sn-ticket-router
properties: {
endpoint_url: "https://graph.microsoft.com",
http_methods: [
{name: "GET User", http_method: "GET", endpoint: "/v1.0/users/{id}"},
{name: "GET Groups", http_method: "GET", endpoint: "/v1.0/groups"}
],
method_count: 2,
write_method_count: 0
}

Query examples enabled by new edges:

  1. "Show all automations that invoke AzureGraphRouter"

    MATCH (auto)-[CALLS]->(si:Script Include {name: "AzureGraphRouter"})
    RETURN auto
  2. "Show all automations that suppress workflows when modifying HR tables"

    MATCH (auto)-[m:MODIFIES]->(table:Resource)
    WHERE table.business_domain = "hr" AND m.workflow_suppressed = true
    RETURN auto, table
  3. "Show trigger records for orphaned automations"

    MATCH (evidence:ExecutionEvidence)-[TRIGGERS_ON]->(auto:Identity)
    WHERE auto.ownership_status = "orphaned" AND evidence.evidence_type = "trigger_record"
    RETURN evidence, auto

Appendix B: ServiceNow Data Availability

What data is available for trigger records?

ServiceNow Table API returns:

  • sys_id — unique record ID (e.g., abc123...)
  • number — human-readable record number (e.g., INC0010023)
  • sys_created_by — username who created the record
  • sys_created_on — timestamp of creation
  • sys_updated_by — username who last updated the record
  • sys_updated_on — timestamp of last update
  • Table-specific fields: short_description, caller_id, assignment_group, etc.

Fields NOT available without additional queries:

  • Full user details for sys_created_by (requires join to sys_user)
  • Audit trail of who/what triggered the BR (ServiceNow doesn't log "BR X fired because record Y was created")
  • Actual field values that matched the BR's condition (e.g., "short_description CONTAINS 'joiner'")

Connector limitation: servicenow_client.py:897-950 fetches trigger examples but doesn't verify that these records actually triggered the BR. It assumes: "recent records from this table that match the BR's condition would have triggered it."

Implication: Trigger evidence is inferred, not proven. We should set evidence_type: "inferred_trigger_record" to be explicit about this.

What data is available for local mutations?

ServiceNow script content analysis returns:

  • table — GlideRecord target table name
  • fields_modified — field names passed to setValue()
  • mutation_types — update, insert, delete (from script text analysis)
  • workflow_suppressed — boolean (setWorkflow(false) detected)

Fields NOT available:

  • Actual field values written (requires runtime instrumentation)
  • Number of records modified per execution (requires transaction logs)
  • Whether mutations actually succeeded (requires error handling analysis)

Connector limitation: servicenow_client.py:38-112 uses regex-based static analysis. Dynamic table names (new GlideRecord(tableName)) and indirect field references (gr.setValue(fieldName, value)) are missed.

Implication: Mutation detection is best-effort, not exhaustive. We should include a note in evidence completeness: "Local mutations detected via static script analysis; dynamic references not captured."


End of Integrator Analysis