Execution Flow Data Modeling — Integrator Analysis
Date: 2026-02-13 Author: Integrator (ServiceNow + Entra ID domain expert) Status: Draft for Architect review Related: 00-synthesis.md
Executive Summary
The connector currently collects but does not emit critical execution provenance data:
- CALLS edges (BR→SI, Job→SI) — collected in correlator, consumed in transformer logic, but never emitted as NormalizedEdges
- Trigger record examples — fetched from source tables (INC0010023, etc.), stored in ExecutionChain.trigger_examples, never referenced
- Record mutation details — field-level writes, setWorkflow(false) suppression, mutation types — partially emitted as table resource nodes, losing granularity
- HTTP method details — REST message endpoint URLs emitted, but HTTP methods (GET/POST), headers, auth types stored but not exposed
These gaps prevent the platform from answering:
- "What ServiceNow artifact actually invoked this REST message?" (BR→SI CALLS chain)
- "Which real incident record triggered this automation?" (evidence trail from trigger to execution)
- "What downstream triggers did this automation suppress with setWorkflow(false)?" (security-significant workflow bypass)
- "What fields did this automation modify in the incident table?" (blast radius of local mutations)
Recommendation: Emit all execution provenance data as first-class entities and edges. The platform schema is extensible — no breaking changes required. Implementation effort: ~3-4 days connector work, minimal platform changes.
1. CALLS Edge Emission
Current State
What the connector does:
correlator.py:545-557—EdgeResolver.resolve_indirect_caller_edges()createsResolvedEdgeobjects withedge_type="CALLS"for BR→SI and Job→SI relationshipstransformer.py:228-234— Consumes CALLS edges to buildindirect_business_rulesandindirect_scheduled_jobsliststransformer.py:612-731— Emits both BR and SI as separate nodes with parallelEXECUTES_ON → REST Messageedges, collapsing the invocation chain
Problem: The actual call graph (BR invokes SI, SI invokes REST Message) is flattened to (BR → REST Message, SI → REST Message). The platform can see that both artifacts exist but cannot determine which one actually invoked the other.
Example from chain report:
BUSINESS RULE: Auto-route identity tickets via Entra
Calls: AzureGraphRouter ← This relationship is NOT in NormalizedGraph
↓
SCRIPT INCLUDE: AzureGraphRouter
↓
REST MESSAGE: Graph - sn-ticket-router
Emitted graph:
BR -[EXECUTES_ON]-> REST Message
SI -[EXECUTES_ON]-> REST Message ← Parallel edges, no BR→SI link
Technical Analysis
Platform schema compatibility:
The platform's NormalizedEdgeType enum (ingestion/types.ts:12-26) does not currently include CALLS as a valid edge type. The enum is:
export type NormalizedEdgeType =
| "OWNED_BY" | "BELONGS_TO" | "HAS_ROLE" | "GRANTS" | "APPLIES_TO"
| "AUTHENTICATES_TO" | "AUTHENTICATES_VIA" | "EXECUTES_ON"
| "RUNS_AS" | "TRIGGERS_ON" | "CREATED_BY"
| "DELEGATES_TO" | "APPROVED_BY" | "MEMBER_OF";
Is this extensible? Yes — adding a new edge type is non-breaking. The platform ingestion layer validates edge types at runtime and would reject unknown types, but existing edges are unaffected. This is a schema extension, not a breaking change.
Precedent: The RUNS_AS and TRIGGERS_ON edge types were added for automation modeling (01-data-model.md:535-541) without breaking existing integrations.
Proposed Solution
Step 1: Extend platform edge type enum (sv0-platform/src/ingestion/types.ts)
export type NormalizedEdgeType =
| "OWNED_BY" | "BELONGS_TO" | "HAS_ROLE" | "GRANTS" | "APPLIES_TO"
| "AUTHENTICATES_TO" | "AUTHENTICATES_VIA" | "EXECUTES_ON"
| "RUNS_AS" | "TRIGGERS_ON" | "CREATED_BY"
| "DELEGATES_TO" | "APPROVED_BY" | "MEMBER_OF"
| "CALLS"; // ← New: automation invocation (BR→SI, Job→SI, SI→SI)
Step 2: Emit CALLS edges in transformer (transformer.py:~690, after SI node creation)
# In _process_execution_chain(), after processing script_includes:
for si in chain.script_includes:
si_sys_id = si.get("sys_id", si.get("name", "unknown"))
si_node_id = f"sn-si-{si_sys_id}"
# ... existing SI node creation ...
# NEW: Emit CALLS edges from BRs/Jobs that invoke this SI
for caller_edge in entities.caller_edges:
if (caller_edge.edge_type == "CALLS"
and caller_edge.target_id == si_sys_id
and caller_edge.properties.get("target_type") == "script_include"):
caller_type = caller_edge.properties.get("caller_type")
caller_prefix = {
"business_rule": "sn-br-",
"scheduled_job": "sn-job-"
}.get(caller_type, "")
if caller_prefix:
caller_node_id = f"{caller_prefix}{caller_edge.source_id}"
self._add_edge(
edge_type="CALLS",
source_node_id=caller_node_id,
target_node_id=si_node_id,
properties={
"caller_type": caller_type,
"target_name": si.get("name", ""),
}
)
Step 3: Update path materializer (sv0-platform/src/domain/graph/path-materializer.ts)
// In _traverseExecutionPath(), add CALLS edge handling:
if (edge.edgeType === 'CALLS') {
// Follow invocation chain: BR → SI → REST Message
// Don't consume auth depth budget (this is code invocation, not delegation)
const targetNode = await this._getNode(edge.targetNodeId);
if (targetNode) {
path.push({ type: 'invocation', edge, node: targetNode });
await this._traverseExecutionPath(targetNode, depth, maxDepth, visited, path);
}
}
Semantic Considerations
CALLS vs EXECUTES_ON:
EXECUTES_ON— "this automation performs actions on this resource/endpoint" (writes to incident table, calls Graph API)CALLS— "this automation invokes this other automation artifact" (BR invokes SI, SI invokes another SI)
CALLS vs RUNS_AS:
RUNS_AS— "this automation executes as this identity" (identity binding for permission inheritance)CALLS— "this automation invokes this artifact" (code-level invocation, no identity change)
Path materializer impact: The path materializer must distinguish between:
- Code invocation (CALLS) — follow the chain without consuming auth depth
- Identity delegation (AUTHENTICATES_TO) — consumes auth depth, tracks trust chain position
- Permission inheritance (RUNS_AS) — borrows target identity's paths, doesn't consume depth
Why CALLS doesn't consume depth: It's not delegation — it's the same execution context invoking a subroutine. The BR and SI are both "system" code running in ServiceNow's server process. Only when the REST Message authenticates to an external system (AUTHENTICATES_TO) does the auth chain depth increment.
Implementation Effort
| Component | Change | Effort | Risk |
|---|---|---|---|
| Platform types.ts | Add "CALLS" to NormalizedEdgeType enum | 5 min | None (non-breaking) |
| Platform path materializer | Add CALLS edge traversal logic | 1 hour | Low (similar to existing edge handlers) |
| Connector transformer.py | Emit CALLS edges from caller_edges list | 2 hours | Low (data already available) |
| Connector tests | Add test cases for CALLS edge emission | 1 hour | None |
| Platform tests | Add path materializer tests for CALLS | 1 hour | None |
| Total | ~6 hours | Low |
Backward compatibility: Fully backward compatible. Existing graphs without CALLS edges continue to work. New connector syncs emit CALLS edges; platform path materializer follows them if present.
2. Trigger Record Modeling
Current State
What the connector collects:
servicenow_client.py:897-950 — get_recent_trigger_examples() fetches:
# Fields: sys_id, table, number, short_description, created_by, created_on
# Example: INC0010023, "joiner alice.joiner@smedvedsecurityv0..."
Stored in ExecutionChain.trigger_examples (correlator.py:184) but never referenced in transformer.py.
What the chain report shows (but platform never sees):
┌─────────────────────────────────────────────────────────────────┐
│ TRIGGER RECORD: incident/INC0010023
│ Created by: admin
│ Created on: 2026-02-05 04:07:18
│ Description: joiner alice.joiner@smedvedsecurityv0.onmicro...
└──────────────────────────────┬──────────────────────────────────┘
▼
│ BUSINESS RULE: Auto-route identity tickets via Entra
This is first-party execution evidence — proof that a specific record triggered this automation at a specific time.
Platform Entity Model
The platform already has execution_evidence as a first-class node type (ingestion/types.ts:7, 01-data-model.md:285-308):
export type NormalizedNodeType =
| "autonomous_identity"
| "human_identity"
| "role" | "permission" | "resource" | "credential"
| "execution_evidence"; // ← Already exists
ExecutionEvidence properties (01-data-model.md:298-307):
source_table— where the evidence was fetched from (e.g.,incident)source_record_id— ID in the source system (e.g.,INC0010023)source_timestamp— when the execution occurredevidence_type— api_call, flow_execution, scheduled_job, sign_in, trigger_record (NEW)action— what was done (e.g., "trigger:incident.insert")target_resource— what was acted upon (e.g., "incident")outcome— success, failure, unknownpayload_hash— SHA256 of the source record content
Relationship: Execution evidence is linked to identities via EXECUTES_ON edges (01-data-model.md:461).
Proposed Solution
Option A: Emit trigger records as execution_evidence nodes (RECOMMENDED)
Create execution_evidence nodes for each trigger example, linked to the BR that triggered:
# In _process_execution_chain(), after BR node creation:
for br in chain.business_rules + chain.indirect_business_rules:
br_node_id = f"sn-br-{br_sys_id}"
# ... existing BR node creation ...
# NEW: Emit trigger record evidence
for trigger_rec in chain.trigger_examples:
if trigger_rec.get("table") == br.get("table"): # Match on trigger table
trigger_node_id = self._add_node(
node_id=f"evidence-trigger-{trigger_rec['sys_id']}",
node_type="execution_evidence",
source_system="servicenow",
source_id=trigger_rec["sys_id"],
display_name=f"Trigger: {trigger_rec.get('number', trigger_rec['sys_id'])}",
status="active",
created_at=trigger_rec.get("created_on"),
properties={
"source_table": trigger_rec["table"],
"source_timestamp": trigger_rec["created_on"],
"evidence_type": "trigger_record",
"action": f"trigger:{trigger_rec['table']}.insert",
"target_resource": trigger_rec["table"],
"outcome": "success",
"created_by": trigger_rec.get("created_by", ""),
"short_description": trigger_rec.get("short_description", "")[:200],
}
)
# Link trigger record → BR via TRIGGERS_ON (reuse existing edge type)
self._add_edge(
edge_type="TRIGGERS_ON",
source_node_id=trigger_node_id,
target_node_id=br_node_id,
properties={
"triggerType": "record_event",
"timestamp": trigger_rec["created_on"],
}
)
Option B: Embed trigger examples in TRIGGERS_ON edge properties
Store trigger examples as a trigger_examples array in the existing TRIGGERS_ON edge properties:
self._add_edge(
edge_type="TRIGGERS_ON",
source_node_id=br_node_id,
target_node_id=table_node_id,
properties={
"triggerType": "event",
"events": events or ["record_change"],
"trigger_examples": [ # NEW
{
"sys_id": rec["sys_id"],
"number": rec.get("number"),
"created_on": rec["created_on"],
"created_by": rec["created_by"],
}
for rec in chain.trigger_examples[:5] # Limit to 5 most recent
]
},
)
Option C: New edge type TRIGGERED_BY
Create a reverse relationship from table resource to trigger record:
execution_evidence (trigger record) -[TRIGGERED_BY]-> BR
Recommendation: Option A (execution_evidence nodes)
Why:
- Consistency with existing patterns — Azure sign-ins are already modeled as execution_evidence nodes (transformer.py:1442-1492)
- First-class queryability — trigger records become searchable entities, not nested properties
- Evidence completeness tracking — can declare whether trigger examples are available/unavailable per entity type
- Temporal analysis — trigger timestamps become first-class temporal markers for drift detection
Data volume concern:
If a BR triggers 10,000 times/day, emitting all trigger examples would create 10,000 evidence nodes per sync. Mitigation: Limit to N most recent examples (5-10) per BR, or add a connector config flag --include-trigger-examples (default: false).
Alternative for high-volume scenarios: Use Option B (embed in edge properties) for trigger-heavy automations, Option A for low-volume or security-critical triggers.
Implementation Effort
| Component | Change | Effort | Risk |
|---|---|---|---|
| Connector transformer.py | Add trigger record → execution_evidence node emission | 2 hours | Low (pattern exists for sign-ins) |
| Connector transformer.py | Link trigger evidence → BR via TRIGGERS_ON edge | 30 min | Low |
| Connector config | Add --max-trigger-examples flag (default: 5) | 30 min | None |
| Connector tests | Add test cases for trigger evidence emission | 1 hour | None |
| Platform ingestion | No changes (execution_evidence already supported) | 0 | None |
| Total | ~4 hours | Low |
3. Record Mutation / Effect Modeling
Current State
What the connector collects:
servicenow_client.py:38-112 — analyze_script_mutations() detects:
{
"table": "incident",
"fields_modified": ["assignment_group", "u_auto_routed", "work_notes"],
"mutation_types": ["update"],
"workflow_suppressed": True # setWorkflow(false) detected
}
Stored in ExecutionChain.local_mutations (correlator.py:207, transformer.py:802-817).
What the transformer emits: Only the table name becomes a resource node (transformer.py:802-817). Field-level writes and workflow suppression are lost.
Example from chain report (not in platform graph):
┌─────────────────────────────────────────────────────────────────┐
│ RECORD CHANGES (after API response): │
│ Table: incident
│ Fields: assignment_group, u_auto_routed, u_autorouted, work_notes
│ Action: update
│ ⚠ setWorkflow(false) — downstream triggers suppressed
└─────────────────────────────────────────────────────────────────┘
Semantic Question: What Are "Record Changes"?
Two interpretations:
-
Egress effects — what the automation WRITES to the target system (Entra Graph, AWS, GitHub)
- Example: "Creates user in Entra" or "Pushes to GitHub repo"
- This is external blast radius (cross-system impact)
-
Local effects — what the automation WRITES back to ServiceNow after processing
- Example: "Updates incident.assignment_group after calling Entra Graph API"
- This is internal blast radius (same-system side effects)
Chain report example is LOCAL: The BR calls Graph API (egress), receives response, then updates the incident record in ServiceNow (local mutation). The mutation is after the external call, not part of it.
Why this matters for modeling:
- Egress effects are typically captured in the target resource (the external API endpoint)
- Local effects need to be modeled as properties of the automation node or separate mutation edges
Proposed Solutions
Option A: Automation node properties (SIMPLEST)
Add mutation details as properties on the automation node:
br_node_id = self._add_node(
node_id=f"sn-br-{br_sys_id}",
node_type="autonomous_identity",
source_system="servicenow",
source_id=br_sys_id,
display_name=br.get("name", "Unknown Business Rule"),
status="active",
properties={
"identitySubtype": "business_rule",
"automation_type": "business_rule",
"table": br.get("table", ""),
# NEW:
"local_mutations": [
{
"table": "incident",
"fields_modified": ["assignment_group", "u_auto_routed", "work_notes"],
"mutation_types": ["update"],
"workflow_suppressed": True
}
],
"workflow_suppression_count": 1, # Aggregate signal
},
)
Pros:
- No schema changes
- Queryable via node properties filter
- Low implementation effort
Cons:
- Mutations not first-class entities (harder to query "all automations that modify assignment_group")
- No temporal tracking of when mutations were detected
Option B: New edge type MODIFIES (MORE EXPRESSIVE)
Create edges from automation → table resource with mutation metadata:
for mutation in chain.local_mutations:
table_node_id = self._add_node(
node_id=f"sn-table-{mutation['table']}",
node_type="resource",
source_system="servicenow",
source_id=mutation["table"],
display_name=mutation["table"],
status="active",
properties={"resourceType": "table"},
)
self._add_edge(
edge_type="MODIFIES", # NEW edge type
source_node_id=br_node_id,
target_node_id=table_node_id,
properties={
"fields_modified": mutation["fields_modified"],
"mutation_types": mutation["mutation_types"],
"workflow_suppressed": mutation["workflow_suppressed"],
}
)
Pros:
- First-class relationship (can query "all automations that MODIFIES incident")
- Edge properties carry full mutation context
- Enables blast radius queries: "show all tables modified by orphaned automations"
Cons:
- Requires platform schema extension (add "MODIFIES" to NormalizedEdgeType)
- Higher implementation effort (schema change + path materializer updates)
Option C: Mutation evidence nodes (MOST GRANULAR)
Create execution_evidence nodes for each mutation operation:
for mutation in chain.local_mutations:
mutation_node_id = self._add_node(
node_id=f"evidence-mutation-{br_sys_id}-{mutation['table']}",
node_type="execution_evidence",
source_system="servicenow",
source_id=f"mutation:{br_sys_id}:{mutation['table']}",
display_name=f"Mutation: {mutation['table']} by {br.get('name')}",
status="active",
properties={
"source_table": mutation["table"],
"evidence_type": "local_mutation",
"action": f"{mutation['mutation_types'][0]}:{mutation['table']}",
"target_resource": mutation["table"],
"fields_modified": mutation["fields_modified"],
"workflow_suppressed": mutation["workflow_suppressed"],
}
)
self._add_edge(
edge_type="EXECUTES_ON",
source_node_id=br_node_id,
target_node_id=mutation_node_id,
)
Pros:
- Most granular (one evidence node per mutation)
- Consistent with other execution evidence patterns
- Temporal tracking built-in (node createdAt)
Cons:
- Highest node count (one per mutation per automation)
- Execution evidence type may be overloaded (mixing "proof of execution" with "proof of mutation")
Recommendation: Option A for MVP, Option B for production
Short-term (MVP): Use Option A (node properties) to quickly surface mutation data without schema changes.
Long-term (production): Use Option B (MODIFIES edge) for:
- First-class queryability
- Blast radius analysis ("which automations modify HR tables?")
- Path materializer integration (follow MODIFIES edges to compute local blast radius)
Why not Option C: Execution evidence is conceptually "proof that something happened" (sign-ins, trigger records, API calls). Mutations are what the automation does (capabilities), not evidence that it executed. Mixing these semantics would confuse the evidence model.
setWorkflow(false) — Security Significance
What it means: ServiceNow's setWorkflow(false) suppresses downstream Business Rules and workflows when a record is updated. This is security-significant because:
- Audit trails may be bypassed (no BR fires to log the change)
- Approval workflows may be skipped (e.g., assignment group change without manager approval)
- Notifications may be suppressed (on-call engineer not alerted)
How to surface this:
- Option A: Add
workflow_suppression_countto automation node properties (aggregate signal) - Option B: Add
workflow_suppressed: trueto MODIFIES edge properties (per-mutation granularity) - Finding rule: Create a
workflow_suppressionfinding type: "This automation bypasses downstream audit/approval controls"
Recommendation: Use Option B (per-mutation flag) + create a platform finding rule that fires when workflow_suppressed: true is detected on a MODIFIES edge from an automation with ownership_status: orphaned or security_relevance: active_external.
Implementation Effort
| Component | Change | Effort | Risk |
|---|---|---|---|
| Option A (node properties) | |||
| Connector transformer.py | Add local_mutations to automation node properties | 1 hour | Low |
| Connector tests | Test mutation property emission | 30 min | None |
| Option B (MODIFIES edge) | |||
| Platform types.ts | Add "MODIFIES" to NormalizedEdgeType | 5 min | None |
| Connector transformer.py | Emit MODIFIES edges with mutation metadata | 2 hours | Low |
| Platform path materializer | Add MODIFIES edge traversal | 1 hour | Low |
| Connector tests | Test MODIFIES edge emission | 1 hour | None |
| Platform evaluator | Add workflow_suppression finding rule | 2 hours | Medium (new rule logic) |
| Total (Option A) | ~1.5 hours | Low | |
| Total (Option B) | ~7 hours | Low |
4. HTTP Method Details
Current State
What the connector collects:
ExecutionChain.http_methods (correlator.py:167) stores:
[
{
"name": "GET User",
"http_method": "GET",
"endpoint": "https://graph.microsoft.com/v1.0/users/{id}",
"rest_endpoint": "https://graph.microsoft.com",
"sys_id": "abc123...",
"auth_type": "oauth2",
"headers": {"Accept": "application/json"},
},
...
]
What the transformer emits: Only the endpoint URL is included in the REST message resource node (transformer.py:576-601):
rest_msg_node_id = self._add_node(
node_id=f"sn-restmsg-{rest_msg['sys_id']}",
node_type="resource",
source_system="servicenow",
source_id=rest_msg["sys_id"],
display_name=rest_msg.get("name", "Unknown REST Message"),
status="active",
properties={
"resourceType": "rest_message",
"endpoint_url": endpoint_url, # ← Only URL, no method/headers
"authentication_type": rest_msg.get("authentication_type", ""),
},
)
Use Cases for HTTP Method Details
1. Egress classification accuracy:
GET https://graph.microsoft.com/v1.0/users— read-only, lower riskPOST https://graph.microsoft.com/v1.0/users— creates users, higher riskDELETE https://graph.microsoft.com/v1.0/groups/{id}/members/{id}— removes group members, security-significant
2. Blast radius determination:
- REST message "Graph API Sync" has 10 HTTP methods — 8 GET, 2 POST
- Blast radius should reflect write operations, not just "accesses Graph API"
3. Evidence completeness:
- REST message has 5 configured methods but only 2 have been used in the last 30 days (from execution evidence)
- Scope drift: methods added over time without re-approval
Proposed Solution
Option: Add http_methods array to REST message node properties
rest_msg_node_id = self._add_node(
node_id=f"sn-restmsg-{rest_msg['sys_id']}",
node_type="resource",
source_system="servicenow",
source_id=rest_msg["sys_id"],
display_name=rest_msg.get("name", "Unknown REST Message"),
status="active",
properties={
"resourceType": "rest_message",
"endpoint_url": endpoint_url,
"authentication_type": rest_msg.get("authentication_type", ""),
# NEW:
"http_methods": [
{
"name": method.get("name", ""),
"http_method": method.get("http_method", "GET"),
"endpoint": method.get("endpoint", ""),
"auth_type": method.get("auth_type", ""),
}
for method in chain.http_methods
],
"method_count": len(chain.http_methods),
"write_method_count": len([
m for m in chain.http_methods
if m.get("http_method") in ["POST", "PUT", "PATCH", "DELETE"]
]),
},
)
Why not separate HTTP method nodes?
- HTTP methods are not independent execution artifacts — they're configuration details of the REST message
- Creating one node per method would inflate node count without adding execution path value
- Methods don't have separate ownership/lifecycle — they inherit from the REST message
Egress classifier update:
Modify egress_classifier.py to consider HTTP method when classifying risk:
def classify_egress(endpoint_url: str, instance_host: str, http_methods: list) -> dict:
# ... existing URL-based classification ...
# NEW: Adjust category based on methods
write_methods = [m for m in http_methods if m.get("http_method") in ["POST", "PUT", "PATCH", "DELETE"]]
if write_methods and category == "external":
# External write operations are higher risk than reads
return {
"egress_category": "external_write",
"write_method_count": len(write_methods),
}
Implementation Effort
| Component | Change | Effort | Risk |
|---|---|---|---|
| Connector transformer.py | Add http_methods array to REST message properties | 1 hour | Low |
| Connector egress_classifier.py | Update classifier to consider HTTP methods | 2 hours | Medium (risk scoring logic) |
| Connector tests | Test HTTP method property emission | 30 min | None |
| Platform UI | Display HTTP methods in resource detail view | 1 hour | Low |
| Total | ~4.5 hours | Low |
5. Data Quality: "automation_disabled" Owner
Context
The chain report shows:
REST MESSAGE: Graph - sn-ticket-router-no-owner
Created by: automation_disabled
Updated by: automation_disabled
BUSINESS RULE: Auto-route identity tickets-Enta-no-own
Created by: automation_disabled
Question: Is "automation_disabled" a real ServiceNow user, a system account, or a disabled account?
ServiceNow Ground Truth
Common ServiceNow system accounts:
| Username | Type | Purpose |
|---|---|---|
admin | Human/system | Default admin account (often used for automation) |
system | System | ServiceNow platform itself (scheduled jobs, system tasks) |
guest | System | Unauthenticated access (usually disabled) |
maint | System | Maintenance operations |
integration_user | Human/service | Dedicated account for external integrations |
"automation_disabled" is NOT a standard ServiceNow system account. It's either:
- A custom service account created by an admin (likely disabled after use)
- A renamed/repurposed account
- An account name that suggests it SHOULD be disabled but may not be
How to verify:
Query the sys_user table for this user:
// ServiceNow script to check account state
var gr = new GlideRecord('sys_user');
gr.addQuery('user_name', 'automation_disabled');
gr.query();
if (gr.next()) {
gs.info('User: ' + gr.user_name);
gs.info('Active: ' + gr.active);
gs.info('Locked out: ' + gr.locked_out);
gs.info('Last login: ' + gr.last_login_time);
gs.info('Email: ' + gr.email);
}
Connector Behavior
Current state:
The connector fetches sys_created_by and emits it as a human_identity node with status="disabled" if no user details are found (transformer.py:1320-1331):
# Conservative default: unknown creator account state should not be assumed active.
creator_node_id = self._add_node(
node_id=f"sn-user-{creator_username}",
node_type="human_identity",
source_system="servicenow",
source_id=creator_username,
display_name=creator_username,
status="disabled", # ← Conservative default
properties={},
)
Problem: If "automation_disabled" is a functional service account that's still active but named misleadingly, the connector incorrectly marks it as disabled, triggering a false-positive orphaned_ownership finding.
Solution: The connector should:
- Fetch full
sys_userrecord for the creator - Use the actual
activefield from ServiceNow to set status - Log a warning if the username contains "disabled" but the account is active
Code change (servicenow_client.py):
def get_user_details(self, username: str) -> dict | None:
"""Get full sys_user record for a username."""
users = self._get_table(
"sys_user",
query=f"user_name={username}",
fields=["sys_id", "user_name", "name", "email", "active", "locked_out", "last_login_time"],
limit=1,
)
if not users:
logger.warning(f"User {username} not found in sys_user table")
return None
user = users[0]
# Detect suspicious naming
if "disabled" in username.lower() and user.get("active") == "true":
logger.warning(
f"User {username} has 'disabled' in name but account is ACTIVE "
f"(last login: {user.get('last_login_time', 'never')})"
)
return user
Recommendation
Action: Enhance the connector to fetch full sys_user details for all creators and use the actual active field.
Finding rule: Create a suspicious_ownership finding that fires when:
- An automation's creator username contains "disabled", "test", "temp", or "demo"
- The account is still active
- The automation is actively executing
Rationale: This indicates operational neglect — an account that should have been disabled after testing is now the de facto owner of production automation.
Implementation Effort
| Component | Change | Effort | Risk |
|---|---|---|---|
| Connector servicenow_client.py | Add get_user_details() method | 1 hour | Low |
| Connector transformer.py | Use user.active field instead of conservative default | 1 hour | Low |
| Connector transformer.py | Log warning for suspicious usernames | 30 min | None |
| Platform evaluator | Add suspicious_ownership finding rule | 2 hours | Medium |
| Total | ~4.5 hours | Low |
6. Implementation Plan
Phase 1: Quick Wins (No Platform Changes) — ~10 hours
Goal: Surface existing data without schema changes
-
HTTP method details (4.5 hours)
- Add http_methods array to REST message node properties
- Update tests
-
Local mutations as node properties (1.5 hours)
- Add local_mutations array to automation node properties
- Include workflow_suppression_count aggregate
-
Trigger records as execution_evidence nodes (4 hours)
- Emit trigger examples as execution_evidence nodes
- Link via TRIGGERS_ON edge (reuse existing edge type)
- Add --max-trigger-examples config flag
Deliverables: All execution provenance data visible in platform, no breaking changes.
Phase 2: Schema Extensions (Platform Changes Required) — ~13 hours
Goal: Make provenance data first-class queryable entities
-
CALLS edge emission (6 hours)
- Extend platform NormalizedEdgeType enum
- Update path materializer to traverse CALLS edges
- Emit CALLS edges in transformer
- Add tests
-
MODIFIES edge for local mutations (7 hours)
- Extend platform NormalizedEdgeType enum
- Emit MODIFIES edges with mutation metadata
- Update path materializer to compute local blast radius
- Add workflow_suppression finding rule
- Add tests
Deliverables: Execution provenance is first-class in the data model, enabling advanced queries and findings.
Phase 3: Data Quality Improvements — ~4.5 hours
- Resolve "automation_disabled" ownership ambiguity (4.5 hours)
- Enhance connector to fetch full sys_user details
- Use actual active field instead of conservative default
- Add suspicious_ownership finding rule
Deliverables: Accurate ownership status, fewer false positives.
Total Effort: ~28 hours (3.5 engineering days)
Risk Assessment:
- Low risk: Phases 1 and 3 (no platform schema changes)
- Medium risk: Phase 2 (schema extensions, but non-breaking)
Rollout Strategy:
- Phase 1 → Immediate (connector-only changes, backward compatible)
- Phase 2 → After Phase 1 validation (platform + connector coordination)
- Phase 3 → Parallel to Phase 2 (independent data quality work)
7. Backward Compatibility Analysis
Will adding new edges/nodes break existing ingestion?
Answer: No — fully backward compatible.
Why:
- New edge types (CALLS, MODIFIES) are additive — existing graphs without these edges continue to work
- New node types (execution_evidence for triggers) already exist in the platform schema
- New node properties (http_methods, local_mutations) are optional — ingestion doesn't enforce property schemas
- Path materializer changes are additive — new edge types are only traversed if present
Testing approach:
- Run existing connector test suite against new transformer code → all pass
- Ingest old graph (without new edges) into new platform → no errors
- Ingest new graph (with new edges) into old platform → edge validation rejects unknown types (expected)
- Mixed deployment: new connector + old platform → connector emits new edges, platform ignores them (graceful degradation)
Migration path:
- Day 1: Deploy Phase 1 connector changes (node properties only) — no platform coordination needed
- Day 7: Deploy Phase 2 platform changes (new edge types) — connector already emitting these edges
- Day 8: Deploy Phase 2 connector changes (emit new edges) — platform already accepts them
Rollback safety: If new code causes issues, rolling back the connector is safe — old transformer will emit old schema, platform continues working.
8. Open Questions for Architect
-
CALLS edge semantics: Should CALLS consume auth depth budget, or is it code-level invocation without delegation? (Recommendation: no depth consumption, similar to RUNS_AS)
-
Trigger record volume: If a BR triggers 10,000 times/day, should we emit all trigger examples or limit to N most recent? (Recommendation: limit to 5-10 per automation, add config flag)
-
Mutation evidence vs node properties: Is "local mutation" an execution capability (node property) or execution evidence (evidence node)? (Recommendation: capability, not evidence — use MODIFIES edge)
-
workflow_suppression finding rule: Should this trigger only for orphaned automations, or any automation with external egress? (Recommendation: any automation with
security_relevance: active_external+workflow_suppressed: true) -
HTTP method write classification: Should POST/PUT/PATCH/DELETE methods upgrade
egress_categoryfrom "external" to "external_write"? (Recommendation: yes — write operations are higher risk) -
"automation_disabled" ownership: Should the connector emit a warning when a creator username contains "disabled" but the account is active? (Recommendation: yes — this is a data quality red flag)
Appendix A: Example Transformed Graph (With Proposed Changes)
Before (current state):
BR: Auto-route identity tickets via Entra
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router
-[TRIGGERS_ON]-> Table: incident
-[RUNS_AS]-> SP: sn-ticket-router
SI: AzureGraphRouter
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router ← Duplicate edge
-[RUNS_AS]-> SP: sn-ticket-router
After (with CALLS, trigger evidence, mutations):
Trigger Evidence: incident/INC0010023
created_by: admin, created_on: 2026-02-05 04:07:18
-[TRIGGERS_ON]-> BR: Auto-route identity tickets via Entra
BR: Auto-route identity tickets via Entra
properties: {
local_mutations: [{table: "incident", fields: ["assignment_group", "work_notes"], workflow_suppressed: true}]
}
-[CALLS]-> SI: AzureGraphRouter ← NEW: explicit invocation
-[TRIGGERS_ON]-> Table: incident
-[RUNS_AS]-> SP: sn-ticket-router
-[MODIFIES]-> Table: incident (fields: ["assignment_group", "work_notes"], workflow_suppressed: true)
SI: AzureGraphRouter
-[EXECUTES_ON]-> REST Message: Graph - sn-ticket-router
-[RUNS_AS]-> SP: sn-ticket-router
REST Message: Graph - sn-ticket-router
properties: {
endpoint_url: "https://graph.microsoft.com",
http_methods: [
{name: "GET User", http_method: "GET", endpoint: "/v1.0/users/{id}"},
{name: "GET Groups", http_method: "GET", endpoint: "/v1.0/groups"}
],
method_count: 2,
write_method_count: 0
}
Query examples enabled by new edges:
-
"Show all automations that invoke AzureGraphRouter"
MATCH (auto)-[CALLS]->(si:Script Include {name: "AzureGraphRouter"})
RETURN auto -
"Show all automations that suppress workflows when modifying HR tables"
MATCH (auto)-[m:MODIFIES]->(table:Resource)
WHERE table.business_domain = "hr" AND m.workflow_suppressed = true
RETURN auto, table -
"Show trigger records for orphaned automations"
MATCH (evidence:ExecutionEvidence)-[TRIGGERS_ON]->(auto:Identity)
WHERE auto.ownership_status = "orphaned" AND evidence.evidence_type = "trigger_record"
RETURN evidence, auto
Appendix B: ServiceNow Data Availability
What data is available for trigger records?
ServiceNow Table API returns:
sys_id— unique record ID (e.g.,abc123...)number— human-readable record number (e.g.,INC0010023)sys_created_by— username who created the recordsys_created_on— timestamp of creationsys_updated_by— username who last updated the recordsys_updated_on— timestamp of last update- Table-specific fields:
short_description,caller_id,assignment_group, etc.
Fields NOT available without additional queries:
- Full user details for
sys_created_by(requires join tosys_user) - Audit trail of who/what triggered the BR (ServiceNow doesn't log "BR X fired because record Y was created")
- Actual field values that matched the BR's condition (e.g., "short_description CONTAINS 'joiner'")
Connector limitation:
servicenow_client.py:897-950 fetches trigger examples but doesn't verify that these records actually triggered the BR. It assumes: "recent records from this table that match the BR's condition would have triggered it."
Implication: Trigger evidence is inferred, not proven. We should set evidence_type: "inferred_trigger_record" to be explicit about this.
What data is available for local mutations?
ServiceNow script content analysis returns:
table— GlideRecord target table namefields_modified— field names passed to setValue()mutation_types— update, insert, delete (from script text analysis)workflow_suppressed— boolean (setWorkflow(false) detected)
Fields NOT available:
- Actual field values written (requires runtime instrumentation)
- Number of records modified per execution (requires transaction logs)
- Whether mutations actually succeeded (requires error handling analysis)
Connector limitation:
servicenow_client.py:38-112 uses regex-based static analysis. Dynamic table names (new GlideRecord(tableName)) and indirect field references (gr.setValue(fieldName, value)) are missed.
Implication: Mutation detection is best-effort, not exhaustive. We should include a note in evidence completeness: "Local mutations detected via static script analysis; dynamic references not captured."
End of Integrator Analysis