Skip to main content

Architecture and Data Model Review

Date: 2026-02-07

Scope

This review evaluates the SecurityV0 architecture, data model, and evidence strategy against the PRD constraints for deterministic, evidence-grade findings. It incorporates new automation research and ServiceNow evidence sources.

Strengths That Should Be Preserved

  • Deterministic scope and non-goals keep the MVP credible and enforce evidence-grade discipline.
  • AUTHENTICATES_TO enables cross-system execution paths and is the correct abstraction for Entra to ServiceNow chains.
  • Audit-log provenance provides strong traceability for findings and evidence packs.
  • Materialized execution paths enable fast blast-radius queries at small tenant scale.
  • Connector contract cleanly separates extraction from storage and allows parallel development.

Evidence-Grade Blocking Gaps

  • Baseline storage design breaks at scale. Current baselines embed all entities in a single document, which will exceed MongoDB 16MB limits well below real-world tenant sizes.
  • Ownership model mismatch. The data model treats Owners as distinct entities, but the connector schema lacks an Owner node type, which makes OWNED_BY semantics ambiguous and compromises deterministic ownership decay.
  • Execution evidence is not a first-class artifact. The PRD requires proof of autonomous execution, but the model only stores EXECUTES_ON edges without immutable evidence records or source log linkage.
  • Embedded execution paths will hit document size limits. execution_paths and accessible_by arrays can exceed limits for high-fan-out identities and resources.
  • No-inference constraint conflicts with drift narratives. Scope drift and approval language must be backed by explicit approval records or labeled as unavailable.
  • raw_api_response risks policy violations. Storing raw API responses without redaction risks secrets or regulated data exposure, violating metadata-only constraints.

Architecture Improvements (Prioritized)

P0: Must Fix for MVP Integrity

  • Redesign baselines. Store baselines as per-entity documents or bounded chunks keyed by baseline_id to avoid the 16MB limit and enable partial retrieval.
  • Align ownership modeling. Add an Owner node type in the normalized schema or enforce owner_type on human/team nodes with clear OWNED_BY semantics. Do not overload sys_created_by as owner.
  • Add execution evidence artifacts. Introduce an execution_evidence entity with source_table, sys_id, source_timestamp, and payload_hash. Link EXECUTES_ON and EXECUTES edges to these records.
  • Add deterministic link evidence for AUTHENTICATES_TO. Include source IDs from OAuth Application Registry and token mapping configuration (client_id, user field) to prove Entra-to-ServiceNow linkage.

P1: Required for Realistic Tenant Scale

  • Move execution_paths and accessible_by into a dedicated collection when thresholds are exceeded. Keep embedded arrays only for small tenants with size guards.
  • Introduce evidence completeness flags. If transaction logs or role audit logs are not enabled, findings should explicitly note incomplete history rather than imply approval or execution.
  • Define a redaction policy. Replace raw_api_response with a hashed payload and a field-level allowlist for persisted metadata.

P2: Improves Deterministic Drift and Automation Coverage

  • Extend drift detection to automation artifacts. Track changes in flow triggers, schedules, run_as identity, and activation states as deterministic drift signals.
  • Add automation-centric events. Examples include automation_created, automation_updated, run_as_changed, schedule_changed, flow_published.

Automation Coverage Additions

New automation research requires explicit modeling of ServiceNow automation artifacts beyond identities and roles.

Automation Artifacts As First-Class Entities

  • Flow Designer flows
  • Business Rules
  • Script Actions
  • Scheduled Jobs
  • Script Includes
  • Workflow Activities

Required Relationships

  • CREATED_BY (automation to human)
  • RUNS_AS (automation to identity)
  • TRIGGERS_ON (automation to resource or event)
  • EXECUTES (automation to execution_evidence)

Identity Subtype for System Execution

Many ServiceNow automations run as System. Model a system identity subtype as a privileged NHI for accurate blast radius.

Data Model Alignment Issues

  • OWNED_BY should represent accountable ownership, not mere creation. Use CREATED_BY for sys_created_by. Ownership should be explicit and reassigned when possible.
  • ownership_state should be derived by the evaluator, not ingested from connectors.
  • AUTHENTICATES_VIA and AUTHENTICATES_TO should carry evidence references for deterministic linkage.

Evidence Chain Requirements (ServiceNow)

Required evidence sources for deterministic claims:

  • syslog_transaction for inbound REST execution evidence.
  • sys_flow_context for Flow Designer execution evidence.
  • ecc_queue for MID Server execution evidence.
  • sys_audit_role for role change evidence when enabled.
  • sys_user_has_role for current role state.

Open Questions

  • Which ServiceNow tables and fields will be approved as authoritative evidence in the MVP?
  • Is sys_audit_role enabled in the target instances, and is syslog_transaction accessible for API ingestion?
  • How will Entra appId (client_id) be mapped to ServiceNow oauth_entity records deterministically?
  • What scale thresholds trigger externalization of execution paths and baselines?
  • What approval system is authoritative for role expansion or scope change evidence?
  • Update core architecture docs to incorporate baseline redesign, evidence artifacts, and automation entities.
  • Add a formal evidence schema to the data model and define evidence completeness semantics.
  • Validate ServiceNow table availability and field mappings in a real instance.

Next Action

Status: adopted — shipped Findings incorporated into docs/architecture/01-data-model.md, 05-connectors.md, and associated ADRs. No further action required.