Execution Evidence Feasibility Study

Date: 2026-02-27 Trigger: Authority Paths Primer shifts to execution-determined model Review rounds: Initial analysis → Codex cross-review → Pessimist/Optimist dual review Verdict: Hybrid "Evidence-Graded" model required. Pure execution-determined is not deliverable today. With focused investment, ~70% Grade A coverage is achievable in 6 months for well-configured customers.

Executive Summary — CEO View

30-second read:

The Primer's vision — "only show what was exercised" — is the right aspiration and a strong differentiator. No competitor grades authority paths by evidence quality. But we cannot claim "execution-determined" today without risking credibility.

What we recommend instead: Evidence-Graded Authority Paths.

"SecurityV0 found 142 authority paths. 14 have confirmed execution evidence (Grade A). 38 have inferred activity (Grade B). 90 have standing authority but no execution proof (Grade C). Those 90 paths are your evidence blind spot — authority that exists but cannot be confirmed or denied."

Why this is stronger than the pure vision:

Honest — we never claim execution proof where we only have configuration. CISOs will test us.
Differentiated — every competitor shows paths flat. We grade by evidence quality. Nobody else does this.
Progressive — as evidence improves, paths upgrade from C → B → A. Customers see measurable progress.
Turns gaps into sales — "You have 90 paths where you can't even tell if they're being used. That's a governance gap." Our limitation becomes the customer's problem to solve.

The trajectory: With 6 months of focused work, well-configured customers can reach ~70% Grade A. The remaining ~30% is honestly labeled as blind spots — which is itself a finding no competitor surfaces.

Executive Summary — CISO View

What SecurityV0 can prove today (Grade A):

ServiceNow Flow Designer executions — deterministic execution records with state (Completed/Error/Failed)
Business Rule executions with structured syslog — deterministic log entries with script name + record ID
Service principal authentication — sign-in logs proving "SP X authenticated to Microsoft Graph" (already integrated)

What we can infer today (Grade B):

Business Rule probably ran — trigger table has recent matching records
Scheduled job ran at time T — timestamp only, no outcome
AI agent was active — aggregate thread count in Foundry

What we can only show as configuration (Grade C):

Business Rules without logging — configured to run, but no proof they did
REST message destinations — configured endpoint, but no proof a call was made
Latent role authority — SP has Contributor but maybe only used Reader

What we are building next (upgrades existing paths):

Foundry per-run tool calls (Grade B → A) — which tools the agent called, with what arguments
ServiceNow sys_audit integration (Grade B/C → A) — record-level proof of BR execution
Path-level evidence attribution — prove which specific path was exercised, not just which workload ran
Logging readiness detection — automatically surface which evidence sources are missing and what to enable

What requires your action to unlock:

Azure Monitor diagnostic settings for MicrosoftGraphActivityLogs → enables "which Graph API endpoint was called by which SP"
ServiceNow outbound HTTP logging (glide.rest.outbound_log_level=elevated) → enables cross-system call proof
Azure AI User role grant on Foundry projects → enables per-run execution evidence

The honest position: We grade every authority path by evidence quality. Grade A is deterministic. Grade B is strong inference. Grade C is configuration-only. We never claim certainty where we don't have it. The evidence gap itself is a finding.

Evidence Tier Framework

Tier	What it proves	Confidence	Example
Tier 1 — Direct	"This identity called this API at this time"	`DETERMINISTIC`	Entra sign-in log: SP `abc` authenticated to `Microsoft Graph` at `2026-02-27T10:42:00Z`
Tier 2 — Correlated	"This automation probably ran because a related record exists"	`TEMPORAL_INFERRED`	An incident record exists in the table this Flow writes to, created in the last 24h
Tier 3 — Structural	"This automation is configured to run and has the authority to do so"	`STRUCTURAL`	Business rule is active, has RUNS_AS to a SP with Reader role on Microsoft Graph

These map to the EvidenceConfidenceLevel type already defined in sv0-platform/src/domain/evidence/types.ts (line 14).

Connector-by-Connector Reality Check

Entra ID — Tier 1 partially integrated, Tier 1+ achievable with customer setup

Evidence Type	Available via API?	Integrated?	Tier
SP sign-in logs (`auditLogs/signIns`)	Yes — Microsoft Graph API	Yes (partial — see bug below)	Tier 1
SP account state (enabled/disabled)	Yes	Yes	Tier 3
Credential type and expiry	Yes	Yes	Tier 3
Role assignments	Yes	Yes	Tier 3
Which API scopes were exercised	Yes — Microsoft Graph activity logs	No	Tier 1 (customer setup)
Which specific API endpoint was called	Yes — activity logs (`requestUri`)	No	Tier 1 (customer setup)
Token roles and scopes used	Partial — activity logs include `roles`, `scopes`	No	Tier 1 (customer setup)

What's already integrated: The transformer at entra-servicenow/core/transformer.py line 1919 (_add_sign_in_node) already creates execution_evidence nodes from sign-in records. The target_resource field is populated with resource_display_name (line 1957). Sign-in evidence links to the SP node via an EVIDENCES edge (line 1964).

Known bug — sign-in pagination: The Azure client at azure_client.py line 180 uses top=100 with no odata_next_link handling. The SP listing (lines 71-76) correctly handles pagination. Sign-in queries silently drop records beyond the first 100. Active SPs will lose data. Quick fix — ~1 session.

Microsoft Graph activity logs (verified against Microsoft docs): Provide per-request detail:

servicePrincipalId — which SP made the call
requestUri — exact API endpoint (e.g., GET /users/upn)
roles — app roles in the token
scopes — delegated scopes used
clientAuthMethod — how the SP authenticated
responseStatusCode — whether the call succeeded

Customer dependency for activity logs:

Entra ID P1 or P2 license (many enterprises already have this)
Azure Monitor diagnostic settings configured to export MicrosoftGraphActivityLogs
Routed to a Log Analytics workspace or storage account we can query

Important: Activity logs are queried via the Azure Monitor Query API — a different API surface from Microsoft Graph. This is not a simple Graph API extension; it requires a new authentication flow and SDK integration.

Many enterprise customers already have Azure Monitor configured for compliance. We should detect what's already there before asking them to create new settings (see "Logging Readiness Checks" below).

ServiceNow — Patchwork, mostly Tier 2-3

Automation Type	Evidence Available	Tier	Status
Flow Designer Flows	`sys_flow_context` — execution records with state	Tier 1	Integrated
Scheduled Jobs	`sys_trigger.last_action` — timestamp (no outcome)	Tier 2	Integrated
Business Rules (structured logging)	`sys_log` — syslog entries with script name + record ID	Tier 1 (opt-in)	Integrated
Business Rules (no logging)	Trigger record heuristic — "table has recent records"	Tier 2	Integrated
Business Rules (sys_audit)	`sys_audit` — record-level modifications	Tier 1	Phase 2 planned
Script Includes	No direct evidence — invoked by BRs/Flows	Tier 3	Inherits from caller
Outbound REST calls	`sys_outbound_http_log` — URL, method, timestamp, response	Tier 1	Phase 3 planned
Script Actions, Transform Scripts, Inactivity Monitors, SLA Escalations	None	Tier 3	Not planned

The trigger record heuristic problem: For BRs without structured logging, the evidence is "the table this BR triggers on had records created recently." For high-traffic tables (Incident, Task), this is tautological — the Incident table always has records. A CISO who understands ServiceNow will see through this immediately.

Outbound REST dependency: Requires customer enabling glide.rest.outbound_log_level=elevated — not a default setting. Even when enabled, sys_outbound_http_log records the URL and method but not the authenticated identity on the destination side. We still cannot complete the cross-system chain.

~60% of ServiceNow automation types have NO evidence integration — these are not edge cases. Script Actions, Transform Scripts, Inactivity Monitors, and SLA Escalations can represent hundreds of automated behaviors in a typical instance.

Azure Foundry — Tier 1 achievable via Runs API

Correction (verified against codebase + Microsoft API docs)

The initial assessment understated Foundry's capabilities. The Runs API provides rich per-execution data. The connector's foundry_client.py (lines 618-628) already documents this upgrade path as Gap G03/G04.

Evidence Type	Available?	Integrated?	Tier
Agent execution count (30d)	Yes — thread listing	Yes	Tier 2
Last run timestamp	Yes — latest thread timestamp	Yes	Tier 2
Per-run status and model	Yes — `GET /threads/{id}/runs`	No	Tier 1
Which tools the agent invoked	Yes — `tools[]` with function definitions	No	Tier 1
Actual tool calls with arguments	Yes — Run Steps API: `step.tool_calls[].function.name + arguments`	No	Tier 1
Token usage per run	Yes — `usage.prompt_tokens`, `usage.completion_tokens`	No	Tier 2
Outbound HTTP calls from agent	No — not in Runs API	N/A	Invisible
Logic App execution triggered by agent	Separate Azure service	N/A	Invisible

Correction on tool_calls location: For completed runs, tool call details are in the Run Steps endpoint (GET /threads/{id}/runs/{runId}/steps), not in required_action.submit_tool_outputs (which only appears during requires_action status). The feasibility review's initial JSON example showed the requires_action case, not the common completed case.

What remains invisible: We can see the agent called a tool named search_kb with specific arguments, but NOT the outbound HTTP calls that tool made. The tool-call boundary is the evidence ceiling.

Integration complexity (~2-3 sessions): Not trivial because of:

N+1 API fan-out — for each thread, call Runs endpoint. Active agents have hundreds of threads.
Two API versioning patterns — agents/v1.0/ (legacy) vs / (AI Foundry). See foundry_client.py lines 663-668.
Rate limiting — Foundry data-plane endpoints have different throttling from ARM. No backoff tuning exists.
Run status state machine — runs can be in_progress, requires_action, cancelling, expired, failed, not just completed.
Run Steps API is a separate endpoint from Runs API — needs its own call per run.
New transformer logic + tests.

Cross-System Correlation — Fundamentally Limited

ServiceNow BR fires → calls REST Message → hits Azure Function →
Azure Function calls Graph API → authenticates as SP → accesses data

Current state: Each connector sees its own fragment. No shared correlation IDs, no request/response matching, no end-to-end execution trace.

Cross-system hop	Evidence available?	Status
SN BR → SN REST Message call	Only if outbound logging enabled	Not integrated
SN REST Message → Azure Function	No correlation	Not feasible without both sides logging
Azure Function → Entra SP sign-in	SP sign-in log exists	Not integrated
Foundry Agent → Logic App → SN	No correlation mechanism	Not feasible today

Hard limit: Cross-system correlation requires either (a) injecting correlation headers into customer request flows (we can't — read-only) or (b) both sides independently logging with enough detail to match by timestamp + endpoint. Option (b) is possible but requires customer setup on both ends.

Workaround — Temporal Correlation Engine (see "Creative Solutions" below): Combine timestamps from independent sources. Not as strong as a correlation ID, but stronger than either signal alone.

Platform Architectural Gaps

Beyond connector evidence, the platform itself has structural limitations that would prevent execution-determined paths even with perfect Tier 1 evidence.

1. Path Materializer is Structural-Only

Verified in sv0-platform/src/ingestion/authority-path-materializer.ts (lines 103-251): the materializer performs pure graph traversal following structural edges (HAS_ROLE → GRANTS → APPLIES_TO). It never accesses execution evidence.

Impact: Even if we collect Tier 1 evidence, the materializer produces the same paths — it doesn't filter or prioritize by observed execution. "Only show what was exercised" requires a new mode that starts from evidence and works backward to construct observed paths.

2. Evidence is Workload-Level, Not Path-Level

Verified in sv0-platform/src/domain/evidence/types.ts (lines 16-36): ExecutionEvidenceDoc has entity_id (the workload) and target_resource (free-text string) but NO destination_id foreign key.

The path-evaluator.ts line 9 explicitly acknowledges this:

// NOTE: execution_30d and last_execution_at are workload-level. Evidence records (ExecutionEvidenceDoc) have entity_id but NOT destination_id, so path-specific attribution is not possible with the current data model.

Verified in authority-path-materializer.ts computeCurrentState() (lines 209-253): evidence counts are summed across the workload's entity_id plus all RUNS_AS targets. Every AuthorityPathDoc for that workload gets the same execution_30d value.

Impact: If a workload has 3 authority paths to 3 different destinations, all 3 get the same evidence grade. We can prove the workload ran — not which path it took. A CISO asking "prove this specific path to Finance data was exercised" will see through this immediately.

3. `destination_id` Migration is Non-Trivial

The target_resource field is a free-text string (e.g., "Microsoft Graph", "Azure Key Vault"). Mapping this to a destination_id entity FK requires a deterministic resolution strategy. "Microsoft Graph" must resolve to a specific entity ID — but what about "Azure Key Vault" vs "Azure Key Vault (contoso-vault)"? What about historical records that need backfill?

Non-trivial migration (~3-4 sessions) because:

Schema migration on existing records
Deterministic target_resource → destination_id mapping strategy
Updating ALL connector transformers to populate the FK going forward
Backfill strategy for historical records (potentially non-deterministic)
Tests

What This Means for the Grade Model

Grade A/B/C labeling works TODAY at workload level. We can say: "This workload has Tier 1 evidence — Grade A." This is useful but imprecise.

Path-specific grading ("this specific route through Identity Y to Destination Z was exercised") requires both the destination_id migration and an execution-aware materializer. Until then, Grade A must honestly say "Workload execution confirmed" — not "Path execution confirmed."

The Honest Assessment

What we CAN show today

Scenario	Evidence quality	Honest label
Flow Designer executed	Tier 1 — deterministic execution record	"Workload execution confirmed"
BR with structured syslog executed	Tier 1 — deterministic log entry	"Workload execution confirmed"
SP authenticated to a resource	Tier 1 — sign-in log (integrated, with pagination bug)	"Authentication confirmed"
Scheduled job last ran at time T	Tier 2 — timestamp only	"Last scheduled run"
BR probably ran (trigger table has records)	Tier 2 — circumstantial	"Execution inferred"
Foundry agent ran X times	Tier 2 — aggregate count	"Agent activity detected"

What becomes available with focused integration work

Scenario	Evidence quality	Priority	Honest label
Foundry agent called specific tools	Tier 1 — tool name + arguments	Immediate	"Tool invocation confirmed"
BR modified specific records (sys_audit)	Tier 1 — record-level audit trail	Near-term	"Record modification confirmed"
Outbound REST call hit specific URL	Tier 1 — URL, method, response code	Near-term	"Outbound call confirmed" (customer enables logging)
SP called specific Graph API endpoint	Tier 1 — requestUri + scopes	Medium-term	"API call confirmed" (customer enables activity logs)

What is fundamentally impossible (read-only, no-agent)

Scenario	Why	Workaround
Cross-system request correlation (end-to-end trace)	No shared correlation IDs. Cannot inject headers.	Temporal correlation heuristics
What a BR script did internally (GlideRecord calls)	Script execution is opaque	ServiceNow scoped app (future)
What a Foundry tool did after being invoked	Tool-call boundary is the evidence ceiling	Azure Function evidence proxy (future)
Runtime token scopes at call time	Would require token interception	Activity logs provide token roles/scopes (customer setup)

Risk Assessment

Risk	Severity	Impact	Mitigation
Claiming "execution-determined" but delivering Tier 2/3	Critical	CISOs will test claims. "You said observed, but this is a heuristic."	Use evidence grades. Never say "observed" for Tier 2/3 paths.
Evidence is workload-level, not path-level	High	All paths from a workload get same grade. "Prove THIS path" fails.	Label as "Workload execution confirmed." Build `destination_id` in near-term.
Path materializer ignores evidence	High	"Observed" paths are structurally derived, not evidence-derived.	Build execution-aware materializer mode in near-term.
Trigger-record heuristic is tautological	High	"Incident table has records" is not evidence a BR ran.	Replace with sys_audit (Phase 2). Mark trigger-record as Grade B with caveat.
Outbound REST requires customer action	High	Most customers won't have elevated logging on.	Detect and surface as blind-spot finding. Provide setup script.
Graph activity logs require customer setup	Medium	Tier 1 API-call evidence depends on Entra P1/P2 + Azure Monitor.	Detect existing Azure Monitor config. Provide setup wizard.
Sign-in API pagination bug	Medium	Silently drops data for active SPs (>100 sign-ins in 30d).	Fix pagination — quick fix, ~1 session.
Cross-system correlation not feasible	Medium	No end-to-end execution lineage proof.	Temporal correlation engine (partial mitigation).
~60% of SN automation types have no evidence	Medium	Large evidence gaps in inventory.	Surface as blind-spot findings. ServiceNow scoped app (future).

Recommendation: Evidence-Graded Authority Paths

Do not claim "execution-determined." Instead, build an honest tiered model that turns our evidence gaps into a feature.

The Grade Model

Grade	Label	What it means	Example
A	"Execution confirmed"	Tier 1 deterministic evidence exists	Flow execution record, syslog entry, Foundry run with tool calls
A-	"Execution highly probable"	Multiple Tier 2 signals from different systems, temporally correlated	SN trigger record + Entra sign-in within 5s window on connected path
B	"Execution inferred"	Single Tier 2 signal exists	Trigger records active, scheduled job timestamp, aggregate agent count
C	"Authority exists, unconfirmed"	Tier 3 configuration-only	BR is active with REST Message configured, SP has roles, no execution proof

Important caveat: Until destination_id is implemented, Grades A through B apply at the workload level. All paths from the same workload share the same grade. The label must honestly reflect this.

How this maps to the Primer's vision

Primer concept	How we deliver it
"Only show what was exercised" (default)	Default view shows Grade A/A- paths. Grade B with "inferred" label. Grade C in "potential" view.
"Observed Authority Path"	Grade A/A- paths
"Potential Authority Path"	Grade C paths + latent roles on Grade A/B paths
"Standing authority / risk posture"	Delta between Grade A (observed) and Grade C (potential)

Why this is a stronger market position

Honest — we never claim execution proof where we only have configuration evidence.
Progressive — as evidence sources are enabled, paths upgrade from C → B → A. Measurable improvement.
Differentiated — no competitor grades evidence quality. Astrix, Entro, Token Security, Veza — they all show paths flat.
CISO-friendly — "14 Grade-A paths are actively executing against Finance data" is more credible than "142 authority paths exist."
Turns gaps into sales — "You have 90 paths where you can't even tell if they're being used. That's your governance gap."

Creative Solutions — New Ideas from Dual Review

Turn every evidence gap into a governance finding the customer must address:

New finding type: evidence_blind_spot

"47 authority paths have evidence blind spots — authority that exists but cannot be confirmed or denied."
Per-path detail: "This path has NO execution evidence because: (1) Outbound REST logging is disabled in ServiceNow, (2) Graph Activity Logs are not exported from Azure Monitor."
Each blind spot includes a specific remediation: "Enable glide.rest.outbound_log_level = elevated to upgrade 23 authority paths from Grade C to Grade A."

Why this is powerful: No competitor tells you WHERE your evidence gaps are. They either show everything flat or hide the gaps. We surface the gap itself as a finding, turning our limitation into the customer's problem to solve — and giving them a clear path to fix it.

Scope: New finding type + evaluation rule + UX. ~1-2 sessions.

2. Logging Readiness Checks (P0)

Automatically detect what logging is and isn't enabled before a customer discovers gaps in a demo:

Check	How
ServiceNow outbound logging	Query `sys_properties` for `glide.rest.outbound_log_level`
Azure diagnostic settings	`GET /providers/microsoft.aadiam/diagnosticSettings` (we already hold ARM tokens via `foundry_client._arm_token()`)
Foundry run-level access	Surface 401/403 from thread listing as a typed finding (already handled at `foundry_client.py` line 673)
Entra sign-in log access	Check permission availability

Surface in a "Connector Health" dashboard: "ServiceNow: outbound logging OFF (23 paths affected) | Entra: sign-in logs ON, activity logs OFF (8 SPs affected)"

Scope: ~1 session per check, or all four in ~2 sessions.

3. Pre-Built Setup Scripts (P1)

Ship configuration scripts customers can run:

enable-entra-activity-logs.ps1 — creates Azure Monitor diagnostic setting for MicrosoftGraphActivityLogs
enable-sn-outbound-logging.js — ServiceNow fix script for glide.rest.outbound_log_level = elevated
grant-foundry-access.ps1 — assigns Azure AI User role

Each script has a "check" mode. The platform runs checks automatically and links to the relevant script from each blind-spot finding.

Scope: ~1 session.

4. Logging Maturity Score (P1)

Per-connector scoring with "how to upgrade" guidance:

Level	What's enabled	Score
0 — Blind	No execution evidence sources	0/5
1 — Basic	SP authentication logs (sign-ins)	1/5
2 — Intermediate	+ Flow execution records + scheduled job timestamps	2/5
3 — Good	+ Outbound HTTP logging + sys_audit records	3/5
4 — Strong	+ Cross-system temporal correlation enabled	4/5
5 — Complete	+ Graph Activity Logs with per-API-call detail	5/5

This turns evidence improvement into a remediation workflow CISOs already understand. "Your ServiceNow connector is at Level 2. Enable outbound logging to reach Level 3 and upgrade 23 paths."

5. Temporal Correlation Engine (P1)

Combine independent Tier 2 signals into higher-confidence composite evidence:

Rule: If evidence from two different systems falls within a configurable window (e.g., 5 seconds) AND the systems are connected by known edges → score as Grade A- ("multi-source correlated").
Example: SN trigger record at T + Entra SP sign-in at T+3s on a connected authority path = probable execution.
Cron pattern: sys_trigger.run_period = every 15 minutes + SP authenticates at :00, :15, :30, :45 = near-deterministic.

This is NOT probabilistic scoring — it's a deterministic rule. "If evidence A and B exist within window W and share a connected path, composite grade = X." Explicit, repeatable, auditable. Compliant with the deterministic design constraint.

Scope: Post-ingestion enrichment step. ~2-3 sessions.

6. ServiceNow Scoped App — "Evidence Collector" (W2, P2)

A lightweight scoped app deployed to customer instances:

Custom table x_sv0_exec_log — records BR execution: sys_id, timestamp, trigger record, target table, run_as user, duration_ms.
Outbound REST interceptor — wraps sn_ws.RESTMessageV2 calls, logs destination URL, method, response code, calling BR/SI.
REST API endpoint — GET /api/x_sv0/evidence/latest?since={timestamp} for single-call evidence pull.

Solves the fundamental problem: ServiceNow's native logging is scattered across 5+ tables. This unifies evidence collection into a purpose-built table. Distributed via ServiceNow Store or as an Update Set.

7. Observability Stack Integration (W2-W3, P3)

Many enterprise customers already forward logs to Datadog, Splunk, or Azure Log Analytics:

Azure Log Analytics: If customers have diagnostic settings configured (many do for compliance), query their existing workspace via the Azure Monitor Query API — no new logging setup required, just a read permission grant.
Splunk: Query for Azure AD sign-in logs already forwarded: index=azure_ad sourcetype=SignInLogs.
Datadog APM: Pull traces where the caller is an NHI.

Key insight: The review frames Graph Activity Logs as "customer must set up Azure Monitor." But many enterprise customers already have Azure Monitor configured. Detect what's already there before asking them to create new settings.

8. OpenTelemetry as the Correlation Protocol (W3, P3)

OTel traces map directly to authority path executions:

Trace = authority path execution instance
Span = one hop (BR fires → REST call → SP authenticates → API called)
Define SecurityV0-specific semantic conventions: sv0.workload.id, sv0.identity.id, sv0.destination.id

If customers instrument their Azure Functions or Logic Apps with OTel (increasingly common), SecurityV0 consumes traces and automatically correlates with discovered authority paths. Solves cross-system correlation without a proprietary agent — we piggyback on observability infrastructure the customer is already deploying.

What to Build — Priority Roadmap

Sized in Claude Code sessions (~1 session = one focused agentic build cycle).

Immediate (current sprint)

Action	Sessions	Impact
Fix sign-in log pagination bug (`azure_client.py` line 180)	~1	Data completeness for active SPs
Evidence blind spots as findings	~1-2	Turns evidence gaps into selling points
Logging readiness checks (all connectors)	~2	Prevents demo embarrassment
Evidence grade labels (A/B/C) in UX	~2-3	Grade computation, API changes, UX rendering, tests

Near-term (next 2 sprints)

Action	Sessions	Impact
Foundry Runs + Run Steps API integration	~2-3	Upgrades Foundry paths from B → A
Pre-built setup scripts	~1	Reduces customer friction
Logging maturity score	~1	Visual progress + remediation workflow
Phase 2: sys_audit for BRs	~2-3	Upgrades BR paths from B/C → A
Temporal correlation engine	~2-3	Multi-source Tier 2 → Grade A-
Phase 3: sys_outbound_http_log	~1-2	Cross-system call proof (if customer enables)

Medium-term (next quarter)

Action	Sessions	Impact
`destination_id` on ExecutionEvidenceDoc	~3-4	Enables path-level evidence attribution
Execution-aware path materializer	~3-5	"Only show what was exercised" becomes real
Azure Monitor Query API integration (activity logs)	~2-3	Tier 1 API-call evidence for Entra (customer setup)
Evidence completeness bar in path detail UX	~1-2	Per-path breakdown of what evidence exists vs missing

Future (W2-W3)

Action	Dependency	Impact
ServiceNow scoped app	Customer deployment required	Permanent solution for BR evidence gaps
Observability stack integration	Customer has Datadog/Splunk/LogAnalytics	Leverage existing infrastructure
OpenTelemetry conventions	OTel adoption in customer apps	Cross-system correlation without proprietary agents

6-Month Evidence Trajectory

Well-configured customer (Entra P2 + outbound logging + Foundry access + scoped app)

Timeline	Grade A	Grade A-	Grade B	Grade C	Key milestone
Today	~15%	—	~25%	~60%	Flows + syslog BRs only
Month 1	~20%	—	~25%	~55%	+ blind spot findings + pagination fix + evidence grades
Month 2	~30%	~5%	~20%	~45%	+ Foundry Runs API + temporal correlation
Month 3	~40%	~5%	~15%	~40%	+ sys_audit + `destination_id` + materializer
Month 6	~60%	~10%	~10%	~20%	+ scoped app + activity logs + observability

Minimal-configuration customer (no P2, no outbound logging, no scoped app)

Timeline	Grade A	Grade A-	Grade B	Grade C	Key milestone
Today	~15%	—	~25%	~60%	Same baseline
Month 6	~25%	~5%	~25%	~45%	Foundry Runs + temporal correlation only

The difference between these two tables IS the value proposition of the logging maturity score and the blind-spot findings: "Here's what you're missing, and here's exactly how to get it."

What to Tell the CEO

The Primer's vision is the right aspiration. It is not fully deliverable today. Here is the honest story:

"SecurityV0 shows authority paths graded by evidence quality. Grade A paths have confirmed execution. Grade B paths have strong inference. Grade C paths have standing authority but no execution proof.

Our differentiator is that we grade the evidence. Everyone else shows all paths equally. We show which ones we can prove — and critically, we show where the evidence gaps are. Those gaps are themselves a finding: 'You have 90 paths where you can't even tell if they're being used.'

With customer-assisted evidence (enabling the right logging), we can reach ~70% Grade A within 6 months. The remaining ~30% is honestly labeled. That honesty is stronger than claiming full visibility and having a CISO find a gap."

What to Tell Design Partners

Quiron (regulatory risk, high identity depth):

"We show deterministic execution evidence for Flows and signed-in identities today. For Business Rules, we have strong inference now and deterministic proof (sys_audit) coming next sprint. We'll be transparent about evidence quality on every path — and we'll show you exactly where your evidence blind spots are."

TPx (operational overload, "show me what I don't know"):

"We show what actually executed, what we can infer, and crucially — what exists but we can't confirm ever ran. That gap is your governance blind spot. We also show you exactly what to enable to close the gap."

Deloitte (explainability, moving AI to prod):

"Every authority path has an evidence grade. Grade A is deterministic — court-admissible. Grade B is strong inference. We never claim certainty where we don't have it. The evidence grade itself is auditable — we can show exactly what signals produced it."

Sources

Authority Paths Primer (Notion, synced 2026-02-27)
Independent Codex model review (cross-review, 2026-02-27)
Pessimist architect / Optimist solutions thinker dual review (2026-02-27)
Codebase verification:
- Sign-in integration: sv0-connectors/.../core/transformer.py line 1919 (_add_sign_in_node)
- Pagination bug: sv0-connectors/.../adapters/azure_client.py line 180 (no odata_next_link)
- SP pagination (correct): azure_client.py lines 71-76
- Evidence types: sv0-platform/src/domain/evidence/types.ts lines 14-36
- Path materializer: sv0-platform/src/ingestion/authority-path-materializer.ts lines 103-253
- Path evaluator NOTE: sv0-platform/src/evaluator/path-evaluator.ts lines 9-12
- Foundry upgrade path: sv0-connectors/.../adapters/foundry_client.py lines 618-628
- Foundry thread listing: foundry_client.py lines 601-714
Microsoft Graph activity logs: Azure Monitor diagnostic settings documentation
Microsoft Assistants API: Runs endpoint + Run Steps endpoint
ETL Pipeline Strengthening Plan: docs/analysis/2026-02-20-etl-pipeline-strengthening-plan.md
NHI competitive landscape: Astrix, Entro, Token Security, Saviynt, Veza, Permiso
OpenTelemetry Tracing Specification

Executive Summary — CEO View​

Executive Summary — CISO View​

Evidence Tier Framework​

Connector-by-Connector Reality Check​

Entra ID — Tier 1 partially integrated, Tier 1+ achievable with customer setup​

ServiceNow — Patchwork, mostly Tier 2-3​

Azure Foundry — Tier 1 achievable via Runs API​

Cross-System Correlation — Fundamentally Limited​

Platform Architectural Gaps​

1. Path Materializer is Structural-Only​

2. Evidence is Workload-Level, Not Path-Level​

3. destination_id Migration is Non-Trivial​

What This Means for the Grade Model​

The Honest Assessment​

What we CAN show today​

What becomes available with focused integration work​

What is fundamentally impossible (read-only, no-agent)​

Risk Assessment​

Recommendation: Evidence-Graded Authority Paths​

The Grade Model​

How this maps to the Primer's vision​

Why this is a stronger market position​

Creative Solutions — New Ideas from Dual Review​

1. Evidence Blind Spots as First-Class Findings (P0)​

2. Logging Readiness Checks (P0)​

3. Pre-Built Setup Scripts (P1)​

4. Logging Maturity Score (P1)​

5. Temporal Correlation Engine (P1)​

6. ServiceNow Scoped App — "Evidence Collector" (W2, P2)​

7. Observability Stack Integration (W2-W3, P3)​

8. OpenTelemetry as the Correlation Protocol (W3, P3)​

What to Build — Priority Roadmap​

Immediate (current sprint)​

Near-term (next 2 sprints)​

Medium-term (next quarter)​

Future (W2-W3)​

6-Month Evidence Trajectory​

Well-configured customer (Entra P2 + outbound logging + Foundry access + scoped app)​

Minimal-configuration customer (no P2, no outbound logging, no scoped app)​

What to Tell the CEO​

What to Tell Design Partners​

Sources​