Evidence Classification Model -- Research Brief
Executive Summary
Sergey's feedback on the March 2026 sprint review identified a fundamental gap: when SecurityV0 surfaces a finding, users cannot tell whether the claim is proven from observed execution logs or inferred from the structure of the identity graph. Today, the platform has the raw machinery -- EvidenceConfidenceLevel on execution evidence records, EvidenceCompletenessSection on findings, and structured evidence_refs -- but these are implementation details invisible to the user. No single field on a finding answers: "Is this proven?"
This research proposes an EvidenceClassification enum with five values (observed_execution, observed_absence, correlated_pattern, structural_authority, inferred_capability) and an EvidenceClaim type that wraps every finding in a human-readable claim statement, a classification, a runtime confidence overlay, and per-section confidence metadata. Each of the 14 rule files (producing 15 finding types) is analyzed and assigned a classification based on what data it actually checks.
The impact is threefold: (1) every finding in the UI can display a trust indicator ("Proven" vs. "Inferred"), (2) API consumers can filter and sort by evidence class, and (3) the platform satisfies the four-part requirement: what is proven, why it matters, what to do first, and who owns it. The classification is deterministic -- it is a combination of the rule's static data access pattern and the runtime confidence of the underlying evidence records, not a probabilistic model.
Note: Evidence classification alone does not solve the scanability problem identified by Sergey (too many clicks to understand a finding). See #217 (Wiz UX research) and #224 (expand default-visible info) for the complementary efforts needed to surface the "why/action/owner" payload without requiring detail-view clicks.
Current State Analysis
Existing Infrastructure
The platform already tracks evidence quality at multiple layers, but they are disconnected:
Layer 1: Execution Evidence Records (ExecutionEvidenceDoc in src/domain/evidence/types.ts)
confidence?: EvidenceConfidenceLevel-- set by the connector, values:DETERMINISTIC,TEMPORAL_INFERRED,STRUCTURALproof_notes?: string-- human-readable explanation of what this evidence does and does not prove- Populated during ingestion via
resolveConfidence()insrc/ingestion/graph-transformer.ts
Layer 2: Evidence Completeness (EvidenceCompletenessSection in src/domain/evidence-packs/types.ts)
- Six categories:
current_roles,role_history,execution_evidence,ownership_records,approval_records,credential_state - Each has an
EvidenceAvailabilityvalue:available,unavailable_not_enabled,unavailable_no_access,unavailable_not_applicable,partial - Free-text
notesper category - Populated by each evaluator rule via
defaultEvidenceCompleteness()
Layer 3: Finding-Level References (FindingDoc in src/domain/findings/types.ts)
evidence_refs: Record<string, unknown>-- rule-specific structured data (counts, IDs, timestamps)deterministic_explanation: string-- human-readable but not structured as a claimevidence_completeness: EvidenceCompletenessSection-- completeness, not classification
What is missing:
- No field says whether a finding is based on observed execution logs vs. structural graph analysis
- No claim statement that a non-technical stakeholder can read
- No way to filter findings by "how sure are we?"
- The
EvidenceConfidenceLevelon individual evidence records never propagates to the finding level
Per-Rule Evidence Analysis
1. dormant_authority (dormant-authority.ts)
What it checks: Queries execution evidence records for the entity and its RUNS_AS targets. Compares the most recent source_timestamp against a 90-day threshold. Also checks the connector-computed last_observed_execution_timestamp property.
Evidence basis: Observed execution log timestamps (or absence thereof). The rule queries real ExecutionEvidenceDoc records and makes a temporal determination.
evidence_completeness: Sets execution_evidence: "available" with a note about the age of the most recent record or the absence of records.
evidence_refs: execution_path_count, last_evidence_timestamp, threshold_days
Classification: observed_absence -- The finding fires when execution evidence is absent or stale. Although the rule queries real ExecutionEvidenceDoc records, the claim is about absence of recent activity, not about observed activity. Labelling this observed_execution ("Proven from execution logs") would mislead a user who sees "no logs found." The observed_absence classification makes the claim honest: "We looked and confirmed nothing recent exists."
2. external_egress (external-egress.ts)
What it checks: Reads entity.properties.egress_category and checks for the value "external". No execution evidence is queried.
Evidence basis: A connector-assigned property on the entity. The connector determines egress category from integration metadata (e.g., HTTP connector target URL classification).
evidence_completeness: Sets current_roles: "available" only.
evidence_refs: egress_category, execution_path_count
Classification: structural_authority -- The finding is based on a structural property of the entity in the graph. The egress classification comes from connector metadata, not from observed runtime data.
3. llm_egress (llm-egress.ts)
What it checks: Reads entity.properties.egress_category and checks for the value "llm". Identical pattern to external_egress.
Evidence basis: Connector-assigned entity property.
evidence_completeness: Sets current_roles: "available" only.
evidence_refs: egress_category, execution_path_count
Classification: structural_authority -- Same as external_egress. The LLM endpoint classification is a structural property from the connector.
4. orphaned_ownership (orphaned-ownership.ts)
What it checks: Reads OWNED_BY and CREATED_BY relationships. Fetches owner entities and checks their status property against a list of non-active statuses (deleted, departed, disabled, disbanded, restructured, expired). Also produces ownership_degraded findings when primary owners are non-active but secondary/inherited owners remain.
Evidence basis: Graph structure (relationships) plus entity status properties. No execution evidence, no version history.
evidence_completeness: Sets current_roles: "available", ownership_records: "available".
evidence_refs: relationships (filtered OWNED_BY/CREATED_BY), execution_path_count
Classification: structural_authority -- Based entirely on the current graph structure: who owns what and whether those owners are active. The status values come from the source system via the connector.
5. ownership_degraded (emitted by orphaned-ownership.ts)
What it checks: Same rule as orphaned_ownership -- fires the ownership_degraded variant when primary owners are non-active but secondary/inherited owners are still active.
Evidence basis: Same as orphaned_ownership.
Classification: structural_authority -- Same rationale.
6. ownership_ambiguous (ownership-ambiguous.ts)
What it checks: Reads OWNED_BY relationships and fetches owner entities. Checks whether all owners are group/team entities (by owner_type or entity_type). Also checks version history to rule out cases where an individual owner previously existed.
Evidence basis: Graph structure (current relationships + entity types) and version history to confirm this is not a degradation.
evidence_completeness: Sets ownership_records: "available".
evidence_refs: owner_ids, owner_types, version_count_checked
Classification: structural_authority -- Based on entity types and relationship structure. The version history check is used to differentiate from ownership_degraded, but the finding itself is about the current structural state.
7. ownership_unknown (ownership-unknown.ts)
What it checks: Verifies that the entity has execution paths (it matters who owns it) but has no OWNED_BY relationships, no CREATED_BY relationships, and no ownership-indicating properties (sys_created_by, created_by, owner, managed_by).
Evidence basis: Absence of metadata -- a metadata quality signal.
evidence_completeness: Sets ownership_records: "unavailable_no_access" with note "No ownership metadata available from source systems".
evidence_refs: execution_path_count, has_owned_by, has_created_by, has_ownership_properties
Classification: structural_authority -- The finding is about missing graph structure. The rule does not infer anything; it reports what is not there.
8. privilege_justification_gap (privilege-justification-gap.ts)
What it checks: Identifies elevated (confidential/restricted) execution paths and queries execution evidence to determine whether the granted write-level actions are actually being used. Detects two gap types: no_activity (no evidence for a resource at all) and action_mismatch (evidence exists but only shows read-level actions when write-level is granted).
Evidence basis: Combination of graph structure (execution paths, sensitivity levels, granted actions) and execution evidence records (observed actions). This is the only rule that correlates structural grants with observed usage patterns.
evidence_completeness: Sets current_roles: "available", execution_evidence: "available".
evidence_refs: elevated_paths_total, gap_count, no_activity_count, action_mismatch_count, observed_evidence_count, gap_resources
Classification: correlated_pattern -- The finding correlates two data sources: structural authority grants and observed execution logs. A gap means the structure says "can write" but the evidence says "only reads" (or nothing). This is neither pure observation nor pure structure -- it is pattern correlation.
9. reachable_sensitive_domain (reachable-sensitive-domain.ts)
What it checks: Reads execution_paths and filters for paths with sensitivity of confidential, restricted, or high. No execution evidence is queried.
Evidence basis: Purely structural -- the computed execution paths with their sensitivity labels.
evidence_completeness: Sets current_roles: "available".
evidence_refs: sensitive_path_count, total_path_count, sensitivity_levels, business_domains
Classification: structural_authority -- The finding reports what an entity can reach based on its role grants and path computation. Whether it actually accesses those resources is not checked by this rule.
10. unknown_identity_binding (unknown-identity-binding.ts)
What it checks: For workload entities, checks for RUNS_AS relationships. Three failure modes: (a) no RUNS_AS at all, (b) RUNS_AS targets exist but cannot be resolved to known entities, (c) multiple RUNS_AS targets resolve -- ambiguous binding.
Evidence basis: Graph structure only -- relationship existence and entity resolution.
evidence_completeness: Sets credential_state: "available" with a note about the specific failure mode.
evidence_refs: runs_as_targets, resolved_count
Classification: structural_authority -- Based entirely on graph relationship structure.
11. unproven_execution (unproven-execution.ts)
What it checks: For workload entities with execution paths, checks execution_count_30d property, then queries direct execution evidence, then checks evidence for RUNS_AS targets. Fires only when no evidence exists anywhere.
Evidence basis: Absence of execution evidence across the entity and all linked identities.
evidence_completeness: Sets execution_evidence: "available" with note about the query scope.
evidence_refs: execution_path_count, runs_as_target_count
Classification: observed_absence -- The finding fires when zero execution evidence records exist across the entity and all linked identities. The claim is about confirmed absence ("we queried and found nothing"), not about observed activity. Using observed_execution ("Proven from execution logs") would be misleading when the entire point is that no logs were found.
12. unresolved_cross_system_auth (unresolved-auth.ts)
What it checks: Reads identity_binding_status property and identitySubtype/workloadSubtype property. Only fires on oauth_app entities with identity_binding_status: "unlinked" -- meaning the connector could not match the OAuth app's client_id to a governed Azure service principal.
Evidence basis: Connector-determined binding status. The connector ran a join operation (client_id matching) and reported the result.
evidence_completeness: Sets credential_state: "available".
evidence_refs: identity_binding_status, identity_subtype, identity_type (fallback when subtype is absent), execution_path_count
Classification: structural_authority -- The rule reads a connector-set property (identity_binding_status) and a subtype filter. This is the same pattern as external_egress (which reads egress_category): a connector-assigned property on the entity. The connector performed the cross-system join and recorded the result as a property; the evaluator rule simply reads that property. Both should be classified consistently as structural_authority.
13. scope_drift (scope-drift.ts)
What it checks: Compares current relationships (HAS_ROLE, GRANTS, USES, RUNS_AS) against the oldest version in history. Detects added/removed targets. Also queries execution evidence (1 record) to determine if authority is "exercised". Checks whether new roles reach sensitive domains via execution paths.
Evidence basis: Version history (temporal comparison) + execution evidence (exercised flag) + execution path sensitivity.
evidence_completeness: Sets current_roles: "available", role_history: "partial" (with note about version diff inference), execution_evidence and credential_state conditionally.
evidence_refs: current_role_count, baseline_role_count, added_role_targets, baseline_version_date, sensitive_domains_affected, exercised, drift_categories, grants_added, uses_added, uses_removed, runs_as_added, runs_as_removed.
Classification: correlated_pattern -- Correlates temporal drift (version history diffs) with structural authority (execution paths, sensitivity) and optionally with observed execution. The "exercised" flag elevates severity but even without it, the finding correlates two data sources.
14. reachability_drift (reachability-drift.ts)
What it checks: Compares current execution_paths against the oldest version's paths. Identifies new destinations and new business domains. Queries execution evidence to determine if authority is exercised. Checks for sensitive domain impact.
Evidence basis: Version history (temporal comparison) + execution paths + execution evidence.
evidence_completeness: Sets current_roles: "available", role_history: "partial", execution_evidence conditionally.
evidence_refs: baseline_destination_count, current_destination_count, new_destination_ids, new_destination_names, new_domains, sensitive_domains_affected, baseline_version_date, exercised
Classification: correlated_pattern -- Same pattern as scope_drift: correlates temporal change with structural authority and optional execution observation.
15. ownership_drift (ownership-drift.ts)
What it checks: Compares current OWNED_BY relationships against the oldest version in history. Identifies removed owners and owners whose status changed from active to non-active since baseline.
Evidence basis: Version history (temporal comparison) + entity status properties.
evidence_completeness: Sets ownership_records: "available", role_history: "partial" (inferred from version diffs).
evidence_refs: baseline_owner_count, current_owner_count, removed_owner_ids, removed_owner_names, disabled_owner_ids, disabled_owner_names, baseline_version_date
Classification: structural_authority -- Although the rule uses version history to detect which owners were removed, the finding's claim is about the current state: owners are gone or disabled now. This is the same pattern as ownership_ambiguous (which also checks version history for differentiation but reports on current structure). Both are classified as structural_authority for consistency.
Proposed Type Definitions
EvidenceClassification Enum
/**
* Classification of the evidence basis for a finding.
*
* Determines how the finding's claim was derived and what level of
* trust a reviewer should place in it.
*
* Values are ordered from highest to lowest confidence:
* 1. observed_execution -- based on actual execution log records showing activity
* 2. observed_absence -- based on querying execution logs and confirming absence of activity
* 3. correlated_pattern -- correlates multiple data sources (e.g., version history + execution evidence)
* 4. structural_authority -- based on the static graph structure (roles, paths, relationships, connector-set properties)
* 5. inferred_capability -- connector or platform inferred a capability from indirect signals
*/
export const EVIDENCE_CLASSIFICATIONS = [
"observed_execution",
"observed_absence",
"correlated_pattern",
"structural_authority",
"inferred_capability"
] as const;
export type EvidenceClassification = (typeof EVIDENCE_CLASSIFICATIONS)[number];
EvidenceClaim Type
/**
* A structured claim statement attached to every finding.
*
* Answers Sergey's four questions:
* 1. What is proven vs. inferred? -> classification + basis
* 2. Why does it matter? -> business_impact
* 3. What is the safest first action? -> recommended_action
* 4. Who should own that action? -> owner_role
*/
export interface EvidenceClaim {
/**
* One-sentence claim statement suitable for display in a finding card.
* Written in plain language for a non-technical reviewer.
*
* @example "This workload has write access to the Payroll domain but
* has only been observed reading data in the last 90 days."
*/
claim_statement: string;
/**
* Static classification of the evidence basis (rule-level).
* Determined by the rule's data access pattern and does not change at runtime.
*/
classification: EvidenceClassification;
/**
* Runtime confidence derived from the actual EvidenceConfidenceLevel
* of the underlying evidence records consulted when producing this finding.
* Maps EvidenceConfidenceLevel -> EvidenceClassification:
* DETERMINISTIC -> observed_execution
* TEMPORAL_INFERRED -> correlated_pattern
* STRUCTURAL -> structural_authority
*
* When no execution evidence records are consulted, this field is omitted.
*/
runtime_confidence?: EvidenceClassification;
/**
* Effective classification shown to the user.
* Computed as: min(classification, runtime_confidence) using the
* EVIDENCE_CLASSIFICATIONS ordering (lower index = higher confidence).
* When runtime_confidence is absent, equals classification.
*/
effective_classification: EvidenceClassification;
/**
* Human-readable label for the effective classification.
* Used in the UI trust indicator.
*
* @example "Proven from execution logs"
* @example "Confirmed absent from execution logs"
* @example "Derived from graph structure"
*/
classification_label: string;
/**
* What data sources contributed to this finding.
* Each entry describes one source of evidence.
*
* @example ["Execution evidence: 3 records from ServiceNow (most recent: 2026-02-15)"]
* @example ["Graph structure: 2 OWNED_BY relationships, both targets status=departed"]
*/
basis: string[];
/**
* One-sentence business impact statement.
* Explains why a business stakeholder should care.
*
* @example "Unreviewed access to restricted financial data increases
* regulatory exposure under SOX controls."
*/
business_impact: string;
/**
* The safest first action to take.
* Not the full remediation plan -- just the lowest-risk next step.
*
* Cross-reference: This maps to `MitigationActionDoc.action` from the
* ownership workflow research (#215). Both should share vocabulary to
* ensure consistency between evidence claims and mitigation workflows.
*
* @example "Review the 3 execution paths to Payroll resources and
* confirm whether write access is still required."
*/
recommended_action: string;
/**
* The role that should own this action.
* Maps to an organizational function, not a specific person.
*
* This is a static recommendation from DEFAULT_OWNER_ROLES. It should
* eventually resolve through the `OwnershipAssignmentDoc` defined in
* the ownership workflow research (#215). The text label alone does NOT
* satisfy the "who should own that action" requirement -- it needs
* integration with the ownership assignment workflow once that ships.
*
* @example "Application Owner"
* @example "Identity Governance Team"
* @example "Security Operations"
*/
owner_role: string;
/**
* Per-section evidence availability snapshot from the finding's evidence completeness.
* This is the standard `EvidenceCompletenessSection` (the same shape already on `FindingDoc`),
* NOT the extended `ClassifiedEvidenceCompleteness` type defined later in this document.
* The `ClassifiedEvidenceCompleteness` type lives on the *evidence pack*, not on the finding.
*
* In other words:
* - `EvidenceClaim.section_confidence` (on the finding) = lightweight availability snapshot
* - `ClassifiedEvidenceCompleteness` (on the evidence pack) = extended version with
* `contributed_to_classification` and `influence_note` per section
*/
section_confidence: EvidenceCompletenessSection;
}
Classification Label Map
/**
* Human-readable labels for each evidence classification.
* Used in UI trust indicators and API responses.
*/
export const CLASSIFICATION_LABELS: Record<EvidenceClassification, string> = {
observed_execution: "Proven from execution logs",
observed_absence: "Confirmed absent from execution logs",
correlated_pattern: "Correlated across data sources",
structural_authority: "Derived from graph structure",
inferred_capability: "Inferred from indirect signals"
};
Owner Role Map
Cross-reference: These default owner roles are static text recommendations. They do not replace the ownership workflow from #215. Once
OwnershipAssignmentDocis implemented, theowner_roleonEvidenceClaimshould resolve through that workflow rather than relying solely on this lookup table. The vocabulary used here should align with the role taxonomy defined in the ownership research.
/**
* Default owner role for each finding type.
* Can be overridden per-tenant in configuration.
* See #215 (ownership workflow research) for how these map to OwnershipAssignmentDoc.
*/
export const DEFAULT_OWNER_ROLES: Record<FindingType, string> = {
orphaned_ownership: "Identity Governance Team",
ownership_degraded: "Identity Governance Team",
ownership_ambiguous: "Identity Governance Team",
ownership_unknown: "Identity Governance Team",
ownership_drift: "Identity Governance Team",
dormant_authority: "Application Owner",
privilege_justification_gap: "Application Owner",
unresolved_cross_system_auth: "Security Operations",
unproven_execution: "Application Owner",
unknown_identity_binding: "Security Operations",
reachable_sensitive_domain: "Data Protection Officer",
llm_egress: "Security Operations",
external_egress: "Security Operations",
scope_drift: "Application Owner",
reachability_drift: "Application Owner"
};
Integration with FindingDoc
The EvidenceClaim field should be added to FindingDoc as an optional field during migration, becoming required once all rules populate it:
export interface FindingDoc {
// ... existing fields ...
/**
* Structured evidence claim. Populated by the evaluator when the
* finding is created or updated. Contains the classification,
* claim statement, business impact, and recommended action.
*
* Optional during migration (Phase 1). Required after Phase 2.
*/
evidence_claim?: EvidenceClaim;
}
The RuleFindingCandidate in src/evaluator/types.ts should also gain the field. During Phase 1, the field is optional to avoid a big-bang migration (all 14 rule files would need updating simultaneously before anything compiles). Rules are updated incrementally; once all rules populate the field, it becomes required in Phase 2:
export interface RuleFindingCandidate {
// ... existing fields ...
/**
* Evidence claim metadata.
* Optional during Phase 1 (incremental rule migration).
* Required after Phase 2 (all rules populated).
*/
evidenceClaim?: EvidenceClaim;
}
Classification Table
| Finding Type | Rule File | Evidence Basis | Classification | Default Severity | Owner Role | Notes |
|---|---|---|---|---|---|---|
dormant_authority | dormant-authority.ts | Execution evidence timestamps, RUNS_AS traversal, connector-computed last_observed_execution_timestamp | observed_absence | high | Application Owner | Fires on absence of recent execution evidence |
external_egress | external-egress.ts | egress_category entity property from connector | structural_authority | medium | Security Operations | Connector classifies egress target; no runtime verification |
llm_egress | llm-egress.ts | egress_category entity property from connector | structural_authority | high | Security Operations | Same pattern as external_egress |
orphaned_ownership | orphaned-ownership.ts | OWNED_BY/CREATED_BY relationships, owner entity status | structural_authority | critical | Identity Governance Team | Status values come from source system via connector |
ownership_degraded | orphaned-ownership.ts | OWNED_BY relationships, owner entity status, ownership_level | structural_authority | high | Identity Governance Team | Emitted by same rule as orphaned_ownership |
ownership_ambiguous | ownership-ambiguous.ts | OWNED_BY relationships, owner entity types, version history | structural_authority | medium | Identity Governance Team | Version history used for differentiation only |
ownership_unknown | ownership-unknown.ts | Absence of OWNED_BY/CREATED_BY relationships and ownership properties | structural_authority | medium | Identity Governance Team | Metadata quality signal |
privilege_justification_gap | privilege-justification-gap.ts | Execution paths (sensitivity, actions) + execution evidence (observed actions) | correlated_pattern | medium | Application Owner | Only rule that correlates granted vs. observed actions |
reachable_sensitive_domain | reachable-sensitive-domain.ts | Execution paths with elevated sensitivity | structural_authority | medium/high | Data Protection Officer | Severity depends on restricted vs. confidential |
unknown_identity_binding | unknown-identity-binding.ts | RUNS_AS relationships, entity resolution | structural_authority | high | Security Operations | Three failure modes: missing, unresolvable, ambiguous |
unproven_execution | unproven-execution.ts | Execution evidence query (zero results), RUNS_AS traversal, execution_count_30d | observed_absence | high | Application Owner | Fires on confirmed absence of any execution evidence |
unresolved_cross_system_auth | unresolved-auth.ts | Connector-set identity_binding_status, identity_subtype, identity_type (fallback) | structural_authority | medium | Security Operations | Reads connector-set property, same pattern as external_egress |
scope_drift | scope-drift.ts | Version history diffs (HAS_ROLE, GRANTS, USES, RUNS_AS) + execution evidence + path sensitivity. Refs: grants_added, uses_added, uses_removed, runs_as_added, runs_as_removed | correlated_pattern | medium-critical | Application Owner | Severity escalates with execution evidence and sensitive domains |
reachability_drift | reachability-drift.ts | Version history diffs (execution_paths) + execution evidence + domain sensitivity | correlated_pattern | medium-critical | Application Owner | Same escalation pattern as scope_drift |
ownership_drift | ownership-drift.ts | Version history diffs (OWNED_BY) + owner entity status | structural_authority | medium/high | Identity Governance Team | Claim is about current state (owners gone/disabled now); version history used for detection, not for the claim |
Evidence Pack Integration
Per-Section Confidence
The existing EvidenceCompletenessSection already tracks six categories. The proposal adds a confidence interpretation layer on top:
/**
* Per-section confidence metadata for an evidence pack.
* Extends the existing EvidenceCompletenessSection with
* classification-relevant context.
*/
export interface SectionConfidenceMetadata {
/** The existing availability status */
availability: EvidenceAvailability;
/**
* Whether this section contributed to the finding classification.
* Not all sections are relevant to every finding type.
*/
contributed_to_classification: boolean;
/**
* How this section influenced the classification.
* Only populated when contributed_to_classification is true.
*
* @example "Execution evidence records provided the temporal threshold comparison"
* @example "Role history inferred from version diffs (not from audit logs)"
*/
influence_note?: string;
}
/**
* Extended evidence completeness with classification metadata.
* Carried in the evidence pack, not in the finding itself.
*/
export interface ClassifiedEvidenceCompleteness {
sections: Record<keyof Omit<EvidenceCompletenessSection, "notes">, SectionConfidenceMetadata>;
notes: Record<string, string>;
}
Claim Derivation
The overall finding classification is derived deterministically from the rule's data access pattern:
-
Rule declares its static classification. Each rule knows what data it accesses. The static classification is a property of the rule that does not change between invocations.
-
Runtime confidence overlays the static classification. When a rule consults execution evidence records, the
EvidenceConfidenceLevelof those records is mapped to anEvidenceClassificationand stored asruntime_confidence. Theeffective_classificationshown to the user is the minimum of (static classification, runtime confidence) using theEVIDENCE_CLASSIFICATIONSordering. For example, ascope_driftfinding (static:correlated_pattern) backed bySTRUCTURAL-confidence evidence records gets an effective classification ofstructural_authority, notcorrelated_pattern. When no execution evidence records are consulted,runtime_confidenceis omitted andeffective_classificationequals the static classification. -
Severity may vary, static classification does not. A
scope_driftfinding is always staticallycorrelated_patternregardless of whether execution evidence was found. The severity escalates when execution evidence exists. The effective classification may differ from the static one based on runtime confidence. -
Claim statement is templated per rule. Each rule provides a template function that interpolates rule-specific values (counts, names, timestamps) into a human-readable sentence.
-
Business impact is derived from the affected domains. When a finding touches sensitive business domains (from execution paths), the business impact references those domains. When it does not, a generic impact statement is used.
Example claim derivation for dormant_authority:
// In dormant-authority.ts evaluate():
const staticClassification = "observed_absence" as const;
const runtimeConfidence = mostRecent
? mapConfidenceToClassification(mostRecent.confidence) // e.g., DETERMINISTIC -> observed_execution
: undefined;
const effectiveClassification = runtimeConfidence
? minClassification(staticClassification, runtimeConfidence)
: staticClassification;
const claim: EvidenceClaim = {
claim_statement: mostRecent
? `This ${label.toLowerCase()} has ${paths.length} execution path(s) but has not been active for ${daysSinceActivity} days.`
: `This ${label.toLowerCase()} has ${paths.length} execution path(s) but has never produced execution evidence.`,
classification: staticClassification,
runtime_confidence: runtimeConfidence,
effective_classification: effectiveClassification,
classification_label: CLASSIFICATION_LABELS[effectiveClassification],
basis: mostRecent
? [`Execution evidence: most recent record at ${mostRecent.source_timestamp.toISOString()} (${daysSinceActivity} days ago)`]
: ["Execution evidence: queried entity and RUNS_AS targets, zero records found"],
business_impact: buildBusinessImpact(paths),
recommended_action: "Review the execution paths and confirm whether this authority is still needed. If not, revoke the associated roles.",
owner_role: DEFAULT_OWNER_ROLES.dormant_authority,
section_confidence: evidenceCompleteness
};
Two-Layer Confidence Model
Classification is not purely a static per-rule property. A rule classified as correlated_pattern (e.g., scope_drift) may, at runtime, consult execution evidence records whose EvidenceConfidenceLevel is only STRUCTURAL. In that case, the user-facing classification should reflect the weaker runtime confidence, not the optimistic static label.
The model works as follows:
-
Static rule classification (
classificationfield): Determined by the rule's data access pattern. Immutable per rule --dormant_authorityis alwaysobserved_absence,scope_driftis alwayscorrelated_pattern, etc. -
Runtime evidence confidence (
runtime_confidencefield): Derived from the actualEvidenceConfidenceLevelof the execution evidence records consulted during evaluation. The mapping is:EvidenceConfidenceLevelMaps to EvidenceClassificationDETERMINISTICobserved_executionTEMPORAL_INFERREDcorrelated_patternSTRUCTURALstructural_authorityWhen a rule consults multiple evidence records, the minimum confidence (weakest link) is used. When a rule does not consult any execution evidence records,
runtime_confidenceis omitted. -
Effective classification (
effective_classificationfield): The minimum of (static classification, runtime confidence) using theEVIDENCE_CLASSIFICATIONSordering. This is the value shown to the user and used for filtering/sorting.
Example: A scope_drift finding (static: correlated_pattern) backed by execution evidence with confidence: STRUCTURAL gets:
classification:correlated_patternruntime_confidence:structural_authorityeffective_classification:structural_authority(the weaker of the two)classification_label: "Derived from graph structure"
Example: A dormant_authority finding (static: observed_absence) where the most recent evidence record has confidence: DETERMINISTIC gets:
classification:observed_absenceruntime_confidence:observed_executioneffective_classification:observed_absence(absence is weaker than presence)classification_label: "Confirmed absent from execution logs"
This ensures the user-facing label never overstates the actual evidence quality.
API Contract Changes
GET /findings/:id
Add evidence_claim to the detail response:
{
"data": {
"id": "abc123",
"finding_type": "dormant_authority",
"severity": "high",
// ... existing fields ...
// NEW: structured evidence claim
"evidence_claim": {
"claim_statement": "This workload has 4 execution paths but has not been active for 127 days.",
"classification": "observed_absence",
"runtime_confidence": "observed_execution",
"effective_classification": "observed_absence",
"classification_label": "Confirmed absent from execution logs",
"basis": [
"Execution evidence: most recent record at 2025-11-19T14:22:00Z (127 days ago)",
"Checked 2 linked identities via RUNS_AS"
],
"business_impact": "Dormant authority to Payroll and HR domains creates standing risk without active justification.",
"recommended_action": "Review the 4 execution paths and confirm whether this authority is still needed. If not, revoke the associated roles.",
"owner_role": "Application Owner",
"section_confidence": {
"current_roles": "unavailable_not_applicable",
"role_history": "unavailable_not_applicable",
"execution_evidence": "available",
"ownership_records": "unavailable_not_applicable",
"approval_records": "unavailable_not_applicable",
"credential_state": "unavailable_not_applicable",
"notes": {
"execution_evidence": "Most recent evidence is 127 days old (beyond 90-day threshold)"
}
}
}
}
}
GET /findings (list)
Add summary classification fields to the normalized list response:
{
"data": [
{
"id": "abc123",
"finding_type": "dormant_authority",
"severity": "high",
// ... existing fields ...
// NEW: classification summary (lightweight, no full claim)
"evidence_classification": "observed_absence",
"effective_classification": "observed_absence",
"evidence_classification_label": "Confirmed absent from execution logs"
}
],
"meta": {
"total_count": 42,
"bySeverity": { "high": 12, "medium": 20, "critical": 5, "low": 5 },
"byType": { /* ... */ },
// NEW: classification distribution
"byClassification": {
"observed_execution": 5,
"observed_absence": 6,
"correlated_pattern": 15,
"structural_authority": 16
}
}
}
GET /exposures and GET /exposures/:id
Add the highest-confidence classification across all findings for an exposure:
{
"data": {
"id": "EXP-abc123",
// ... existing fields ...
// NEW: worst-case classification across all findings
"evidence_classifications": ["observed_execution", "structural_authority"],
"primary_classification": "observed_execution"
}
}
Filtering and Sorting
New query parameters for GET /findings:
| Parameter | Type | Description |
|---|---|---|
classification | string | Filter by evidence classification. Comma-separated for multiple values. |
sort=classification | string | Sort by classification confidence order (observed_execution > observed_absence > correlated > structural > inferred). |
Example: GET /api/v1/findings?classification=observed_execution,correlated_pattern&sort=classification
Scanability caveat: Adding a classification label to the findings list improves filtering but does not, by itself, solve the "too many clicks" problem. The "why/action/owner" payload (
business_impact,recommended_action,owner_role) is still only available in the detail response. Evidence classification is one piece of the puzzle; see #217 (Wiz UX research) for user-experience direction and #224 (expand default-visible info in list view) for the complementary API work needed.
MongoDB Index Requirement
The classification filter requires a compound index on the findings collection:
{ tenant_id: 1, "evidence_claim.effective_classification": 1, status: 1 }
This supports the GET /findings?classification=... query without a collection scan. The index should be created as part of the Phase 2 API migration.
UI Direction
Specific UI treatment should be informed by the Wiz UX research (#217). Sergey explicitly asked for UX research before proposing a final solution. The following are directional principles only:
- Make evidence classification visible. The classification (and its human-readable label) should be surfaced wherever findings appear -- list views, detail views, and exposure summaries.
- Don't bury it. Classification should be as prominent as severity, not hidden behind an expand/collapse.
- Use plain-English labels. Labels like "Confirmed absent from execution logs" and "Derived from graph structure" are more useful than enum values. The
classification_labelfield exists for this purpose. - Distinguish observed-presence from observed-absence. Users should understand the difference between "we saw this happen" (
observed_execution) and "we looked and confirmed it didn't happen" (observed_absence). The labels and any visual treatment should not conflate these. - Show the two-layer model when relevant. When
runtime_confidencediffers fromclassification, the UI should make it possible to understand why the effective classification was downgraded (e.g., "Rule classification: correlated pattern, but underlying evidence is structural-only").
Detailed component design (badges, colors, icons, layouts, panels) is deferred until after #217 research is complete.
Implementation Sequence
Phase 1: Types and Rule Changes (Sprint S+1)
Deliverables:
- Add
EvidenceClassification,EvidenceClaim,CLASSIFICATION_LABELS,DEFAULT_OWNER_ROLEStosrc/domain/findings/types.ts - Add
evidence_claim?: EvidenceClaimtoFindingDoc - Add
evidenceClaim?: EvidenceClaimtoRuleFindingCandidate(optional during Phase 1 to allow incremental rule migration) - Add
mapConfidenceToClassification()andminClassification()utilities for the two-layer confidence model - Incrementally update evaluator rules to populate
evidenceClaim(can be done rule-by-rule; no big-bang required since the field is optional) - Add a
buildBusinessImpact(paths: ExecutionPath[]): stringutility that generates business impact statements from affected domains - Add a
buildClaimStatement(findingType: FindingType, templateVars: Record<string, unknown>): stringutility for templated claim generation - Update
src/evaluator/orchestrator to passevidenceClaimthrough toFindingDoc - Migration: backfill
evidence_claimfor existing findings (offline script -- see Backfill Strategy below)
Backfill Strategy:
The backfill script re-evaluates existing findings to populate evidence_claim. Key decisions:
- Non-authoritative marking: Backfilled claims are inherently ahistorical — they reflect the state at backfill time, not the posture that originally produced the finding. For a product positioning itself around deterministic truth and repeatable outputs, this distinction matters. Backfilled claims MUST carry
backfilled: trueandclaim_generated_at(distinct from the finding'screated_at). The UI should render these with a subtle indicator (e.g., "Classification generated retroactively") so users know the claim was not computed at evaluation time. - Entity state constraints: The backfill uses current entity state because historical snapshots are not stored. This means an old finding may carry a claim based on today's posture. To mitigate: (a) the
backfilledflag alerts consumers, (b) theruntime_confidencefield reflects current evidence availability which may differ from original, (c) if the staticrule_classificationandruntime_confidenceproduce a differenteffective_classificationthan would have been computed originally, thebackfilledflag signals this is expected. - Deleted entities: When a finding's target entity has been deleted, the backfill script cannot produce a meaningful claim. These findings should have
evidence_claimset tonullwith a note in the finding'sdeterministic_explanationfield: "Entity deleted; evidence claim could not be generated retroactively." They will appear as unclassified in the UI until resolved/archived. - Idempotency: The script must be idempotent — re-running it on a finding that already has
evidence_claimshould be a no-op unless the claim schema version has changed. - Scope limitation: Backfilled claims should NOT be used for audit or compliance purposes. They are a best-effort enrichment for UI display. Only claims generated at evaluation time (where
backfilledis false/absent) carry full deterministic trust.
Estimated effort: 4-5 days for one engineer (extra day for backfill script and two-layer confidence utilities).
Phase 2: API Exposure and Connector-Report Classification (Sprint S+1)
Deliverables:
- Make
evidenceClaimrequired onRuleFindingCandidate(all rules now populate it) - Update
GET /findings/:idto includeevidence_claimin response - Update
GET /findingsto includeevidence_classification,effective_classification, andevidence_classification_labelin list items - Add
byClassificationto findings listmeta - Add
classificationquery parameter for filtering - Add
classificationas a valid sort field - Create MongoDB compound index:
{ tenant_id: 1, "evidence_claim.effective_classification": 1, status: 1 } - Update
GET /exposuresandGET /exposures/:idto include classification summary - Update OpenAPI/Zod schemas for all changed endpoints
- Connector-report findings: Classify all connector-report findings (those coming through
FindingsStorein-memory rather than evaluator rules) asstructural_authorityby default. Add aclassificationfield to the connector report schema so that connectors can override the default. Connector-report findings should produce anEvidenceClaimwith a generic claim statement derived from the connector'sdescriptionfield and a basis of["Connector detection logic"].
Estimated effort: 3-4 days for one engineer (extra time for connector-report integration).
Phase 3: UI Visualization (Sprint S+2)
Prerequisite: Wiz UX research (#217) should be complete before finalizing UI deliverables. Phase 3 scope will be defined based on #217 findings.
Tentative deliverables (subject to #217 research):
- Surface
effective_classificationandclassification_labelin finding list and detail views - Classification filter in the findings list
- Exposure detail: classification summary
- Specific component design (badge style, colors, icons, layout) to be determined after #217
Estimated effort: TBD after #217.
Open Questions
-
Should classification be immutable per finding type, or can it change at runtime?Resolved: the two-layer model (static rule classification + runtime evidence confidence) addresses this. The staticclassificationis immutable per rule. Theeffective_classificationcan differ based on the runtime confidence of underlying evidence records. See "Claim Derivation" section. -
Should
correlated_patternfindings show which correlation elevated them? For example, a scope_drift finding where execution evidence was found could display "Correlated: version history + execution logs" while one without execution evidence could display "Correlated: version history + path sensitivity". This adds complexity to thebasisfield but makes it more informative. Recommendation: yes, populatebasisdynamically based on which data sources were actually present. -
How should connector-report findings (non-evaluator) be classified?Resolved: moved to Phase 2 implementation plan. Connector-report findings default tostructural_authority(consistent with how evaluator rules that read connector-set properties are classified). Connectors can override via aclassificationfield in the report schema. -
Should the
owner_rolebe tenant-configurable? TheDEFAULT_OWNER_ROLESmap provides sensible defaults, but different organizations have different role names. Recommendation: make it configurable via tenant settings in a future phase. For Phase 1, use defaults. -
What about findings that span multiple entities? Currently, every finding is scoped to a single entity. If a future rule produces cross-entity findings, the claim statement and owner role may need to handle multiple subjects. Recommendation: defer until a concrete cross-entity rule is designed.
-
How does this interact with compliance_references? The existing
ComplianceReferencefield on findings maps finding types to compliance frameworks (SOX, SOC2, etc.). Thebusiness_impactfield inEvidenceClaimshould reference these frameworks when available. Recommendation: have the claim builder consultgetComplianceReferences(findingType)and include relevant framework names in the business impact statement.
References
Source Files
- Evidence types:
src/domain/evidence/types.ts--EvidenceConfidenceLevel,ExecutionEvidenceDoc - Finding types:
src/domain/findings/types.ts--FindingDoc,FindingType,FINDING_TYPES - Evidence pack types:
src/domain/evidence-packs/types.ts--EvidenceCompletenessSection,EvidencePackContent,EvidencePackDoc - Evidence sections builder:
src/evidence/sections.ts--buildEvidencePackContent(),buildEvidenceCompleteness() - Evaluator types:
src/evaluator/types.ts--RuleFindingCandidate,FindingRule,EvaluationContext - Graph transformer:
src/ingestion/graph-transformer.ts--resolveConfidence() - Ingest route:
src/api/routes/ingest.ts--NormalizedGraphSchema,EVIDENCE_CONFIDENCE_LEVELS - Findings route:
src/api/routes/findings.ts--createFindingsRoutes() - Exposures route:
src/api/routes/exposures.ts--createExposureRoutes() - Evaluator rules index:
src/evaluator/rules/index.ts--ALL_RULES - Entity types:
src/domain/entities/types.ts--EntityDoc,ExecutionPath,EntityVersionDoc
Evaluator Rules (14 rule files producing 15 finding types)
| # | Rule | File |
|---|---|---|
| 1 | dormant_authority | src/evaluator/rules/dormant-authority.ts |
| 2 | external_egress | src/evaluator/rules/external-egress.ts |
| 3 | llm_egress | src/evaluator/rules/llm-egress.ts |
| 4 | orphaned_ownership | src/evaluator/rules/orphaned-ownership.ts |
| 5 | ownership_degraded | src/evaluator/rules/orphaned-ownership.ts (variant) |
| 6 | ownership_ambiguous | src/evaluator/rules/ownership-ambiguous.ts |
| 7 | ownership_unknown | src/evaluator/rules/ownership-unknown.ts |
| 8 | privilege_justification_gap | src/evaluator/rules/privilege-justification-gap.ts |
| 9 | reachable_sensitive_domain | src/evaluator/rules/reachable-sensitive-domain.ts |
| 10 | unknown_identity_binding | src/evaluator/rules/unknown-identity-binding.ts |
| 11 | unproven_execution | src/evaluator/rules/unproven-execution.ts |
| 12 | unresolved_cross_system_auth | src/evaluator/rules/unresolved-auth.ts |
| 13 | scope_drift | src/evaluator/rules/scope-drift.ts |
| 14 | reachability_drift | src/evaluator/rules/reachability-drift.ts |
| 15 | ownership_drift | src/evaluator/rules/ownership-drift.ts |
GitHub
- Issue: #214