Evidence Model Separation: Claim Type vs Evidence Strength

Problem Statement

The current EvidenceClassification type conflates two orthogonal concerns into a single enum:

Claim type — What are we asserting? (execution happened, permission exists structurally, capability is inferred, absence confirmed)
Evidence strength — How confident are we? (deterministic proof, strong correlation, structural derivation, inference)

The existing model defines five values that each encode both dimensions simultaneously:

export const EVIDENCE_CLASSIFICATIONS = [
  "observed_execution",
  "observed_absence",
  "correlated_pattern",
  "structural_authority",
  "inferred_capability",
] as const;

As Sergey noted in his March 31 founder feedback: the platform must distinguish between "the type of claim being made and the strength of the supporting evidence." These are separate axes. A claim about execution can have deterministic or correlated evidence. A claim about permission can have structural or inferred evidence. Flattening this into a single dimension loses information and constrains how the UI can present trust to users.

Current Model Analysis

Each of the five current classifications implicitly encodes a (claim_type, strength) pair:

Current Classification	Implicit Claim Type	Implicit Evidence Strength	Rank
`observed_execution`	Execution happened	Deterministic	0
`observed_absence`	Execution did NOT happen	Deterministic	1
`correlated_pattern`	Execution likely	Correlation	2
`structural_authority`	Permission exists	Structural derivation	3
`inferred_capability`	Execution possible	Inference	4

This is a 2D matrix being flattened into a 1D ranking. The flattening creates several problems:

No way to express new combinations. A structurally-derived execution claim (e.g., "this identity executed based on permission + temporal correlation") has no classification. It would need to be forced into either correlated_pattern or structural_authority, losing precision.
Ranking conflates confidence with claim semantics. observed_absence (rank 1) is ranked lower than observed_execution (rank 0) even though both are deterministic. The ranking is really about claim importance, not evidence quality.
UI cannot independently display trust. The badge system (PR #242) must map a single value to both "what kind of thing is this?" and "how much should I trust it?" — forcing color to carry two meanings.
effective_classification logic is constrained. The current weakestClassification function compares two values on a single axis, but "weakest" is only meaningful for evidence strength, not for claim type.

Proposed Two-Axis Model

Claim Types (what we assert)

Claim Type	Meaning
`execution_observed`	We saw this identity execute this action in source system logs
`execution_absent`	We confirmed this identity did NOT execute this action
`permission_exists`	Structural authority exists (role, permission, or policy grant)
`capability_inferred`	Indirect signals suggest this identity could perform this action

Claim types are mutually exclusive for a given evidence claim. Each claim asserts exactly one thing.

Evidence Strength (how confident we are)

Evidence Strength	Meaning
`deterministic`	Direct proof from source system logs or records
`correlated`	Cross-source pattern match, high confidence
`structural`	Derived from graph structure, permissions, or configuration
`inferred`	Indirect signals only, lowest confidence

Evidence strength is orderable — deterministic > correlated > structural > inferred. This ordering is what the renamed weakestStrength function should operate on (currently weakestClassification).

The 2D Matrix

This separation allows the full matrix of valid combinations:

Claim \ Strength	deterministic	correlated	structural	inferred
execution_observed	Direct log proof	Cross-source correlation	—	—
execution_absent	Confirmed no logs	Absence across sources	—	—
permission_exists	—	—	Role/permission graph	Indirect permission signal
capability_inferred	—	—	Structural reachability	Behavioral inference

Not all cells are valid. Execution claims require deterministic or correlated evidence. Permission claims require structural or inferred evidence. This constraint should be encoded in the type system.

Mapping from Current Model

The migration is deterministic — each old value maps to exactly one (claim_type, strength) pair:

Current	claim_type	evidence_strength
`observed_execution`	`execution_observed`	`deterministic`
`observed_absence`	`execution_absent`	`deterministic`
`correlated_pattern`	`execution_observed`	`correlated`
`structural_authority`	`permission_exists`	`structural`
`inferred_capability`	`capability_inferred`	`inferred`

Aggregation Semantics

Evidence Strength Aggregation

When an access chain contains multiple evidence claims, the effective strength is the weakest strength across all claims in the chain. This is the existing weakestClassification logic, renamed to weakestStrength. It operates purely on the strength axis:

deterministic > correlated > structural > inferred

If a chain has one deterministic claim and one inferred claim, the effective strength is inferred. This is conservative by design — the chain is only as strong as its weakest link.

Claim Type Aggregation

Claim type does not aggregate the same way. An access chain may contain claims of different types — e.g., an execution_observed claim (identity ran a query) alongside a permission_exists claim (identity has a role grant). These are not comparable on a single axis.

For access chain summarization, the effective claim type follows a priority rule — the chain is characterized by the strongest assertion it contains:

execution_observed — chain includes observed activity (strongest: proves usage)
execution_absent — chain includes confirmed non-usage
permission_exists — chain is structural only (permission granted, no execution data)
capability_inferred — chain is entirely inferred (weakest: no direct evidence)

If a chain has both execution_observed and capability_inferred claims, the effective claim type is execution_observed — the chain demonstrably includes real activity, even if some paths within it are inferred.

This is semantically different from strength aggregation: strength takes the weakest (conservative for trust), claim type takes the strongest (most informative for triage).

Decided: The UI shows the effective claim type as the primary label, with an inline count breakdown (e.g., "4 observed · 1 inferred") visible without hover. This keeps cards scannable while answering "how much of this chain is proven?" without requiring the detail view.

User-Facing Trust Language

The four evidence strength values need simplified, non-technical labels for the UI. Proposed mapping:

Evidence Strength	User-Facing Label	Badge Color	Meaning for the User
`deterministic`	Confirmed	Green	Direct proof from source systems
`correlated`	Likely	Blue	Correlated across data sources
`structural`	Configured	Amber	Derived from permissions and configuration
`inferred`	Possible	Gray	Inferred from indirect signals

"Configured" was chosen over "Structural" because it communicates to non-technical users that the evidence comes from how systems are set up (roles, permissions, policies) rather than from observed behavior. "Derived" was rejected as too vague; "Granted" implies a deliberate human act, which doesn't cover inherited or default permissions. Decided: "Configured".

Claim type should be communicated via icon or label, not color:

Execution observed → activity/log icon
Execution absent → empty/check icon
Permission exists → key/lock icon
Capability inferred → question/signal icon

This gives users two independent visual channels: color for "how much do I trust this?" and icon for "what kind of claim is this?"

Impact Assessment

`EvidenceClaim` Interface

Add claim_type field, rename classification to evidence_strength:

export interface EvidenceClaim {
  claim_statement: string;
  claim_type: ClaimType;           // NEW
  evidence_strength: EvidenceStrength; // RENAMED from classification
  runtime_confidence?: EvidenceStrength;
  effective_strength: EvidenceStrength; // RENAMED from effective_classification
  strength_rank: number;           // RENAMED from classification_rank
  strength_label: string;          // RENAMED from classification_label
  basis: string[];
  business_impact: string;
  recommended_action: string;
  section_confidence: EvidenceCompletenessSection;
}

Effective Strength Logic (replaces `effective_classification`)

The computeEffectiveClassification function becomes computeEffectiveStrength and operates only on the strength axis. The weakestClassification function becomes weakestStrength. Claim type does not participate in strength computation — it passes through unchanged.

Evaluator Rules

All 24 evaluator rule files call buildEvidenceClaim with a classification parameter. Each call must be updated to provide both claim_type and evidence_strength. Since every rule currently passes a single classification that deterministically maps to one (claim_type, strength) pair, this is a mechanical change.

UI Badges (PR #242)

The current badge system maps classification to a single color. Under the new model:

Badge color maps to evidence_strength (Confirmed=green, Likely=blue, Configured=amber, Possible=gray)
A separate icon or label maps to claim_type

Evidence Packs

section_confidence in evidence packs is unaffected. It measures data availability (which connector sections returned data), not claim strength. No changes needed.

API Responses

evidence_classification field → evidence_strength
New field: evidence_claim_type
effective_classification → effective_strength
classification_rank → strength_rank
classification_label → strength_label

Migration Feasibility

Since each current classification value deterministically maps to exactly one (claim_type, evidence_strength) pair, migration can be done in a single pass:

Compute the new fields from the old field for all stored documents.
No ambiguity, no manual review needed.
Backwards compatibility can be maintained by computing the old field from the new pair during the transition period.

Migration Path

Phase 1: Add New Fields (Backwards Compatible)

Add claim_type and evidence_strength fields to EvidenceClaim interface as optional fields
Populate new fields alongside existing ones in buildEvidenceClaim
API returns both old and new field names
No breaking changes for consumers

Phase 2: Update Evaluator Rules

Update all 24 evaluator rule files to explicitly provide claim_type and evidence_strength
Update buildEvidenceClaim signature to require both new fields
Rename computeEffectiveClassification to computeEffectiveStrength

Phase 3: Update UI

Update badge rendering to use evidence_strength for color
Add claim type icons
Update user-facing labels to Confirmed / Likely / Configured / Possible
Update any filtering or sorting logic to use new field names

Phase 4: Deprecate Old Fields

Mark classification, effective_classification, classification_rank, and classification_label as deprecated in API
Remove old fields from EvidenceClaim interface
Remove backwards-compatibility mapping
Clean up stored documents

Next Action

Status: adopted

Decisions made:

Evidence model: Full two-axis separation (claim type + evidence strength). Not a cosmetic rename — the 2D matrix and independent UI channels are required.
User-facing labels: Confirmed / Likely / Configured / Possible. "Configured" chosen over "Structural", "Derived", and "Granted".
Claim type display: Effective claim type as primary label, with inline count breakdown (e.g., "4 observed · 1 inferred").
Timing: Implement before access chain UI work so the new model is the foundation, not a retrofit.

Implementation:

Create GitHub issues in sv0-platform for each migration phase
Phase 1 (backwards-compatible new fields) can start immediately
Phases 2–4 follow sequentially per the migration path above

Problem Statement​

Current Model Analysis​

Proposed Two-Axis Model​

Claim Types (what we assert)​

Evidence Strength (how confident we are)​

The 2D Matrix​

Mapping from Current Model​

Aggregation Semantics​

Evidence Strength Aggregation​

Claim Type Aggregation​

User-Facing Trust Language​

Impact Assessment​

EvidenceClaim Interface​

Effective Strength Logic (replaces effective_classification)​

Evaluator Rules​

UI Badges (PR #242)​

Evidence Packs​

API Responses​

Migration Feasibility​

Migration Path​

Phase 1: Add New Fields (Backwards Compatible)​

Phase 2: Update Evaluator Rules​

Phase 3: Update UI​

Phase 4: Deprecate Old Fields​

Next Action​