Evidence Model Separation: Claim Type vs Evidence Strength
Problem Statement
The current EvidenceClassification type conflates two orthogonal concerns into a single enum:
-
Claim type — What are we asserting? (execution happened, permission exists structurally, capability is inferred, absence confirmed)
-
Evidence strength — How confident are we? (deterministic proof, strong correlation, structural derivation, inference)
The existing model defines five values that each encode both dimensions simultaneously:
export const EVIDENCE_CLASSIFICATIONS = [
"observed_execution",
"observed_absence",
"correlated_pattern",
"structural_authority",
"inferred_capability",
] as const;
As Sergey noted in his March 31 founder feedback: the platform must distinguish between "the type of claim being made and the strength of the supporting evidence." These are separate axes. A claim about execution can have deterministic or correlated evidence. A claim about permission can have structural or inferred evidence. Flattening this into a single dimension loses information and constrains how the UI can present trust to users.
Current Model Analysis
Each of the five current classifications implicitly encodes a (claim_type, strength) pair:
| Current Classification | Implicit Claim Type | Implicit Evidence Strength | Rank |
|---|---|---|---|
observed_execution | Execution happened | Deterministic | 0 |
observed_absence | Execution did NOT happen | Deterministic | 1 |
correlated_pattern | Execution likely | Correlation | 2 |
structural_authority | Permission exists | Structural derivation | 3 |
inferred_capability | Execution possible | Inference | 4 |
This is a 2D matrix being flattened into a 1D ranking. The flattening creates several problems:
-
No way to express new combinations. A structurally-derived execution claim (e.g., "this identity executed based on permission + temporal correlation") has no classification. It would need to be forced into either
correlated_patternorstructural_authority, losing precision. -
Ranking conflates confidence with claim semantics.
observed_absence(rank 1) is ranked lower thanobserved_execution(rank 0) even though both are deterministic. The ranking is really about claim importance, not evidence quality. -
UI cannot independently display trust. The badge system (PR #242) must map a single value to both "what kind of thing is this?" and "how much should I trust it?" — forcing color to carry two meanings.
-
effective_classificationlogic is constrained. The currentweakestClassificationfunction compares two values on a single axis, but "weakest" is only meaningful for evidence strength, not for claim type.
Proposed Two-Axis Model
Claim Types (what we assert)
| Claim Type | Meaning |
|---|---|
execution_observed | We saw this identity execute this action in source system logs |
execution_absent | We confirmed this identity did NOT execute this action |
permission_exists | Structural authority exists (role, permission, or policy grant) |
capability_inferred | Indirect signals suggest this identity could perform this action |
Claim types are mutually exclusive for a given evidence claim. Each claim asserts exactly one thing.
Evidence Strength (how confident we are)
| Evidence Strength | Meaning |
|---|---|
deterministic | Direct proof from source system logs or records |
correlated | Cross-source pattern match, high confidence |
structural | Derived from graph structure, permissions, or configuration |
inferred | Indirect signals only, lowest confidence |
Evidence strength is orderable — deterministic > correlated > structural > inferred. This ordering is what the renamed weakestStrength function should operate on (currently weakestClassification).
The 2D Matrix
This separation allows the full matrix of valid combinations:
| Claim \ Strength | deterministic | correlated | structural | inferred |
|---|---|---|---|---|
| execution_observed | Direct log proof | Cross-source correlation | — | — |
| execution_absent | Confirmed no logs | Absence across sources | — | — |
| permission_exists | — | — | Role/permission graph | Indirect permission signal |
| capability_inferred | — | — | Structural reachability | Behavioral inference |
Not all cells are valid. Execution claims require deterministic or correlated evidence. Permission claims require structural or inferred evidence. This constraint should be encoded in the type system.
Mapping from Current Model
The migration is deterministic — each old value maps to exactly one (claim_type, strength) pair:
| Current | claim_type | evidence_strength |
|---|---|---|
observed_execution | execution_observed | deterministic |
observed_absence | execution_absent | deterministic |
correlated_pattern | execution_observed | correlated |
structural_authority | permission_exists | structural |
inferred_capability | capability_inferred | inferred |
Aggregation Semantics
Evidence Strength Aggregation
When an access chain contains multiple evidence claims, the effective strength is the weakest strength across all claims in the chain. This is the existing weakestClassification logic, renamed to weakestStrength. It operates purely on the strength axis:
deterministic > correlated > structural > inferred
If a chain has one deterministic claim and one inferred claim, the effective strength is inferred. This is conservative by design — the chain is only as strong as its weakest link.
Claim Type Aggregation
Claim type does not aggregate the same way. An access chain may contain claims of different types — e.g., an execution_observed claim (identity ran a query) alongside a permission_exists claim (identity has a role grant). These are not comparable on a single axis.
For access chain summarization, the effective claim type follows a priority rule — the chain is characterized by the strongest assertion it contains:
execution_observed— chain includes observed activity (strongest: proves usage)execution_absent— chain includes confirmed non-usagepermission_exists— chain is structural only (permission granted, no execution data)capability_inferred— chain is entirely inferred (weakest: no direct evidence)
If a chain has both execution_observed and capability_inferred claims, the effective claim type is execution_observed — the chain demonstrably includes real activity, even if some paths within it are inferred.
This is semantically different from strength aggregation: strength takes the weakest (conservative for trust), claim type takes the strongest (most informative for triage).
Decided: The UI shows the effective claim type as the primary label, with an inline count breakdown (e.g., "4 observed · 1 inferred") visible without hover. This keeps cards scannable while answering "how much of this chain is proven?" without requiring the detail view.
User-Facing Trust Language
The four evidence strength values need simplified, non-technical labels for the UI. Proposed mapping:
| Evidence Strength | User-Facing Label | Badge Color | Meaning for the User |
|---|---|---|---|
deterministic | Confirmed | Green | Direct proof from source systems |
correlated | Likely | Blue | Correlated across data sources |
structural | Configured | Amber | Derived from permissions and configuration |
inferred | Possible | Gray | Inferred from indirect signals |
"Configured" was chosen over "Structural" because it communicates to non-technical users that the evidence comes from how systems are set up (roles, permissions, policies) rather than from observed behavior. "Derived" was rejected as too vague; "Granted" implies a deliberate human act, which doesn't cover inherited or default permissions. Decided: "Configured".
Claim type should be communicated via icon or label, not color:
- Execution observed → activity/log icon
- Execution absent → empty/check icon
- Permission exists → key/lock icon
- Capability inferred → question/signal icon
This gives users two independent visual channels: color for "how much do I trust this?" and icon for "what kind of claim is this?"
Impact Assessment
EvidenceClaim Interface
Add claim_type field, rename classification to evidence_strength:
export interface EvidenceClaim {
claim_statement: string;
claim_type: ClaimType; // NEW
evidence_strength: EvidenceStrength; // RENAMED from classification
runtime_confidence?: EvidenceStrength;
effective_strength: EvidenceStrength; // RENAMED from effective_classification
strength_rank: number; // RENAMED from classification_rank
strength_label: string; // RENAMED from classification_label
basis: string[];
business_impact: string;
recommended_action: string;
section_confidence: EvidenceCompletenessSection;
}
Effective Strength Logic (replaces effective_classification)
The computeEffectiveClassification function becomes computeEffectiveStrength and operates only on the strength axis. The weakestClassification function becomes weakestStrength. Claim type does not participate in strength computation — it passes through unchanged.
Evaluator Rules
All 24 evaluator rule files call buildEvidenceClaim with a classification parameter. Each call must be updated to provide both claim_type and evidence_strength. Since every rule currently passes a single classification that deterministically maps to one (claim_type, strength) pair, this is a mechanical change.
UI Badges (PR #242)
The current badge system maps classification to a single color. Under the new model:
-
Badge color maps to
evidence_strength(Confirmed=green, Likely=blue, Configured=amber, Possible=gray) -
A separate icon or label maps to
claim_type
Evidence Packs
section_confidence in evidence packs is unaffected. It measures data availability (which connector sections returned data), not claim strength. No changes needed.
API Responses
-
evidence_classificationfield →evidence_strength -
New field:
evidence_claim_type -
effective_classification→effective_strength -
classification_rank→strength_rank -
classification_label→strength_label
Migration Feasibility
Since each current classification value deterministically maps to exactly one (claim_type, evidence_strength) pair, migration can be done in a single pass:
-
Compute the new fields from the old field for all stored documents.
-
No ambiguity, no manual review needed.
-
Backwards compatibility can be maintained by computing the old field from the new pair during the transition period.
Migration Path
Phase 1: Add New Fields (Backwards Compatible)
-
Add
claim_typeandevidence_strengthfields toEvidenceClaiminterface as optional fields -
Populate new fields alongside existing ones in
buildEvidenceClaim -
API returns both old and new field names
-
No breaking changes for consumers
Phase 2: Update Evaluator Rules
-
Update all 24 evaluator rule files to explicitly provide
claim_typeandevidence_strength -
Update
buildEvidenceClaimsignature to require both new fields -
Rename
computeEffectiveClassificationtocomputeEffectiveStrength
Phase 3: Update UI
-
Update badge rendering to use
evidence_strengthfor color -
Add claim type icons
-
Update user-facing labels to Confirmed / Likely / Configured / Possible
-
Update any filtering or sorting logic to use new field names
Phase 4: Deprecate Old Fields
-
Mark
classification,effective_classification,classification_rank, andclassification_labelas deprecated in API -
Remove old fields from
EvidenceClaiminterface -
Remove backwards-compatibility mapping
-
Clean up stored documents
Next Action
Status: adopted
Decisions made:
-
Evidence model: Full two-axis separation (claim type + evidence strength). Not a cosmetic rename — the 2D matrix and independent UI channels are required.
-
User-facing labels: Confirmed / Likely / Configured / Possible. "Configured" chosen over "Structural", "Derived", and "Granted".
-
Claim type display: Effective claim type as primary label, with inline count breakdown (e.g., "4 observed · 1 inferred").
-
Timing: Implement before access chain UI work so the new model is the foundation, not a retrofit.
Implementation:
-
Create GitHub issues in sv0-platform for each migration phase
-
Phase 1 (backwards-compatible new fields) can start immediately
-
Phases 2–4 follow sequentially per the migration path above