Skip to main content

Canonical Resource Identity for Path-Scoped Execution Evidence

Revision history

  • 2026-04-09 (v1) — initial draft, opened as sv0-documentation#165.
  • 2026-04-09 (v2) — revised after the first cross-review round. Three blocking findings (summarized in §12) are addressed throughout §5 (Proposed design), §6 (Migration), §7 (Test strategy), and §8 (Alternatives rejected). The scope estimate has grown materially. Changes are called out inline with **v2:** prefixes where relevant.
  • 2026-04-10 (v3) — added cross-reference callouts to §5.2 and §5.3 after Codex round 2 re-raised the same three findings. The v2 fixes were already present in §5.1 but readers focused on §5.2/§5.3 could miss them. No design changes; readability improvement only. Codex round 2 logged in §12.

Executive summary

SecurityV0/sv0-platform#302 proposed a small tactical fix — add a destinationResource filter to execution-evidence queries so the authority-path materializer can compute execution_30d per path instead of once per workload. An adversarial architecture review (2026-04-09) of the first-pass design revealed that the data model underneath the proposed filter is fundamentally insufficient for deterministic matching against AWS data. Three silent failures in the current code would have caused the originally-proposed filter to match zero records against any AWS scan, and seven resource-specific ARN normalization landmines would have produced quiet wrongness as AWS data eventually began flowing.

This document proposes a targeted data-model refactor that introduces a first-class canonical resource_key on both EntityDoc and ExecutionEvidenceDoc. The key is computed once at ingest, stored as an indexed field, and consumed by evaluator rules via a single equality check. It replaces the current "match target_resource string against resource_id hash or missing-on-AWS resource_name" logic that has been silently non-functional against AWS data since the AWS connector shipped. The refactor also includes a ~10-line connector-side fix for assumed-role session ARN parsing, which is load-bearing enough that the refactor alone cannot produce correct findings without it.

v2 update: a first cross-review round surfaced three additional blocking findings that expand the scope of this refactor beyond what v1 proposed. In particular:

  1. Path identity, not just the filter, has to move to the canonical key. Adding resource_key to the path payload does not change how AuthorityPathDoc._id or path_lineage_id are computed — both still hash the destination entity's Mongo _id. Two entities that share a canonical key (ECS task definition revision 17 and 18) therefore still materialize as two distinct path documents, and evidence counts duplicate across them. The v2 design rewrites path identity to hash resource_key instead.
  2. Evidence grain matters, and the current platform counts documents rather than events. execution-evidence-adapter.countExecutionEvidence literally calls countDocuments, so an aggregated CloudTrail evidence record (the shape sv0-connectors#31 currently proposes) counts as 1 regardless of how many underlying events it represents. The v2 design adds a new execution_count: number field on ExecutionEvidenceDoc and changes the materializer to sum that field instead of counting documents. This makes #306 a hard dependency on the CloudTrail extractor's grain choice in #31.
  3. Platform-only canonicalization is not future-proof for non-AWS connectors. Azure Foundry writes agent.name into target_resource; Entra/ServiceNow writes display names like "incident" or record numbers like "INC0012345". Neither can be canonicalized into a stable identifier without connector-side knowledge. The v2 design introduces a new optional target_resource_key field on the ExecutionEvidence NormalizedGraph node; connectors populate it explicitly, and the platform's helper only falls back to deriving from sourceId for ARN-shaped sources (AWS).

Scope is intentionally narrower than a full sv0-platform#300 cross-connector identity correlation: only the resource side and the minimum viable principal-side fix are included. The goal is to make #302 correct, not to solve all identity problems.

This document is written to be cross-reviewed by multiple models before any implementation begins. Sections marked "Open question" identify decisions that are explicitly unresolved.

1. Problem statement

1.1 What #302 set out to solve

Today, AuthorityPath.current_state.execution_30d is computed once per workload and copied to every authority path originating from that workload. From src/ingestion/authority-path-materializer.ts:57-116:

const currentState = await computeCurrentState(
workload,
tenantId,
thirtyDaysAgo,
storageAdapter,
);

for (const ep of executionPaths) {
// ...
pathsToUpsert.push({
// ...
current_state: currentState, // SAME for all paths from this workload
});
}

A Lambda that executed 50 times in the last 30 days will have every one of its authority paths showing execution_30d = 50, including paths to destinations the Lambda has never actually touched. A Lambda that ran against DynamoDB daily but has s3:GetObject on a PII bucket it has never read shows identical execution counts on both paths. dormant_authority and unproven_execution cannot fire on the S3 path even though the Lambda has demonstrably never touched that bucket.

#302 wants to fix this by attributing each CloudTrail event to the specific destination it touched.

1.2 The first-pass design

The original design proposed on #302 was:

  1. Extend ExecutionEvidenceQuery with an optional destinationResource?: string filter.
  2. In computeCurrentState(), query evidence filtered by both entityId (the workload principal) and destinationResource (the path's target resource).
  3. Match target_resource against the destination entity's source_id with a fallback to source_name, mirroring what the existing privilege_justification_gap rule does at src/evaluator/rules/privilege-justification-gap.ts:48-50:
return e.target_resource === path.resource_id || e.target_resource === path.resource_name;
  1. Rewrite dormant_authority and unproven_execution to read the newly path-scoped execution_30d from path.current_state instead of re-querying evidence directly.

The entire design rests on two assumptions:

  • Assumption A: the string equality target_resource === resource_id (or resource_name) is a valid precedent that already works in the codebase.
  • Assumption B: execution evidence records for AWS workloads actually carry a populated target_resource field that identifies the specific resource each CloudTrail event touched.

Both assumptions are false.

2. Three silent failures in the current code

2.1 Failure 1 — privilege_justification_gap is non-functional on AWS

The rule's matcher at src/evaluator/rules/privilege-justification-gap.ts:48-50:

function evidenceMatchesResource(e: ExecutionEvidenceDoc, path: ExecutionPath): boolean {
return e.target_resource === path.resource_id || e.target_resource === path.resource_name;
}

Both branches of the OR fail for AWS data:

  • path.resource_id is assigned at src/ingestion/path-materializer.ts:158 to resource._id — a 24-character Mongo hex hash computed from sha256(tenantId:sourceSystem:sourceId).slice(0, 24) at src/ingestion/graph-transformer.ts:24-36. This is never an ARN. A CloudTrail event's target_resource field will never equal a hex hash.

  • path.resource_name is assigned at src/ingestion/path-materializer.ts:159 to resource.properties.resource_name ?? resource._id. When the resource_name property is absent on the resource node, this falls through to the same hash.

The AWS connector never populates resource_name on any resource node. Verification: a full grep of integrations/aws/src/sv0_aws/core/transformer.py for any property key equal to "resource_name" returns zero hits. Every AWS resource node lands in Mongo with properties.resource_name === undefined, so the second branch of the OR also falls through to the hash.

By contrast, the entra-servicenow and azure-foundry connectors do populate resource_name (integrations/entra-servicenow/src/entra_servicenow/core/transformer.py:855, 871, 988, 1163, 1866; integrations/azure-foundry/src/azure_foundry/core/transformer.py:498 and adjacent). This is why the rule appears to work in the reference implementation — but that's loose display-name matching ("incident", "Microsoft Graph", "DataInsight Analytics"), which is deterministic in the sense that two strings are compared, but semantically fuzzy in the sense that display names collide across tenants and are not guaranteed stable.

Consequence: privilege_justification_gap has been silently producing zero findings against every AWS scan since the AWS connector shipped. Nobody noticed because of Failure 2 below.

2.2 Failure 2 — CloudTrail target_resource is never populated for AWS

There are two compounding problems:

No CloudTrail data flows at all. In the AWS connector CLI, extracted_data["cloudtrail_evidence"] is initialized to an empty list at integrations/aws/src/sv0_aws/cli/main.py:146 and never written to. There is no cloudtrail_extractor.py in integrations/aws/src/sv0_aws/extractors/. CloudTrail extraction is tracked as SecurityV0/sv0-connectors#31 and has not been implemented. The --skip-cloudtrail flag referenced elsewhere in the CLI is effectively a no-op because there is no extractor to skip.

Even if CloudTrail data flowed, the transformer would discard the target. The existing _transform_cloudtrail_evidence method at integrations/aws/src/sv0_aws/core/transformer.py:1428 is written against the assumption that CloudTrail data will eventually arrive. But the aggregation at that method reduces events by (principalArn, eventName) and explicitly discards request_parameters and resources before emitting the execution_evidence node (approximately lines 1478-1489). The NormalizedGraph node emitted by this method has a properties bag that does not include target_resource at all.

The platform side confirms the gap. src/ingestion/graph-transformer.ts:227 passes through whatever the node's properties.target_resource contains:

target_resource: (node.properties.target_resource as string) ?? "",

With no AWS connector ever writing that property, the field is unconditionally the empty string on every AWS-sourced execution_evidence record. The _id hash, the entity_id FK, and the top-level source_system are the only populated identifiers.

Consequence: the first-pass design of #302 — "add a destinationResource filter to the evidence query" — filters against a field that is empty for every AWS record. The filter matches zero evidence rows regardless of what arguments are passed to it.

2.3 Failure 3 — Assumed-role principals don't resolve

Even if the target side were fixed, the source side has an independent gap. The AWS connector's helper _get_identity_node_id_from_arn (integrations/aws/src/sv0_aws/core/transformer.py:1768-1780) only handles two ARN shapes:

if ":role/" in arn:
# arn:aws:iam::acct:role/name → aws_iam_role:acct:name
elif ":user/" in arn:
# arn:aws:iam::acct:user/name → aws_iam_user:acct:name
return None

CloudTrail does not record most principals in those two shapes. It records them as STS assumed-role session ARNs:

arn:aws:sts::<account>:assumed-role/<RoleName>/<session-name>

This shape is produced for:

  • Every Lambda invocation (the execution role is assumed by lambda.amazonaws.com)
  • Every ECS task (task role and execution role are both assumed by ecs-tasks.amazonaws.com)
  • Every EC2 instance using an instance profile
  • Every Step Functions state transition that calls into another AWS service
  • Every Bedrock agent action group call
  • Every role-chaining flow via sts:AssumeRole
  • Every federated web identity session after the initial AssumeRoleWith* call

In practice this is 80–90% of CloudTrail events against a modern AWS workload. The _get_identity_node_id_from_arn helper returns None for every one of them because the ARN's service segment is sts, not iam, and neither :role/ nor :user/ appears in the resource segment.

When the helper returns None, the EVIDENCES edge from the execution_evidence node to the principal entity is never constructed. The graph-transformer.ts:186-205 lookup that normally populates ExecutionEvidenceDoc.entity_id from an incoming EVIDENCES edge falls through to the empty string. Evidence records land in Mongo with entity_id: "". They cannot be retrieved by any of the existing ctx.getExecutionEvidence(entity._id, …) queries because no entity has _id === "".

Consequence: even after Failures 1 and 2 are fixed, the majority of AWS CloudTrail events would still land in the database orphaned from the workload that performed the action. A getExecutionEvidence({entityId: workload._id}) call would return zero rows regardless of how many CloudTrail events actually mentioned that workload's role. The dormant_authority rule would fire on every Lambda in the lab because, as far as the evaluator can tell, no Lambda has any execution evidence.

2.4 Why these failures are compounding

Each failure, considered individually, looks like a small gap. Together they form a dependency chain:

CloudTrail extractor ships (#31)
└→ writes target_resource on evidence nodes
└→ principal resolution handles assumed-role ARNs
└→ entity_id is populated
└→ evidence records are queryable by workload
└→ target_resource filter selects per-destination
└→ evaluator rules fire correctly per-path

Each step is necessary; no single step is sufficient. A fix that addresses only the materializer side leaves the target unresolved. A fix that addresses only the target side leaves principals orphaned. A fix that addresses both sides at the string level inherits the seven resource-specific landmines in §3. The only design that satisfies all three preconditions simultaneously is a deterministic canonical identity that spans both the resource and principal surfaces.

3. Resource-specific landmines

The first-pass design assumed that an ARN string produced by the AWS connector at resource-scan time would equal the ARN string CloudTrail records at event time. It assumed wrong for seven specific cases, each of which is a real shape Nimbus Cloud or Nimbus Enterprise will produce in the demo lab.

3.1 Lambda version qualifiers

Lambda source_id is set by the connector to the unqualified function ARN: arn:aws:lambda:us-east-1:111:function:doc-ingest (integrations/aws/src/sv0_aws/core/transformer.py:535). CloudTrail events for a versioned or aliased invocation record the qualified form: arn:aws:lambda:us-east-1:111:function:doc-ingest:prod or ...:function:doc-ingest:42.

Exact-match fails. The unqualified ARN is semantically correct for attribution (we want all invocations of the function, regardless of alias) but the strings don't compare equal.

3.2 Secrets Manager suffix drift

Secret ARNs from AWS include a trailing 6-character random suffix: arn:aws:secretsmanager:us-east-1:111:secret:doc-ingest/openai-AbC123. The connector stores the full suffixed ARN as source_id (integrations/aws/src/sv0_aws/core/transformer.py:780). CloudTrail GetSecretValue records the same suffixed ARN.

This case happens to work under exact-match today — but the connector's internal node-ID resolver at _resolve_resource_node_id (line 1815) actively strips the suffix when generating an internal lookup key. The suffix is simultaneously "kept" (in source_id) and "stripped" (in the internal nodeId). Any future ingestion-time normalization attempt must commit to one choice; otherwise an inconsistency gets introduced between the resource side and the evidence side and the equality check starts failing silently.

3.3 ECS task definition revision recreation

ECS task definitions are keyed by family:revision, and the ARN includes the revision: arn:aws:ecs:us-east-1:111:task-definition/nimbus-data-pipeline:17. A code deploy that changes the container image creates revision 18. The new revision is a distinct ECS resource; AWS does not mutate revision 17 in place.

The connector emits a new entity node for revision 18 (distinct source_id, distinct entity _id). CloudTrail RunTask records the revision the task was launched with. Evidence collected under revision 18 is orphaned from any query that targets revision 17, and revision 17's workload has no execution_paths materialized yet on the first sync after deploy.

From an attribution standpoint, we want "this task definition family has been executed" — not "this specific revision number." Revision-level granularity is a surface accident, not a business invariant. The canonical key must collapse revisions onto the family.

3.4 ECR image digests

ECR repository source_id is the repository ARN with no image identifier: arn:aws:ecr:us-east-1:111:repository/nimbus-data-pipeline. CloudTrail BatchGetImage and GetDownloadUrlForLayer events record the repository in the resources array and the specific image digest in requestParameters.imageIds[].imageDigest. The target ARN in the CloudTrail payload is the repository ARN — the digest is a separate field.

If an extractor naively concatenates "repository ARN + image digest" when populating target_resource, the resulting string will never match any entity's source_id. If the extractor correctly uses just the repository ARN, exact-match works — but this is a design-time decision that the current non-existent extractor will need to get right.

3.5 S3 object-level access

arn:aws:s3:::bucket is the only ARN form S3 has for a bucket. Object-level access in CloudTrail (GetObject, PutObject) is recorded in requestParameters.bucketName + requestParameters.key — there is no single ARN representing the object. Any extractor that wants to attribute evidence to the bucket node must explicitly truncate key from the target and use the bucket ARN alone.

The design question: does SecurityV0 model per-object access at all? For the demo narratives the answer is "no — the bucket is the unit of sensitivity," which means the extractor must be written to deliberately drop the object key. Without the canonical-key abstraction, this is a per-extractor decision that can drift over time.

3.6 Cross-account resources

A Nimbus workload in account A writes to a bucket in a partner account B. The CloudTrail trail is in account A. The target bucket is arn:aws:s3:::partner-export-B. The SecurityV0 connector was only scanning account A, so no entity exists in the graph for the partner bucket.

Under exact-match, the evidence record still has a populated target_resource field, but the materializer's filter (entityId + destinationResource) finds no matching path because no path was ever created to a resource we don't know about. The evidence becomes silently orphaned. This is acceptable behavior (we cannot attribute to a node that doesn't exist) — but only if we explicitly track it as "orphaned against an unmodeled target" rather than letting it disappear into a zero-count.

With canonical resource_key, orphaned-against-unmodeled becomes an observable metric: count of evidence records whose resource_key does not match any entity in the tenant. That's a useful health signal for the demo lab and for customers during onboarding.

3.7 IAM role paths

IAM roles can have a path: arn:aws:iam::111:role/service-role/MyRole. The connector keeps the full ARN (including the path) as source_id at transformer.py:287, but its internal nodeId helper at line 1774 drops the path (arn.split("/")[-1]). The same asymmetric treatment as the Secrets Manager suffix — kept in one place, stripped in another.

CloudTrail records userIdentity.arn with the path. Exact-match against source_id works. Exact-match against the nodeId doesn't. Any future code that accidentally queries against the wrong side of this pair will silently fail.

4. Why the simpler options fail

Before proposing the refactor, this section documents the two simpler alternatives and explains why each is insufficient.

4.1 Query-time canonicalization

"Normalize the ARN on both sides of the comparison inside the evaluator rule, at the moment the rule runs." Pros: smallest diff, no schema change. Cons:

  • Each rule that reads evidence must implement the same normalization, or normalization drifts across rules. Currently four rules touch evidence (privilege_justification_gap, authority-path-materializer.computeCurrentState, dormant_authority, unproven_execution); future rules will be more. Every new rule author must re-learn and re-implement the normalization.
  • Each AWS service has its own normalization rules (Lambda version suffix, Secrets Manager suffix, ECS revision, ECR digest, S3 object key, IAM path). That's a service-lookup table scattered across evaluator files.
  • The determinism project rule says two queries written by two authors must produce the same result. Scattered normalization gives us no way to enforce that beyond discipline.
  • Query-time normalization also cannot help with the principal-side failure. A CloudTrail event whose entity_id = "" is not reachable by any query regardless of how the target is normalized.

Rejected. Not deterministic in the multi-author sense. Does not solve the principal-side problem.

4.2 Ingest-time string canonicalization without a typed field

"Normalize target_resource at ingest inside graph-transformer.ts, and normalize source_id the same way at ingest. Queries compare raw strings that are now guaranteed to be in canonical form." Pros: normalization lives in one place (ingest); queries become straight equality. Cons:

  • Still reuses target_resource and source_id as the comparison keys. Those fields are already named and typed for their original purposes (the ARN as the source system saw it, the ARN as CloudTrail recorded it). Overloading them for canonical comparison conflates two concerns.
  • Once source_id is canonicalized at ingest, the original ARN is lost. Debugging "where did this entity come from in AWS" becomes harder because the ARN you see in Mongo is now a canonicalized derivative.
  • Does not give us an index. Mongo cannot index a derived string unless it is materialized on the document. Without an index, the per-path evidence query goes from O(matching rows) to O(all rows for workload) on every materialization — which on a large tenant with thousands of events per workload becomes a bottleneck.
  • Still does not address the principal side.

Rejected. Keeps the concern conflation and the index gap.

5. Proposed design: first-class canonical resource_key

5.1 Schema changes

v2 update: this section grew materially in response to cross-review finding #1 (path identity re-keying). What v1 described as "add an indexed field" is actually a change to path identity, path lineage, and the merge key in the path materializer.

5.1.1 New fields

Add a new optional field to four domain types:

  • EntityDoc.resource_key?: string — populated on entities whose nodeType is resource-class (resource, workload, or credential when the credential is a stand-in for a resource).
  • ExecutionEvidenceDoc.resource_key?: string — populated at ingest from the connector-supplied target_resource_key (see §5.3) with properties.target_resource as a fallback source for AWS only.
  • ExecutionEvidenceDoc.execution_count: numbernew in v2. The number of underlying events this document represents. Defaults to 1 for connectors that emit per-event evidence records (e.g., the current Entra/ServiceNow sign-in connector). Set to >1 by connectors that aggregate multiple events into a single document (e.g., sv0-connectors#31's proposed CloudTrail aggregation by (principal, eventName)). The materializer and the dormancy rules sum this field rather than counting documents.
  • ExecutionPath.resource_key?: string — copied from the destination entity's resource_key at path materialization time.

Add compound indexes:

  • entities: (tenant_id, resource_key) — enables "find me the resource entity for this canonical key" queries in constant time.
  • execution_evidence: (tenant_id, resource_key) — enables "find me all evidence that targeted this canonical key" queries in constant time, including the per-path filter the materializer needs.

Existing indexes stay. source_id is not removed and stays the authoritative source-system identifier on entity records.

5.1.2 Path identity and lineage — v2 addition

src/ingestion/authority-path-materializer.ts:164-194 currently computes path identity and lineage by hashing (tenantId, workloadId, identityId, destinationId) where destinationId is the Mongo _id of the destination entity (assigned at line 90: destination_id: destinationId = ep.resource_id, which in turn is resource._id from path-materializer.ts:158). Two distinct entities — for example ECS task definition revision 17 and revision 18 — have distinct Mongo _ids, so they produce distinct AuthorityPathDoc records even if they share a canonical resource_key. Evidence collected after a deploy (which always runs against the new revision) would attribute to the new path's row but the old path's row would persist with a zero count until it is pruned by sync-version drift, producing duplicate dormancy findings on every deploy.

The v2 design fixes this by re-keying path identity on resource_key instead of destinationId:

// v1 (current) — hashes the destination entity's Mongo _id
buildAuthorityPathId(tenantId, workloadId, identityId, destinationId)
buildPathLineageId(tenantId, workloadId, destinationId)

// v2 — hashes the canonical resource key
buildAuthorityPathId(tenantId, workloadId, identityId, resourceKey)
buildPathLineageId(tenantId, workloadId, resourceKey)

Concretely, two ExecutionPath entries with different resource_id but the same resource_key now collapse into a single AuthorityPathDoc. The merge in mergePathsToSameResource at path-materializer.ts:260 also changes its key from ${path.resource_id}::${path.via_identity ?? ""} to ${path.resource_key}::${path.via_identity ?? ""}.

destination_id is kept as a field on AuthorityPathDoc for backward-compat with anything that dereferences it into the entity collection, but its semantics change: when multiple entities share a resource_key, destination_id points to the first-seen entity (stable) and a new destination_ids: string[] field holds the full set. Rules that need to enumerate every concrete entity behind a canonical identity iterate the destination_ids array; rules that want a representative identity use destination_id.

Required null-safety rule: if an ExecutionPath has no resource_key (see §5.3 for when this happens), the materializer falls back to the v1 identity scheme using the entity's _id. This keeps non-AWS connectors working during their migration window at the cost of still producing duplicate paths for any source that hasn't been taught to emit canonical keys. Once every active connector emits resource_key, the fallback can be removed.

5.1.3 Query surface additions

In addition to extending ExecutionEvidenceQuery with destinationResourceKey?: string, the storage adapter gains a new method:

sumExecutionEvidenceCount(
tenantId: string,
query: Omit<ExecutionEvidenceQuery, "limit" | "afterTs" | "afterId">
): Promise<number>

Implementation uses a Mongo $group + $sum: "$execution_count" aggregation. The existing countExecutionEvidence is kept for callers that genuinely want "how many records" (e.g., health metrics), but the materializer and the dormancy / unproven rules call the new sum-based variant.

Open question (v2): should countExecutionEvidence be deprecated entirely? Keeping two query methods invites confusion. Proposal: mark it @deprecated with a doc comment pointing at sumExecutionEvidenceCount; remove after one sprint of no new usages.

5.2 Canonical key format

Cross-reference: this section defines the key format. Path identity re-keying (how paths collapse across entities that share a key) is in §5.1.2. Evidence grain (how execution_count is summed rather than document-counted) is in §5.1.3. Both are load-bearing for the proposal — the format alone does not fix the path-duplication or evidence-grain problems.

A typed fingerprint derived from (source_system, source_id, properties):

<provider>:<service>:<account>:<region>:<local-id>
  • provideraws, azure, entra, servicenow, github, etc. One per connector.
  • service — the AWS service name (s3, lambda, dynamodb, secretsmanager, ecr, ecs, iam, bedrock) or its equivalent in other providers (foundry, graph, table, oauth-app, …).
  • account — the AWS account ID for AWS, the Entra tenant ID for Entra, the ServiceNow instance name for ServiceNow. Always present; empty string (never null) for providers that don't have the concept.
  • region — AWS region for regional services. Empty string for global services (IAM, S3, CloudFront) and for non-AWS providers that aren't regional.
  • local-id — the service-specific canonical identifier.

Per-AWS-service local-id rules:

Servicelocal-idNormalization
S3 bucketbucket namelowercased; object path stripped
Lambda functionfunction nameunqualified; alias and version suffix removed
DynamoDB tabletable nameas-is
Secrets Manager secretsecret name-AbCdEf random suffix removed
SSM parameterparameter pathleading / normalized
ECR repositoryrepository nameimage digest excluded
ECS clustercluster nameas-is
ECS service<cluster>/<service>both names as-is
ECS task definition<family>revision number excluded
IAM rolerole nameIAM path stripped
IAM useruser nameIAM path stripped
IAM policypolicy namepath stripped
Bedrock agentagent IDopaque; as-is
Bedrock KBknowledge base IDopaque; as-is
Bedrock flowflow IDopaque; as-is
SNS topictopic nameas-is
Step Functions state machinestate machine nameas-is
EventBridge rule<event-bus>/<rule-name>both names as-is
CloudTrail trailtrail nameas-is

Examples for Nimbus Cloud:

aws:s3::::nimbus-customer-data
aws:s3::::nimbus-support-kb
aws:lambda:087380083467:us-east-2:nimbus-customer-data-export
aws:dynamodb:087380083467:us-east-2:nimbus-support-tickets
aws:secretsmanager:087380083467:us-east-2:nimbus/integrations/analytics-vendor-api
aws:ecs:087380083467:us-east-2:task-definition/nimbus-data-pipeline
aws:ecs:087380083467:us-east-2:nimbus-data-processing/pipeline-svc
aws:ecr:087380083467:us-east-2:nimbus-data-pipeline
aws:iam:087380083467::nimbus-data-pipeline-task-role

Key shape invariants:

  • Always five colons. Empty segments (global services, providers without an account/region concept) are represented as empty strings between colons. This makes parsing trivial and avoids the ambiguity of variable-length keys.
  • Case-sensitive. AWS is case-sensitive for most resource names. Lowercasing would cause ARN:...:MyRole and ARN:...:myrole to collide — and they might actually be different roles.
  • Local-id may contain / and :. ECS task definitions legitimately contain both. Any parser splitting on : must use maxsplit=5 (for a Python split) or equivalent.

5.3 Where normalization lives

Cross-reference: this section describes who populates the canonical key. The schema changes (new fields, indexes, execution_count) are in §5.1.1. Path identity re-keying is in §5.1.2. The sumExecutionEvidenceCount query surface is in §5.1.3. All three sections are required reading — this section alone does not convey the full scope of the v2 design.

v2 update: this section was significantly weakened in response to cross-review finding #3. The v1 claim that "connectors emit raw ARNs; the platform canonicalizes" was only true for AWS. Non-AWS connectors currently write display names into target_resource and cannot be canonicalized by the platform alone.

5.3.1 Current-state reality check

Verified against the actual connector code:

  • Azure Foundry: integrations/azure-foundry/src/azure_foundry/core/transformer.py:498 writes agent.name into target_resource. A Foundry agent name is not instance-scoped — the same display name in two different projects collides. The platform cannot derive a stable key from agent.name alone.
  • Entra / ServiceNow: integrations/entra-servicenow/src/entra_servicenow/core/transformer.py:1993-2190 writes resource_display_name, flow_name, job_name, table (e.g., "incident"), and record_number (e.g., "INC0012345") into target_resource. Most of these are ambiguous across tenants, and some of them (the table names) are literally the same string for every ServiceNow instance in the world.
  • AWS: integrations/aws/src/sv0_aws/core/transformer.py writes resource ARNs into sourceId (e.g., arn:aws:s3:::nimbus-customer-data) and currently writes nothing into target_resource because CloudTrail extraction (sv0-connectors#31) has not shipped. When it does ship, the proposed shape is aggregated-by-(principalArn, eventName) with request_parameters / resources discarded — so even AWS will need connector-side work to populate a canonicalizable target.

Only AWS sourceId is canonicalizable without connector-side knowledge today, and only because boto3's API returns full ARNs. Every other connector will silently produce resource_key: null under a platform-only helper, which means evaluator rules would produce zero findings against those sources.

5.3.2 Hybrid contract: target_resource_key at the connector-platform boundary

The v2 proposal splits the responsibility:

Connector side (for every connector, including AWS):

Add a new optional field target_resource_key: string to the ExecutionEvidence NormalizedNode in the NormalizedGraph Zod schema. Connectors populate this field using their own knowledge of the source system's stable identifier conventions. Examples:

  • AWS CloudTrail: canonicalize userIdentity.arn and the extracted target ARN per the rules in §5.2, emitting aws:s3::::nimbus-customer-data etc.
  • Entra sign-in: combine tenant ID + app ID + resource identifier URI into entra:<tenant>::<app-id>:<resource-uri>.
  • ServiceNow flow: servicenow:<instance>::<table>/<sys_id> — stable per-record identity instead of display name.
  • Azure Foundry: azure:foundry:<subscription>:<region>:<workspace>/<agent-id> — opaque agent ID, not display name.

Connectors that don't know the canonical form leave the field empty; their evidence remains unmatchable until the connector-side code catches up.

Platform side:

src/shared/resource-key.ts becomes a validator + fallback deriver, not a full canonicalizer:

  1. If the connector supplied target_resource_key, use it as-is. The platform optionally runs a sanity check that it matches the expected five-segment shape; a missing or malformed key logs a warning and falls through to the fallback.
  2. If the connector did not supply target_resource_key, the platform tries to derive one from the resource entity's sourceId. This fallback only works for sources whose sourceId is already in a recognizable canonical shape (AWS ARNs). For other sources it returns null.
  3. If both the connector and the fallback return nothing, resource_key is null on the evidence record. Rules that consume resource_key produce no findings on null-key records (deliberate strict contract — see §5.5).

Rationale for the split:

  • Single canonical-key spec still exists in this research doc (§5.2) — it is the contract between connectors and platform. Every connector must produce keys in the form specified here.
  • Per-connector interpretation lives in the connector because only the connector knows which source-system fields map to the canonical segments. The platform cannot guess that table=incident should become servicenow:nimbus-dev::incident/INC0012345 without understanding the connector's data model.
  • Determinism is preserved because the key format is centrally specified and tested against vectors; only the population is connector-side.
  • Cross-language coordination is bounded. The spec is in documentation (consumable by both TypeScript and Python). Each connector implements its own small helper that produces keys matching the spec. The platform runs a format validator against vectors from the spec.

Open question (v2): should the canonical-key format validator reject malformed keys at ingest (hard fail) or only warn and set resource_key: null (soft fail)? Hard fail catches connector bugs immediately; soft fail keeps the pipeline flowing during connector migration windows. Proposal: soft fail for now with structured warnings, hard fail behind a feature flag after two sprints of clean operation.

5.3.3 Per-connector migration plan

Because the non-AWS connectors emit display names today, each one needs a small migration to populate target_resource_key. These are tracked as separate issues, not bundled into #306:

ConnectorTodayMigrationIssue
AWSWrites ARNs into sourceId; target_resource currently empty on execution evidencePopulate target_resource_key from CloudTrail event fields per §5.2. Depends on sv0-connectors#31 shipping CloudTrail extraction.Folded into sv0-connectors#31
Azure FoundryWrites agent.nameUse <subscription>:<workspace>/<agent-id> with the agent's opaque ID, not nameNew issue, filed after #306 lands
Entra sign-inWrites resource_display_nameEmit <tenant>::<app-id>:<resource-uri> using the app's appId and the resource's stable URINew issue
ServiceNow flowWrites flow_nameEmit <instance>::flow/<sys_id> with the flow's sys_id (stable) instead of display nameNew issue
ServiceNow jobWrites job_nameSame pattern: use the job's sys_idNew issue
ServiceNow record accessWrites table + record_numberEmit <instance>::<table>/<sys_id>New issue

Until each migration lands, that connector's evidence remains unmatched and its rules continue to produce the same number of findings they produced before the refactor (which, per §2.1, is "zero actionable findings" for AWS and "loose display-name matches" for the others). The refactor does not introduce new wrongness on non-AWS sources; it just removes the fuzzy display-name fallback that made them look like they worked.

Platform ships first, connector migrations follow. The platform-side work in #306 can land before any connector migration because the null-key path is already specified and the rules fail safely.

5.4 The principal-side fix

Independent of the resource-key refactor, the AWS connector's _get_identity_node_id_from_arn must learn to parse assumed-role session ARNs. This is filed as a separate issue (SecurityV0/sv0-connectors#39) because it is small and self-contained, but it is load-bearing enough that the resource_key refactor alone cannot produce correct findings without it.

The fix is approximately 10 lines of Python, adding a third branch:

elif ":assumed-role/" in arn:
parts = arn.split(":")
account_id = parts[4]
resource = ":".join(parts[5:]) # "assumed-role/<RoleName>/<session-name>"
role_segment = resource[len("assumed-role/"):]
role_name = role_segment.split("/", 1)[0] # drop session
return f"aws_iam_role:{account_id}:{role_name}"

Determinism note: the session name is intentionally discarded. Sessions are per-invocation; keeping them would produce a distinct "identity" per CloudTrail event. Attribution must flow to the role, not the session.

Out-of-scope: federated identities (federated-user/...), SAML, OIDC post-initial-assumption sessions, AWS account root, service-owned principals (lambda.amazonaws.com). Those require the broader sv0-platform#300 cross-connector identity correlation work. The three-line branch above is the minimum to unblock #302 and #306 against Nimbus lab data.

5.5 Rule rewrites

v2 update: rewrite 1 expands to cover path identity re-keying and the switch from counting documents to summing execution_count. Rewrites 2-4 are unchanged from v1.

Four evaluator rules read execution evidence. All four need updating:

  1. authority-path-materializer.computeCurrentState (src/ingestion/authority-path-materializer.ts:57-116, 209-268) — the bulk of the refactor. Changes:

    • Refactor to accept a path parameter; call once per path inside the loop.
    • Batch-load all evidence for the workload once per iteration, then filter in memory by resource_key per path.
    • Sum execution_count, don't count documents (v2). The current countExecutionEvidence call is replaced with sumExecutionEvidenceCount, which performs a Mongo $group + $sum: "$execution_count" aggregation. For connectors that emit per-event evidence (every current non-CloudTrail source), execution_count = 1 and the result is identical to the old document count. For connectors that emit aggregated evidence (e.g., the sv0-connectors#31 CloudTrail plan), the sum captures the real event volume.
    • Path identity must be rebuilt with the canonical key (v2). buildAuthorityPathId and buildPathLineageId now hash resource_key instead of destinationId. See §5.1.2 for the full rationale. This is the load-bearing fix for cross-review finding #1.
  2. privilege_justification_gap (src/evaluator/rules/privilege-justification-gap.ts:48-50) — replace the broken resource_id || resource_name match with e.resource_key === path.resource_key. The display-name fallback is removed entirely.

  3. dormant_authority (src/evaluator/rules/dormant-authority.ts:10-34) — rewrite to read path.current_state.last_execution_at directly. The rule becomes a pure function over already-materialized state.

  4. unproven_execution (src/evaluator/rules/unproven-execution.ts:29-30) — same treatment as dormant_authority. Read path.current_state.execution_30d; fire when zero. (Note: the "zero" semantics now depend on the summed execution_count, not document count.)

After the rewrite, rules 3 and 4 no longer call ctx.getExecutionEvidence(...) at all. The materializer is the single authoritative source of path-level attribution; rules consume the materialized state.

6. Migration

v2 update: the migration plan has to deal with three distinct changes, not one:

  1. New resource_key field on entities and evidence (v1 scope)
  2. New execution_count field on evidence + path identity re-keying (v2 additions)
  3. Per-connector population of target_resource_key in the NormalizedGraph (v2 addition)

6.1 Nimbus-scale migration

Nimbus Cloud is currently the only non-test tenant. All execution_evidence and authority_paths records are recomputed by re-running the connector and re-submitting the graph. No migration script is needed for the fresh data that arrives via the next sync:

  • resource_key and execution_count are populated during ingestion on every fresh evidence record. execution_count defaults to 1 for records that don't carry it, so pre-v2 records (including any legacy Entra/ServiceNow evidence that might exist) keep working.
  • Path identity re-keying (v2, finding #1) is the sharp-edged part. Pre-refactor AuthorityPathDoc documents have _id hashes computed from the old destinationId (Mongo _id) rather than the new resource_key. On the first sync after the refactor ships, the materializer will:
    1. Compute new path IDs for every current ExecutionPath. These new IDs will not match any existing AuthorityPathDoc._id.
    2. Upsert the new documents as fresh (no first_seen_at history carried forward — see acceptance note below).
    3. Flag the orphaned pre-refactor documents as status: removed_by_migration in a one-time cleanup step (not a general-purpose migration script, just a specific cleanup for path_lineage_id values that don't match any current ExecutionPath).
  • Non-AWS connectors emitting target_resource_key is a per-connector migration; see §5.3.3. Until each connector ships its migration, its execution evidence will land with resource_key: null and its rules will produce no findings through the resource_key code path. Users of those connectors will see a drop in finding count compared to the pre-refactor display-name matching — the drop is correct (the old findings were ungrounded in stable identity), but it's a visible behavior change that has to be communicated.
  • privilege_justification_gap becomes strict after the rewrite. Null-key records produce zero findings. Pre-refactor records that persisted with the old matching logic get a one-time cleanup: findings tagged as having been fired against null-key paths are marked status: auto_retired so they don't clutter the active finding list.

6.2 Acceptance

The first Nimbus re-sync after the refactor ships should reproduce the existing baseline (104 nodes / 113 edges / 15 authority paths / 91 active findings, or whatever the current post-apply numbers are when the refactor lands) minus any findings that were being produced by the silent display-name fallback in non-AWS evidence paths. The exact expected delta depends on how many Entra/ServiceNow findings were previously firing via that fallback — I don't have a number for this because the pre-refactor baseline is dominated by AWS-sourced findings (Nimbus is AWS-only) and the Entra/ServiceNow connectors aren't currently feeding the Nimbus tenant.

Zero data loss on the entity side. Entity records are not re-keyed by this refactor; only the path materialization and evidence queries change. Existing ingested entities keep their Mongo _id values and existing UI deep-links to those IDs continue to work.

Path history semantics: the one cost of path identity re-keying is that first_seen_at resets for any path whose identity changed. For Nimbus this is acceptable because the demo lab's path history has been noisy anyway. For future tenants with long path histories, an explicit rehydration step is needed that reconciles pre-refactor lineage IDs with post-refactor ones by looking up the destination entity's resource_key. That step is not in scope for this refactor — it's deferred to a follow-up ticket to be filed only when a tenant with history-sensitive lineage actually needs it.

6.3 Ordering

The three changes can ship in the following order, each in its own PR:

  1. Schema additions (new fields + indexes + sumExecutionEvidenceCount query method). No behavior change yet. Safe to ship against current data.
  2. Canonical-key helper + graph-transformer.ts population (resource_key derived from sourceId for AWS; null for others). No rule changes yet. Safe against current data.
  3. Path identity re-keying + rule rewrites. This is the behavior-change step; the one that requires the cleanup of pre-refactor path documents. Ship with a feature flag if possible.
  4. Per-connector target_resource_key migrations (sv0-connectors). Each connector gets its own PR; rules for that connector's evidence start firing once its migration lands.

Steps 1-3 land in sv0-platform#306; step 4 is a per-connector workstream.

7. Test strategy

Canonical-key helper (src/shared/resource-key.ts): 100% line coverage via test vectors. Every row of the §5.2 table has at least one vector; the seven resource-specific landmines from §3 each have at least one vector that pins down the intended behavior. Cross-reviewers can contribute additional vectors by writing them into the test file and running the suite.

Ingestion (graph-transformer.ts): unit tests against representative AWS and non-AWS node shapes. Each test asserts that the expected resource_key is written to the entity/evidence record.

Materializer (authority-path-materializer.ts): the four scenarios from the first-pass #302 design, two new ones, plus three v2 additions driven by cross-review findings:

  1. Split dormancy. Workload with two paths; evidence targets only one destination. Path A is active, path B is dormant. Assert per-path execution_30d matches reality.
  2. Split proven. Workload with two paths; both destinations have evidence at different counts. Assert per-path counts differ.
  3. Zero match. Workload has evidence, but no evidence matches any of its paths. All paths show execution_30d = 0.
  4. Backwards compatibility. Workload with one path (the simple pre-refactor case). Assert behavior is unchanged from the old per-workload count for this trivial shape.
  5. Cross-account orphan. Evidence targets a resource whose account wasn't scanned; path-level count is zero but the evidence record still exists and is countable via a new "orphaned against unmodeled target" metric.
  6. Revision churn. ECS task def revision 17 has 30 execution records; the task def is updated to revision 18, which has 3 records. Assert both revisions collapse into the same resource_key and execution_30d reflects the combined 33.

v2 additions:

  1. Path identity collapses across revisions (cross-review finding #1). Same fixture as scenario 6, but the assertion is explicitly about AuthorityPathDoc count. Before the refactor, a scan with revisions 17 and 18 present produces two AuthorityPathDoc rows (one per destinationId). After the refactor, the same scan produces one row whose _id is the canonical-key hash. The test queries Mongo directly and asserts authorityPaths.countDocuments({workload_id, resource_key}) === 1. This test would fail against the v1 design and proves the v2 finding #1 fix.
  2. Aggregated evidence sums correctly (cross-review finding #2). Seed two execution_evidence records for the same workload + same resource_key: one with execution_count = 50 (simulating sv0-connectors#31 aggregation), one with execution_count = 1 (per-event from a different connector). Assert the materialized path.current_state.execution_30d === 51 and path.current_state.last_execution_at is the newer of the two source_timestamp values. This test would fail against any document-counting implementation.
  3. Null resource_key produces no findings (cross-review finding #3 + strict contract). Seed an execution_evidence record that has target_resource populated with a display name (e.g., "incident") and no target_resource_key. The platform-side helper cannot derive a key from this (it's not ARN-shaped), so resource_key is null. Assert that no rule fires against this evidence: dormant_authority, unproven_execution, and privilege_justification_gap all see zero matches. This is the deliberate strict contract — null-key evidence is unmatchable until the connector migrates.

Three new focused storage-adapter tests covering the destinationResourceKey filter and the new sumExecutionEvidenceCount method — one that filters positively with one aggregated record (execution_count = 42) and asserts the sum returns 42, one that filters with no matches and asserts the sum returns 0, and one that combines records with different execution_count values from multiple connectors and asserts the correct sum.

Rules (privilege_justification_gap, dormant_authority, unproven_execution): fixture-based tests where the same workload produces different findings on different paths. The critical new coverage is "two paths from one workload, different findings" — no existing test covers this.

Full evaluator suite: the 215 existing tests must pass before and after. Any that are accidentally depending on the smeared behavior get rewritten with explicit documentation of why.

End-to-end against Nimbus lab: after the refactor ships, re-scan Nimbus Cloud and re-submit to local platform. Expected deltas are documented in the #306 issue acceptance criteria.

8. Alternatives explicitly rejected

  • (A) Proceed with original #302 design. Rejected because of the three silent failures in §2.
  • (B) Query-time canonicalization. Rejected in §4.1.
  • (C) Ingest-time canonicalization without a typed field. Rejected in §4.2.
  • (D) Fold into sv0-platform#300 cross-connector identity correlation. The review considered this. Rejected as overscoped: the resource-side refactor proposed here stands on its own, and the minimum assumed-role principal fix (§5.4) is a 10-line connector patch that doesn't need the full #300 machinery. Bundling them would make the PR unreviewable without solving a different problem.
  • (E) Per-service tagged union instead of a single string key. Considered: instead of resource_key: string, use a structured resource_key: {provider, service, account, region, localId} object. Rejected because the five-segment colon-delimited string is strictly simpler for indexing (Mongo indexes string fields trivially; indexing a nested document requires an index-path-per-field) and for grep-based debugging. The parsing burden on the occasional consumer that needs to decompose the key is lower than the schema-complexity tax of a nested object.
  • (F) Content-addressed hash key. Considered: resource_key = sha256(provider:service:account:region:localId) for opacity. Rejected because SecurityV0's observability story depends on humans being able to grep resource_key:aws:s3::::nimbus-customer-data in logs. Opacity is a cost without a corresponding security benefit.
  • (G) Add resource_key as a filter-only field and leave path identity untouched. Added in v2 in response to cross-review finding #1. The original v1 draft effectively proposed this without stating it — resource_key was on the path payload but buildAuthorityPathId and buildPathLineageId still hashed the destination entity's Mongo _id. Rejected because two entities sharing a canonical key (ECS task definition revisions 17 and 18, a Lambda function across alias promotions, a DynamoDB table after a rename-rebuild pattern) still produce two distinct AuthorityPathDoc rows, so evidence collected under one identity does not attribute to the other and findings duplicate across every revision churn. Path identity has to move to the canonical key for the "stop smearing evidence" goal to actually hold.
  • (H) Change path identity but keep countExecutionEvidence as the count primitive. Added in v2 in response to cross-review finding #2. Proposes fixing path identity but keeping Mongo's countDocuments as the underlying count operation. Rejected because sv0-connectors#31's current CloudTrail plan aggregates events by (principal, eventName) into a single document per cluster. Under countDocuments, a Lambda that ran 1,000 times against one destination still counts as execution_30d = 1, which is indistinguishable from "ran once." The execution_count field and the sumExecutionEvidenceCount query method are the minimum fix; any implementation of #306 that doesn't include them just renames the bug.
  • (I) Platform-only canonicalization, connectors stay untouched. Added in v2 in response to cross-review finding #3. Proposes keeping the helper on the platform side only and deriving resource_key from entity sourceId for every source. Rejected because only AWS sourceId values are ARN-shaped today. Azure Foundry writes agent.name, Entra/ServiceNow writes table names and record numbers — neither can be canonicalized by the platform alone without connector-side knowledge of the source system's stable identifier. The v2 design introduces a connector-side target_resource_key field so each connector populates its own canonical key, with the platform doing fallback derivation only for ARN-shaped sources. Non-AWS connectors get explicit migration tickets per §5.3.3.

9. Risks and mitigations

v2 update: three new risk rows added for the path-identity, evidence-grain, and connector-contract changes.

RiskMitigation
Canonical-key rules drift as new AWS services are addedSpec lives in this doc (adopted); PRs adding new services must update the spec and add a test vector; CI fails on missing test vectors
A resource has two legitimately-different identities that canonicalize to the same key (false collision)The local-id rules are designed to avoid this (ECS task def family collapses revisions, which is intentional). Any future service whose identity includes more than name must document the invariant and add regression vectors.
Connector emits sourceId in a form the helper doesn't recognize (silent null key)Helper returns null for unrecognized shapes; a platform-side log warning fires at ingest; evaluator rules produce no findings on null-key entities (deliberate strict contract).
Principal side still has gaps for federated/SAML identities after the assumed-role fixAcknowledged; tracked as sv0-platform#300. Not a regression from current behavior; the current code also fails on these.
Nimbus demo findings count changes in a way that surprises a live demoAcceptance criteria include a before/after baseline capture; any change larger than expected is flagged and investigated before the refactor merges.
Mongo index builds are expensive on large collectionsNimbus-only dataset is small; index build is instant. At customer scale this would need to be a rolling index build, not a blocker for ship.
(v2) Path identity re-keying resets first_seen_at on existing pathsFirst sync after the refactor will produce new AuthorityPathDoc._id values that don't match pre-refactor rows. Acknowledged as acceptable for Nimbus (demo lab, noisy history already). At customer scale this needs a rehydration script that reconciles old and new lineage IDs by looking up the destination entity's resource_key — deferred until a history-sensitive tenant actually needs it.
(v2) Aggregated-evidence connectors produce wrong counts if they forget to set execution_countexecution_count defaults to 1 in the Zod schema. Connectors that aggregate but don't set execution_count silently undercount their events. Mitigation: the CloudTrail extractor PR (sv0-connectors#31) is blocked from merging until it populates execution_count. A schema validator in ingestion can optionally warn when execution_count === 1 on evidence records from a connector that's on a known-aggregating allowlist (future work).
(v2) Non-AWS connectors ship to customers without the target_resource_key migrationPer §5.3.3, Azure Foundry, Entra, and ServiceNow connectors all emit display names today. Until each one ships its migration, its execution evidence lands with resource_key: null and its rules produce zero findings through the canonical path. Mitigation: each migration is tracked as a separate issue; a pre-GA checklist gates customer rollout on every shipped connector having its migration. The platform's null-key contract ensures this is a quiet "no findings yet" rather than a crash.

10. Open questions (cross-review welcome)

Q1. Should resource_key be non-nullable for resource-class entities? If every AWS resource has a canonical form, there's no reason to allow null. A non-nullable field would force the canonicalization helper to handle every service. The concern: early-phase adoption may need a grace period while new services are added to the helper. Proposal: ship as nullable, enforce non-null after two sprint cycles.

Q2. Should the canonical key include a version/schema field? E.g., aws/v1:s3::::.... This would let future format changes coexist during migration. The concern: YAGNI vs forward-compatibility. Proposal: skip v1 now, require a new top-level field (resource_key_version) if we ever need to change the format.

Q3. How do we handle resources that legitimately share a canonical key across tenants? A public S3 bucket (e.g., arn:aws:s3:::public-artifact-bucket) has no account segment (aws:s3::::public-artifact-bucket). Two tenants that both reference this bucket would collide. Proposal: tenant is always the outer scope (all queries are (tenant_id, resource_key)), so tenant-level uniqueness is enforced at the index level even if the key itself is globally shared.

Q4. What's the canonical form for aggregation nodes? If we ever model "the set of all S3 buckets in account X" as a synthetic node (for coarse findings), what's its canonical key? Proposal: aggregation nodes are not addressable by resource_key at all; they're a different concept and should have a distinct identifier (aggregate_key?) to prevent accidental matching.

Q5. Does the helper need to be idempotent under re-ingestion? Yes — but this deserves a test vector. The same input must always produce the same key, even across connector restarts, platform upgrades, and different instance hardware. Proposal: the helper is a pure function of its arguments; no Date.now(), no random suffix, no dictionary ordering dependence.

Q6. Are there existing evaluator rules that rely on resource_name being populated that we'd break? The review grepped the four rules that touch evidence. No other rule was checked. An exhaustive resource_name grep of the evaluator directory should be the first step before the refactor lands; any surprises become additional scope.

Q7. How should the canonical key handle case-sensitivity differences between AWS APIs? S3 bucket names are lowercase-only. IAM role names are case-preserving but case-sensitive in ARNs (MyRolemyrole). DynamoDB table names are case-preserving. The current proposal is "preserve case from the source." Concern: if two connectors emit the same logical resource with different casing, they collide. Proposal: stay case-preserving (matches AWS semantics) and rely on the AWS connector to emit canonical-case names, which it already does by reading them from the AWS APIs.

Q8 (v2, from cross-review finding #1). When path identity re-keys onto resource_key, what happens to the relationship between AuthorityPathDoc.destination_id (a Mongo _id) and the underlying entity documents when multiple entities share a canonical key? Two options: (a) destination_id becomes the first-seen entity's _id with a new destination_ids: string[] field holding the full set; (b) destination_id is removed in favor of resource_key as the only destination reference, and any rule that wants a concrete entity walks the entities collection on demand. Option (a) is less disruptive but adds a field; option (b) is cleaner but breaks any caller that currently dereferences destination_id directly. Proposal: (a) for this refactor, (b) as a follow-up once there are no remaining destination_id dereferences.

Q9 (v2, from cross-review finding #2). Should execution_count be required (non-nullable) or optional-with-default-1 on ExecutionEvidenceDoc? Required makes the connector contract stricter and catches migration bugs immediately; optional is more lenient during the transition. Proposal: optional with default 1 at the Zod schema level, but the materializer treats any record where execution_count is absent and the connector is on a known-aggregating allowlist as a hard error. The allowlist is maintained in config, not code.

Q10 (v2, from cross-review finding #2). sv0-connectors#31 hasn't shipped yet. How should this document's adoption interact with #31's design? Proposal: #306 adds the execution_count field and sumExecutionEvidenceCount API unconditionally (no behavior change while execution_count defaults to 1). #31's CloudTrail extractor must populate execution_count with the real cluster size when it aggregates, and the #31 PR reviewer should block merge if it doesn't. The two PRs don't need to land in a specific order but #31 is incorrect until #306 provides the field to populate.

Q11 (v2, from cross-review finding #3). The Azure Foundry and Entra/ServiceNow target_resource_key migrations are filed as separate issues per §5.3.3. Should any of them be blocking dependencies for #306, or can #306 ship without them? Proposal: #306 is not gated on connector migrations. The platform side is correct on its own because of the null-key strict contract. Connector migrations ship on their own timelines and "turn on" rule firing for each connector as they land. Downside: the post-#306 Nimbus baseline may show fewer findings than pre-#306 on non-AWS sources (removal of display-name fuzzy matches). Upside: no single PR has to include changes across every connector repo.

11. Cross-review mandate

This document exists to be torn apart before any code lands. Specific things cross-reviewers should look for:

  • Have I misread any file:line citation? Every claim in §2 is grounded in a specific line number. If any line has been quoted out of context or misattributed, that's a significant finding.
  • Are there silent failures I haven't found? The first-round review uncovered three, which were folded into the v2 design. Are there four? Five? The v2 scope is larger than v1, so more surface to get wrong.
  • Is the canonical key format right? The per-service table in §5.2 is the most likely place for a subtle mistake. Each row is a decision; each decision has downstream consequences. Does any row break an existing flow?
  • Does the migration story hold? §6 now covers three separate migration surfaces (schema additions, path identity re-keying, per-connector target_resource_key). Each has its own failure mode. Is the ordering in §6.3 actually safe?
  • Is the v2 evidence-count story right? I'm assuming execution_count can be added to ExecutionEvidenceDoc as an optional field with default 1 and that sumExecutionEvidenceCount via Mongo $group performs acceptably. Both claims are unvalidated against a realistic data volume.
  • Is the scope right? Too small? Should #300 be folded in? Too big? Should the path identity re-keying be deferred to a separate PR from the schema additions?
  • Are the open questions (§10) the right open questions? Is there a Q12 I've missed?

The goal of cross-review is not consensus. It is to surface the highest-confidence criticism of each section so the author (me, and whoever implements) can adjust before committing to a direction.

12. Cross-review log

This section records critiques raised during adversarial review and how each was addressed.

Round 1 — 2026-04-09

Reviewer: first cross-review pass (anonymized for the log).

Finding #1 (P1) — Collapsing revisions in resource_key still leaves duplicate authority paths.

This adds resource_key to the path payload, but it does not change path identity. The live platform still merges paths by resource_id and builds authority-path IDs and lineage from destination_id, so ECS revisions 17 and 18 will still materialize as two distinct paths even if they share one canonical key. That means the same execution_30d can be stamped onto both revisions and findings will duplicate after each deploy. The proposal needs to re-key path identity on canonical destination identity, or explicitly keep revision-scoped paths instead of claiming family-scoped attribution.

Citations verified:

  • src/ingestion/path-materializer.ts:260mergePathsToSameResource keys by ${path.resource_id}::${path.via_identity ?? ""}, confirmed.
  • src/ingestion/authority-path-materializer.ts:69-90buildAuthorityPathId, buildPathLineageId, and the AuthorityPathDoc.destination_id field all use ep.resource_id (the destination entity's Mongo _id), confirmed.
  • src/ingestion/authority-path-materializer.ts:164-194 — both ID builders hash destinationId rather than any canonical form, confirmed.

Finding accepted as a P1 blocker. The v1 draft conflated "evidence can be attributed to the canonical key" with "path identity collapses across entities sharing the canonical key." They are separate changes and both are required. The v2 design addresses this in:

  • §5.1.2 — explicit re-keying of path identity via buildAuthorityPathId and buildPathLineageId
  • §5.5 rewrite #1 — the materializer change is now explicit about path identity
  • §6.1 — migration plan now handles the one-time cleanup of pre-refactor path documents
  • §7 scenario 7 — new test that asserts authorityPaths.countDocuments({workload_id, resource_key}) === 1 for a revision-churn fixture
  • §8 alternative (G) — the "filter-only" design is now explicitly rejected with reasoning that directly references this finding

Finding #2 (P1) — Matching by resource_key does not fix the evidence-grain bug.

This rewrite assumes evidence can be batch-loaded once and then counted per path by resource_key, but the linked CloudTrail plan still aggregates by principal + eventName and the platform counts execution-evidence documents rather than an executionCount field. Even after target matching is fixed, execution_30d will stay numerically wrong unless the design also changes evidence grain or teaches the platform to sum a numeric count. Please make that dependency explicit here, because the current proposal and test plan overstate what #306 can guarantee.

Citations verified:

  • src/storage/mongo/adapters/execution-evidence-adapter.ts:85-91countExecutionEvidence calls this.c.executionEvidence.countDocuments(filter), confirmed. There is no field on ExecutionEvidenceDoc that holds a numeric event count; it was truly counting documents.
  • src/domain/evidence/types.ts — no existing execution_count, event_count, or cluster_size field. Confirmed.
  • sv0-connectors#31 original proposal — aggregates CloudTrail by (principalArn, eventName), which means one Mongo document per cluster. Confirmed against the issue body.

Finding accepted as a P1 blocker. A design that proposed to "count documents to get execution_30d" would have shipped incorrect counts the moment #31 started emitting aggregated records. The v2 design addresses this in:

  • §5.1.1 — new ExecutionEvidenceDoc.execution_count: number field, default 1
  • §5.1.3 — new sumExecutionEvidenceCount query method via Mongo $group + $sum
  • §5.5 rewrite #1 — materializer sums execution_count instead of counting documents
  • §6.3 — ordering clarifies that the schema addition ships first and the consuming logic second
  • §7 scenario 8 — new test that seeds two records with execution_count = 50 and execution_count = 1 and asserts the sum is 51
  • §8 alternative (H) — the "keep countExecutionEvidence as the count primitive" design is now explicitly rejected
  • §9 — new risk row for connectors that forget to set execution_count
  • §10 Q9, Q10 — new open questions about the schema-level required-ness and the dependency relationship with sv0-connectors#31

Finding #3 (P2) — Platform-only normalization is too strong for the current connector contract.

The placement rationale here assumes connectors emit raw IDs that the platform can canonicalize later, but that is not true today. Azure Foundry and Entra/ServiceNow execution evidence currently store display names in target_resource, and some AWS shapes only become canonical after interpreting CloudTrail fields in the connector. Without a stronger connector-to-platform target-identity contract, a platform-only helper is not actually future-proof for non-AWS providers and will silently fall back to null keys.

Citations verified:

  • integrations/azure-foundry/src/azure_foundry/core/transformer.py:498 — writes target_resource = agent.name, confirmed.
  • integrations/entra-servicenow/src/entra_servicenow/core/transformer.py:1993-2190 — writes resource_display_name, flow_name, job_name, table (e.g., "incident"), record_number (e.g., "INC0012345"). All display names. Confirmed.

Finding accepted as a P2. Escalated to P1 during v2 authoring because the "silent fall back to null keys" impact is larger than a P2 suggests — every non-AWS connector would silently lose all rule findings that depend on evidence attribution until their migration ships. The v2 design addresses this in:

  • §5.3.1 — new current-state reality check section that grounds the finding in specific connector code
  • §5.3.2 — new hybrid contract with target_resource_key on the ExecutionEvidence NormalizedGraph node; connectors populate, platform falls back for ARN-shaped sources only
  • §5.3.3 — per-connector migration table with separate issue tracking
  • §8 alternative (I) — the "platform-only, connectors untouched" design is now explicitly rejected
  • §9 — new risk row for non-AWS connectors shipping without the migration
  • §10 Q11 — new open question about whether the connector migrations block #306 or ship independently

Scope impact of v2

Rough production LoC estimate has grown from v1:

Layerv1v2Delta
Canonical-key helper + tests~150 + ~100~150 + ~1000
Entity/evidence schema + graph-transformer.ts + indexes~50 + ~50~80 + ~80+30 prod, +30 test (adds execution_count, updates Zod)
Materializer + rule rewrites~150 + ~200~250 + ~300+100 prod, +100 test (adds path identity re-keying, sumExecutionEvidenceCount, three new scenarios)
Path identity migration cleanup0~50 + ~40+50 prod, +40 test (new)
Connector-side assumed-role ARN parser~10 + ~30~10 + ~300
target_resource_key Zod schema addition0~20 + ~20+20 prod, +20 test (new)
Totals~360 prod + ~380 test~560 prod + ~570 test+200 prod, +190 test

Per-connector target_resource_key migrations (§5.3.3) are tracked separately and do not count against the #306 budget. Each is estimated at ~50 production LoC + ~40 test LoC, spread across Azure Foundry, Entra, and ServiceNow source-connector PRs.

Round 2 — 2026-04-10 (Codex)

Reviewer: Codex (adversarial design review, no tests run).

Codex re-raised all three findings from round 1. Each is addressed below with a verdict.

Finding #1 (P1) — "Collapsing revisions in resource_key still leaves duplicate authority paths"

This adds resource_key to the path payload, but it does not change path identity. The live platform still merges paths by resource_id and builds authority-path IDs and lineage from destination_id, so ECS revisions 17 and 18 will still materialize as two distinct paths even if they share one canonical key.

Verdict: duplicate of round 1 finding #1, already addressed in v2. §5.1.2 (lines 294-318) explicitly rewrites buildAuthorityPathId and buildPathLineageId to hash resource_key instead of destinationId. The mergePathsToSameResource merge key at path-materializer.ts:260 also changes from resource_id to resource_key. Codex cited lines 347-349 (§5.2, the key format section) which does not discuss path identity — that discussion is in §5.1.2.

Action: added cross-reference callout at the top of §5.2 pointing readers to §5.1.2 and §5.1.3. No design change needed.

Finding #2 (P1) — "Matching by resource_key does not fix the evidence-grain bug"

This rewrite assumes evidence can be batch-loaded once and then counted per path by resource_key, but the linked CloudTrail plan still aggregates by principal + eventName and the platform counts execution-evidence documents rather than an executionCount field.

Verdict: duplicate of round 1 finding #2, already addressed in v2. §5.1.1 (line 274) adds execution_count: number to ExecutionEvidenceDoc, defaulting to 1. §5.1.3 (line 320) adds sumExecutionEvidenceCount using Mongo $group + $sum: "$execution_count". The materializer and dormancy rules call the new sum method instead of countDocuments. Codex cited lines 372-380 (§5.3) which discusses who populates the key, not how evidence is counted — that discussion is in §5.1.1 and §5.1.3.

Action: added cross-reference callout at the top of §5.3 pointing readers to §5.1.1, §5.1.2, and §5.1.3. No design change needed.

Finding #3 (P2) — "Platform-only normalization rationale is too strong for the current connector contract"

Without a stronger connector-to-platform target-identity contract, a platform-only helper is not actually future-proof for non-AWS providers and will silently fall back to null keys.

Verdict: duplicate of round 1 finding #3, already addressed in v2. §5.3.2 introduces the hybrid target_resource_key contract where each connector populates the field using its own knowledge. §5.3.3 has a per-connector migration table. The §5.3 section itself opens with a v2 update note explaining that the v1 claim was weakened. Codex cited lines 331-346 which is §5.3.1 (the "current-state reality check" subsection added in v2 to document the problem) — but the solution is in §5.3.2, immediately following.

Action: the cross-reference callout added to §5.3 now also helps here. No design change needed.

Codex round 2 summary

All three findings are duplicates of round 1, already addressed in the v2 revision (commit 1a465c1). The root cause of the re-raise is a readability issue: the v2 fixes are concentrated in §5.1 (schema changes), but the sections Codex focused on (§5.2 key format, §5.3 normalization) did not cross-reference §5.1 strongly enough. Cross-reference callouts have been added to both sections.

No design changes were required. The document's technical content is unchanged from v2.

Round 3 — TBD

If additional cross-review is desired, findings will be logged here with the same structure.

Next Action

Status: research-in-progress

Decision needed from: Ivan (CTO) + cross-review by at least two additional models before flipping to research-complete.

Options:

  1. Adopt — create implementation tickets (already filed as SecurityV0/sv0-platform#306, SecurityV0/sv0-platform#307, SecurityV0/sv0-connectors#39) and begin implementation on the refactor as specified here.
  2. Adopt with modifications — make specific changes to §5 (proposed design) or §10 (open questions) based on cross-review findings, then proceed.
  3. Defer — park the refactor, ship the first-pass #302 design with explicit acknowledgment that it will produce zero findings against AWS data until #306, and come back to the refactor later.
  4. Reject — cross-review surfaces a fundamentally better approach; this document is archived with the reasoning.

GitHub Issues:

  • SecurityV0/sv0-platform#306 — canonical resource_key refactor (implementation tracker; links here)
  • SecurityV0/sv0-platform#307privilege_justification_gap silent no-op (depends on #306)
  • SecurityV0/sv0-connectors#39 — assumed-role ARN parser gap (independent, pairs with #306)
  • SecurityV0/sv0-platform#302 — path-scoped execution evidence attribution (original scope; ultimately closed by whichever PR lands the rule rewrites)
  • SecurityV0/sv0-documentation#164 — trailing-whitespace pre-commit hook housekeeping (surfaced by the MDX fix subagent; unrelated to this doc but filed to avoid future damage)

What "adopted" means for this doc: the status in the frontmatter flips to adopted, the Next Action section is updated with the implementation start date and the PR links as they land, and this doc becomes the canonical reference that all four linked issues point back to for rationale.