ADR-002: Single Entities Collection

Status

Accepted (2026-01-27)

Context

During implementation, we evaluated whether the single entities collection approach would scale, or if we should split entities into separate collections.

Options Considered

Approach A (Current): Single entities collection with entity_type discriminator
Approach B: Separate collection per type (identities, owners, roles, permissions, resources, credentials)
Approach C: Type collections + separate relationships edge collection

Concerns Addressed

MongoDB 16MB document limit
Query performance for security investigations
Operational complexity (backups, indexes)
Neo4j migration path
Alignment with materialized execution paths strategy

Decision

Keep single entities collection with embedded relationships.

All entity types (identity, owner, role, permission, resource, credential) stored in one collection, discriminated by entity_type field.

Rationale

1. Document Size Not a Concern

Typical identity with 50 execution paths: ~10KB
High-privilege identity with 200 paths: ~40KB
Would need 800+ paths to approach 16MB limit
At 1K identities: average 50KB per document (well within limits)

2. Query Performance Comparison

Query Pattern	Single Collection	Type Collections	Edge Collection
Blast radius	1 query (O(1))	1 query	1-2 queries
Ownership chain	2 queries	2+ queries	3+ queries
On-demand path	4 queries	6 queries	7+ queries
Mixed entity query	1 query	6 queries	6+ queries

Single collection minimizes round-trips for security investigation queries.

3. Operational Simplicity

Dimension	Single	Type Collections	Edge Collection
Collections to manage	6	11	12+
Entity indexes	8	30+	40+
Write routing	Simple	Complex (6-way)	Very complex

4. Aligns with Materialized Paths

The platform's core performance strategy is pre-computed execution paths stored directly on identity documents. This approach:

✅ Paths embedded on identities
✅ No join complexity at query time
✅ Application-level traversal is simple

Edge collection would conflict with materialized paths, requiring either:

Keep paths embedded (defeats edge collection purpose)
Move paths to separate collection (extra lookup overhead)
Recompute on query (defeats materialization)

5. Industry Standard Pattern

Single collection with discriminator is the idiomatic MongoDB pattern for polymorphic documents. Used in production at scale across the industry.

Consequences

Positive

Simplest mental model (one place for entities)
Optimal performance for security queries
Minimal operational overhead
Clean Neo4j migration (single collection to iterate)
Schema evolution without migration (add enum value)

Negative (Acceptable Trade-offs)

Identity documents grow with execution paths (5-50KB typical)
Non-atomic path materialization across documents (eventual consistency)
Full collection scans if not filtering by entity_type (mitigated by indexes)

When to Reconsider

Trigger to Split Identities

Only if ALL of:

Tenants exceed 10,000 identities AND
Identity documents consistently exceed 100KB AND
Query performance degrades on entity_type scans

Response: Split to 2 collections (identities, other_entities) — minimal change via StorageAdapter.

Trigger for Edge Collection

Never for MongoDB-only architecture.

Only consider if Neo4j already deployed AND need graph analytics on raw relationships.

03-database.md - Full database architecture
ADR-001 - MongoDB-only decision

Status​

Context​

Options Considered​

Concerns Addressed​

Decision​

Rationale​

1. Document Size Not a Concern​

2. Query Performance Comparison​

3. Operational Simplicity​

4. Aligns with Materialized Paths​

5. Industry Standard Pattern​

Consequences​

Positive​

Negative (Acceptable Trade-offs)​

When to Reconsider​

Trigger to Split Identities​

Trigger for Edge Collection​

Related Documents​