ADR-001: MongoDB Only for MVP

Status

Accepted (2026-01-22)

Context

SecurityV0 has two primary data workloads:

Graph traversal - Execution path queries: "What can this identity reach through which roles?"
Temporal analysis - Point-in-time queries, drift detection over time

The question: Should we use a single database (MongoDB) or a hybrid approach (Neo4j for graph + TimescaleDB for temporal)?

Evaluation Criteria

Rich document storage (full policy JSON, raw API responses)
Graph traversal performance (3-5 hop paths typical)
Point-in-time subgraph queries
Operational simplicity
Team familiarity

Decision

MongoDB as the single database for MVP. All data — entities, relationships, versions, events, findings, evidence — lives in MongoDB.

Graph queries are handled through:

Materialized execution paths - Pre-computed on each identity during sync
Denormalized reverse lookups - accessible_by arrays on resource documents
Application-level traversal - Follow relationship references for on-demand queries

Future enhancement: When scale requires it (10,000+ identities, 5+ hop paths), add Neo4j as a thin graph index over MongoDB. The StorageAdapter interface enables this without changing connectors or API.

Rationale

MongoDB Advantages

Capability	MongoDB	Neo4j+TimescaleDB
Rich documents	Native (embedded JSON)	Poor (flat properties)
Point-in-time queries	Direct read (versioned docs)	Event reconstruction
Operational complexity	Single system	Two systems to manage
Team familiarity	High	Low

MongoDB Limitations (Acceptable)

$graphLookup has memory limits and single-collection constraint
No index-free adjacency (index lookup at each hop)
Variable-length path queries less efficient than native graph DB

Why These Limitations Are Acceptable

At MVP scale (< 1,000 identities, 2-3 connectors):

Materialized paths enable O(1) blast radius queries
Application-level traversal is fast enough for on-demand queries
Reverse queries via denormalized arrays are efficient

Why Not Neo4j for MVP

Overkill for MVP scale - Native graph traversal not needed at < 1,000 identities
Poor document support - Flat properties can't store rich policy JSON efficiently
Dual-write complexity - Two databases means consistency challenges
Operational overhead - Two systems to manage, backup, monitor

Consequences

Positive

Simpler mental model (one database)
Rich document storage for policy JSON and raw API responses
Direct point-in-time queries (no event reconstruction)
Team can move fast with familiar technology
StorageAdapter abstraction protects future migration

Negative (Acceptable Trade-offs)

Graph queries require application-level traversal
Materialized paths must be recomputed on sync
Deep path queries (5+ hops) become expensive at scale

Risks Mitigated

✅ StorageAdapter interface isolates connectors/API from storage implementation
✅ Neo4j migration path designed and documented
✅ Materialized paths provide O(1) blast radius at any scale

When to Reconsider

Add Neo4j when ANY of these occur:

10,000+ identities per tenant with 5+ connectors
5+ hop transitive chains (cross-system paths)
Path recomputation becomes sync bottleneck
Real-time reverse queries required at scale

03-database.md - Full database architecture
ADR-002 - Single collection decision
ADR-003 - Apache AGE rejection

Status​

Context​

Evaluation Criteria​

Decision​

Rationale​

MongoDB Advantages​

MongoDB Limitations (Acceptable)​

Why These Limitations Are Acceptable​

Why Not Neo4j for MVP​

Consequences​

Positive​

Negative (Acceptable Trade-offs)​

Risks Mitigated​

When to Reconsider​

Related Documents​