Deployment and Cloud Strategy Research

Date: 2026-02-07 Scope: Core platform (api, query, trigger evaluator, evidence generator, ui), database, connectors, scheduling/triggers, logging/observability, and autonomous operations support.

1. Decision Criteria

Primary criteria used for the options below:

Speed to first pilot (days/weeks, not months)
Deterministic operability (easy to inspect logs and state via CLI)
Cost predictability at low volume
Migration path without re-platforming core code
Support for autonomous troubleshooting (agents/automation can query logs/metrics with narrow permissions)

2. Deployment Options

Option A: MVP Self-Hosted VM (Hetzner + Docker Compose)

Shape

One VM for app services (api, ui, connectors, trigger, evidence) via Docker Compose
One VM for data plane (mongodb + backup job) OR same VM for earliest MVP
Optional third small VM for observability stack (Loki/Prometheus/Grafana)

Pros

Fastest setup and lowest fixed cost
Full root/SSH control for debugging
Easy to run all services together and iterate architecture quickly

Cons

You own HA, backups, patching, and security hardening
Manual scaling and failover
Higher operational risk for customer-facing production

Fit

Best for internal demo, design-partner pilot, and rapid product iteration

Option B: AWS ECS on EC2 (Production-leaning, lower complexity than EKS)

Shape

ECS services for core platform
Connector workers as separate ECS services or scheduled tasks
MongoDB remains self-managed initially (EC2) or moved to Atlas later
CloudWatch logs/metrics + ECS Exec for diagnostics

Pros

ECS EC2 launch type has no additional ECS control plane charge
Better IAM, networking, and security controls than single VM
Easier ops than EKS

Cons

You still manage EC2 capacity and patching
Still non-trivial if MongoDB remains self-managed

Fit

Strong mid-stage target when you want AWS controls without Kubernetes overhead

Option C: AWS ECS on Fargate (Managed container runtime)

Shape

Core services on ECS Fargate
Connector jobs as on-demand/scheduled Fargate tasks
DB likely moved to managed offering for better reliability

Pros

No node management
Clean scaling model per task
Good fit for bursty connectors

Cons

Can cost more than EC2 at steady state
Per-task compute pricing needs tight right-sizing

Fit

Good for production where team bandwidth for infra ops is limited

Option D: AWS EKS (Strategic only if Kubernetes-native operating model is required)

Shape

Full Kubernetes platform for core services and workers
GitOps (Argo CD/Flux) + HPA/KEDA + service mesh (optional)

Pros

Maximum flexibility and ecosystem
Standardized platform if broader org already runs Kubernetes

Cons

Highest platform complexity
EKS cluster fee applies regardless of workload size

Fit

Use only when there is a clear multi-team/platform reason

Option E: Event-Driven Connectors/Triggers with Lambda (Selective)

Shape

Keep core API/query services containerized
Move selected connector ingestion and trigger evaluation jobs to Lambda + EventBridge

Pros

Excellent for bursty workloads
Fine-grained cost for sporadic jobs

Cons

Distributed debugging complexity
Cold start/runtime constraints for heavy workloads

Fit

Good complement later, not a full replacement of core graph platform

3. Recommended Phased Strategy

Phase 0 (MVP, now): Hetzner + Docker Compose

Target timeline: 1-3 weeks

Baseline architecture

VM-1: api, query, trigger, evidence, ui, connector workers
VM-2: mongodb + nightly encrypted backup + restore test
Optional VM-3: observability stack if needed

Critical controls for MVP

Immutable image tags per deploy
Backup + restore drill (weekly)
Basic SLO dashboard (API error rate, connector failure rate, sync latency)
Structured JSON logs with correlation IDs (tenant_id, sync_id, entity_id, finding_id)

Why this is acceptable

Fastest path to pilot while preserving deterministic debugging via SSH + Docker logs

Phase 1 (Pilot to early production): AWS ECS on EC2

Target timeline: after first pilots, before broader customer rollout

Move plan

Migrate app services from Compose to ECS task definitions
Keep connectors as separate services/tasks
Use CloudWatch for centralized logs/metrics
Enable ECS Exec for break-glass diagnostics

Reasoning

Materially better security and operability than raw VMs without full Kubernetes burden

Phase 2 (Production scale): ECS Fargate or EKS by clear trigger

Choose ECS Fargate when

Team wants managed runtime and predictable operational model
Workloads are moderate and scaling is service/task oriented

Choose EKS when

Organization already has Kubernetes platform team
Need advanced Kubernetes-native controls that ECS cannot reasonably satisfy

Phase 3 (Optimization): Selective Lambda for burst jobs

Use Lambda/EventBridge for:

scheduled lightweight connectors
enrichment jobs
periodic housekeeping/reconciliation

Keep long-running graph/API workloads containerized.

4. Observability and Logging Strategy

MVP observability (low cost, high control)

Option MVP-Obs-A: Self-hosted LGTM stack

Loki + Promtail for logs
Prometheus + Alertmanager for metrics
Grafana for dashboards
Optional Tempo for traces

Pros:

Minimal vendor cost
Full control
Easy CLI access (docker logs, logcli, promtool)

Cons:

You operate it
Retention and scaling need discipline

Option MVP-Obs-B: Managed SaaS light footprint

Grafana Cloud or Better Stack for logs/metrics/traces
Keep app on Hetzner

Pros:

Lower operational overhead
Fast setup and team-friendly UI

Cons:

Ongoing ingest/retention costs
Vendor dependency

Production observability

Option Prod-Obs-A: AWS-native

CloudWatch Logs + CloudWatch metrics + alarms
Optional Amazon Managed Grafana and AMP

Pros:

IAM-native access control
Deep ECS/EKS integration
ECS Exec and CloudWatch CLI improve remote diagnostics

Cons:

Cost can rise quickly with log volume and query scanning

Option Prod-Obs-B: Hybrid

Keep CloudWatch for platform logs
Route application logs/metrics to Grafana Cloud or Better Stack

Pros:

Better query UX/correlation in some cases
Can reduce operational toil

Cons:

Dual tooling and data egress considerations

5. MVP Cost Comparison And Ready-to-Go Estimate

All numbers below are high-level estimates using public list pricing and simple assumptions.

5.1 Assumptions Used

Region assumptions:
Hetzner EU pricing from public cloud page.
AWS us-east-1 style list pricing anchors.
Workload assumptions:
Core services run 24x7.
1 pilot tenant, low traffic.
30 GB/month log ingestion.
300 GB persistent block storage for MongoDB data and snapshots.
CI/CD assumptions:
GitHub Actions private repo includes 2,000 free Linux minutes/month; overage priced at $0.008/min.
Exclusions:
VAT/tax, support plans, data egress spikes, incident-response labor.
One-time engineering setup cost.

5.2 Unit Cost Anchors (Public Pricing)

Hetzner Cloud examples:
CPX21 = EUR 9.49/month
CPX31 = EUR 16.49/month
CPX41 = EUR 30.49/month
Hetzner Load Balancer = from EUR 5.39/month
Hetzner Object Storage = from EUR 4.99/month
AWS:
ECS on EC2: no additional ECS fee beyond underlying resources.
EKS control plane: $0.10/hour per cluster.
Fargate Linux x86 (on-demand): vCPU $0.000011244/second, memory $0.000001235/GB-second.
EC2 on-demand (reference class):
t2.medium = $0.0464/hour
t2.large = $0.0928/hour
Application Load Balancer:
base $0.0225/hour
LCU $0.008/LCU-hour
CloudWatch Logs examples:
ingest $0.50/GB
archive $0.03/GB-month
GitHub Actions:
private repos include 2,000 minutes/month free
Linux 2-core overage $0.008/min
Grafana Cloud Pro:
$19/month platform fee
includes 50 GB logs + 50 GB traces
beyond included logs/traces $0.50/GB

5.3 MVP Option Comparison (Monthly, High-Level)

MVP Option	What is included	Estimated Monthly Cost
A. Hetzner Lean	`1x CPX31` all-in-one app+db, object storage backups	~EUR 21.48
B. Hetzner Ready-to-Go (recommended MVP)	`1x CPX31` app/workers + `1x CPX41` MongoDB + `1x CPX21` observability + LB + object storage	~EUR 66.85
C. AWS ECS on EC2 (MVP production-like)	`2x t2.medium` ECS nodes + `1x t2.large` MongoDB + ALB + 300GB EBS + 3 IPv4 + 30GB CloudWatch logs	~USD 205 to USD 240
D. AWS ECS Fargate + EC2 MongoDB	Fargate core services (`3 vCPU`, `6 GB`) + `1x t2.large` MongoDB + ALB + EBS + IPv4 + CloudWatch logs	~USD 245 to USD 280

Notes:

AWS ranges reflect whether NAT Gateway is needed (+~USD 33/month base before traffic processing) and modest ALB LCU variability.
Option C/D are intentionally sized as practical MVP production baselines, not ultra-minimal single-instance demos.

5.4 Ready-to-Go Deployed Solution Estimate (Recommended MVP)

Recommended for first customer pilot:

Platform: Hetzner Compose split-node MVP (CPX31 + CPX41 + CPX21 + LB + Object Storage)
Observability: self-hosted LGTM on the CPX21 node
CI/CD: GitHub Actions auto-deploy to staging on every main commit, prod via protected approval

Estimated monthly run cost:

Infrastructure subtotal: ~EUR 66.85/month
GitHub Actions overage:
0 if under included 2,000 private Linux minutes
Example overage (+1,000 min) = ~USD 8/month
Optional managed observability alternative:
replace self-hosted observability node with Grafana Cloud Pro (USD 19/month), typically reducing ops burden

Planning budget for pilot:

Target run-rate: ~EUR 67 + USD 0 to 20/month (depending on CI and observability choice)
Operationally safe envelope with contingency: ~EUR 90 to 140/month equivalent

5.5 Hetzner-Range Competitors (US-Focused)

Reference date: 2026-02-07. Prices are entry-level and can change.

Provider	Entry Price (Approx)	US Presence	Fit Notes
OVHcloud US VPS	`from $4.20/month`	US regions including Hillsboro and Vint Hill	Strong price/perf, daily backups and anti-DDoS included in VPS range.
Vultr Cloud Compute	`from $2.50/month` (IPv6-only) or `$5/month` standard entry	Broad US city coverage (e.g., Atlanta, Chicago, Dallas, Los Angeles, Miami, New York area, Seattle, SF Bay Area)	Good balance of low price and many US regions.
DigitalOcean Droplets	`from $6/month` for 1GB basic droplets	US datacenters in NYC, SFO, ATL	Usually higher than Hetzner at equal specs, but simpler operations and good DX.
Contabo Cloud VPS	`Cloud VPS 10` total shown around `EUR 5.45 to 5.90` with US location fees	US East, US West, US Central	Very low monthly price; verify performance consistency for production workloads.
IONOS Cloud Cubes	`from $5.76 per 30 days`	Newark, NJ (US)	Cost-effective and simple billing model for small footprints.
UpCloud Developer Plans	`from $3.5/month` (USD pricing)	US in Chicago, New York, San Jose	Competitive low end with good US footprint and predictable plans.
AWS Lightsail	`from $3.50/month` (IPv6-only) and `$5/month` standard entry	Multiple US regions	Simple AWS entry point; costs can rise once add-ons/managed services are added.

Selection guidance for SecurityV0 MVP:

Choose Vultr when you want many US regions with low entry cost and simple VM operations.
Choose OVHcloud US when you prioritize cost and built-in VPS protections.
Choose DigitalOcean when developer workflow and operational simplicity are more important than minimum price.
Choose Contabo/IONOS/UpCloud when monthly floor cost is the main driver and you can validate workload behavior early.

6. CLI and Agent Access Requirements

Required for both human and autonomous troubleshooting:

Centralized logs accessible by CLI
Narrow-scoped read-only credentials for diagnostic agents
Correlation IDs across services and jobs
Ability to access runtime shell only via auditable controls

CLI paths by platform

Compose/VM: ssh, docker compose logs, docker logs
ECS: aws logs tail, aws logs start-query, aws ecs execute-command
EKS: kubectl logs, kubectl describe, kubectl top

Security model

No shared root credentials
Role-based temporary credentials (OIDC where possible)
Full audit trail for interactive access (ECS Exec supports CloudTrail auditing)

7. CI/CD Plan (MVP Mandatory)

GitHub Actions should perform automatic deployment on commit to main.

MVP pipeline (Hetzner + Compose)

lint-test

Run tests and static checks.

build-publish

Build container images.
Push to registry.

deploy-staging (auto on main)

SSH to staging VM.
Pull latest images.
docker compose up -d.
Run smoke checks.

deploy-prod (manual approval until stable)

Same flow as staging with protected environment approval.

Auth and secrets

Prefer OIDC to cloud where applicable.
For VM SSH, use short-lived deploy keys and restricted command scope.

Drift prevention

Store deployment manifests in Git.
Record deployed image digest.
Keep rollback command and previous digest available.

8. Strategic CI/CD Evolution

As platform matures:

Move from SSH-based deploys to GitOps (ECS task defs in Git, or EKS manifests via Argo CD/Flux).
Add canary or blue/green deployment patterns.
Add policy gates for schema migrations and evidence-pack integrity checks.

9. Recommended Plan for SecurityV0

Start with Hetzner Compose MVP for speed, but split DB onto separate node early.
Implement structured logging and correlation IDs before pilot.
Add minimal observability stack now (self-hosted LGTM or managed low-tier).
Enforce auto-deploy via GitHub Actions to staging on each main commit.
Migrate to ECS on EC2 as first production-grade target.
Re-evaluate Fargate vs EKS only when scale/team constraints justify.
Introduce Lambda selectively for bursty connector/trigger jobs.

10. Trigger-Based Reassessment Rules

Re-evaluate platform choice when one or more thresholds are crossed:

10 production tenants
200 connector sync jobs/day
100 GB/day log ingestion
99.9% uptime target with strict recovery objectives
Need multi-region failover

At that point, prioritize managed runtime and managed observability to reduce operations risk.

Sources

AWS ECS pricing: https://aws.amazon.com/ecs/pricing/
AWS EC2 On-Demand pricing: https://aws.amazon.com/ec2/pricing/on-demand/
AWS EKS pricing: https://aws.amazon.com/eks/pricing/
AWS Fargate pricing: https://aws.amazon.com/fargate/pricing/
AWS Lambda pricing: https://aws.amazon.com/lambda/pricing/
AWS Elastic Load Balancing pricing: https://aws.amazon.com/elasticloadbalancing/pricing/
AWS VPC pricing (NAT and IPv4): https://aws.amazon.com/vpc/pricing/
AWS CloudWatch pricing: https://aws.amazon.com/cloudwatch/pricing/
AWS ECS logs to CloudWatch: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html
AWS ECS Exec: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html
Kubernetes kubectl logs: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_logs
GitHub Actions billing: https://docs.github.com/billing/concepts/product-billing/github-actions
GitHub Actions OIDC with AWS: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
Hetzner Cloud pricing: https://www.hetzner.com/cloud
OVHcloud US VPS pricing: https://us.ovhcloud.com/vps/
OVHcloud US locations: https://us.ovhcloud.com/about/global-infrastructure/locations/
Vultr pricing: https://www.vultr.com/pricing/
Vultr datacenter regions: https://www.vultr.com/features/datacenter-regions/
DigitalOcean droplet pricing: https://www.digitalocean.com/products/droplets
DigitalOcean regional availability: https://docs.digitalocean.com/platform/regional-availability/
Contabo location fees: https://contabo.com/en-us/location-fees/
Contabo VPS page: https://contabo.com/en-us/vps-server/
IONOS Cloud Cubes pricing: https://cloud.ionos.com/compute/cloud-cubes
UpCloud pricing (USD): https://upcloud.com/pricing-usd/
UpCloud locations: https://upcloud.com/docs/getting-started/locations/
AWS Lightsail pricing: https://aws.amazon.com/lightsail/pricing/
Grafana Cloud pricing: https://grafana.com/support/plans

Next Action

Status: adopted — shipped Docker + Colima + GitHub Actions CI/CD model adopted. Deployment config lives in docker-compose.deploy.yml (Caddy TLS) and .github/workflows/. No further action required.

1. Decision Criteria​

2. Deployment Options​

Option A: MVP Self-Hosted VM (Hetzner + Docker Compose)​

Option B: AWS ECS on EC2 (Production-leaning, lower complexity than EKS)​

Option C: AWS ECS on Fargate (Managed container runtime)​

Option D: AWS EKS (Strategic only if Kubernetes-native operating model is required)​

Option E: Event-Driven Connectors/Triggers with Lambda (Selective)​

3. Recommended Phased Strategy​

Phase 0 (MVP, now): Hetzner + Docker Compose​

Phase 1 (Pilot to early production): AWS ECS on EC2​

Phase 2 (Production scale): ECS Fargate or EKS by clear trigger​

Phase 3 (Optimization): Selective Lambda for burst jobs​

4. Observability and Logging Strategy​

MVP observability (low cost, high control)​

Production observability​

5. MVP Cost Comparison And Ready-to-Go Estimate​

5.1 Assumptions Used​

5.2 Unit Cost Anchors (Public Pricing)​

5.3 MVP Option Comparison (Monthly, High-Level)​

5.4 Ready-to-Go Deployed Solution Estimate (Recommended MVP)​

5.5 Hetzner-Range Competitors (US-Focused)​

6. CLI and Agent Access Requirements​

CLI paths by platform​

Security model​

7. CI/CD Plan (MVP Mandatory)​

MVP pipeline (Hetzner + Compose)​

Auth and secrets​

Drift prevention​

8. Strategic CI/CD Evolution​

9. Recommended Plan for SecurityV0​

10. Trigger-Based Reassessment Rules​

Sources​

Next Action​

1. Decision Criteria

2. Deployment Options

Option A: MVP Self-Hosted VM (Hetzner + Docker Compose)

Option B: AWS ECS on EC2 (Production-leaning, lower complexity than EKS)

Option C: AWS ECS on Fargate (Managed container runtime)

Option D: AWS EKS (Strategic only if Kubernetes-native operating model is required)

Option E: Event-Driven Connectors/Triggers with Lambda (Selective)

3. Recommended Phased Strategy

Phase 0 (MVP, now): Hetzner + Docker Compose

Phase 1 (Pilot to early production): AWS ECS on EC2

Phase 2 (Production scale): ECS Fargate or EKS by clear trigger

Phase 3 (Optimization): Selective Lambda for burst jobs

4. Observability and Logging Strategy

MVP observability (low cost, high control)

Production observability

5. MVP Cost Comparison And Ready-to-Go Estimate

5.1 Assumptions Used

5.2 Unit Cost Anchors (Public Pricing)

5.3 MVP Option Comparison (Monthly, High-Level)

5.4 Ready-to-Go Deployed Solution Estimate (Recommended MVP)

5.5 Hetzner-Range Competitors (US-Focused)

6. CLI and Agent Access Requirements

CLI paths by platform

Security model

7. CI/CD Plan (MVP Mandatory)

MVP pipeline (Hetzner + Compose)

Auth and secrets

Drift prevention

8. Strategic CI/CD Evolution

9. Recommended Plan for SecurityV0

10. Trigger-Based Reassessment Rules

Sources

Next Action