Deployment and Cloud Strategy Research
Date: 2026-02-07
Scope: Core platform (api, query, trigger evaluator, evidence generator, ui), database, connectors, scheduling/triggers, logging/observability, and autonomous operations support.
1. Decision Criteria
Primary criteria used for the options below:
- Speed to first pilot (days/weeks, not months)
- Deterministic operability (easy to inspect logs and state via CLI)
- Cost predictability at low volume
- Migration path without re-platforming core code
- Support for autonomous troubleshooting (agents/automation can query logs/metrics with narrow permissions)
2. Deployment Options
Option A: MVP Self-Hosted VM (Hetzner + Docker Compose)
Shape
- One VM for app services (
api,ui,connectors,trigger,evidence) via Docker Compose - One VM for data plane (
mongodb+ backup job) OR same VM for earliest MVP - Optional third small VM for observability stack (Loki/Prometheus/Grafana)
Pros
- Fastest setup and lowest fixed cost
- Full root/SSH control for debugging
- Easy to run all services together and iterate architecture quickly
Cons
- You own HA, backups, patching, and security hardening
- Manual scaling and failover
- Higher operational risk for customer-facing production
Fit
- Best for internal demo, design-partner pilot, and rapid product iteration
Option B: AWS ECS on EC2 (Production-leaning, lower complexity than EKS)
Shape
- ECS services for core platform
- Connector workers as separate ECS services or scheduled tasks
- MongoDB remains self-managed initially (EC2) or moved to Atlas later
- CloudWatch logs/metrics + ECS Exec for diagnostics
Pros
- ECS EC2 launch type has no additional ECS control plane charge
- Better IAM, networking, and security controls than single VM
- Easier ops than EKS
Cons
- You still manage EC2 capacity and patching
- Still non-trivial if MongoDB remains self-managed
Fit
- Strong mid-stage target when you want AWS controls without Kubernetes overhead
Option C: AWS ECS on Fargate (Managed container runtime)
Shape
- Core services on ECS Fargate
- Connector jobs as on-demand/scheduled Fargate tasks
- DB likely moved to managed offering for better reliability
Pros
- No node management
- Clean scaling model per task
- Good fit for bursty connectors
Cons
- Can cost more than EC2 at steady state
- Per-task compute pricing needs tight right-sizing
Fit
- Good for production where team bandwidth for infra ops is limited
Option D: AWS EKS (Strategic only if Kubernetes-native operating model is required)
Shape
- Full Kubernetes platform for core services and workers
- GitOps (Argo CD/Flux) + HPA/KEDA + service mesh (optional)
Pros
- Maximum flexibility and ecosystem
- Standardized platform if broader org already runs Kubernetes
Cons
- Highest platform complexity
- EKS cluster fee applies regardless of workload size
Fit
- Use only when there is a clear multi-team/platform reason
Option E: Event-Driven Connectors/Triggers with Lambda (Selective)
Shape
- Keep core API/query services containerized
- Move selected connector ingestion and trigger evaluation jobs to Lambda + EventBridge
Pros
- Excellent for bursty workloads
- Fine-grained cost for sporadic jobs
Cons
- Distributed debugging complexity
- Cold start/runtime constraints for heavy workloads
Fit
- Good complement later, not a full replacement of core graph platform
3. Recommended Phased Strategy
Phase 0 (MVP, now): Hetzner + Docker Compose
Target timeline: 1-3 weeks
Baseline architecture
- VM-1:
api,query,trigger,evidence,ui, connector workers - VM-2:
mongodb+ nightly encrypted backup + restore test - Optional VM-3: observability stack if needed
Critical controls for MVP
- Immutable image tags per deploy
- Backup + restore drill (weekly)
- Basic SLO dashboard (API error rate, connector failure rate, sync latency)
- Structured JSON logs with correlation IDs (
tenant_id,sync_id,entity_id,finding_id)
Why this is acceptable
- Fastest path to pilot while preserving deterministic debugging via SSH + Docker logs
Phase 1 (Pilot to early production): AWS ECS on EC2
Target timeline: after first pilots, before broader customer rollout
Move plan
- Migrate app services from Compose to ECS task definitions
- Keep connectors as separate services/tasks
- Use CloudWatch for centralized logs/metrics
- Enable ECS Exec for break-glass diagnostics
Reasoning
- Materially better security and operability than raw VMs without full Kubernetes burden
Phase 2 (Production scale): ECS Fargate or EKS by clear trigger
Choose ECS Fargate when
- Team wants managed runtime and predictable operational model
- Workloads are moderate and scaling is service/task oriented
Choose EKS when
- Organization already has Kubernetes platform team
- Need advanced Kubernetes-native controls that ECS cannot reasonably satisfy
Phase 3 (Optimization): Selective Lambda for burst jobs
Use Lambda/EventBridge for:
- scheduled lightweight connectors
- enrichment jobs
- periodic housekeeping/reconciliation
Keep long-running graph/API workloads containerized.
4. Observability and Logging Strategy
MVP observability (low cost, high control)
Option MVP-Obs-A: Self-hosted LGTM stack
- Loki + Promtail for logs
- Prometheus + Alertmanager for metrics
- Grafana for dashboards
- Optional Tempo for traces
Pros:
- Minimal vendor cost
- Full control
- Easy CLI access (
docker logs,logcli,promtool)
Cons:
- You operate it
- Retention and scaling need discipline
Option MVP-Obs-B: Managed SaaS light footprint
- Grafana Cloud or Better Stack for logs/metrics/traces
- Keep app on Hetzner
Pros:
- Lower operational overhead
- Fast setup and team-friendly UI
Cons:
- Ongoing ingest/retention costs
- Vendor dependency
Production observability
Option Prod-Obs-A: AWS-native
- CloudWatch Logs + CloudWatch metrics + alarms
- Optional Amazon Managed Grafana and AMP
Pros:
- IAM-native access control
- Deep ECS/EKS integration
- ECS Exec and CloudWatch CLI improve remote diagnostics
Cons:
- Cost can rise quickly with log volume and query scanning
Option Prod-Obs-B: Hybrid
- Keep CloudWatch for platform logs
- Route application logs/metrics to Grafana Cloud or Better Stack
Pros:
- Better query UX/correlation in some cases
- Can reduce operational toil
Cons:
- Dual tooling and data egress considerations
5. MVP Cost Comparison And Ready-to-Go Estimate
All numbers below are high-level estimates using public list pricing and simple assumptions.
5.1 Assumptions Used
- Region assumptions:
- Hetzner EU pricing from public cloud page.
- AWS
us-east-1style list pricing anchors. - Workload assumptions:
- Core services run 24x7.
- 1 pilot tenant, low traffic.
- 30 GB/month log ingestion.
- 300 GB persistent block storage for MongoDB data and snapshots.
- CI/CD assumptions:
- GitHub Actions private repo includes 2,000 free Linux minutes/month; overage priced at
$0.008/min. - Exclusions:
- VAT/tax, support plans, data egress spikes, incident-response labor.
- One-time engineering setup cost.
5.2 Unit Cost Anchors (Public Pricing)
- Hetzner Cloud examples:
CPX21=EUR 9.49/monthCPX31=EUR 16.49/monthCPX41=EUR 30.49/month- Hetzner Load Balancer =
from EUR 5.39/month - Hetzner Object Storage =
from EUR 4.99/month - AWS:
- ECS on EC2: no additional ECS fee beyond underlying resources.
- EKS control plane:
$0.10/hourper cluster. - Fargate Linux x86 (on-demand):
vCPU $0.000011244/second,memory $0.000001235/GB-second. - EC2 on-demand (reference class):
t2.medium=$0.0464/hourt2.large=$0.0928/hour- Application Load Balancer:
- base
$0.0225/hour - LCU
$0.008/LCU-hour - CloudWatch Logs examples:
- ingest
$0.50/GB - archive
$0.03/GB-month - GitHub Actions:
- private repos include 2,000 minutes/month free
- Linux 2-core overage
$0.008/min - Grafana Cloud Pro:
$19/monthplatform fee- includes 50 GB logs + 50 GB traces
- beyond included logs/traces
$0.50/GB
5.3 MVP Option Comparison (Monthly, High-Level)
| MVP Option | What is included | Estimated Monthly Cost |
|---|---|---|
| A. Hetzner Lean | 1x CPX31 all-in-one app+db, object storage backups | ~EUR 21.48 |
| B. Hetzner Ready-to-Go (recommended MVP) | 1x CPX31 app/workers + 1x CPX41 MongoDB + 1x CPX21 observability + LB + object storage | ~EUR 66.85 |
| C. AWS ECS on EC2 (MVP production-like) | 2x t2.medium ECS nodes + 1x t2.large MongoDB + ALB + 300GB EBS + 3 IPv4 + 30GB CloudWatch logs | ~USD 205 to USD 240 |
| D. AWS ECS Fargate + EC2 MongoDB | Fargate core services (3 vCPU, 6 GB) + 1x t2.large MongoDB + ALB + EBS + IPv4 + CloudWatch logs | ~USD 245 to USD 280 |
Notes:
- AWS ranges reflect whether NAT Gateway is needed (
+~USD 33/monthbase before traffic processing) and modest ALB LCU variability. - Option C/D are intentionally sized as practical MVP production baselines, not ultra-minimal single-instance demos.
5.4 Ready-to-Go Deployed Solution Estimate (Recommended MVP)
Recommended for first customer pilot:
- Platform: Hetzner Compose split-node MVP (
CPX31 + CPX41 + CPX21 + LB + Object Storage) - Observability: self-hosted LGTM on the
CPX21node - CI/CD: GitHub Actions auto-deploy to staging on every
maincommit, prod via protected approval
Estimated monthly run cost:
- Infrastructure subtotal: ~EUR 66.85/month
- GitHub Actions overage:
0if under included 2,000 private Linux minutes- Example overage (
+1,000 min) = ~USD 8/month - Optional managed observability alternative:
- replace self-hosted observability node with Grafana Cloud Pro (
USD 19/month), typically reducing ops burden
Planning budget for pilot:
- Target run-rate:
~EUR 67 + USD 0 to 20/month(depending on CI and observability choice) - Operationally safe envelope with contingency: ~EUR 90 to 140/month equivalent
5.5 Hetzner-Range Competitors (US-Focused)
Reference date: 2026-02-07. Prices are entry-level and can change.
| Provider | Entry Price (Approx) | US Presence | Fit Notes |
|---|---|---|---|
| OVHcloud US VPS | from $4.20/month | US regions including Hillsboro and Vint Hill | Strong price/perf, daily backups and anti-DDoS included in VPS range. |
| Vultr Cloud Compute | from $2.50/month (IPv6-only) or $5/month standard entry | Broad US city coverage (e.g., Atlanta, Chicago, Dallas, Los Angeles, Miami, New York area, Seattle, SF Bay Area) | Good balance of low price and many US regions. |
| DigitalOcean Droplets | from $6/month for 1GB basic droplets | US datacenters in NYC, SFO, ATL | Usually higher than Hetzner at equal specs, but simpler operations and good DX. |
| Contabo Cloud VPS | Cloud VPS 10 total shown around EUR 5.45 to 5.90 with US location fees | US East, US West, US Central | Very low monthly price; verify performance consistency for production workloads. |
| IONOS Cloud Cubes | from $5.76 per 30 days | Newark, NJ (US) | Cost-effective and simple billing model for small footprints. |
| UpCloud Developer Plans | from $3.5/month (USD pricing) | US in Chicago, New York, San Jose | Competitive low end with good US footprint and predictable plans. |
| AWS Lightsail | from $3.50/month (IPv6-only) and $5/month standard entry | Multiple US regions | Simple AWS entry point; costs can rise once add-ons/managed services are added. |
Selection guidance for SecurityV0 MVP:
- Choose Vultr when you want many US regions with low entry cost and simple VM operations.
- Choose OVHcloud US when you prioritize cost and built-in VPS protections.
- Choose DigitalOcean when developer workflow and operational simplicity are more important than minimum price.
- Choose Contabo/IONOS/UpCloud when monthly floor cost is the main driver and you can validate workload behavior early.
6. CLI and Agent Access Requirements
Required for both human and autonomous troubleshooting:
- Centralized logs accessible by CLI
- Narrow-scoped read-only credentials for diagnostic agents
- Correlation IDs across services and jobs
- Ability to access runtime shell only via auditable controls
CLI paths by platform
- Compose/VM:
ssh,docker compose logs,docker logs - ECS:
aws logs tail,aws logs start-query,aws ecs execute-command - EKS:
kubectl logs,kubectl describe,kubectl top
Security model
- No shared root credentials
- Role-based temporary credentials (OIDC where possible)
- Full audit trail for interactive access (ECS Exec supports CloudTrail auditing)
7. CI/CD Plan (MVP Mandatory)
GitHub Actions should perform automatic deployment on commit to main.
MVP pipeline (Hetzner + Compose)
lint-test
- Run tests and static checks.
build-publish
- Build container images.
- Push to registry.
deploy-staging(auto onmain)
- SSH to staging VM.
- Pull latest images.
docker compose up -d.- Run smoke checks.
deploy-prod(manual approval until stable)
- Same flow as staging with protected environment approval.
Auth and secrets
- Prefer OIDC to cloud where applicable.
- For VM SSH, use short-lived deploy keys and restricted command scope.
Drift prevention
- Store deployment manifests in Git.
- Record deployed image digest.
- Keep rollback command and previous digest available.
8. Strategic CI/CD Evolution
As platform matures:
- Move from SSH-based deploys to GitOps (ECS task defs in Git, or EKS manifests via Argo CD/Flux).
- Add canary or blue/green deployment patterns.
- Add policy gates for schema migrations and evidence-pack integrity checks.
9. Recommended Plan for SecurityV0
- Start with Hetzner Compose MVP for speed, but split DB onto separate node early.
- Implement structured logging and correlation IDs before pilot.
- Add minimal observability stack now (self-hosted LGTM or managed low-tier).
- Enforce auto-deploy via GitHub Actions to staging on each
maincommit. - Migrate to ECS on EC2 as first production-grade target.
- Re-evaluate Fargate vs EKS only when scale/team constraints justify.
- Introduce Lambda selectively for bursty connector/trigger jobs.
10. Trigger-Based Reassessment Rules
Re-evaluate platform choice when one or more thresholds are crossed:
-
10 production tenants
-
200 connector sync jobs/day
-
100 GB/day log ingestion
-
99.9% uptime target with strict recovery objectives
- Need multi-region failover
At that point, prioritize managed runtime and managed observability to reduce operations risk.
Sources
- AWS ECS pricing: https://aws.amazon.com/ecs/pricing/
- AWS EC2 On-Demand pricing: https://aws.amazon.com/ec2/pricing/on-demand/
- AWS EKS pricing: https://aws.amazon.com/eks/pricing/
- AWS Fargate pricing: https://aws.amazon.com/fargate/pricing/
- AWS Lambda pricing: https://aws.amazon.com/lambda/pricing/
- AWS Elastic Load Balancing pricing: https://aws.amazon.com/elasticloadbalancing/pricing/
- AWS VPC pricing (NAT and IPv4): https://aws.amazon.com/vpc/pricing/
- AWS CloudWatch pricing: https://aws.amazon.com/cloudwatch/pricing/
- AWS ECS logs to CloudWatch: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html
- AWS ECS Exec: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html
- Kubernetes
kubectl logs: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_logs - GitHub Actions billing: https://docs.github.com/billing/concepts/product-billing/github-actions
- GitHub Actions OIDC with AWS: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
- Hetzner Cloud pricing: https://www.hetzner.com/cloud
- OVHcloud US VPS pricing: https://us.ovhcloud.com/vps/
- OVHcloud US locations: https://us.ovhcloud.com/about/global-infrastructure/locations/
- Vultr pricing: https://www.vultr.com/pricing/
- Vultr datacenter regions: https://www.vultr.com/features/datacenter-regions/
- DigitalOcean droplet pricing: https://www.digitalocean.com/products/droplets
- DigitalOcean regional availability: https://docs.digitalocean.com/platform/regional-availability/
- Contabo location fees: https://contabo.com/en-us/location-fees/
- Contabo VPS page: https://contabo.com/en-us/vps-server/
- IONOS Cloud Cubes pricing: https://cloud.ionos.com/compute/cloud-cubes
- UpCloud pricing (USD): https://upcloud.com/pricing-usd/
- UpCloud locations: https://upcloud.com/docs/getting-started/locations/
- AWS Lightsail pricing: https://aws.amazon.com/lightsail/pricing/
- Grafana Cloud pricing: https://grafana.com/support/plans
Next Action
Status: adopted — shipped
Docker + Colima + GitHub Actions CI/CD model adopted. Deployment config lives in docker-compose.deploy.yml (Caddy TLS) and .github/workflows/. No further action required.