Add dual-mode deployment architecture addendum for P1 (route)
Docker Compose self-hosted mode, install script, auth abstraction, data layer abstraction (SQS→pgmq, Cognito→local JWT, S3→local FS), Caddy auto-TLS, upgrade path, self-hosted BDD specs. 16 story points additional effort. Template for all 6 products.
This commit is contained in:
353
products/01-llm-cost-router/architecture/dual-mode-addendum.md
Normal file
353
products/01-llm-cost-router/architecture/dual-mode-addendum.md
Normal file
@@ -0,0 +1,353 @@
|
||||
# dd0c/route — Dual-Mode Deployment Architecture Addendum
|
||||
|
||||
**Date:** March 1, 2026
|
||||
**Scope:** Self-hosted deployment mode alongside existing cloud-managed architecture
|
||||
|
||||
---
|
||||
|
||||
## 1. Deployment Modes
|
||||
|
||||
dd0c/route supports two deployment modes. The core business logic (proxy engine, router brain, analytics pipeline) is identical in both. Only the infrastructure and auth layers differ.
|
||||
|
||||
| Aspect | Cloud-Managed | Self-Hosted |
|
||||
|--------|--------------|-------------|
|
||||
| Deployment | ECS Fargate + CDK | Docker Compose |
|
||||
| Auth | GitHub OAuth + Cognito JWT | Local auth (bcrypt + JWT) |
|
||||
| Database | RDS PostgreSQL + ElastiCache Redis | PostgreSQL + Redis containers |
|
||||
| Telemetry DB | TimescaleDB on RDS | TimescaleDB container |
|
||||
| CDN | CloudFront + S3 | Caddy reverse proxy (auto-TLS) |
|
||||
| Billing | Stripe Checkout | License key or honor system |
|
||||
| Analytics | PostHog Cloud | PostHog self-hosted (optional) |
|
||||
| Updates | Automatic (ECS rolling) | `docker compose pull && docker compose up -d` |
|
||||
| Monitoring | CloudWatch + PagerDuty | Grafana + Prometheus (bundled) |
|
||||
|
||||
## 2. Docker Compose (Self-Hosted)
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml — dd0c/route self-hosted
|
||||
version: "3.8"
|
||||
|
||||
services:
|
||||
proxy:
|
||||
image: ghcr.io/dd0c/route-proxy:latest
|
||||
ports:
|
||||
- "8080:8080" # Proxy endpoint
|
||||
environment:
|
||||
- DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
|
||||
- REDIS_URL=redis://redis:6379
|
||||
- TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
|
||||
- AUTH_MODE=local # local | oauth
|
||||
- GOVERNANCE_MODE=audit # strict | audit
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
redis:
|
||||
condition: service_healthy
|
||||
restart: unless-stopped
|
||||
|
||||
api:
|
||||
image: ghcr.io/dd0c/route-api:latest
|
||||
ports:
|
||||
- "3000:3000" # Dashboard API
|
||||
environment:
|
||||
- DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
|
||||
- REDIS_URL=redis://redis:6379
|
||||
- TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
|
||||
- AUTH_MODE=local
|
||||
- JWT_SECRET=${JWT_SECRET}
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
|
||||
worker:
|
||||
image: ghcr.io/dd0c/route-worker:latest
|
||||
environment:
|
||||
- DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
|
||||
- TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
|
||||
- SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL:-}
|
||||
- SMTP_URL=${SMTP_URL:-}
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
|
||||
dashboard:
|
||||
image: ghcr.io/dd0c/route-dashboard:latest
|
||||
ports:
|
||||
- "3001:80" # Static SPA
|
||||
restart: unless-stopped
|
||||
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
volumes:
|
||||
- pg_data:/var/lib/postgresql/data
|
||||
- ./migrations:/docker-entrypoint-initdb.d
|
||||
environment:
|
||||
- POSTGRES_USER=dd0c
|
||||
- POSTGRES_PASSWORD=${DB_PASSWORD}
|
||||
- POSTGRES_DB=dd0c
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U dd0c"]
|
||||
interval: 5s
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
|
||||
timescaledb:
|
||||
image: timescale/timescaledb:latest-pg16
|
||||
volumes:
|
||||
- ts_data:/var/lib/postgresql/data
|
||||
- ./migrations/timescale:/docker-entrypoint-initdb.d
|
||||
environment:
|
||||
- POSTGRES_USER=dd0c
|
||||
- POSTGRES_PASSWORD=${DB_PASSWORD}
|
||||
- POSTGRES_DB=dd0c_telemetry
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U dd0c"]
|
||||
interval: 5s
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
volumes:
|
||||
- redis_data:/data
|
||||
healthcheck:
|
||||
test: ["CMD", "redis-cli", "ping"]
|
||||
interval: 5s
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
|
||||
caddy:
|
||||
image: caddy:2-alpine
|
||||
ports:
|
||||
- "80:80"
|
||||
- "443:443"
|
||||
volumes:
|
||||
- ./Caddyfile:/etc/caddy/Caddyfile
|
||||
- caddy_data:/data
|
||||
depends_on:
|
||||
- proxy
|
||||
- api
|
||||
- dashboard
|
||||
|
||||
volumes:
|
||||
pg_data:
|
||||
ts_data:
|
||||
redis_data:
|
||||
caddy_data:
|
||||
```
|
||||
|
||||
## 3. Install Script
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
# install.sh — dd0c/route self-hosted installer
|
||||
set -euo pipefail
|
||||
|
||||
echo "🚀 dd0c/route — Self-Hosted Installer"
|
||||
echo "======================================"
|
||||
|
||||
# Check prerequisites
|
||||
command -v docker >/dev/null 2>&1 || { echo "❌ Docker required. Install: https://docs.docker.com/get-docker/"; exit 1; }
|
||||
command -v docker compose >/dev/null 2>&1 || { echo "❌ Docker Compose V2 required."; exit 1; }
|
||||
|
||||
# Create directory
|
||||
DD0C_DIR="${DD0C_DIR:-$HOME/.dd0c}"
|
||||
mkdir -p "$DD0C_DIR"
|
||||
cd "$DD0C_DIR"
|
||||
|
||||
# Generate secrets
|
||||
DB_PASSWORD=$(openssl rand -hex 16)
|
||||
JWT_SECRET=$(openssl rand -hex 32)
|
||||
|
||||
# Write .env
|
||||
cat > .env << ENVEOF
|
||||
DB_PASSWORD=$DB_PASSWORD
|
||||
JWT_SECRET=$JWT_SECRET
|
||||
# Optional: Slack webhook for alerts
|
||||
# SLACK_WEBHOOK_URL=https://hooks.slack.com/services/...
|
||||
# Optional: SMTP for email digests
|
||||
# SMTP_URL=smtp://user:pass@smtp.example.com:587
|
||||
ENVEOF
|
||||
|
||||
# Download compose file
|
||||
curl -sSL https://raw.githubusercontent.com/dd0c/route/main/docker-compose.yml -o docker-compose.yml
|
||||
curl -sSL https://raw.githubusercontent.com/dd0c/route/main/Caddyfile -o Caddyfile
|
||||
|
||||
# Pull and start
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
|
||||
echo ""
|
||||
echo "✅ dd0c/route is running!"
|
||||
echo ""
|
||||
echo " Proxy: http://localhost:8080/v1/chat/completions"
|
||||
echo " Dashboard: http://localhost:3001"
|
||||
echo " API: http://localhost:3000"
|
||||
echo ""
|
||||
echo " Create your first API key:"
|
||||
echo " curl -X POST http://localhost:3000/api/auth/local/signup \\"
|
||||
echo " -H 'Content-Type: application/json' \\"
|
||||
echo " -d '{\"email\":\"admin@localhost\",\"password\":\"changeme\"}'"
|
||||
echo ""
|
||||
echo " Data stored in: $DD0C_DIR"
|
||||
echo " Upgrade: cd $DD0C_DIR && docker compose pull && docker compose up -d"
|
||||
```
|
||||
|
||||
## 4. Auth Abstraction Layer
|
||||
|
||||
The API uses an `AuthProvider` trait/interface that switches based on `AUTH_MODE`:
|
||||
|
||||
```rust
|
||||
// src/auth/mod.rs
|
||||
|
||||
pub enum AuthMode {
|
||||
Local, // bcrypt passwords + local JWT
|
||||
OAuth, // GitHub OAuth + Cognito JWT
|
||||
}
|
||||
|
||||
pub trait AuthProvider: Send + Sync {
|
||||
async fn authenticate(&self, req: &Request) -> Result<AuthContext, AuthError>;
|
||||
async fn create_user(&self, email: &str, password: Option<&str>) -> Result<User, AuthError>;
|
||||
}
|
||||
|
||||
pub struct LocalAuthProvider { /* PostgreSQL-backed */ }
|
||||
pub struct OAuthProvider { /* GitHub + Cognito */ }
|
||||
|
||||
impl AuthProvider for LocalAuthProvider {
|
||||
async fn authenticate(&self, req: &Request) -> Result<AuthContext, AuthError> {
|
||||
// Extract JWT from Authorization header
|
||||
// Verify with local JWT_SECRET (HS256)
|
||||
// Return AuthContext with org_id, role
|
||||
}
|
||||
}
|
||||
|
||||
// Factory
|
||||
pub fn create_auth_provider(mode: AuthMode, config: &Config) -> Box<dyn AuthProvider> {
|
||||
match mode {
|
||||
AuthMode::Local => Box::new(LocalAuthProvider::new(config)),
|
||||
AuthMode::OAuth => Box::new(OAuthProvider::new(config)),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 5. Data Layer Abstraction
|
||||
|
||||
All 6 dd0c products must abstract the data layer so self-hosted mode uses PostgreSQL everywhere (no DynamoDB, no Cognito, no SQS):
|
||||
|
||||
| Cloud Service | Self-Hosted Replacement |
|
||||
|--------------|----------------------|
|
||||
| DynamoDB | PostgreSQL (same schema, different driver) |
|
||||
| SQS FIFO | PostgreSQL LISTEN/NOTIFY + pgmq |
|
||||
| Cognito | Local JWT (HS256) |
|
||||
| EventBridge | Cron + webhook polling |
|
||||
| S3 | Local filesystem or MinIO |
|
||||
| CloudFront | Caddy reverse proxy |
|
||||
| SES | SMTP relay |
|
||||
| KMS | Local AES-256-GCM with key file |
|
||||
|
||||
```rust
|
||||
// src/data/mod.rs
|
||||
|
||||
pub trait EventQueue: Send + Sync {
|
||||
async fn publish(&self, event: Event) -> Result<()>;
|
||||
async fn consume(&self, batch_size: usize) -> Result<Vec<Event>>;
|
||||
}
|
||||
|
||||
pub struct SqsFifoQueue { /* AWS SQS */ }
|
||||
pub struct PgmqQueue { /* PostgreSQL pgmq */ }
|
||||
|
||||
pub trait ObjectStore: Send + Sync {
|
||||
async fn put(&self, key: &str, data: &[u8]) -> Result<()>;
|
||||
async fn get(&self, key: &str) -> Result<Vec<u8>>;
|
||||
}
|
||||
|
||||
pub struct S3Store { /* AWS S3 */ }
|
||||
pub struct LocalFsStore { /* Local filesystem */ }
|
||||
```
|
||||
|
||||
## 6. Upgrade Path
|
||||
|
||||
Self-hosted users need a safe upgrade mechanism:
|
||||
|
||||
```bash
|
||||
# Upgrade script (runs as part of `docker compose pull`)
|
||||
# 1. Pull new images
|
||||
docker compose pull
|
||||
|
||||
# 2. Run migrations (idempotent)
|
||||
docker compose run --rm api migrate
|
||||
|
||||
# 3. Rolling restart
|
||||
docker compose up -d --remove-orphans
|
||||
|
||||
# 4. Health check
|
||||
curl -sf http://localhost:8080/health || echo "⚠️ Proxy unhealthy after upgrade"
|
||||
curl -sf http://localhost:3000/health || echo "⚠️ API unhealthy after upgrade"
|
||||
```
|
||||
|
||||
## 7. Self-Hosted BDD Acceptance Specs
|
||||
|
||||
```gherkin
|
||||
Feature: Self-Hosted Installation
|
||||
|
||||
Scenario: Fresh install via install script
|
||||
Given a Linux host with Docker and Docker Compose installed
|
||||
When the user runs curl -sSL install.dd0c.dev | bash
|
||||
Then docker-compose.yml is downloaded to ~/.dd0c
|
||||
And .env is generated with random DB_PASSWORD and JWT_SECRET
|
||||
And all containers start and pass health checks within 60 seconds
|
||||
And the proxy responds to GET /health with 200
|
||||
|
||||
Scenario: Local auth signup and API key generation
|
||||
Given dd0c/route is running in self-hosted mode (AUTH_MODE=local)
|
||||
When the user POSTs /api/auth/local/signup with email and password
|
||||
Then a user account is created with bcrypt-hashed password
|
||||
And a JWT is returned (HS256, signed with JWT_SECRET)
|
||||
And the user can create an API key via /api/orgs/{id}/api-keys
|
||||
|
||||
Scenario: Upgrade preserves data
|
||||
Given dd0c/route is running with existing routing rules and telemetry
|
||||
When the user runs docker compose pull && docker compose up -d
|
||||
Then all routing rules are preserved
|
||||
And all telemetry data is preserved
|
||||
And the proxy resumes routing within 10 seconds
|
||||
|
||||
Scenario: Self-hosted works without internet after initial pull
|
||||
Given all Docker images are cached locally
|
||||
When the host loses internet connectivity
|
||||
Then the proxy continues routing requests
|
||||
And the dashboard continues serving
|
||||
And cost tables use the last cached version
|
||||
|
||||
Scenario: Caddy auto-TLS with custom domain
|
||||
Given the Caddyfile is configured with domain "route.example.com"
|
||||
And DNS points to the host
|
||||
When Caddy starts
|
||||
Then a Let's Encrypt TLS certificate is automatically provisioned
|
||||
And HTTPS is served on port 443
|
||||
|
||||
Scenario: PostgreSQL data persists across restarts
|
||||
Given routing rules and telemetry exist in PostgreSQL
|
||||
When docker compose down && docker compose up -d is run
|
||||
Then all data is preserved via named volumes
|
||||
```
|
||||
|
||||
## 8. Impact on Existing Epics
|
||||
|
||||
| Epic | Change Required | Effort |
|
||||
|------|----------------|--------|
|
||||
| 1 (Proxy) | None — pure Rust, no AWS deps | 0 |
|
||||
| 2 (Router) | None — in-memory, no AWS deps | 0 |
|
||||
| 3 (Analytics) | Add pgmq as alternative to SQS | 2 pts |
|
||||
| 4 (Dashboard API) | Add LocalAuthProvider, abstract KMS | 3 pts |
|
||||
| 5 (Dashboard UI) | Add local login form (email/password) | 2 pts |
|
||||
| 6 (Shadow CLI) | None — already runs locally | 0 |
|
||||
| 7 (Slack/Email) | SMTP fallback for SES | 1 pt |
|
||||
| 8 (Infrastructure) | New: docker-compose.yml + install.sh + Caddyfile | 5 pts |
|
||||
| 9 (Onboarding) | New: local signup flow, remove Stripe requirement | 3 pts |
|
||||
| 10 (TF Tenets) | None — tenets are code-level, not infra-level | 0 |
|
||||
| **Total** | | **16 pts** |
|
||||
|
||||
---
|
||||
|
||||
*This addendum applies to dd0c/route. The same pattern (AuthProvider trait, data layer abstraction, docker-compose, install script) replicates across all 6 dd0c products with product-specific service containers.*
|
||||
Reference in New Issue
Block a user