Files
dd0c/products/01-llm-cost-router/architecture/dual-mode-addendum.md
Max Mayfield 96e51054ae Add dual-mode deployment architecture addendum for P1 (route)
Docker Compose self-hosted mode, install script, auth abstraction,
data layer abstraction (SQS→pgmq, Cognito→local JWT, S3→local FS),
Caddy auto-TLS, upgrade path, self-hosted BDD specs.
16 story points additional effort. Template for all 6 products.
2026-03-01 01:58:15 +00:00

11 KiB

dd0c/route — Dual-Mode Deployment Architecture Addendum

Date: March 1, 2026 Scope: Self-hosted deployment mode alongside existing cloud-managed architecture


1. Deployment Modes

dd0c/route supports two deployment modes. The core business logic (proxy engine, router brain, analytics pipeline) is identical in both. Only the infrastructure and auth layers differ.

Aspect Cloud-Managed Self-Hosted
Deployment ECS Fargate + CDK Docker Compose
Auth GitHub OAuth + Cognito JWT Local auth (bcrypt + JWT)
Database RDS PostgreSQL + ElastiCache Redis PostgreSQL + Redis containers
Telemetry DB TimescaleDB on RDS TimescaleDB container
CDN CloudFront + S3 Caddy reverse proxy (auto-TLS)
Billing Stripe Checkout License key or honor system
Analytics PostHog Cloud PostHog self-hosted (optional)
Updates Automatic (ECS rolling) docker compose pull && docker compose up -d
Monitoring CloudWatch + PagerDuty Grafana + Prometheus (bundled)

2. Docker Compose (Self-Hosted)

# docker-compose.yml — dd0c/route self-hosted
version: "3.8"

services:
  proxy:
    image: ghcr.io/dd0c/route-proxy:latest
    ports:
      - "8080:8080"    # Proxy endpoint
    environment:
      - DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
      - REDIS_URL=redis://redis:6379
      - TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
      - AUTH_MODE=local          # local | oauth
      - GOVERNANCE_MODE=audit    # strict | audit
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: unless-stopped

  api:
    image: ghcr.io/dd0c/route-api:latest
    ports:
      - "3000:3000"    # Dashboard API
    environment:
      - DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
      - REDIS_URL=redis://redis:6379
      - TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
      - AUTH_MODE=local
      - JWT_SECRET=${JWT_SECRET}
    depends_on:
      postgres:
        condition: service_healthy

  worker:
    image: ghcr.io/dd0c/route-worker:latest
    environment:
      - DATABASE_URL=postgres://dd0c:${DB_PASSWORD}@postgres:5432/dd0c
      - TIMESCALE_URL=postgres://dd0c:${DB_PASSWORD}@timescaledb:5432/dd0c_telemetry
      - SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL:-}
      - SMTP_URL=${SMTP_URL:-}
    depends_on:
      postgres:
        condition: service_healthy

  dashboard:
    image: ghcr.io/dd0c/route-dashboard:latest
    ports:
      - "3001:80"      # Static SPA
    restart: unless-stopped

  postgres:
    image: postgres:16-alpine
    volumes:
      - pg_data:/var/lib/postgresql/data
      - ./migrations:/docker-entrypoint-initdb.d
    environment:
      - POSTGRES_USER=dd0c
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=dd0c
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U dd0c"]
      interval: 5s
      timeout: 3s
      retries: 5

  timescaledb:
    image: timescale/timescaledb:latest-pg16
    volumes:
      - ts_data:/var/lib/postgresql/data
      - ./migrations/timescale:/docker-entrypoint-initdb.d
    environment:
      - POSTGRES_USER=dd0c
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=dd0c_telemetry
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U dd0c"]
      interval: 5s
      timeout: 3s
      retries: 5

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  caddy:
    image: caddy:2-alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
      - caddy_data:/data
    depends_on:
      - proxy
      - api
      - dashboard

volumes:
  pg_data:
  ts_data:
  redis_data:
  caddy_data:

3. Install Script

#!/usr/bin/env bash
# install.sh — dd0c/route self-hosted installer
set -euo pipefail

echo "🚀 dd0c/route — Self-Hosted Installer"
echo "======================================"

# Check prerequisites
command -v docker >/dev/null 2>&1 || { echo "❌ Docker required. Install: https://docs.docker.com/get-docker/"; exit 1; }
command -v docker compose >/dev/null 2>&1 || { echo "❌ Docker Compose V2 required."; exit 1; }

# Create directory
DD0C_DIR="${DD0C_DIR:-$HOME/.dd0c}"
mkdir -p "$DD0C_DIR"
cd "$DD0C_DIR"

# Generate secrets
DB_PASSWORD=$(openssl rand -hex 16)
JWT_SECRET=$(openssl rand -hex 32)

# Write .env
cat > .env << ENVEOF
DB_PASSWORD=$DB_PASSWORD
JWT_SECRET=$JWT_SECRET
# Optional: Slack webhook for alerts
# SLACK_WEBHOOK_URL=https://hooks.slack.com/services/...
# Optional: SMTP for email digests
# SMTP_URL=smtp://user:pass@smtp.example.com:587
ENVEOF

# Download compose file
curl -sSL https://raw.githubusercontent.com/dd0c/route/main/docker-compose.yml -o docker-compose.yml
curl -sSL https://raw.githubusercontent.com/dd0c/route/main/Caddyfile -o Caddyfile

# Pull and start
docker compose pull
docker compose up -d

echo ""
echo "✅ dd0c/route is running!"
echo ""
echo "   Proxy:     http://localhost:8080/v1/chat/completions"
echo "   Dashboard: http://localhost:3001"
echo "   API:       http://localhost:3000"
echo ""
echo "   Create your first API key:"
echo "   curl -X POST http://localhost:3000/api/auth/local/signup \\"
echo "     -H 'Content-Type: application/json' \\"
echo "     -d '{\"email\":\"admin@localhost\",\"password\":\"changeme\"}'"
echo ""
echo "   Data stored in: $DD0C_DIR"
echo "   Upgrade:  cd $DD0C_DIR && docker compose pull && docker compose up -d"

4. Auth Abstraction Layer

The API uses an AuthProvider trait/interface that switches based on AUTH_MODE:

// src/auth/mod.rs

pub enum AuthMode {
    Local,   // bcrypt passwords + local JWT
    OAuth,   // GitHub OAuth + Cognito JWT
}

pub trait AuthProvider: Send + Sync {
    async fn authenticate(&self, req: &Request) -> Result<AuthContext, AuthError>;
    async fn create_user(&self, email: &str, password: Option<&str>) -> Result<User, AuthError>;
}

pub struct LocalAuthProvider { /* PostgreSQL-backed */ }
pub struct OAuthProvider { /* GitHub + Cognito */ }

impl AuthProvider for LocalAuthProvider {
    async fn authenticate(&self, req: &Request) -> Result<AuthContext, AuthError> {
        // Extract JWT from Authorization header
        // Verify with local JWT_SECRET (HS256)
        // Return AuthContext with org_id, role
    }
}

// Factory
pub fn create_auth_provider(mode: AuthMode, config: &Config) -> Box<dyn AuthProvider> {
    match mode {
        AuthMode::Local => Box::new(LocalAuthProvider::new(config)),
        AuthMode::OAuth => Box::new(OAuthProvider::new(config)),
    }
}

5. Data Layer Abstraction

All 6 dd0c products must abstract the data layer so self-hosted mode uses PostgreSQL everywhere (no DynamoDB, no Cognito, no SQS):

Cloud Service Self-Hosted Replacement
DynamoDB PostgreSQL (same schema, different driver)
SQS FIFO PostgreSQL LISTEN/NOTIFY + pgmq
Cognito Local JWT (HS256)
EventBridge Cron + webhook polling
S3 Local filesystem or MinIO
CloudFront Caddy reverse proxy
SES SMTP relay
KMS Local AES-256-GCM with key file
// src/data/mod.rs

pub trait EventQueue: Send + Sync {
    async fn publish(&self, event: Event) -> Result<()>;
    async fn consume(&self, batch_size: usize) -> Result<Vec<Event>>;
}

pub struct SqsFifoQueue { /* AWS SQS */ }
pub struct PgmqQueue { /* PostgreSQL pgmq */ }

pub trait ObjectStore: Send + Sync {
    async fn put(&self, key: &str, data: &[u8]) -> Result<()>;
    async fn get(&self, key: &str) -> Result<Vec<u8>>;
}

pub struct S3Store { /* AWS S3 */ }
pub struct LocalFsStore { /* Local filesystem */ }

6. Upgrade Path

Self-hosted users need a safe upgrade mechanism:

# Upgrade script (runs as part of `docker compose pull`)
# 1. Pull new images
docker compose pull

# 2. Run migrations (idempotent)
docker compose run --rm api migrate

# 3. Rolling restart
docker compose up -d --remove-orphans

# 4. Health check
curl -sf http://localhost:8080/health || echo "⚠️ Proxy unhealthy after upgrade"
curl -sf http://localhost:3000/health || echo "⚠️ API unhealthy after upgrade"

7. Self-Hosted BDD Acceptance Specs

Feature: Self-Hosted Installation

  Scenario: Fresh install via install script
    Given a Linux host with Docker and Docker Compose installed
    When the user runs curl -sSL install.dd0c.dev | bash
    Then docker-compose.yml is downloaded to ~/.dd0c
    And .env is generated with random DB_PASSWORD and JWT_SECRET
    And all containers start and pass health checks within 60 seconds
    And the proxy responds to GET /health with 200

  Scenario: Local auth signup and API key generation
    Given dd0c/route is running in self-hosted mode (AUTH_MODE=local)
    When the user POSTs /api/auth/local/signup with email and password
    Then a user account is created with bcrypt-hashed password
    And a JWT is returned (HS256, signed with JWT_SECRET)
    And the user can create an API key via /api/orgs/{id}/api-keys

  Scenario: Upgrade preserves data
    Given dd0c/route is running with existing routing rules and telemetry
    When the user runs docker compose pull && docker compose up -d
    Then all routing rules are preserved
    And all telemetry data is preserved
    And the proxy resumes routing within 10 seconds

  Scenario: Self-hosted works without internet after initial pull
    Given all Docker images are cached locally
    When the host loses internet connectivity
    Then the proxy continues routing requests
    And the dashboard continues serving
    And cost tables use the last cached version

  Scenario: Caddy auto-TLS with custom domain
    Given the Caddyfile is configured with domain "route.example.com"
    And DNS points to the host
    When Caddy starts
    Then a Let's Encrypt TLS certificate is automatically provisioned
    And HTTPS is served on port 443

  Scenario: PostgreSQL data persists across restarts
    Given routing rules and telemetry exist in PostgreSQL
    When docker compose down && docker compose up -d is run
    Then all data is preserved via named volumes

8. Impact on Existing Epics

Epic Change Required Effort
1 (Proxy) None — pure Rust, no AWS deps 0
2 (Router) None — in-memory, no AWS deps 0
3 (Analytics) Add pgmq as alternative to SQS 2 pts
4 (Dashboard API) Add LocalAuthProvider, abstract KMS 3 pts
5 (Dashboard UI) Add local login form (email/password) 2 pts
6 (Shadow CLI) None — already runs locally 0
7 (Slack/Email) SMTP fallback for SES 1 pt
8 (Infrastructure) New: docker-compose.yml + install.sh + Caddyfile 5 pts
9 (Onboarding) New: local signup flow, remove Stripe requirement 3 pts
10 (TF Tenets) None — tenets are code-level, not infra-level 0
Total 16 pts

This addendum applies to dd0c/route. The same pattern (AuthProvider trait, data layer abstraction, docker-compose, install script) replicates across all 6 dd0c products with product-specific service containers.