Files
dd0c/products/05-aws-cost-anomaly/architecture/dual-mode-addendum.md
Max Mayfield c3bafa238a Add dual-mode deployment addendums for all 6 products
P1 route: 16 pts (template, full docker-compose + install script)
P2 drift: 17 pts (pgmq, local CA for mTLS)
P3 alert: 19 pts (Lambda→Fastify, DynamoDB→PG JSONB)
P4 portal: 18 pts (Step Functions→cron, Aurora→PG+pgvector)
P5 cost: 19 pts (EventBridge→agent/polling, DynamoDB→PG JSONB)
P6 run: 15 pts (easiest — already PG-native, no AWS deps in core)

Total self-hosted effort: ~104 story points across all 6 products
2026-03-01 02:00:00 +00:00

3.1 KiB

dd0c/cost — Dual-Mode Deployment Addendum

Template: Based on dd0c/route dual-mode pattern


Cloud → Self-Hosted Service Mapping

Cloud Service Self-Hosted Replacement Notes
EventBridge Webhook polling + cron Customer pushes CloudTrail logs or dd0c polls
SQS FIFO PostgreSQL pgmq Event queue
Lambda (normalizer) Container process Same TypeScript code
DynamoDB PostgreSQL (JSONB) Single-table → JSONB with GIN indexes
Cognito Local JWT (HS256) AuthProvider pattern
STS (cross-account) Direct IAM credentials Customer provides access key or role ARN
S3 Local FS or MinIO Raw event archive
SES SMTP relay Digest emails

Self-Hosted Compose Services

services:
  ingestion:        # CloudTrail event normalizer
    image: ghcr.io/dd0c/cost-ingestion:latest
  scorer:           # Anomaly detection (Z-score, Welford, novelty)
    image: ghcr.io/dd0c/cost-scorer:latest
  zombie-hunter:    # Daily idle resource scanner
    image: ghcr.io/dd0c/cost-zombie:latest
  api:              # Dashboard API
    image: ghcr.io/dd0c/cost-api:latest
  dashboard:        # React SPA
    image: ghcr.io/dd0c/cost-dashboard:latest
  postgres:         # All data (JSONB), baselines, config
    image: postgres:16-alpine
  redis:            # Panic mode, governance flags, circuit breakers
    image: redis:7-alpine
  caddy:
    image: caddy:2-alpine

Key Difference: EventBridge → Polling/Push

Self-hosted mode can't use EventBridge cross-account rules. Two alternatives:

  1. Push mode: Customer configures CloudTrail to send to an S3 bucket, dd0c polls the bucket
  2. Agent mode: Lightweight Go agent in customer VPC forwards CloudTrail events via gRPC (same pattern as dd0c/drift)

Agent mode is recommended — reuses the dd0c/drift agent pattern.

Key Difference: DynamoDB → PostgreSQL JSONB

Same pattern as dd0c/alert:

CREATE TABLE cost_events (
  id UUID PRIMARY KEY,
  tenant_id TEXT NOT NULL,
  account_id TEXT NOT NULL,
  data JSONB NOT NULL,
  severity TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE baselines (
  tenant_id TEXT NOT NULL,
  account_id TEXT NOT NULL,
  resource_type TEXT NOT NULL,
  mean_hourly_cost NUMERIC(12,4),
  stddev NUMERIC(12,4),
  event_count INTEGER DEFAULT 0,
  observed_actors JSONB DEFAULT '[]',
  PRIMARY KEY (tenant_id, account_id, resource_type)
);

Epic Impact

Epic Change Effort
1 (CloudTrail Ingestion) EventBridge → agent/polling, SQS → pgmq 4 pts
2 (Anomaly Detection) No change — pure math 0
3 (Zombie Hunter) Direct AWS API calls (same) 0
4 (Notifications) SMTP fallback 1 pt
5 (Onboarding) No CFN quick-create; manual IAM setup guide 3 pts
6 (Dashboard API) LocalAuthProvider, DynamoDB → PG 3 pts
7 (Dashboard UI) Local login form 2 pts
8 (Infrastructure) docker-compose.yml + install.sh 5 pts
9 (Multi-Account) Same — just different credential input 1 pt
10 (TF Tenets) No change 0
Total 19 pts