From c3bafa238a0e7e2fd27b9454bd2480c810e760aa Mon Sep 17 00:00:00 2001 From: Max Mayfield Date: Sun, 1 Mar 2026 02:00:00 +0000 Subject: [PATCH] Add dual-mode deployment addendums for all 6 products MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P1 route: 16 pts (template, full docker-compose + install script) P2 drift: 17 pts (pgmq, local CA for mTLS) P3 alert: 19 pts (Lambda→Fastify, DynamoDB→PG JSONB) P4 portal: 18 pts (Step Functions→cron, Aurora→PG+pgvector) P5 cost: 19 pts (EventBridge→agent/polling, DynamoDB→PG JSONB) P6 run: 15 pts (easiest — already PG-native, no AWS deps in core) Total self-hosted effort: ~104 story points across all 6 products --- .../architecture/dual-mode-addendum.md | 61 +++++++++++++ .../architecture/dual-mode-addendum.md | 77 ++++++++++++++++ .../architecture/dual-mode-addendum.md | 65 ++++++++++++++ .../architecture/dual-mode-addendum.md | 88 +++++++++++++++++++ .../architecture/dual-mode-addendum.md | 69 +++++++++++++++ 5 files changed, 360 insertions(+) create mode 100644 products/02-iac-drift-detection/architecture/dual-mode-addendum.md create mode 100644 products/03-alert-intelligence/architecture/dual-mode-addendum.md create mode 100644 products/04-lightweight-idp/architecture/dual-mode-addendum.md create mode 100644 products/05-aws-cost-anomaly/architecture/dual-mode-addendum.md create mode 100644 products/06-runbook-automation/architecture/dual-mode-addendum.md diff --git a/products/02-iac-drift-detection/architecture/dual-mode-addendum.md b/products/02-iac-drift-detection/architecture/dual-mode-addendum.md new file mode 100644 index 0000000..8d15d0e --- /dev/null +++ b/products/02-iac-drift-detection/architecture/dual-mode-addendum.md @@ -0,0 +1,61 @@ +# dd0c/drift — Dual-Mode Deployment Addendum + +**Template:** Based on dd0c/route dual-mode pattern (`01-llm-cost-router/architecture/dual-mode-addendum.md`) + +--- + +## Cloud → Self-Hosted Service Mapping + +| Cloud Service | Self-Hosted Replacement | Notes | +|--------------|----------------------|-------| +| SQS FIFO | PostgreSQL pgmq | Agent pushes drift reports to pgmq instead of SQS | +| RDS PostgreSQL | PostgreSQL container | Same schema, same RLS | +| Cognito | Local JWT (HS256) | Same AuthProvider trait pattern | +| S3 (drift report archive) | MinIO or local FS | Configurable via ObjectStore trait | +| CloudWatch | Prometheus + Grafana | Bundled in compose | +| SES | SMTP relay | For email notifications | +| KMS | Local AES-256-GCM | Key file mounted as volume | + +## Self-Hosted Compose Services + +```yaml +services: + agent-gateway: # gRPC endpoint for agents (replaces SQS ingestion) + image: ghcr.io/dd0c/drift-gateway:latest + event-processor: # Normalizes drift reports, scores severity + image: ghcr.io/dd0c/drift-processor:latest + api: # Dashboard API + image: ghcr.io/dd0c/drift-api:latest + dashboard: # React SPA + image: ghcr.io/dd0c/drift-dashboard:latest + postgres: # Config + drift data (with RLS) + image: postgres:16-alpine + redis: # mTLS cert cache, circuit breakers + image: redis:7-alpine + caddy: # Reverse proxy + auto-TLS + image: caddy:2-alpine +``` + +## Agent Changes + +The Go agent already connects via gRPC — it just needs a configurable endpoint: +- Cloud: `grpcs://ingest.drift.dd0c.dev` +- Self-hosted: `grpc://localhost:50051` (or user's domain with Caddy TLS) + +mTLS certs: self-hosted uses a local CA (generated during install) instead of ACM. + +## Epic Impact + +| Epic | Change | Effort | +|------|--------|--------| +| 1 (Agent) | Add configurable gRPC endpoint | 1 pt | +| 2 (Communication) | Local CA for mTLS, pgmq instead of SQS | 3 pts | +| 3 (Event Processor) | Already PostgreSQL — no change | 0 | +| 4 (Notifications) | SMTP fallback | 1 pt | +| 5 (Remediation) | No change — agent-side | 0 | +| 6 (Dashboard UI) | Local login form | 2 pts | +| 7 (Dashboard API) | LocalAuthProvider | 2 pts | +| 8 (Infrastructure) | docker-compose.yml + install.sh | 5 pts | +| 9 (Onboarding) | Local signup, remove Stripe req | 3 pts | +| 10 (TF Tenets) | No change | 0 | +| **Total** | | **17 pts** | diff --git a/products/03-alert-intelligence/architecture/dual-mode-addendum.md b/products/03-alert-intelligence/architecture/dual-mode-addendum.md new file mode 100644 index 0000000..57e264b --- /dev/null +++ b/products/03-alert-intelligence/architecture/dual-mode-addendum.md @@ -0,0 +1,77 @@ +# dd0c/alert — Dual-Mode Deployment Addendum + +**Template:** Based on dd0c/route dual-mode pattern + +--- + +## Cloud → Self-Hosted Service Mapping + +| Cloud Service | Self-Hosted Replacement | Notes | +|--------------|----------------------|-------| +| API Gateway + Lambda | Fastify/Express container | Webhook ingestion endpoint | +| SQS | PostgreSQL pgmq | Between ingestion and correlation | +| ECS Fargate (Correlation) | Docker container | Same Node.js code | +| DynamoDB | PostgreSQL (JSONB) | Incidents stored as JSONB with GIN indexes | +| TimescaleDB on RDS | TimescaleDB container | Analytics unchanged | +| Cognito | Local JWT (HS256) | AuthProvider pattern | +| S3 (raw payload archive) | Local FS or MinIO | ObjectStore trait | +| SES | SMTP relay | Email notifications | +| EventBridge | Cron container | Scheduled tasks | + +## Self-Hosted Compose Services + +```yaml +services: + ingestion: # Webhook endpoint (replaces API Gateway + Lambda) + image: ghcr.io/dd0c/alert-ingestion:latest + correlation: # Correlation engine (replaces ECS Fargate) + image: ghcr.io/dd0c/alert-correlation:latest + api: # Dashboard API + image: ghcr.io/dd0c/alert-api:latest + dashboard: # React SPA + image: ghcr.io/dd0c/alert-dashboard:latest + postgres: # Incidents (JSONB) + config + image: postgres:16-alpine + timescaledb: # Analytics + image: timescale/timescaledb:latest-pg16 + redis: # Sliding windows for correlation, circuit breakers + image: redis:7-alpine + caddy: # Reverse proxy + auto-TLS + image: caddy:2-alpine +``` + +## Key Difference: DynamoDB → PostgreSQL JSONB + +DynamoDB Single-Table design maps to PostgreSQL JSONB with partition-like indexes: + +```sql +CREATE TABLE incidents ( + id UUID PRIMARY KEY, + tenant_id TEXT NOT NULL, + data JSONB NOT NULL, + severity TEXT NOT NULL, + status TEXT NOT NULL DEFAULT 'open', + created_at TIMESTAMPTZ DEFAULT NOW() +); +CREATE INDEX idx_incidents_tenant ON incidents(tenant_id); +CREATE INDEX idx_incidents_data ON incidents USING GIN(data); +-- RLS policy +ALTER TABLE incidents ENABLE ROW LEVEL SECURITY; +CREATE POLICY tenant_isolation ON incidents USING (tenant_id = current_setting('app.tenant_id')); +``` + +## Epic Impact + +| Epic | Change | Effort | +|------|--------|--------| +| 1 (Webhook Ingestion) | Lambda → Fastify container | 3 pts | +| 2 (Normalization) | No change — pure logic | 0 | +| 3 (Correlation) | pgmq instead of SQS, same Redis | 2 pts | +| 4 (Notifications) | SMTP fallback | 1 pt | +| 5 (Slack Bot) | No change | 0 | +| 6 (Dashboard API) | LocalAuthProvider, DynamoDB→PG | 3 pts | +| 7 (Dashboard UI) | Local login form | 2 pts | +| 8 (Infrastructure) | docker-compose.yml + install.sh | 5 pts | +| 9 (Onboarding) | Local signup, remove Stripe req | 3 pts | +| 10 (TF Tenets) | No change | 0 | +| **Total** | | **19 pts** | diff --git a/products/04-lightweight-idp/architecture/dual-mode-addendum.md b/products/04-lightweight-idp/architecture/dual-mode-addendum.md new file mode 100644 index 0000000..648af5b --- /dev/null +++ b/products/04-lightweight-idp/architecture/dual-mode-addendum.md @@ -0,0 +1,65 @@ +# dd0c/portal — Dual-Mode Deployment Addendum + +**Template:** Based on dd0c/route dual-mode pattern + +--- + +## Cloud → Self-Hosted Service Mapping + +| Cloud Service | Self-Hosted Replacement | Notes | +|--------------|----------------------|-------| +| Aurora Serverless v2 | PostgreSQL container | Same schema, pgvector extension | +| Step Functions | Temporal or simple cron | Discovery orchestration | +| Lambda (scanners) | Scanner containers | Same code, containerized | +| Cognito | Local JWT (HS256) | AuthProvider pattern | +| Meilisearch (managed) | Meilisearch container | Already self-hostable | +| S3 | Local FS or MinIO | Discovery artifacts | +| SES | SMTP relay | Notifications | +| CloudWatch | Prometheus + Grafana | Bundled | + +## Self-Hosted Compose Services + +```yaml +services: + api: # Dashboard API + discovery orchestrator + image: ghcr.io/dd0c/portal-api:latest + scanner-aws: # AWS discovery scanner + image: ghcr.io/dd0c/portal-scanner-aws:latest + scanner-github: # GitHub discovery scanner + image: ghcr.io/dd0c/portal-scanner-github:latest + dashboard: # React SPA with Cmd+K search + image: ghcr.io/dd0c/portal-dashboard:latest + postgres: # Catalog + pgvector + image: pgvector/pgvector:pg16 + meilisearch: # Full-text search + image: getmeili/meilisearch:latest + volumes: + - meili_data:/meili_data + redis: # Prefix cache for Cmd+K + image: redis:7-alpine + caddy: + image: caddy:2-alpine +``` + +## Key Difference: Step Functions → Cron + +Self-hosted replaces Step Functions with a simple cron scheduler inside the API container: +- AWS scan: every 6 hours +- GitHub scan: every 4 hours +- Scans run as background tasks, progress streamed via WebSocket (same as cloud) + +## Epic Impact + +| Epic | Change | Effort | +|------|--------|--------| +| 1 (AWS Scanner) | Lambda → container, Step Functions → cron | 3 pts | +| 2 (GitHub Scanner) | Lambda → container | 2 pts | +| 3 (Service Catalog) | Aurora → PostgreSQL container (same schema) | 1 pt | +| 4 (Search) | Already Meilisearch — no change | 0 | +| 5 (Dashboard UI) | Local login form | 2 pts | +| 6 (Analytics) | No change | 0 | +| 7 (Dashboard API) | LocalAuthProvider | 2 pts | +| 8 (Infrastructure) | docker-compose.yml + install.sh | 5 pts | +| 9 (Onboarding) | Local signup, remove Stripe req, WebSocket same | 3 pts | +| 10 (TF Tenets) | No change | 0 | +| **Total** | | **18 pts** | diff --git a/products/05-aws-cost-anomaly/architecture/dual-mode-addendum.md b/products/05-aws-cost-anomaly/architecture/dual-mode-addendum.md new file mode 100644 index 0000000..eb2ce19 --- /dev/null +++ b/products/05-aws-cost-anomaly/architecture/dual-mode-addendum.md @@ -0,0 +1,88 @@ +# dd0c/cost — Dual-Mode Deployment Addendum + +**Template:** Based on dd0c/route dual-mode pattern + +--- + +## Cloud → Self-Hosted Service Mapping + +| Cloud Service | Self-Hosted Replacement | Notes | +|--------------|----------------------|-------| +| EventBridge | Webhook polling + cron | Customer pushes CloudTrail logs or dd0c polls | +| SQS FIFO | PostgreSQL pgmq | Event queue | +| Lambda (normalizer) | Container process | Same TypeScript code | +| DynamoDB | PostgreSQL (JSONB) | Single-table → JSONB with GIN indexes | +| Cognito | Local JWT (HS256) | AuthProvider pattern | +| STS (cross-account) | Direct IAM credentials | Customer provides access key or role ARN | +| S3 | Local FS or MinIO | Raw event archive | +| SES | SMTP relay | Digest emails | + +## Self-Hosted Compose Services + +```yaml +services: + ingestion: # CloudTrail event normalizer + image: ghcr.io/dd0c/cost-ingestion:latest + scorer: # Anomaly detection (Z-score, Welford, novelty) + image: ghcr.io/dd0c/cost-scorer:latest + zombie-hunter: # Daily idle resource scanner + image: ghcr.io/dd0c/cost-zombie:latest + api: # Dashboard API + image: ghcr.io/dd0c/cost-api:latest + dashboard: # React SPA + image: ghcr.io/dd0c/cost-dashboard:latest + postgres: # All data (JSONB), baselines, config + image: postgres:16-alpine + redis: # Panic mode, governance flags, circuit breakers + image: redis:7-alpine + caddy: + image: caddy:2-alpine +``` + +## Key Difference: EventBridge → Polling/Push + +Self-hosted mode can't use EventBridge cross-account rules. Two alternatives: +1. **Push mode:** Customer configures CloudTrail to send to an S3 bucket, dd0c polls the bucket +2. **Agent mode:** Lightweight Go agent in customer VPC forwards CloudTrail events via gRPC (same pattern as dd0c/drift) + +Agent mode is recommended — reuses the dd0c/drift agent pattern. + +## Key Difference: DynamoDB → PostgreSQL JSONB + +Same pattern as dd0c/alert: +```sql +CREATE TABLE cost_events ( + id UUID PRIMARY KEY, + tenant_id TEXT NOT NULL, + account_id TEXT NOT NULL, + data JSONB NOT NULL, + severity TEXT, + created_at TIMESTAMPTZ DEFAULT NOW() +); +CREATE TABLE baselines ( + tenant_id TEXT NOT NULL, + account_id TEXT NOT NULL, + resource_type TEXT NOT NULL, + mean_hourly_cost NUMERIC(12,4), + stddev NUMERIC(12,4), + event_count INTEGER DEFAULT 0, + observed_actors JSONB DEFAULT '[]', + PRIMARY KEY (tenant_id, account_id, resource_type) +); +``` + +## Epic Impact + +| Epic | Change | Effort | +|------|--------|--------| +| 1 (CloudTrail Ingestion) | EventBridge → agent/polling, SQS → pgmq | 4 pts | +| 2 (Anomaly Detection) | No change — pure math | 0 | +| 3 (Zombie Hunter) | Direct AWS API calls (same) | 0 | +| 4 (Notifications) | SMTP fallback | 1 pt | +| 5 (Onboarding) | No CFN quick-create; manual IAM setup guide | 3 pts | +| 6 (Dashboard API) | LocalAuthProvider, DynamoDB → PG | 3 pts | +| 7 (Dashboard UI) | Local login form | 2 pts | +| 8 (Infrastructure) | docker-compose.yml + install.sh | 5 pts | +| 9 (Multi-Account) | Same — just different credential input | 1 pt | +| 10 (TF Tenets) | No change | 0 | +| **Total** | | **19 pts** | diff --git a/products/06-runbook-automation/architecture/dual-mode-addendum.md b/products/06-runbook-automation/architecture/dual-mode-addendum.md new file mode 100644 index 0000000..8156fa3 --- /dev/null +++ b/products/06-runbook-automation/architecture/dual-mode-addendum.md @@ -0,0 +1,69 @@ +# dd0c/run — Dual-Mode Deployment Addendum + +**Template:** Based on dd0c/route dual-mode pattern + +--- + +## Cloud → Self-Hosted Service Mapping + +| Cloud Service | Self-Hosted Replacement | Notes | +|--------------|----------------------|-------| +| RDS PostgreSQL | PostgreSQL container | Same schema, same RLS, same audit trail | +| Cognito | Local JWT (HS256) | AuthProvider pattern | +| S3 (compliance exports) | Local FS or MinIO | ObjectStore trait | +| SES | SMTP relay | Notifications | +| CloudWatch | Prometheus + Grafana | Bundled | +| KMS (audit encryption) | Local AES-256-GCM | Key file mounted as volume | + +## Self-Hosted Compose Services + +```yaml +services: + engine: # Parser + Classifier + Execution Engine (Rust) + image: ghcr.io/dd0c/run-engine:latest + api: # Dashboard API + image: ghcr.io/dd0c/run-api:latest + dashboard: # React SPA (parse preview, execution timeline) + image: ghcr.io/dd0c/run-dashboard:latest + postgres: # Config + audit trail (RLS, hash chain) + image: postgres:16-alpine + redis: # Panic mode, execution locks + image: redis:7-alpine + caddy: + image: caddy:2-alpine +``` + +## Key Advantage: dd0c/run is Already Self-Host Friendly + +dd0c/run has the simplest self-hosted story of all 6 products: +- The Go agent already runs in customer VPCs +- The SaaS is already PostgreSQL-native (no DynamoDB) +- gRPC between agent and SaaS works the same locally +- No EventBridge/SQS/Step Functions dependencies + +The main change is auth and the install script. + +## Agent Connection + +- Cloud: `grpcs://engine.run.dd0c.dev` +- Self-hosted: `grpc://localhost:50051` (or Caddy TLS) + +Agent binary is the same — just different `--server` flag. + +## Epic Impact + +| Epic | Change | Effort | +|------|--------|--------| +| 1 (Parser) | No change — pure Rust | 0 | +| 2 (Classifier) | No change — pure Rust | 0 | +| 3 (Execution Engine) | No change — pure Rust | 0 | +| 4 (Agent) | Configurable gRPC endpoint | 1 pt | +| 5 (Audit Trail) | KMS → local AES-256-GCM | 2 pts | +| 6 (Dashboard API) | LocalAuthProvider | 2 pts | +| 7 (Dashboard UI) | Local login form | 2 pts | +| 8 (Infrastructure) | docker-compose.yml + install.sh | 5 pts | +| 9 (Onboarding) | Local signup, remove Stripe req | 3 pts | +| 10 (TF Tenets) | No change | 0 | +| **Total** | | **15 pts** | + +*dd0c/run is the easiest product to self-host. Recommend it as the second self-hosted release after dd0c/route.*