Commit Graph

27 Commits

Author SHA1 Message Date
eb953cdea5 Security hardening: auth encapsulation, pool restriction, rate limiting, invites, async webhooks
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Successful in 43s
CI — P2 Drift (Go + Node) / saas (push) Failing after 5s
CI — P3 Alert / test (push) Failing after 4s
CI — P4 Portal / test (push) Failing after 4s
CI — P5 Cost / test (push) Failing after 4s
CI — P6 Run / saas (push) Failing after 5s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 7s
CI — P3 Alert / build-push (push) Has been skipped
CI — P4 Portal / build-push (push) Has been skipped
CI — P5 Cost / build-push (push) Has been skipped
CI — P6 Run / build-push (push) Failing after 5s
Phase 1 (Security Critical):
- Auth plugin encapsulation: replaced global addHook with Fastify plugin scope
- Removed startsWith URL matching; public routes registered outside auth scope
- JWT verify now enforces algorithms: ['HS256'] (prevents algorithm confusion)
- Raw pool no longer exported from db.ts; systemQuery() + getPoolForAuth() instead
- withTenant() remains primary tenant-scoped query path

Phase 2 (Infrastructure):
- docker-compose.yml: all secrets via env var substitution (${VAR:-default})
- Per-service Postgres users (dd0c_drift, dd0c_alert, etc.) in docker-init-db.sh
- .env.example with all configurable secrets
- build-push.sh uses $REGISTRY_PASSWORD instead of hardcoded
- .gitignore excludes .env files
- @fastify/rate-limit: 100 req/min global, 5/min login, 3/min signup
- CORS_ORIGIN default changed from '*' to 'http://localhost:5173'

Phase 3 (Product):
- Team invite flow: tenant_invites table, POST /invite, GET /invites, DELETE /invites/:id
- Signup accepts optional invite_token to join existing tenant
- Async webhook ingestion (P3): LPUSH to Redis, BRPOP worker, dead-letter queue

Console:
- All 5 product modules wired: drift, alert, portal, cost, run
- PageHeader accepts children prop
- 71 modules, 70KB gzipped production build

All 6 projects compile clean (tsc --noEmit).
2026-03-02 23:53:55 +00:00
be3f37cfdd Fix CRITICAL auth bypass: exact match for login/signup paths
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Successful in 45s
CI — P2 Drift (Go + Node) / saas (push) Successful in 28s
CI — P3 Alert / test (push) Successful in 24s
CI — P4 Portal / test (push) Successful in 27s
CI — P5 Cost / test (push) Successful in 26s
CI — P6 Run / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 46s
CI — P3 Alert / build-push (push) Failing after 38s
CI — P4 Portal / build-push (push) Failing after 50s
CI — P5 Cost / build-push (push) Failing after 22s
CI — P6 Run / build-push (push) Failing after 1m3s
startsWith('/api/v1/auth/login') allowed any path with that prefix
to bypass authentication (e.g. /api/v1/auth/login-anything).
Changed to exact path match with query string stripping.
Fixed across all 5 products + shared/auth.ts.
2026-03-02 20:35:28 +00:00
3be37d1293 Skip auth on /version endpoint (same as /health)
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Successful in 9s
CI — P3 Alert / test (push) Successful in 21s
CI — P2 Drift (Go + Node) / saas (push) Successful in 35s
CI — P4 Portal / test (push) Successful in 25s
CI — P5 Cost / test (push) Successful in 37s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P6 Run / build-push (push) Failing after 3s
2026-03-02 13:54:46 +00:00
5bad2481ae Add /version endpoint to all products + BUILD_SHA/BUILD_TIME in Dockerfiles
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 23s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P2 Drift (Go + Node) / agent (push) Successful in 17s
CI — P3 Alert / test (push) Successful in 21s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / test (push) Successful in 38s
CI — P5 Cost / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:53:15 +00:00
c4ec43cb76 Add CI build-push jobs targeting reg.dd0c.net with docker login + deploy
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 26s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 22s
CI — P4 Portal / test (push) Successful in 38s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 2s
CI — P6 Run / saas (push) Successful in 22s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P4 Portal / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:48:10 +00:00
18d476f7a0 Target Nas runner (ubuntu-24.04) for build-push jobs — sandbox lacks Docker
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 24s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 27s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 37s
CI — P6 Run / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 17s
CI — P3 Alert / build-push (push) Failing after 17s
CI — P5 Cost / build-push (push) Failing after 11s
CI — P4 Portal / build-push (push) Failing after 14s
CI — P6 Run / build-push (push) Failing after 13s
2026-03-02 05:32:04 +00:00
2df0ce2fff Trigger CI build+push to populate registry at 192.168.86.11:30095
Some checks failed
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 1s
CI — P5 Cost / build-push (push) Failing after 0s
CI — P6 Run / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / saas (push) Successful in 27s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 41s
2026-03-02 05:29:03 +00:00
c537022fa8 Add drift report submission + stack deletion endpoints to P2
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / agent (push) Successful in 43s
2026-03-02 05:03:34 +00:00
4eda9d7be3 Add .dockerignore to all Node products (skip node_modules/dist/tests in build context)
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 29s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 21s
2026-03-02 04:45:57 +00:00
d175c3a6e7 Clean up drift: restore Dockerfile name, remove cache bust artifacts
All checks were successful
CI — P2 Drift (Go + Node) / agent (push) Successful in 14s
CI — P2 Drift (Go + Node) / saas (push) Successful in 28s
2026-03-02 04:45:12 +00:00
e0b84f5481 Fix drift SET LOCAL: use string interpolation with UUID validation (SET doesn't support params)
All checks were successful
CI — P2 Drift (Go + Node) / agent (push) Successful in 15s
CI — P2 Drift (Go + Node) / saas (push) Successful in 29s
2026-03-02 03:59:26 +00:00
364e411e69 Nuclear cache bust: rename drift Dockerfile to Dockerfile.v2
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / agent (push) Successful in 42s
2026-03-02 00:14:43 +00:00
00aaf1a941 Force drift rebuild: add CACHE_BUST build arg to Dockerfile + docker-compose
All checks were successful
CI — P2 Drift (Go + Node) / agent (push) Successful in 10s
CI — P2 Drift (Go + Node) / saas (push) Successful in 27s
2026-03-01 23:06:19 +00:00
cbc9e01807 Cache bust: force drift image rebuild to pick up auth middleware
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 26s
CI — P2 Drift (Go + Node) / agent (push) Successful in 44s
2026-03-01 22:59:34 +00:00
81d03c1735 Fix tenant slug collision: append random hex suffix to prevent 23505 on duplicate tenant names
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m6s
CI — P3 Alert / test (push) Successful in 37s
CI — P5 Cost / test (push) Successful in 29s
CI — P4 Portal / test (push) Successful in 48s
CI — P6 Run / saas (push) Successful in 25s
2026-03-01 22:36:21 +00:00
e0d3a3c043 Add auth middleware to P2 Drift (signup/login/API keys), remove pino-pretty dev transport
All checks were successful
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P2 Drift (Go + Node) / saas (push) Successful in 52s
2026-03-01 22:24:18 +00:00
362c94af33 Fix Node Dockerfiles: npm ci --include=dev so tsc is available in builder stage
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P3 Alert / test (push) Successful in 38s
CI — P4 Portal / test (push) Successful in 38s
CI — P6 Run / saas (push) Successful in 39s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m15s
CI — P5 Cost / test (push) Successful in 1m7s
2026-03-01 19:31:44 +00:00
27a89ee2b7 Trigger CI with tsc fix
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Failing after 3s
CI — P2 Drift (Go + Node) / saas (push) Successful in 29s
CI — P3 Alert / test (push) Successful in 40s
CI — P4 Portal / test (push) Successful in 32s
CI — P6 Run / saas (push) Successful in 30s
CI — P5 Cost / test (push) Successful in 46s
2026-03-01 06:56:00 +00:00
3e68e8871d Trigger CI for P2-SaaS, P4, P5, P6
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Failing after 1s
CI — P4 Portal / test (push) Failing after 17s
CI — P5 Cost / test (push) Failing after 15s
CI — P6 Run / saas (push) Failing after 15s
CI — P2 Drift (Go + Node) / saas (push) Successful in 43s
2026-03-01 06:52:14 +00:00
b9c480c06b Copy shared auth migration (002_auth.sql) to P1 route and P2 drift 2026-03-01 06:12:36 +00:00
5e0065e73e Fix P2 SaaS compilation: wire dispatchNotifications correctly, add P1/P2 Dockerfiles
- P2 processor: use correct dispatchNotifications signature (channels, notification, severity)
- P2 processor: pass pool to withTenant, fix implicit any types
- P1 Dockerfile: multi-stage Rust build for proxy/api/worker binaries
- P2 agent Dockerfile: multi-stage Go build
- P2 SaaS package-lock.json generated
- All 6 products now compile cleanly
2026-03-01 06:10:21 +00:00
b351f2f46b Implement P2 Resend email + PagerDuty Events v2 + Slack retry backoff
- Resend: HTML email with drift summary table and CTA button
- PagerDuty: Events API v2 with dedup_key, severity mapping, custom_details
- Slack: setTimeout retry on 429 rate limit instead of dropping
2026-03-01 05:51:28 +00:00
e1b22e5309 Wire up remaining TODO stubs: P3 test notifications, P2 drift notification trigger
- P3: test notification endpoint now instantiates real Slack/Email/Webhook notifiers
- P2: drift processor triggers notification service when drift_score > 0 (non-fatal on failure)
2026-03-01 04:14:26 +00:00
2c112b2fb1 Add vitest configs for P2-P6 2026-03-01 03:16:58 +00:00
bbbea3519e Add unit tests for P2 SaaS, P3 notifications, P4 search, P5 ingestion, P6 API
- P2: nonce validation, severity levels, RLS withTenant
- P3: notification dispatcher severity gating, Slack Block Kit emoji mapping
- P4: Meilisearch fallback, service CRUD validation, staged update actions
- P5: cost ingestion validation, snooze range, optimistic locking
- P6: runbook API validation, approval decisions, execution status machine, Slack signature
2026-03-01 03:15:31 +00:00
5d67de6486 Add dd0c/drift notifications, infra, CI: Slack Block Kit, Dockerfiles, Gitea Actions
- Notification service: Slack Block Kit (remediate/accept buttons), webhook delivery, rate limit handling
- Dispatcher with severity-based channel filtering
- Agent Dockerfile: multi-stage Go build, static binary
- SaaS Dockerfile: multi-stage Node build
- Fly.io config: scale-to-zero, shared-cpu
- Gitea Actions: Go test+vet, Node typecheck+test, cross-compile agent (linux/darwin/windows)
2026-03-01 02:46:47 +00:00
e67cef518e Scaffold dd0c/drift SaaS backend: Fastify, RLS, ingestion, dashboard API
- Fastify server with Zod validation, pino logging, CORS/helmet
- Drift report ingestion endpoint with nonce replay prevention
- Dashboard API: stacks list, drift history, report detail, summary stats
- PostgreSQL schema with RLS: tenants, users, agent_keys, drift_reports, remediation_actions
- withTenant() helper for safe connection pool tenant context management
- Config via Zod-validated env vars
2026-03-01 02:45:33 +00:00