Max
ef3d00f124
feat(run): add runbook parser, safety classifier, audit hash chain, trust levels
...
CI — P6 Run / saas (push) Successful in 29s
CI — P6 Run / build-push (push) Failing after 50s
- Multi-format parser: YAML, Markdown, Confluence HTML
- Deterministic safety scanner: destructive commands, privilege escalation, network changes
- Immutable audit trail with SHA-256 hash chain + verification endpoint
- Trust level enforcement: sandbox/restricted/standard/elevated
- 004_classifier_audit.sql migration
2026-03-03 06:48:56 +00:00
Max
093890503c
feat(alert): add analytics, PagerDuty escalation, Slack interactions, daily noise report
...
CI — P6 Run / saas (push) Successful in 48s
CI — P6 Run / build-push (push) Failing after 48s
- Analytics API: MTTR by severity, noise reduction stats, incident trends
- PagerDuty auto-escalation for unacknowledged critical incidents
- Slack interactive handler: acknowledge, resolve, mark noise/helpful
- Daily noise report worker with Slack summary
- 005_analytics.sql migration (resolved_at, time-series indexes)
2026-03-03 06:41:25 +00:00
Max
f1f4dee7ab
feat(cost): add zombie hunter, Slack interactions, composite scoring
...
CI — P3 Alert / test (push) Successful in 28s
CI — P5 Cost / test (push) Successful in 42s
CI — P6 Run / saas (push) Successful in 41s
CI — P6 Run / build-push (push) Has been cancelled
CI — P3 Alert / build-push (push) Failing after 53s
CI — P5 Cost / build-push (push) Failing after 5s
- Zombie resource hunter: detects idle EC2/RDS/EBS/EIP/NAT resources
- Slack interactive handler: acknowledge, snooze, create-ticket actions
- Composite anomaly scorer: Z-Score + rate-of-change + pattern + novelty
- Cold-start fast path for new resources (<7 days data)
- 005_zombies.sql migration
2026-03-03 06:39:20 +00:00
Max
47a64d53fd
fix: align backend API routes with console frontend contract
CI — P3 Alert / test (push) Successful in 34s
CI — P4 Portal / test (push) Successful in 37s
CI — P5 Cost / test (push) Successful in 35s
CI — P6 Run / saas (push) Successful in 33s
CI — P5 Cost / build-push (push) Failing after 5s
CI — P6 Run / build-push (push) Failing after 4s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m5s
CI — P2 Drift (Go + Node) / saas (push) Successful in 37s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 16s
CI — P3 Alert / build-push (push) Failing after 14s
CI — P4 Portal / build-push (push) Failing after 27s
2026-03-03 06:09:41 +00:00
Protocol dd0c Agent
76715d169e
fix: RLS auth bypass for signup/login flows
...
CI — P2 Drift (Go + Node) / saas (push) Successful in 26s
CI — P3 Alert / test (push) Successful in 23s
CI — P6 Run / build-push (push) Failing after 15s
CI — P2 Drift (Go + Node) / agent (push) Successful in 38s
CI — P4 Portal / test (push) Successful in 34s
CI — P5 Cost / test (push) Successful in 35s
CI — P6 Run / saas (push) Successful in 33s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 50s
CI — P3 Alert / build-push (push) Failing after 5s
CI — P4 Portal / build-push (push) Failing after 51s
CI — P5 Cost / build-push (push) Failing after 15s
- Add set_config('app.tenant_id') before user INSERT in signup tx
- Add 004_auth_rls_fix.sql: permissive SELECT on users/api_keys for
auth lookups, INSERT on users with tenant context check
- db-setup now runs migrations on every up (idempotent)
2026-03-03 05:38:25 +00:00
Protocol dd0c Agent
1d068c3f75
fix: add maxmem to scrypt params (128MB)
...
CI — P2 Drift (Go + Node) / agent (push) Successful in 38s
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P3 Alert / test (push) Successful in 25s
CI — P4 Portal / test (push) Successful in 32s
CI — P5 Cost / test (push) Successful in 35s
CI — P6 Run / saas (push) Successful in 32s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 16s
CI — P3 Alert / build-push (push) Failing after 15s
CI — P4 Portal / build-push (push) Failing after 40s
CI — P5 Cost / build-push (push) Failing after 41s
CI — P6 Run / build-push (push) Failing after 42s
Node's OpenSSL defaults to 32MB scrypt memory limit but N=65536/r=8/p=1
needs ~64MB. Adds maxmem: 128*1024*1024 to all 5 services' hash and
verify functions.
2026-03-03 05:11:37 +00:00
5792f95d7c
Fix BMad adversarial security review findings
...
CI — P2 Drift (Go + Node) / agent (push) Successful in 47s
CI — P2 Drift (Go + Node) / saas (push) Successful in 36s
CI — P3 Alert / test (push) Successful in 36s
CI — P4 Portal / build-push (push) Failing after 49s
CI — P5 Cost / build-push (push) Failing after 4s
CI — P6 Run / build-push (push) Failing after 4s
CI — P4 Portal / test (push) Successful in 35s
CI — P5 Cost / test (push) Successful in 40s
CI — P6 Run / saas (push) Successful in 36s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 17s
CI — P3 Alert / build-push (push) Failing after 15s
Resolves 11 of the 13 findings:
- [CRITICAL] SQLi in RLS: replaced SET LOCAL with parameterized set_config()
- [CRITICAL] Rate Limiting: installed and registered @fastify/rate-limit in all 5 apps
- [CRITICAL] Invite Hijacking: added email verification check to invite lookup
- [HIGH] Webhook HMAC: added Fastify rawBody parser to fix JSON.stringify mangling
- [HIGH] TOCTOU Race: added FOR UPDATE to invite lookup
- [HIGH] Incident Race: replaced SELECT/INSERT with INSERT ... ON CONFLICT
- [MEDIUM] Grafana Timing Attack: replaced === with crypto.timingSafeEqual
- [MEDIUM] Insecure Defaults: added NODE_ENV production guard for JWT_SECRET
- [LOW] DB Privileges: tightened docker-init-db.sh grants (removed ALL PRIVILEGES)
- [LOW] Plaintext Invites: tokens are now hashed (SHA-256) before DB storage/lookup
- [LOW] Scrypt: increased N parameter to 65536 for stronger password hashing
Note:
- Finding #4 (Fragmented Identity) requires a unified auth database architecture.
- Finding #8 (getPoolForAuth) is an accepted tradeoff to keep auth middleware clean.
2026-03-03 00:14:39 +00:00
eb953cdea5
Security hardening: auth encapsulation, pool restriction, rate limiting, invites, async webhooks
...
CI — P2 Drift (Go + Node) / agent (push) Successful in 43s
CI — P2 Drift (Go + Node) / saas (push) Failing after 5s
CI — P3 Alert / test (push) Failing after 4s
CI — P4 Portal / test (push) Failing after 4s
CI — P5 Cost / test (push) Failing after 4s
CI — P6 Run / saas (push) Failing after 5s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 7s
CI — P3 Alert / build-push (push) Has been skipped
CI — P4 Portal / build-push (push) Has been skipped
CI — P5 Cost / build-push (push) Has been skipped
CI — P6 Run / build-push (push) Failing after 5s
Phase 1 (Security Critical):
- Auth plugin encapsulation: replaced global addHook with Fastify plugin scope
- Removed startsWith URL matching; public routes registered outside auth scope
- JWT verify now enforces algorithms: ['HS256'] (prevents algorithm confusion)
- Raw pool no longer exported from db.ts; systemQuery() + getPoolForAuth() instead
- withTenant() remains primary tenant-scoped query path
Phase 2 (Infrastructure):
- docker-compose.yml: all secrets via env var substitution (${VAR:-default})
- Per-service Postgres users (dd0c_drift, dd0c_alert, etc.) in docker-init-db.sh
- .env.example with all configurable secrets
- build-push.sh uses $REGISTRY_PASSWORD instead of hardcoded
- .gitignore excludes .env files
- @fastify/rate-limit: 100 req/min global, 5/min login, 3/min signup
- CORS_ORIGIN default changed from '*' to 'http://localhost:5173 '
Phase 3 (Product):
- Team invite flow: tenant_invites table, POST /invite, GET /invites, DELETE /invites/:id
- Signup accepts optional invite_token to join existing tenant
- Async webhook ingestion (P3): LPUSH to Redis, BRPOP worker, dead-letter queue
Console:
- All 5 product modules wired: drift, alert, portal, cost, run
- PageHeader accepts children prop
- 71 modules, 70KB gzipped production build
All 6 projects compile clean (tsc --noEmit).
2026-03-02 23:53:55 +00:00
be3f37cfdd
Fix CRITICAL auth bypass: exact match for login/signup paths
...
CI — P2 Drift (Go + Node) / agent (push) Successful in 45s
CI — P2 Drift (Go + Node) / saas (push) Successful in 28s
CI — P3 Alert / test (push) Successful in 24s
CI — P4 Portal / test (push) Successful in 27s
CI — P5 Cost / test (push) Successful in 26s
CI — P6 Run / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 46s
CI — P3 Alert / build-push (push) Failing after 38s
CI — P4 Portal / build-push (push) Failing after 50s
CI — P5 Cost / build-push (push) Failing after 22s
CI — P6 Run / build-push (push) Failing after 1m3s
startsWith('/api/v1/auth/login') allowed any path with that prefix
to bypass authentication (e.g. /api/v1/auth/login-anything).
Changed to exact path match with query string stripping.
Fixed across all 5 products + shared/auth.ts.
2026-03-02 20:35:28 +00:00
3be37d1293
Skip auth on /version endpoint (same as /health)
CI — P2 Drift (Go + Node) / agent (push) Successful in 9s
CI — P3 Alert / test (push) Successful in 21s
CI — P2 Drift (Go + Node) / saas (push) Successful in 35s
CI — P4 Portal / test (push) Successful in 25s
CI — P5 Cost / test (push) Successful in 37s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P6 Run / build-push (push) Failing after 3s
2026-03-02 13:54:46 +00:00
5bad2481ae
Add /version endpoint to all products + BUILD_SHA/BUILD_TIME in Dockerfiles
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 23s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P2 Drift (Go + Node) / agent (push) Successful in 17s
CI — P3 Alert / test (push) Successful in 21s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / test (push) Successful in 38s
CI — P5 Cost / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:53:15 +00:00
c4ec43cb76
Add CI build-push jobs targeting reg.dd0c.net with docker login + deploy
CI — P2 Drift (Go + Node) / saas (push) Successful in 26s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 22s
CI — P4 Portal / test (push) Successful in 38s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 2s
CI — P6 Run / saas (push) Successful in 22s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P4 Portal / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:48:10 +00:00
18d476f7a0
Target Nas runner (ubuntu-24.04) for build-push jobs — sandbox lacks Docker
CI — P2 Drift (Go + Node) / saas (push) Successful in 24s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 27s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 37s
CI — P6 Run / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 17s
CI — P3 Alert / build-push (push) Failing after 17s
CI — P5 Cost / build-push (push) Failing after 11s
CI — P4 Portal / build-push (push) Failing after 14s
CI — P6 Run / build-push (push) Failing after 13s
2026-03-02 05:32:04 +00:00
2df0ce2fff
Trigger CI build+push to populate registry at 192.168.86.11:30095
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 1s
CI — P5 Cost / build-push (push) Failing after 0s
CI — P6 Run / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / saas (push) Successful in 27s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 41s
2026-03-02 05:29:03 +00:00
4eda9d7be3
Add .dockerignore to all Node products (skip node_modules/dist/tests in build context)
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 29s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 21s
2026-03-02 04:45:57 +00:00
81d03c1735
Fix tenant slug collision: append random hex suffix to prevent 23505 on duplicate tenant names
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m6s
CI — P3 Alert / test (push) Successful in 37s
CI — P5 Cost / test (push) Successful in 29s
CI — P4 Portal / test (push) Successful in 48s
CI — P6 Run / saas (push) Successful in 25s
2026-03-01 22:36:21 +00:00
362c94af33
Fix Node Dockerfiles: npm ci --include=dev so tsc is available in builder stage
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P3 Alert / test (push) Successful in 38s
CI — P4 Portal / test (push) Successful in 38s
CI — P6 Run / saas (push) Successful in 39s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m15s
CI — P5 Cost / test (push) Successful in 1m7s
2026-03-01 19:31:44 +00:00
27a89ee2b7
Trigger CI with tsc fix
CI — P2 Drift (Go + Node) / agent (push) Failing after 3s
CI — P2 Drift (Go + Node) / saas (push) Successful in 29s
CI — P3 Alert / test (push) Successful in 40s
CI — P4 Portal / test (push) Successful in 32s
CI — P6 Run / saas (push) Successful in 30s
CI — P5 Cost / test (push) Successful in 46s
2026-03-01 06:56:00 +00:00
3e68e8871d
Trigger CI for P2-SaaS, P4, P5, P6
CI — P2 Drift (Go + Node) / agent (push) Failing after 1s
CI — P4 Portal / test (push) Failing after 17s
CI — P5 Cost / test (push) Failing after 15s
CI — P6 Run / saas (push) Failing after 15s
CI — P2 Drift (Go + Node) / saas (push) Successful in 43s
2026-03-01 06:52:14 +00:00
68140881e0
Trigger CI for P3-P6 Node products
CI — P3 Alert / test (push) Failing after 15s
CI — P4 Portal / test (push) Failing after 19s
CI — P5 Cost / test (push) Failing after 17s
CI — P6 Run / saas (push) Failing after 18s
2026-03-01 06:43:58 +00:00
4146f1c4d0
Fix TypeScript compilation errors across P3-P6
...
- jwt.sign: explicit SignOptions cast for expiresIn (all 4 products)
- ioredis: use named import { Redis } instead of default (P4, P6)
- P4 catalog/service: fix import paths for aws-scanner and github-scanner
- P4 discovery: pass pool to ScheduledDiscovery constructor
- P6 agent-bridge: add explicit types for Redis message callback params
- All 4 Node products now compile cleanly with tsc --noEmit
2026-03-01 06:06:31 +00:00
cf4d1de9e7
Generate package-lock.json for all 4 Node products (required by npm ci in Dockerfiles)
2026-03-01 06:01:33 +00:00
c5f4246fe9
Implement P6 TODO stubs: runbook CRUD, execution triggers, approval flow, Slack bot
...
- Runbooks: list (paginated), get, create (with step counting), archive
- Executions: trigger with dry_run + variables, history, detail with audit trail
- Approvals: list pending, approve/reject with Redis pub/sub notification to agent
- Slack bot: approve_step/reject_step button handlers with DB updates + agent bridge
- All routes use withTenant() RLS
2026-03-01 03:21:06 +00:00
5ee869b9d8
Implement auth: login/signup (scrypt), API key generation, shared migration
...
- Login: email + password lookup, scrypt verify, JWT token
- Signup: create tenant + owner user in transaction, slug generation
- API key: dd0c_ prefix, SHA-256 hash (not bcrypt — faster for API key lookups), prefix index
- Scrypt over bcrypt: zero native deps, Node.js built-in crypto
- Auth routes skip JWT middleware (login/signup are public)
- 002_auth.sql: users + api_keys tables with RLS, copied to all products
- Synced auth middleware to P3/P4/P5/P6
2026-03-01 03:19:18 +00:00
2c112b2fb1
Add vitest configs for P2-P6
2026-03-01 03:16:58 +00:00
2ceeac1a11
Add P2 SaaS CI, P4 scheduled discovery, P6 agent bridge (Redis pub/sub), Caddyfile
...
- P2: Gitea Actions CI for SaaS backend (separate from Go agent CI)
- P4: ScheduledDiscovery with Redis distributed lock to prevent concurrent scans
- P6: AgentBridge — Redis pub/sub for SaaS↔agent communication (approvals + step results)
- Caddyfile: self-hosted reverse proxy with auto-TLS for all 6 products
2026-03-01 03:16:33 +00:00
bbbea3519e
Add unit tests for P2 SaaS, P3 notifications, P4 search, P5 ingestion, P6 API
...
- P2: nonce validation, severity levels, RLS withTenant
- P3: notification dispatcher severity gating, Slack Block Kit emoji mapping
- P4: Meilisearch fallback, service CRUD validation, staged update actions
- P5: cost ingestion validation, snooze range, optimistic locking
- P6: runbook API validation, approval decisions, execution status machine, Slack signature
2026-03-01 03:15:31 +00:00
f2e0a32cc7
Wire auth middleware into all products, add docker-compose and init-db script
...
- Auth middleware (JWT + API key + RBAC) copied into P3/P4/P5/P6
- All server entry points now register auth hooks + auth routes
- Webhook and Slack endpoints skip JWT auth (use HMAC/signature)
- docker-compose.yml: shared Postgres + Redis + Meilisearch, all 4 Node products as services
- init-db.sh: creates per-product databases and runs migrations
- P1 (Rust) and P2 (Go agent) run standalone, not in compose
2026-03-01 03:10:35 +00:00
2bbaa1efde
Add missing configs: CI workflows, tsconfigs, data layers for P4/P5/P6
2026-03-01 03:07:33 +00:00
57e7083986
Scaffold dd0c/run: Rust agent (classifier, executor, audit) + TypeScript SaaS
...
- Rust agent: clap CLI, command classifier (read-only/modifying/destructive), executor with approval gates, audit log entries
- Classifier: pattern-based safety classification for shell, AWS, kubectl, terraform/tofu commands
- 6 Rust tests: read-only, destructive, modifying, empty, terraform apply, tofu destroy
- SaaS backend: Fastify server, runbook CRUD API, approval API, Slack interactive handler
- Slack integration: signature verification, block_actions for approve/reject buttons
- PostgreSQL schema with RLS: runbooks, executions, audit_entries (append-only), agents
- Dual Dockerfiles: Rust multi-stage (agent), Node multi-stage (SaaS)
- Gitea Actions CI: Rust test+clippy, Node typecheck+test
- Fly.io config for SaaS
2026-03-01 03:03:29 +00:00