Commit Graph

36 Commits

Author SHA1 Message Date
3be37d1293 Skip auth on /version endpoint (same as /health)
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Successful in 9s
CI — P3 Alert / test (push) Successful in 21s
CI — P2 Drift (Go + Node) / saas (push) Successful in 35s
CI — P4 Portal / test (push) Successful in 25s
CI — P5 Cost / test (push) Successful in 37s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P6 Run / build-push (push) Failing after 3s
2026-03-02 13:54:46 +00:00
5bad2481ae Add /version endpoint to all products + BUILD_SHA/BUILD_TIME in Dockerfiles
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 3s
CI — P6 Run / saas (push) Successful in 23s
CI — P4 Portal / build-push (push) Failing after 2s
CI — P2 Drift (Go + Node) / agent (push) Successful in 17s
CI — P3 Alert / test (push) Successful in 21s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / test (push) Successful in 38s
CI — P5 Cost / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:53:15 +00:00
c4ec43cb76 Add CI build-push jobs targeting reg.dd0c.net with docker login + deploy
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 26s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 22s
CI — P4 Portal / test (push) Successful in 38s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 4s
CI — P3 Alert / build-push (push) Failing after 2s
CI — P6 Run / saas (push) Successful in 22s
CI — P5 Cost / build-push (push) Failing after 2s
CI — P4 Portal / build-push (push) Failing after 3s
CI — P6 Run / build-push (push) Failing after 2s
2026-03-02 13:48:10 +00:00
18d476f7a0 Target Nas runner (ubuntu-24.04) for build-push jobs — sandbox lacks Docker
Some checks failed
CI — P2 Drift (Go + Node) / saas (push) Successful in 24s
CI — P2 Drift (Go + Node) / agent (push) Successful in 53s
CI — P3 Alert / test (push) Successful in 27s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 37s
CI — P6 Run / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 17s
CI — P3 Alert / build-push (push) Failing after 17s
CI — P5 Cost / build-push (push) Failing after 11s
CI — P4 Portal / build-push (push) Failing after 14s
CI — P6 Run / build-push (push) Failing after 13s
2026-03-02 05:32:04 +00:00
2df0ce2fff Trigger CI build+push to populate registry at 192.168.86.11:30095
Some checks failed
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 22s
CI — P3 Alert / build-push (push) Failing after 1s
CI — P5 Cost / build-push (push) Failing after 0s
CI — P6 Run / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / saas (push) Successful in 27s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 26s
CI — P5 Cost / test (push) Successful in 24s
CI — P4 Portal / build-push (push) Failing after 0s
CI — P2 Drift (Go + Node) / build-push (push) Failing after 41s
2026-03-02 05:29:03 +00:00
2c9408b1df Fix indentation on registerWebhookSecretRoutes call
All checks were successful
CI — P3 Alert / test (push) Successful in 23s
2026-03-02 05:11:37 +00:00
9f46b84257 Add API to manage webhook secrets for alert integrations
All checks were successful
CI — P3 Alert / test (push) Successful in 15s
2026-03-02 05:00:40 +00:00
4eda9d7be3 Add .dockerignore to all Node products (skip node_modules/dist/tests in build context)
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 25s
CI — P2 Drift (Go + Node) / agent (push) Successful in 52s
CI — P3 Alert / test (push) Successful in 29s
CI — P5 Cost / test (push) Successful in 23s
CI — P4 Portal / test (push) Successful in 36s
CI — P6 Run / saas (push) Successful in 21s
2026-03-02 04:45:57 +00:00
81d03c1735 Fix tenant slug collision: append random hex suffix to prevent 23505 on duplicate tenant names
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m6s
CI — P3 Alert / test (push) Successful in 37s
CI — P5 Cost / test (push) Successful in 29s
CI — P4 Portal / test (push) Successful in 48s
CI — P6 Run / saas (push) Successful in 25s
2026-03-01 22:36:21 +00:00
362c94af33 Fix Node Dockerfiles: npm ci --include=dev so tsc is available in builder stage
All checks were successful
CI — P2 Drift (Go + Node) / saas (push) Successful in 34s
CI — P3 Alert / test (push) Successful in 38s
CI — P4 Portal / test (push) Successful in 38s
CI — P6 Run / saas (push) Successful in 39s
CI — P2 Drift (Go + Node) / agent (push) Successful in 1m15s
CI — P5 Cost / test (push) Successful in 1m7s
2026-03-01 19:31:44 +00:00
27a89ee2b7 Trigger CI with tsc fix
Some checks failed
CI — P2 Drift (Go + Node) / agent (push) Failing after 3s
CI — P2 Drift (Go + Node) / saas (push) Successful in 29s
CI — P3 Alert / test (push) Successful in 40s
CI — P4 Portal / test (push) Successful in 32s
CI — P6 Run / saas (push) Successful in 30s
CI — P5 Cost / test (push) Successful in 46s
2026-03-01 06:56:00 +00:00
bfc599da52 Trigger CI after workflow rewrite
All checks were successful
CI — P3 Alert / test (push) Successful in 1m9s
2026-03-01 06:47:59 +00:00
68140881e0 Trigger CI for P3-P6 Node products
Some checks failed
CI — P3 Alert / test (push) Failing after 15s
CI — P4 Portal / test (push) Failing after 19s
CI — P5 Cost / test (push) Failing after 17s
CI — P6 Run / saas (push) Failing after 18s
2026-03-01 06:43:58 +00:00
4534f0aeba Fix test failures: HMAC length check (P3), fast-check fround (P5)
Some checks failed
CI — P3 Alert / test (push) Failing after 15s
CI — P5 Cost / test (push) Failing after 15s
- P3: timingSafeEqual requires equal-length buffers; add length guard before compare
- P5: fast-check fc.float requires 32-bit floats; wrap min with Math.fround()
- All 5 Node products: 83 tests passing across 13 test files
2026-03-01 06:24:46 +00:00
6403e7a3bf Move CI workflows to repo root .gitea/workflows/ (Gitea requires root location)
Some checks failed
CI — P3 Alert / test (push) Has been cancelled
CI — P5 Cost / test (push) Has been cancelled
CI — P2 Drift (Go + Node) / agent (push) Failing after 46s
CI — P2 Drift (Go + Node) / saas (push) Failing after 1m17s
CI — P4 Portal / test (push) Failing after 16s
CI — P6 Run / saas (push) Failing after 17s
CI — P1 Route (Rust) / test (push) Failing after 11m13s
- 6 per-product CI workflows with path filters
- P1: Rust (cargo test + clippy + fmt)
- P2: Go agent (go test + vet) + Node SaaS (tsc + npm test)
- P3-P6: Node (npm ci + tsc + npm test)
- Removed old per-product .gitea dirs (Gitea ignores non-root workflows)
2026-03-01 06:19:42 +00:00
4146f1c4d0 Fix TypeScript compilation errors across P3-P6
- jwt.sign: explicit SignOptions cast for expiresIn (all 4 products)
- ioredis: use named import { Redis } instead of default (P4, P6)
- P4 catalog/service: fix import paths for aws-scanner and github-scanner
- P4 discovery: pass pool to ScheduledDiscovery constructor
- P6 agent-bridge: add explicit types for Redis message callback params
- All 4 Node products now compile cleanly with tsc --noEmit
2026-03-01 06:06:31 +00:00
cf4d1de9e7 Generate package-lock.json for all 4 Node products (required by npm ci in Dockerfiles) 2026-03-01 06:01:33 +00:00
e1b22e5309 Wire up remaining TODO stubs: P3 test notifications, P2 drift notification trigger
- P3: test notification endpoint now instantiates real Slack/Email/Webhook notifiers
- P2: drift processor triggers notification service when drift_score > 0 (non-fatal on failure)
2026-03-01 04:14:26 +00:00
5ee869b9d8 Implement auth: login/signup (scrypt), API key generation, shared migration
- Login: email + password lookup, scrypt verify, JWT token
- Signup: create tenant + owner user in transaction, slug generation
- API key: dd0c_ prefix, SHA-256 hash (not bcrypt — faster for API key lookups), prefix index
- Scrypt over bcrypt: zero native deps, Node.js built-in crypto
- Auth routes skip JWT middleware (login/signup are public)
- 002_auth.sql: users + api_keys tables with RLS, copied to all products
- Synced auth middleware to P3/P4/P5/P6
2026-03-01 03:19:18 +00:00
bdaa732ce1 Implement TODO stubs: webhook secret lookup, alert→incident wiring, catalog upsert/stage
- P3: getWebhookSecret() now queries DB; ingestAlert() creates/attaches incidents, auto-resolves on resolved status
- P4: stageUpdates() writes to staged_updates table; upsertService() with ON CONFLICT; getService/updateOwner implemented
2026-03-01 03:18:05 +00:00
2c112b2fb1 Add vitest configs for P2-P6 2026-03-01 03:16:58 +00:00
bbbea3519e Add unit tests for P2 SaaS, P3 notifications, P4 search, P5 ingestion, P6 API
- P2: nonce validation, severity levels, RLS withTenant
- P3: notification dispatcher severity gating, Slack Block Kit emoji mapping
- P4: Meilisearch fallback, service CRUD validation, staged update actions
- P5: cost ingestion validation, snooze range, optimistic locking
- P6: runbook API validation, approval decisions, execution status machine, Slack signature
2026-03-01 03:15:31 +00:00
3326d9a714 Add .gitignore files for P2-P6 2026-03-01 03:14:37 +00:00
829e408e1e Add notification dispatchers (P3 Slack/Email/Webhook, P5 Slack), full YAML parser for P6
- P3 alert: NotificationDispatcher with Slack Block Kit, Resend email, generic webhook; severity-gated dispatch
- P5 cost: CostSlackNotifier with anomaly Block Kit (score, deviation, snooze/expected buttons)
- P6 run: Full YAML runbook parser with serde_yaml, variable substitution ({{var}}), failure actions, 7 tests
- P6 parser: validates non-empty steps, default timeout (300s), default abort on failure
2026-03-01 03:13:06 +00:00
f2e0a32cc7 Wire auth middleware into all products, add docker-compose and init-db script
- Auth middleware (JWT + API key + RBAC) copied into P3/P4/P5/P6
- All server entry points now register auth hooks + auth routes
- Webhook and Slack endpoints skip JWT auth (use HMAC/signature)
- docker-compose.yml: shared Postgres + Redis + Meilisearch, all 4 Node products as services
- init-db.sh: creates per-product databases and runs migrations
- P1 (Rust) and P2 (Go agent) run standalone, not in compose
2026-03-01 03:10:35 +00:00
d85cdaa3e7 Flesh out dd0c/alert: webhook routes, incident API, notification config, data layer
- Webhook routes: Datadog, PagerDuty, OpsGenie, Grafana with per-tenant HMAC/token auth
- Incident API: list (filtered), detail with alerts, acknowledge/resolve/suppress, dashboard summary
- Notification config: CRUD with upsert, test endpoint, Slack/email/webhook channels
- Grafana normalizer: severity mapping (critical/warning/info)
- Data layer: withTenant() RLS wrapper, Zod config validation
- Fastify server entry point with cors/helmet
2026-03-01 03:04:57 +00:00
ccc4cd1c32 Scaffold dd0c/alert: ingestion, correlation engine, HMAC validation, tests
- Webhook ingestion: HMAC validation for Datadog/PagerDuty/OpsGenie with 5-min timestamp freshness
- Payload normalizers: canonical alert schema with severity mapping per provider
- Correlation engine: time-window grouping, late-alert attachment (2x window), FakeClock for testing
- InMemoryWindowStore for unit tests
- Tests: 12 HMAC validation cases, 5 normalizer cases, 7 correlation engine cases
- PostgreSQL schema with RLS: tenants, incidents, alerts, webhook_secrets, notification_configs
- Free tier enforcement columns (alert_count_month, reset_at)
- Fly.io config, Dockerfile, Gitea Actions CI
2026-03-01 02:49:14 +00:00
72a0f26a7b Add BMad review epic addendums for all 6 products
Per-product surgical additions to existing epics (not cross-cutting):
- P1 route: 8pts (key redaction, SSE billing, token math, CI runner)
- P2 drift: 12pts (mTLS revocation, state lock recovery, pgmq visibility, RLS leak, entropy scrubber)
- P3 alert: 10pts (HMAC replay, claim-check, out-of-order correlation, free tier, tenant isolation)
- P4 portal: 9pts (partial scan recovery, ownership conflicts, Meilisearch rebuild, VCR freshness, free tier)
- P5 cost: 7pts (concurrent baselines, remediation RBAC, Clock interface, property tests, Redis fallback)
- P6 run: 15pts (shell AST parsing, canary suite, intervention TTL, streaming audit, crypto signatures)

Total: 61 story points across 30 new stories
2026-03-01 02:27:55 +00:00
d038cd9c5c Implement BMad Must-Have Before Launch fixes for all 6 products
P1: API key redaction, SSE billing leak, token math edge cases, CI runner config
P2: mTLS revocation lockout, terraform state lock recovery, RLS pool leak, entropy scrubber, pgmq visibility
P3: HMAC replay prevention, cross-tenant negative tests, correlation window edge cases, SQS claim-check, free tier
P4: Discovery partial failure recovery, ownership conflict integration test, VCR freshness CI, Meilisearch rebuild, Cmd+K latency
P5: Concurrent baseline conflicts, remediation RBAC, Clock interface for governance, 10K property-based runs, Redis panic fallback
P6: Cryptographic agent update signatures, streaming audit logs with WAL, shell AST parsing (mvdan/sh), intervention deadlock TTL, canary suite CI gate
2026-03-01 02:14:04 +00:00
b24cfa7c0d BMad code reviews complete for all 6 products
P1 route: Gemini — 'Ship the proxy, stop writing tests for the tests'
P2 drift: Gemini — mTLS revocation, state lock corruption, RLS pool leak
P3 alert: Gemini — replay attacks, trace propagation, SQS claim-check
P4 portal: Manual — discovery reliability is existential risk
P5 cost: Manual — concurrent baselines, remediation RBAC, pricing staleness
P6 run: Gemini — policy update loophole, AST parsing, audit streaming
2026-03-01 02:09:19 +00:00
c3bafa238a Add dual-mode deployment addendums for all 6 products
P1 route: 16 pts (template, full docker-compose + install script)
P2 drift: 17 pts (pgmq, local CA for mTLS)
P3 alert: 19 pts (Lambda→Fastify, DynamoDB→PG JSONB)
P4 portal: 18 pts (Step Functions→cron, Aurora→PG+pgvector)
P5 cost: 19 pts (EventBridge→agent/polling, DynamoDB→PG JSONB)
P6 run: 15 pts (easiest — already PG-native, no AWS deps in core)

Total self-hosted effort: ~104 story points across all 6 products
2026-03-01 02:00:00 +00:00
4938674c20 Phase 3: BDD acceptance specs for P2 (drift), P3 (alert), P6 (run)
P2: 2,245 lines, 10 epics — Sonnet subagent (8min)
P3: 1,653 lines, 10 epics — Sonnet subagent (6min)
P6: 2,303 lines, 262 scenarios, 10 epics — Sonnet subagent (7min)
P4 (portal) still in progress
2026-03-01 01:54:35 +00:00
03bfe931fc Implement review remediation + PLG analytics SDK
- All 6 test architectures patched with Section 11 addendums
- P5 (cost) fully rewritten from 232 to ~600 lines
- PLG brainstorm + party mode advisory board results
- Analytics SDK v2 (PostHog Cloud, Zod strict, Lambda-safe)
- Analytics tests v2 (safeParse, no , no timestamp, no PII)
- Addresses all Gemini review findings across P1-P6
2026-03-01 01:42:49 +00:00
2fe0ed856e Add Gemini TDD reviews for all 6 products
P1, P2, P3, P4, P6 reviewed by Gemini subagents.
P5 reviewed manually (Gemini credential errors).
All reviews flag coverage gaps, anti-patterns, and Transparent Factory tenet gaps.
2026-03-01 00:29:24 +00:00
1101fef096 Update test architectures for P3, P4, P5 2026-02-28 23:33:07 +00:00
5ee95d8b13 dd0c: full product research pipeline - 6 products, 8 phases each
Products: route, drift, alert, portal, cost, run
Phases: brainstorm, design-thinking, innovation-strategy, party-mode,
        product-brief, architecture, epics (incl. Epic 10 TF compliance),
        test-architecture (TDD strategy)

Brand strategy and market research included.
2026-02-28 17:35:02 +00:00