Files

Max Mayfield 72a0f26a7b Add BMad review epic addendums for all 6 products

Per-product surgical additions to existing epics (not cross-cutting):
- P1 route: 8pts (key redaction, SSE billing, token math, CI runner)
- P2 drift: 12pts (mTLS revocation, state lock recovery, pgmq visibility, RLS leak, entropy scrubber)
- P3 alert: 10pts (HMAC replay, claim-check, out-of-order correlation, free tier, tenant isolation)
- P4 portal: 9pts (partial scan recovery, ownership conflicts, Meilisearch rebuild, VCR freshness, free tier)
- P5 cost: 7pts (concurrent baselines, remediation RBAC, Clock interface, property tests, Redis fallback)
- P6 run: 15pts (shell AST parsing, canary suite, intervention TTL, streaming audit, crypto signatures)

Total: 61 story points across 30 new stories

2026-03-01 02:27:55 +00:00

3.0 KiB

Raw Blame History

dd0c/drift — Epic Addendum (BMad Review Findings)

Source: BMad Code Review (March 1, 2026) Approach: Surgical additions to existing epics — no new epics created.

Epic 2 Addendum: Agent Communication

Story 2.7: mTLS Revocation — Instant Lockout

As a security-conscious platform operator, I want revoked agent certificates to be instantly locked out (including active connections), so that a compromised agent cannot continue sending data.

Acceptance Criteria:

CRL refresh triggers within 30 seconds of cert revocation.
Existing mTLS connections from revoked certs are terminated (not just new connections rejected).
New connection attempts with revoked certs return TLS handshake failure.
Payload replay with captured nonce returns HTTP 409 Conflict.

Estimate: 3 points

Epic 3 Addendum: Drift Analysis Engine

Story 3.8: Terraform State Lock Recovery on Panic

As a customer, I want the panic button to safely release Terraform state locks, so that hitting "stop" doesn't brick my infrastructure.

Acceptance Criteria:

Panic mode triggers terraform force-unlock if normal unlock fails.
State lock is verified released within 10 seconds of panic signal.
Agent logs the force-unlock attempt for audit trail.
If both unlock methods fail, agent alerts the admin with the lock ID for manual recovery.

Estimate: 3 points

Story 3.9: pgmq Visibility Timeout for Long Scans

As a self-hosted operator, I want long-running drift scans to extend their pgmq visibility timeout, so that a second worker doesn't pick up the same job mid-scan.

Acceptance Criteria:

Worker extends visibility by 2 minutes every 90 seconds during processing.
No duplicate processing occurs for scans taking up to 15 minutes.
If worker crashes without extending, job becomes visible after timeout (correct behavior).

Estimate: 2 points

Epic 5 Addendum: Dashboard API

Story 5.8: RLS Connection Pool Leak Prevention

As a multi-tenant SaaS, I want PgBouncer to clear tenant context between requests, so that Tenant A's drift data never leaks to Tenant B.

Acceptance Criteria:

SET LOCAL app.tenant_id is cleared on connection return to pool.
100 concurrent tenant requests produce zero cross-tenant data leakage.
Stress test with interleaved tenant requests on same PgBouncer connection passes.

Estimate: 2 points

Epic 10 Addendum: Transparent Factory Compliance

Story 10.6: Secret Scrubber Entropy Scanning

As a security-first platform, I want the secret scrubber to detect high-entropy strings (not just regex patterns), so that Base64-encoded keys and custom tokens are caught.

Acceptance Criteria:

Shannon entropy > 3.5 bits/char on strings > 20 chars triggers redaction.
Base64-encoded AWS keys detected and scrubbed.
Multi-line RSA private keys detected and replaced with [REDACTED RSA KEY].
Normal log messages (low entropy) are not false-positived.

Estimate: 2 points

Total Addendum: 12 points across 5 stories

3.0 KiB Raw Blame History

dd0c/drift — Epic Addendum (BMad Review Findings)

Epic 2 Addendum: Agent Communication

Story 2.7: mTLS Revocation — Instant Lockout

Epic 3 Addendum: Drift Analysis Engine

Story 3.8: Terraform State Lock Recovery on Panic

Story 3.9: pgmq Visibility Timeout for Long Scans

Epic 5 Addendum: Dashboard API

Story 5.8: RLS Connection Pool Leak Prevention

Epic 10 Addendum: Transparent Factory Compliance

Story 10.6: Secret Scrubber Entropy Scanning

3.0 KiB

Raw Blame History