Implement review remediation + PLG analytics SDK

- All 6 test architectures patched with Section 11 addendums - P5 (cost) fully rewritten from 232 to ~600 lines - PLG brainstorm + party mode advisory board results - Analytics SDK v2 (PostHog Cloud, Zod strict, Lambda-safe) - Analytics tests v2 (safeParse, no , no timestamp, no PII) - Addresses all Gemini review findings across P1-P6
2026-03-01 01:42:49 +00:00
parent 2fe0ed856e
commit 03bfe931fc
9 changed files with 2950 additions and 85 deletions
--- a/products/01-llm-cost-router/src/analytics/index.ts
+++ b/products/01-llm-cost-router/src/analytics/index.ts
@@ -0,0 +1,138 @@
 import { PostHog } from 'posthog-node';
 import { z } from 'zod';
 // ---------------------------------------------------------
 // 1. Unified Event Taxonomy (Zod Enforced, Strictly Typed)
 // ---------------------------------------------------------
 export enum EventName {
  SignupCompleted = 'account.signup.completed',
  FirstDollarSaved = 'routing.savings.first_dollar',
  UpgradeCompleted = 'billing.upgrade.completed',
 }
 // Per-event property schemas — no z.any() PII loophole
 const SignupProperties = z.object({
  method: z.enum(['github_sso', 'google_sso', 'email']),
 }).strict();
 const ActivationProperties = z.object({
  savings_amount: z.number().nonnegative(),
 }).strict();
 const UpgradeProperties = z.object({
  plan: z.enum(['pro', 'business']),
  mrr_increase: z.number().nonnegative(),
 }).strict();
 const PropertiesMap = {
  [EventName.SignupCompleted]: SignupProperties,
  [EventName.FirstDollarSaved]: ActivationProperties,
  [EventName.UpgradeCompleted]: UpgradeProperties,
 } as const;
 export const EventSchema = z.object({
  name: z.nativeEnum(EventName),
  tenant_id: z.string().min(1, 'tenant_id is required'),
  product: z.literal('route'),
  properties: z.record(z.unknown()).optional().default({}),
 });
 export type AnalyticsEvent = z.infer<typeof EventSchema>;
 // ---------------------------------------------------------
 // 2. NoOp Client for local/test environments
 // ---------------------------------------------------------
 class NoOpPostHog {
  capture() {}
  identify() {}
  async flushAsync() {}
  async shutdown() {}
 }
 // ---------------------------------------------------------
 // 3. Analytics SDK (PostHog Cloud, Lambda-Safe)
 // ---------------------------------------------------------
 export class Analytics {
  private client: PostHog | NoOpPostHog;
  public readonly isSessionReplayEnabled = false;
  constructor(client?: PostHog) {
    if (client) {
      this.client = client;
    } else {
      const apiKey = process.env.POSTHOG_API_KEY;
      if (!apiKey) {
        // No key = NoOp. Never silently send to a mock key.
        console.warn('[Analytics] POSTHOG_API_KEY not set — using NoOp client');
        this.client = new NoOpPostHog();
      } else {
        this.client = new PostHog(apiKey, {
          host: 'https://us.i.posthog.com',
          flushAt: 20,       // Batch up to 20 events
          flushInterval: 5000, // Or flush every 5s
        });
      }
    }
  }
  /**
   * Identify a tenant once (on signup). Sets $set properties.
   * Call this instead of embedding $set in every track() call.
   */
  public identify(tenantId: string, properties?: Record<string, unknown>): void {
    this.client.identify({
      distinctId: tenantId,
      properties: { tenant_id: tenantId, ...properties },
    });
  }
  /**
   * Track an event. Uses safeParse — never crashes the caller.
   * Does NOT flush. Call flush() at Lambda teardown.
   */
  public track(event: AnalyticsEvent): boolean {
    // 1. Base schema validation
    const baseResult = EventSchema.safeParse(event);
    if (!baseResult.success) {
      console.error('[Analytics] Invalid event (base):', baseResult.error.format());
      return false;
    }
    // 2. Per-event property validation (strict, no PII loophole)
    const propSchema = PropertiesMap[baseResult.data.name];
    if (propSchema) {
      const propResult = propSchema.safeParse(baseResult.data.properties);
      if (!propResult.success) {
        console.error('[Analytics] Invalid properties:', propResult.error.format());
        return false;
      }
    }
    // 3. Capture — let PostHog assign the timestamp (avoids clock skew)
    this.client.capture({
      distinctId: baseResult.data.tenant_id,
      event: baseResult.data.name,
      properties: {
        product: baseResult.data.product,
        ...baseResult.data.properties,
      },
    });
    return true;
  }
  /**
   * Flush all queued events. Call once at Lambda teardown
   * (e.g., in a Middy middleware or handler's finally block).
   */
  public async flush(): Promise<void> {
    await this.client.flushAsync();
  }
  public async shutdown(): Promise<void> {
    await this.client.shutdown();
  }
 }
--- a/products/01-llm-cost-router/test-architecture/test-architecture.md
+++ b/products/01-llm-cost-router/test-architecture/test-architecture.md
@@ -2239,3 +2239,315 @@ Before writing any new function, ask:
 *Test Architecture document generated for dd0c/route V1 MVP.*
 *Total estimated test count at V1 launch: ~400 tests.*
 *Target CI runtime: <8 minutes (unit + integration), <15 minutes (full pipeline with E2E).*
 ---
 ## 11. Review Remediation Addendum (Post-Gemini Review)
 ### 11.1 Replace MockKeyCache/MockKeyStore with Testcontainers
 ```rust
 // BEFORE (anti-pattern — mocks hide real latency):
 // let cache = MockKeyCache::new();
 // let store = MockKeyStore::new();
 // AFTER: Use Testcontainers for hot-path auth tests
 #[tokio::test]
 async fn auth_middleware_validates_key_under_5ms_with_real_redis() {
    let redis = TestcontainersRedis::start().await;
    let pg = TestcontainersPostgres::start().await;
    let cache = RedisKeyCache::new(redis.connection_string());
    let store = PgKeyStore::new(pg.connection_string());
    let start = Instant::now();
    let result = auth_middleware(&cache, &store, "sk-valid-key").await;
    assert!(start.elapsed() < Duration::from_millis(5));
    assert!(result.is_ok());
 }
 #[tokio::test]
 async fn auth_middleware_handles_redis_connection_pool_exhaustion() {
    // Exhaust all connections, verify fallback to PG
    let redis = TestcontainersRedis::start().await;
    let cache = RedisKeyCache::with_pool_size(redis.connection_string(), 1);
    // Hold the single connection
    let _held = cache.raw_connection().await;
    // Auth must still work via PG fallback
    let result = auth_middleware(&cache, &pg_store, "sk-valid-key").await;
    assert!(result.is_ok());
 }
 ```
 ### 11.2 Fix Encryption Test (Decrypt, Don't Just Assert Non-Plaintext)
 ```rust
 // BEFORE (anti-pattern — passes if stored as random garbage):
 // assert_ne!(stored.encrypted_key, b"sk-plaintext-key");
 // AFTER: Full round-trip encryption test
 #[tokio::test]
 async fn provider_credential_encrypts_and_decrypts_correctly() {
    let kms = LocalStackKMS::start().await;
    let key_id = kms.create_key().await;
    let store = CredentialStore::new(pg.pool(), kms.client(), key_id);
    let original = "sk-live-abc123xyz";
    store.save_credential("org-1", "openai", original).await.unwrap();
    // Read raw from DB — must NOT be plaintext
    let raw = pg.query_raw("SELECT encrypted_key FROM credentials LIMIT 1").await;
    assert!(!String::from_utf8_lossy(&raw).contains(original));
    // Decrypt via the store — must match original
    let decrypted = store.get_credential("org-1", "openai").await.unwrap();
    assert_eq!(decrypted, original);
 }
 #[tokio::test]
 async fn kms_key_rotation_old_deks_still_decrypt_old_credentials() {
    let kms = LocalStackKMS::start().await;
    let key_id = kms.create_key().await;
    let store = CredentialStore::new(pg.pool(), kms.client(), key_id);
    // Save with original key
    store.save_credential("org-1", "openai", "sk-old").await.unwrap();
    // Rotate KMS key
    kms.rotate_key(key_id).await;
    // Old credential must still decrypt
    let decrypted = store.get_credential("org-1", "openai").await.unwrap();
    assert_eq!(decrypted, "sk-old");
    // New credential uses new DEK
    store.save_credential("org-1", "anthropic", "sk-new").await.unwrap();
    let decrypted_new = store.get_credential("org-1", "anthropic").await.unwrap();
    assert_eq!(decrypted_new, "sk-new");
 }
 ```
 ### 11.3 Slow Dependency Chaos Test
 ```rust
 #[tokio::test]
 async fn chaos_slow_db_does_not_block_proxy_hot_path() {
    let stack = E2EStack::start().await;
    // Inject 5-second network delay on TimescaleDB port via tc netem
    stack.inject_latency("timescaledb", Duration::from_secs(5)).await;
    // Proxy must still route requests within SLA
    let start = Instant::now();
    let resp = stack.proxy()
        .post("/v1/chat/completions")
        .header("Authorization", "Bearer sk-valid")
        .json(&chat_request())
        .send().await;
    let latency = start.elapsed();
    assert_eq!(resp.status(), 200);
    // Telemetry is dropped, but routing works
    assert!(latency < Duration::from_millis(50),
        "Proxy blocked by slow DB: {:?}", latency);
 }
 #[tokio::test]
 async fn chaos_slow_redis_falls_back_to_pg_for_auth() {
    let stack = E2EStack::start().await;
    stack.inject_latency("redis", Duration::from_secs(3)).await;
    let resp = stack.proxy()
        .post("/v1/chat/completions")
        .header("Authorization", "Bearer sk-valid")
        .json(&chat_request())
        .send().await;
    assert_eq!(resp.status(), 200);
 }
 ```
 ### 11.4 IDOR / Cross-Tenant Test Suite
 ```rust
 // tests/integration/idor_test.rs
 #[tokio::test]
 async fn idor_org_a_cannot_read_org_b_routing_rules() {
    let stack = E2EStack::start().await;
    let org_a_token = stack.create_org_and_token("org-a").await;
    let org_b_token = stack.create_org_and_token("org-b").await;
    // Org B creates a routing rule
    let rule = stack.api()
        .post("/v1/routing-rules")
        .bearer_auth(&org_b_token)
        .json(&json!({ "name": "secret-rule", "model": "gpt-4" }))
        .send().await.json::<RoutingRule>().await;
    // Org A tries to read it
    let resp = stack.api()
        .get(&format!("/v1/routing-rules/{}", rule.id))
        .bearer_auth(&org_a_token)
        .send().await;
    assert_eq!(resp.status(), 404); // Not 403 — don't leak existence
 }
 #[tokio::test]
 async fn idor_org_a_cannot_read_org_b_api_keys() {
    // Same pattern — create key as org B, attempt read as org A
 }
 #[tokio::test]
 async fn idor_org_a_cannot_read_org_b_telemetry() {}
 #[tokio::test]
 async fn idor_org_a_cannot_mutate_org_b_routing_rules() {}
 ```
 ### 11.5 SSE Connection Drop / Billing Leak Test
 ```rust
 #[tokio::test]
 async fn sse_client_disconnect_aborts_upstream_provider_request() {
    let stack = E2EStack::start().await;
    let mock_provider = stack.mock_provider();
    // Configure provider to stream slowly (1 token/sec for 60 tokens)
    mock_provider.configure_slow_stream(60, Duration::from_secs(1));
    // Start streaming request
    let mut stream = stack.proxy()
        .post("/v1/chat/completions")
        .json(&json!({ "stream": true, "model": "gpt-4" }))
        .send().await
        .bytes_stream();
    // Read 5 tokens then drop the connection
    for _ in 0..5 {
        stream.next().await;
    }
    drop(stream);
    // Wait briefly for cleanup
    tokio::time::sleep(Duration::from_millis(500)).await;
    // Provider connection must be aborted — not still streaming
    assert_eq!(mock_provider.active_connections(), 0);
    // Billing: customer should only be charged for 5 tokens, not 60
    let usage = stack.get_last_usage_record().await;
    assert!(usage.completion_tokens <= 10); // Some buffer for in-flight
 }
 ```
 ### 11.6 Concurrent Circuit Breaker Race Condition
 ```rust
 #[tokio::test]
 async fn circuit_breaker_handles_50_concurrent_failures_cleanly() {
    let redis = TestcontainersRedis::start().await;
    let breaker = RedisCircuitBreaker::new(redis.connection_string(), "openai", 10);
    let mut handles = vec![];
    for _ in 0..50 {
        let b = breaker.clone();
        handles.push(tokio::spawn(async move {
            b.record_failure().await;
        }));
    }
    futures::future::join_all(handles).await;
    // Breaker must be open — no race condition leaving it closed
    assert_eq!(breaker.state().await, CircuitState::Open);
    // Failure count must be exactly 50 (atomic increments)
    assert_eq!(breaker.failure_count().await, 50);
 }
 ```
 ### 11.7 Trace Context Propagation
 ```rust
 #[tokio::test]
 async fn otel_trace_propagates_from_client_through_proxy_to_provider() {
    let stack = E2EStack::start().await;
    let tracer = stack.in_memory_tracer();
    let resp = stack.proxy()
        .post("/v1/chat/completions")
        .header("traceparent", "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01")
        .json(&chat_request())
        .send().await;
    let spans = tracer.finished_spans();
    let proxy_span = spans.iter().find(|s| s.name == "proxy.route").unwrap();
    // Proxy span must be child of the incoming trace
    assert_eq!(proxy_span.trace_id, "4bf92f3577b34da6a3ce929d0e0e4736");
    // Provider request must carry the same trace_id
    let provider_req = stack.mock_provider().last_request();
    assert!(provider_req.headers["traceparent"].contains("4bf92f3577b34da6a3ce929d0e0e4736"));
 }
 ```
 ### 11.8 Flag Provider Fallback Test
 ```rust
 #[test]
 fn flag_provider_unreachable_falls_back_to_safe_default() {
    // Simulate missing/corrupt flag config file
    let provider = JsonFileProvider::new("/nonexistent/flags.json");
    let result = provider.evaluate("enable_new_router", false);
    // Must return the safe default (false), not panic or error
    assert_eq!(result, false);
 }
 #[test]
 fn flag_provider_malformed_json_falls_back_to_safe_default() {
    let provider = JsonFileProvider::from_string("{ invalid json }}}");
    let result = provider.evaluate("enable_new_router", false);
    assert_eq!(result, false);
 }
 ```
 ### 11.9 24-Hour Soak Test Spec
 ```rust
 // tests/soak/long_running_latency.rs
 // Run manually: cargo test --test soak -- --ignored
 #[tokio::test]
 #[ignore] // Only run in nightly CI
 async fn soak_24h_proxy_latency_stays_under_5ms_p99() {
    // k6 config: 10 RPS sustained for 24 hours
    // Assert: p99 < 5ms, no memory growth > 50MB, no connection leaks
    // This catches memory fragmentation and connection pool exhaustion
 }
 ```
 ### 11.10 Panic Mode Authorization
 ```rust
 #[tokio::test]
 async fn panic_mode_requires_owner_role() {
    let stack = E2EStack::start().await;
    let viewer_token = stack.create_token_with_role("org-1", Role::Viewer).await;
    let resp = stack.api()
        .post("/admin/panic")
        .bearer_auth(&viewer_token)
        .send().await;
    assert_eq!(resp.status(), 403);
 }
 #[tokio::test]
 async fn panic_mode_allowed_for_owner_role() {
    let owner_token = stack.create_token_with_role("org-1", Role::Owner).await;
    let resp = stack.api()
        .post("/admin/panic")
        .bearer_auth(&owner_token)
        .send().await;
    assert_eq!(resp.status(), 200);
 }
 ```
 *End of P1 Review Remediation Addendum*
--- a/products/01-llm-cost-router/tests/analytics/analytics.spec.ts
+++ b/products/01-llm-cost-router/tests/analytics/analytics.spec.ts
@@ -0,0 +1,204 @@
 import { describe, it, expect, vi, beforeEach } from 'vitest';
 import { Analytics, EventSchema, EventName } from '../../src/analytics';
 import { PostHog } from 'posthog-node';
 vi.mock('posthog-node');
 describe('Analytics SDK (PostHog Cloud — v2 Post-Review)', () => {
  let analytics: Analytics;
  let mockPostHog: vi.Mocked<PostHog>;
  beforeEach(() => {
    vi.clearAllMocks();
    mockPostHog = new PostHog('phc_test_key', { host: 'https://us.i.posthog.com' }) as any;
    analytics = new Analytics(mockPostHog);
  });
  // ── Schema Validation (Zod) ──────────────────────────────
  describe('Event Taxonomy Validation', () => {
    it('accepts valid account.signup.completed event', () => {
      const event = {
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route' as const,
        properties: { method: 'github_sso' },
      };
      expect(() => EventSchema.parse(event)).not.toThrow();
    });
    it('rejects events missing tenant_id', () => {
      const event = {
        name: EventName.SignupCompleted,
        product: 'route',
        properties: { method: 'email' },
      };
      expect(() => EventSchema.parse(event as any)).toThrow(/tenant_id/);
    });
    it('accepts valid activation event', () => {
      const event = {
        name: EventName.FirstDollarSaved,
        tenant_id: 'tenant-123',
        product: 'route' as const,
        properties: { savings_amount: 1.50 },
      };
      expect(() => EventSchema.parse(event)).not.toThrow();
    });
    it('accepts valid upgrade event', () => {
      const event = {
        name: EventName.UpgradeCompleted,
        tenant_id: 'tenant-123',
        product: 'route' as const,
        properties: { plan: 'pro', mrr_increase: 49 },
      };
      expect(() => EventSchema.parse(event)).not.toThrow();
    });
  });
  // ── track() Behavior ─────────────────────────────────────
  describe('track()', () => {
    it('captures valid events via PostHog client', () => {
      const result = analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'email' },
      });
      expect(result).toBe(true);
      expect(mockPostHog.capture).toHaveBeenCalledWith(
        expect.objectContaining({
          distinctId: 'tenant-123',
          event: 'account.signup.completed',
          properties: expect.objectContaining({
            product: 'route',
            method: 'email',
          }),
        })
      );
    });
    it('does NOT include $set in track calls (use identify instead)', () => {
      analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'github_sso' },
      });
      const captureCall = mockPostHog.capture.mock.calls[0][0];
      expect(captureCall.properties).not.toHaveProperty('$set');
    });
    it('does NOT pass timestamp (let PostHog handle it to avoid clock skew)', () => {
      analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'email' },
      });
      const captureCall = mockPostHog.capture.mock.calls[0][0];
      expect(captureCall).not.toHaveProperty('timestamp');
    });
    it('returns false and does NOT call PostHog if base validation fails', () => {
      const result = analytics.track({
        name: 'invalid.event' as any,
        tenant_id: 'tenant-123',
        product: 'route',
      });
      expect(result).toBe(false);
      expect(mockPostHog.capture).not.toHaveBeenCalled();
    });
    it('returns false if per-event property validation fails (strict schema)', () => {
      const result = analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'invalid_method' }, // Not in enum
      });
      expect(result).toBe(false);
      expect(mockPostHog.capture).not.toHaveBeenCalled();
    });
    it('rejects unknown properties (strict mode — no PII loophole)', () => {
      const result = analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'email', email: 'user@example.com' }, // PII leak attempt
      });
      expect(result).toBe(false);
      expect(mockPostHog.capture).not.toHaveBeenCalled();
    });
    it('does NOT flush after each track call (Lambda batching)', () => {
      analytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'email' },
      });
      expect(mockPostHog.flushAsync).not.toHaveBeenCalled();
    });
  });
  // ── identify() ───────────────────────────────────────────
  describe('identify()', () => {
    it('calls PostHog identify with tenant_id as distinctId', () => {
      analytics.identify('tenant-123', { company: 'Acme' });
      expect(mockPostHog.identify).toHaveBeenCalledWith(
        expect.objectContaining({
          distinctId: 'tenant-123',
          properties: expect.objectContaining({
            tenant_id: 'tenant-123',
            company: 'Acme',
          }),
        })
      );
    });
  });
  // ── flush() ──────────────────────────────────────────────
  describe('flush()', () => {
    it('calls flushAsync on the PostHog client', async () => {
      await analytics.flush();
      expect(mockPostHog.flushAsync).toHaveBeenCalledTimes(1);
    });
  });
  // ── NoOp Client ──────────────────────────────────────────
  describe('NoOp Client (missing API key)', () => {
    it('does not throw when tracking without API key', () => {
      const noopAnalytics = new Analytics(); // No client, no env var
      const result = noopAnalytics.track({
        name: EventName.SignupCompleted,
        tenant_id: 'tenant-123',
        product: 'route',
        properties: { method: 'email' },
      });
      expect(result).toBe(true); // NoOp accepts everything silently
    });
  });
  // ── Session Replay ───────────────────────────────────────
  describe('Security', () => {
    it('session replay is disabled', () => {
      expect(analytics.isSessionReplayEnabled).toBe(false);
    });
  });
 });
--- a/products/02-iac-drift-detection/test-architecture/test-architecture.md
+++ b/products/02-iac-drift-detection/test-architecture/test-architecture.md
@@ -1727,3 +1727,370 @@ Before any code ships to production, these tests must be green:
 ---
 *Document complete. Total estimated test count at V1 launch: ~500 tests. Target by month 3: ~1,000 tests.*
 ---
 ## 11. Review Remediation Addendum (Post-Gemini Review)
 ### 11.1 Missing Epic Coverage
 #### Epic 6: Dashboard UI (React Testing Library + Playwright)
 ```typescript
 // tests/ui/components/DiffViewer.test.tsx
 describe('DiffViewer Component', () => {
  it('renders added lines in green', () => {});
  it('renders removed lines in red', () => {});
  it('renders unchanged lines in default color', () => {});
  it('collapses large diffs with "Show more" toggle', () => {});
  it('highlights HCL syntax in diff blocks', () => {});
  it('shows resource type icon next to each drift item', () => {});
 });
 describe('StackOverview Component', () => {
  it('renders drift count badge per stack', () => {});
  it('sorts stacks by drift severity (critical first)', () => {});
  it('shows last scan timestamp', () => {});
  it('shows agent health indicator (green/yellow/red)', () => {});
 });
 // tests/e2e/ui/dashboard.spec.ts (Playwright)
 test('OAuth login redirects to Cognito and back', async ({ page }) => {
  await page.goto('/dashboard');
  await expect(page).toHaveURL(/cognito/);
 });
 test('stack list renders with drift counts', async ({ page }) => {
  await page.goto('/dashboard/stacks');
  await expect(page.locator('[data-testid="stack-card"]')).toHaveCountGreaterThan(0);
 });
 test('diff viewer renders inline diff for Terraform resource', async ({ page }) => {
  await page.goto('/dashboard/stacks/stack-1/drifts/drift-1');
  await expect(page.locator('[data-testid="diff-viewer"]')).toBeVisible();
  await expect(page.locator('.diff-added')).toHaveCountGreaterThan(0);
 });
 test('revert button triggers confirmation modal', async ({ page }) => {
  await page.goto('/dashboard/stacks/stack-1/drifts/drift-1');
  await page.click('[data-testid="revert-btn"]');
  await expect(page.locator('[data-testid="confirm-modal"]')).toBeVisible();
 });
 ```
 #### Epic 9: Onboarding & PLG (Stripe + drift init)
 ```go
 // pkg/onboarding/stripe_test.go
 func TestStripeWebhookCheckoutCompleted_UpgradesTenant(t *testing.T) {}
 func TestStripeWebhookSubscriptionDeleted_DowngradesTenant(t *testing.T) {}
 func TestStripeWebhookInvalidSignature_Returns401(t *testing.T) {}
 func TestStripeWebhookReplayedEvent_IsIdempotent(t *testing.T) {}
 // pkg/agent/init_test.go
 func TestDriftInit_DetectsTerraformInCurrentDir(t *testing.T) {}
 func TestDriftInit_DetectsCloudFormationInCurrentDir(t *testing.T) {}
 func TestDriftInit_DetectsPulumiInCurrentDir(t *testing.T) {}
 func TestDriftInit_GeneratesValidYAMLConfig(t *testing.T) {}
 func TestDriftInit_HandlesWindowsPaths(t *testing.T) {}
 func TestDriftInit_HandlesMacPaths(t *testing.T) {}
 func TestDriftInit_HandlesLinuxPaths(t *testing.T) {}
 func TestDriftInit_FailsGracefullyOnEmptyDir(t *testing.T) {}
 ```
 #### Epic 8: Infrastructure (Terratest)
 ```go
 // tests/infra/terraform_test.go
 func TestTerraformPlan_CreatesExpectedResources(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../../infra/terraform",
    })
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndPlan(t, terraformOptions)
 }
 func TestTerraformApply_SQSFIFOQueueCreated(t *testing.T) {}
 func TestTerraformApply_RDSInstanceCreated(t *testing.T) {}
 func TestTerraformApply_IAMRolesHaveLeastPrivilege(t *testing.T) {
    // Verify no IAM policy has Action: "*"
 }
 func TestTerraformApply_VPCSecurityGroupsRestrictIngress(t *testing.T) {}
 ```
 #### Epic 2: mTLS Certificate Lifecycle
 ```go
 // pkg/agent/mtls_test.go
 func TestMTLS_CertificateGeneration_ValidX509(t *testing.T) {}
 func TestMTLS_CertificateExpiration_AgentRejectsExpiredCert(t *testing.T) {}
 func TestMTLS_CertificateRotation_NewCertAcceptedMidConnection(t *testing.T) {}
 func TestMTLS_CertificateRevocation_RevokedCertRejected(t *testing.T) {}
 func TestMTLS_SelfSignedCert_RejectedBySaaS(t *testing.T) {}
 func TestMTLS_CertificateChain_IntermediateCAValidated(t *testing.T) {}
 ```
 ### 11.2 Add t.Parallel() to Table-Driven Tests
 ```go
 // BEFORE (sequential — wastes CI time):
 func TestSecretScrubber(t *testing.T) {
    tests := []struct{ name, input, expected string }{...}
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            // runs sequentially
        })
    }
 }
 // AFTER (parallel):
 func TestSecretScrubber(t *testing.T) {
    t.Parallel()
    tests := []struct{ name, input, expected string }{...}
    for _, tt := range tests {
        tt := tt // capture range variable
        t.Run(tt.name, func(t *testing.T) {
            t.Parallel()
            // runs in parallel
        })
    }
 }
 ```
 ### 11.3 Dynamic Resource Naming for LocalStack
 ```go
 // BEFORE (shared state — flaky):
 // bucket := "drift-reports"
 // AFTER (per-test isolation):
 func uniqueBucket(t *testing.T) string {
    return fmt.Sprintf("drift-reports-%s-%d", t.Name(), time.Now().UnixNano())
 }
 func TestDriftReportUpload(t *testing.T) {
    t.Parallel()
    bucket := uniqueBucket(t)
    s3Client.CreateBucket(ctx, &s3.CreateBucketInput{Bucket: &bucket})
    // Test uses isolated bucket — no cross-test contamination
 }
 ```
 ### 11.4 Distributed Tracing Cross-Boundary Tests
 ```go
 // tests/integration/trace_propagation_test.go
 func TestTraceContext_AgentToSaaS_SpanParentChain(t *testing.T) {
    // Agent generates drift_scan span with trace_id
    // POST /v1/drift-reports carries traceparent header
    // SaaS Event Processor creates child span
    // Verify parent-child relationship across HTTP boundary
    exporter := tracetest.NewInMemoryExporter()
    // Fire drift report with traceparent
    traceID := "4bf92f3577b34da6a3ce929d0e0e4736"
    resp := postDriftReport(t, stack, traceID)
    assert.Equal(t, 200, resp.StatusCode)
    spans := exporter.GetSpans()
    eventProcessorSpan := findSpan(spans, "drift_report.process")
    assert.Equal(t, traceID, eventProcessorSpan.SpanContext().TraceID().String())
 }
 func TestTraceContext_SQSBoundary_PreservesTraceID(t *testing.T) {
    // Verify SQS message attributes contain traceparent
    // Verify consumer extracts and continues the trace
 }
 func TestTraceContext_AgentScan_CreatesParentSpan(t *testing.T) {
    // Verify agent drift_scan span has correct attributes:
    // drift.stack_id, drift.resource_count, drift.duration_ms
 }
 ```
 ### 11.5 Backward Compatibility Serialization (Elastic Schema)
 ```go
 // tests/schema/backward_compat_test.go
 func TestOldAgent_ParsesNewDynamoDBItem_WithV2Attributes(t *testing.T) {
    // Simulate V2 DynamoDB item with new _v2 fields
    item := map[string]types.AttributeValue{
        "PK":              &types.AttributeValueMemberS{Value: "STACK#123"},
        "drift_score":     &types.AttributeValueMemberN{Value: "85"},
        "drift_score_v2":  &types.AttributeValueMemberN{Value: "92"}, // New field
        "remediation_v2":  &types.AttributeValueMemberS{Value: "auto"}, // New field
    }
    // V1 parser must ignore unknown fields
    result, err := ParseDriftItem(item)
    assert.NoError(t, err)
    assert.Equal(t, 85, result.DriftScore) // Uses V1 field
 }
 func TestV1Code_ReadsV2Writes_DuringMigrationWindow(t *testing.T) {
    // V2 writes both drift_score and drift_score_v2
    // V1 reads drift_score (ignores _v2)
    // Verify no data loss
 }
 ```
 ### 11.6 Security: RBAC Forgery & Replay Attacks
 ```go
 // tests/integration/security_test.go
 func TestAgentCannotForgeStackID(t *testing.T) {
    // Agent with API key for org-A sends drift report claiming stack belongs to org-B
    orgAKey := createAPIKey(t, "org-a")
    report := makeDriftReport("org-b-stack-id") // Wrong org
    resp := postDriftReportWithKey(t, report, orgAKey)
    assert.Equal(t, 403, resp.StatusCode)
 }
 func TestReplayAttack_DuplicateReportID_Rejected(t *testing.T) {
    report := makeDriftReport("stack-1")
    resp1 := postDriftReport(t, report)
    assert.Equal(t, 200, resp1.StatusCode)
    // Replay exact same report
    resp2 := postDriftReport(t, report)
    assert.Equal(t, 409, resp2.StatusCode) // Conflict — already processed
 }
 func TestReplayAttack_OldTimestamp_Rejected(t *testing.T) {
    report := makeDriftReport("stack-1")
    report.Timestamp = time.Now().Add(-10 * time.Minute) // 10 min old
    resp := postDriftReport(t, report)
    assert.Equal(t, 400, resp.StatusCode) // Stale report
 }
 ```
 ### 11.7 Noisy Neighbor & Fair-Share Processing
 ```go
 // tests/integration/fair_share_test.go
 func TestNoisyNeighbor_LargeOrgDoesNotStarveSmallOrg(t *testing.T) {
    // Org A: 10,000 drifted resources
    // Org B: 10 drifted resources
    // Both submit reports simultaneously
    seedDriftReports(t, "org-a", 10000)
    seedDriftReports(t, "org-b", 10)
    // Org B's reports must be processed within 30 seconds
    // (not queued behind all 10K of Org A's)
    start := time.Now()
    waitForProcessed(t, "org-b", 10, 30*time.Second)
    assert.Less(t, time.Since(start), 30*time.Second)
 }
 ```
 ### 11.8 Panic Mode Mid-Remediation Race Condition
 ```go
 // tests/integration/panic_remediation_test.go
 func TestPanicMode_AbortsInFlightRemediation(t *testing.T) {
    // Start a remediation (terraform apply)
    execID := startRemediation(t, "stack-1", "drift-1")
    waitForState(t, execID, "applying")
    // Trigger panic mode
    triggerPanicMode(t)
    // Remediation must be aborted, not completed
    state := waitForState(t, execID, "aborted")
    assert.Equal(t, "aborted", state)
    // Verify terraform state is not corrupted
    // (agent should have run terraform state pull to verify)
 }
 func TestPanicMode_DoesNotAbortReadOnlyScans(t *testing.T) {
    // Drift scans (read-only) should continue during panic
    // Only write operations (remediation) are halted
    scanID := startDriftScan(t, "stack-1")
    triggerPanicMode(t)
    state := waitForState(t, scanID, "completed")
    assert.Equal(t, "completed", state) // Scan finishes normally
 }
 ```
 ### 11.9 Remediation vs. Concurrent Scan Race Condition
 ```go
 func TestConcurrentScanDuringRemediation_DoesNotReportHalfAppliedState(t *testing.T) {
    // Start remediation (terraform apply — takes ~30s)
    execID := startRemediation(t, "stack-1", "drift-1")
    waitForState(t, execID, "applying")
    // Trigger a drift scan while remediation is in progress
    scanID := startDriftScan(t, "stack-1")
    // Scan must either:
    // a) Wait for remediation to complete, OR
    // b) Skip the stack with "remediation in progress" status
    scanResult := waitForScanComplete(t, scanID)
    assert.NotEqual(t, "half-applied", scanResult.Status)
    // Must be either "skipped_remediation_in_progress" or show post-remediation state
 }
 ```
 ### 11.10 SaaS API Memory Profiling
 ```go
 // tests/load/memory_profile_test.go
 func TestEventProcessor_DoesNotOOM_On1MB_DriftReport(t *testing.T) {
    // Generate a 1MB drift report (1000 resources with large diffs)
    report := makeLargeDriftReport(1000)
    assert.Greater(t, len(report), 1024*1024)
    var memBefore, memAfter runtime.MemStats
    runtime.ReadMemStats(&memBefore)
    processReport(t, report)
    runtime.ReadMemStats(&memAfter)
    growth := memAfter.Alloc - memBefore.Alloc
    assert.Less(t, growth, uint64(50*1024*1024)) // <50MB growth
 }
 ```
 ### 11.11 Trim E2E to Smoke Tier
 Per review recommendation, cap E2E at 10 critical paths. Remaining 40 tests pushed to integration:
 | E2E (Keep — 10 max) | Demoted to Integration |
 |---------------------|----------------------|
 | Onboarding: init → connect → first scan | Agent heartbeat variations |
 | First drift detected → Slack alert | Individual parser format tests |
 | Revert flow: Slack → agent apply → verify | Secret scrubber edge cases |
 | Panic mode halts remediation | DynamoDB access pattern tests |
 | Cross-tenant isolation | Individual webhook format tests |
 | OAuth login → dashboard → view diff | Notification batching |
 | Free tier limit enforcement | Agent config reload |
 | Agent disconnect → reconnect → resume | Baseline score calculations |
 | mTLS cert rotation mid-scan | Individual API endpoint tests |
 | Stripe upgrade → unlock features | Cache invalidation patterns |
 ### 11.12 Updated Test Pyramid (Post-Review)
 | Level | Original | Revised | Rationale |
 |-------|----------|---------|-----------|
 | Unit | 70% (~350) | 65% (~350) | Add t.Parallel(), keep count but add UI component tests |
 | Integration | 20% (~100) | 28% (~150) | Terratest, mTLS, trace propagation, fair-share, security |
 | E2E/Smoke | 10% (~50) | 7% (~35) | Capped at 10 true E2E + 25 Playwright UI tests |
 *End of P2 Review Remediation Addendum*
--- a/products/03-alert-intelligence/test-architecture/test-architecture.md
+++ b/products/03-alert-intelligence/test-architecture/test-architecture.md
@@ -1409,3 +1409,459 @@ Before any release, these tests must pass:
 ---
 *End of dd0c/alert Test Architecture*
 ---
 ## 11. Review Remediation Addendum (Post-Gemini Review)
 ### 11.1 Missing Epic Coverage
 #### Epic 6: Dashboard API
 ```typescript
 describe('Dashboard API', () => {
  describe('Authentication', () => {
    it('returns 401 for missing Cognito JWT', async () => {});
    it('returns 401 for expired JWT', async () => {});
    it('returns 401 for JWT signed by wrong issuer', async () => {});
    it('extracts tenantId from JWT claims', async () => {});
  });
  describe('Incident Listing (GET /v1/incidents)', () => {
    it('returns paginated incidents for authenticated tenant', async () => {});
    it('supports cursor-based pagination', async () => {});
    it('filters by status (open, acknowledged, resolved)', async () => {});
    it('filters by severity (critical, warning, info)', async () => {});
    it('filters by time range (since, until)', async () => {});
    it('returns empty array for tenant with no incidents', async () => {});
  });
  describe('Incident Detail (GET /v1/incidents/:id)', () => {
    it('returns full incident with correlated alerts', async () => {});
    it('returns 404 for incident belonging to different tenant', async () => {});
    it('includes timeline of state transitions', async () => {});
  });
  describe('Analytics (GET /v1/analytics)', () => {
    it('returns MTTR for last 7/30/90 days', async () => {});
    it('returns alert volume by source', async () => {});
    it('returns noise reduction percentage', async () => {});
    it('scopes all analytics to authenticated tenant', async () => {});
  });
  describe('Tenant Isolation', () => {
    it('tenant A cannot read tenant B incidents via API', async () => {});
    it('tenant A cannot read tenant B analytics', async () => {});
    it('all DynamoDB queries include tenantId partition key', async () => {});
  });
 });
 ```
 #### Epic 7: Dashboard UI (Playwright)
 ```typescript
 // tests/e2e/ui/dashboard.spec.ts
 test('login redirects to Cognito hosted UI', async ({ page }) => {
  await page.goto('/dashboard');
  await expect(page).toHaveURL(/cognito/);
 });
 test('incident list renders with correct severity badges', async ({ page }) => {
  await page.goto('/dashboard/incidents');
  await expect(page.locator('[data-testid="incident-card"]')).toHaveCount(5);
  await expect(page.locator('.severity-critical')).toBeVisible();
 });
 test('incident detail shows correlated alert timeline', async ({ page }) => {
  await page.goto('/dashboard/incidents/inc-123');
  await expect(page.locator('[data-testid="alert-timeline"]')).toBeVisible();
  await expect(page.locator('.timeline-event')).toHaveCountGreaterThan(1);
 });
 test('MTTR chart renders with real data', async ({ page }) => {
  await page.goto('/dashboard/analytics');
  await expect(page.locator('[data-testid="mttr-chart"]')).toBeVisible();
 });
 test('noise reduction percentage displays correctly', async ({ page }) => {
  await page.goto('/dashboard/analytics');
  const noise = page.locator('[data-testid="noise-reduction"]');
  await expect(noise).toContainText('%');
 });
 test('webhook setup wizard generates correct URL', async ({ page }) => {
  await page.goto('/dashboard/settings/integrations');
  await page.click('[data-testid="add-datadog"]');
  const url = await page.locator('[data-testid="webhook-url"]').textContent();
  expect(url).toMatch(/\/v1\/webhooks\/ingest\/.+/);
 });
 ```
 #### Epic 9: Onboarding & PLG
 ```typescript
 describe('Free Tier Enforcement', () => {
  it('allows up to 10,000 alerts/month on free tier', async () => {});
  it('returns 429 with upgrade prompt at 10,001st alert', async () => {});
  it('resets counter on first of each month', async () => {});
  it('purges alert data older than 7 days on free tier', async () => {});
  it('retains alert data for 90 days on pro tier', async () => {});
 });
 describe('OAuth Signup', () => {
  it('creates tenant record on first Cognito login', async () => {});
  it('assigns free tier by default', async () => {});
  it('generates unique webhook URL per tenant', async () => {});
 });
 describe('Stripe Integration', () => {
  it('creates checkout session with correct pricing', async () => {});
  it('upgrades tenant on checkout.session.completed webhook', async () => {});
  it('downgrades tenant on subscription.deleted webhook', async () => {});
  it('validates Stripe webhook signature', async () => {});
 });
 ```
 #### Epic 5.3: Slack Feedback Endpoint
 ```typescript
 describe('Slack Interactive Actions Endpoint', () => {
  it('validates Slack request signature (HMAC-SHA256)', async () => {});
  it('rejects request with invalid signature', async () => {});
  it('handles "helpful" feedback — updates incident quality score', async () => {});
  it('handles "noise" feedback — adds to suppression training data', async () => {});
  it('handles "escalate" action — triggers PagerDuty/OpsGenie', async () => {});
  it('updates original Slack message after action', async () => {});
  it('scopes action to correct tenant', async () => {});
 });
 ```
 #### Epic 1.4: S3 Raw Payload Archival
 ```typescript
 describe('Raw Payload Archival', () => {
  it('saves raw webhook payload to S3 asynchronously', async () => {});
  it('S3 key includes tenantId, source, and timestamp', async () => {});
  it('archival failure does not block alert processing', async () => {});
  it('archived payload is retrievable for replay', async () => {});
  it('S3 lifecycle policy deletes after retention period', async () => {});
 });
 ```
 ### 11.2 Anti-Pattern Fixes
 #### Replace ioredis-mock with WindowStore Interface
 ```typescript
 // BEFORE (anti-pattern):
 // import RedisMock from 'ioredis-mock';
 // const engine = new CorrelationEngine(new RedisMock());
 // AFTER (correct):
 interface WindowStore {
  addEvent(tenantId: string, key: string, event: Alert, ttlMs: number): Promise<void>;
  getWindow(tenantId: string, key: string): Promise<Alert[]>;
  clearWindow(tenantId: string, key: string): Promise<void>;
 }
 class InMemoryWindowStore implements WindowStore {
  private store = new Map<string, { events: Alert[]; expiresAt: number }>();
  async addEvent(tenantId: string, key: string, event: Alert, ttlMs: number) {
    const fullKey = `${tenantId}:${key}`;
    const existing = this.store.get(fullKey) || { events: [], expiresAt: Date.now() + ttlMs };
    existing.events.push(event);
    this.store.set(fullKey, existing);
  }
  async getWindow(tenantId: string, key: string): Promise<Alert[]> {
    const fullKey = `${tenantId}:${key}`;
    const entry = this.store.get(fullKey);
    if (!entry || entry.expiresAt < Date.now()) return [];
    return entry.events;
  }
 }
 // Unit tests use InMemoryWindowStore — no Redis dependency
 // Integration tests use RedisWindowStore with Testcontainers
 ```
 #### Replace sinon.useFakeTimers with Clock Interface
 ```typescript
 // BEFORE (anti-pattern):
 // sinon.useFakeTimers(new Date('2026-03-01T00:00:00Z'));
 // AFTER (correct):
 interface Clock {
  now(): number;
  advanceBy(ms: number): void;
 }
 class FakeClock implements Clock {
  private current: number;
  constructor(start: Date = new Date()) { this.current = start.getTime(); }
  now() { return this.current; }
  advanceBy(ms: number) { this.current += ms; }
 }
 class SystemClock implements Clock {
  now() { return Date.now(); }
  advanceBy() { throw new Error('Cannot advance system clock'); }
 }
 // Inject into CorrelationEngine:
 const engine = new CorrelationEngine(new InMemoryWindowStore(), new FakeClock());
 ```
 ### 11.3 Trace Context Propagation Tests
 ```typescript
 describe('Trace Context Propagation', () => {
  it('API Gateway passes trace_id to Lambda via X-Amzn-Trace-Id', async () => {});
  it('Lambda propagates trace_id into SQS message attributes', async () => {
    // Verify SQS message has MessageAttribute 'traceparent' with W3C format
    const msg = await getLastSQSMessage(localstack, 'alert-queue');
    expect(msg.MessageAttributes.traceparent).toBeDefined();
    expect(msg.MessageAttributes.traceparent.StringValue).toMatch(
      /^00-[0-9a-f]{32}-[0-9a-f]{16}-0[01]$/
    );
  });
  it('ECS Correlation Engine extracts trace_id from SQS message', async () => {
    // Verify the correlation span has the correct parent from SQS
    const spans = inMemoryExporter.getFinishedSpans();
    const correlationSpan = spans.find(s => s.name === 'alert.correlation');
    const ingestSpan = spans.find(s => s.name === 'webhook.ingest');
    expect(correlationSpan.parentSpanId).toBeDefined();
    // Parent chain must trace back to the original ingest span
  });
  it('end-to-end trace spans webhook → SQS → correlation → notification', async () => {
    // Fire a webhook, wait for Slack notification, verify all spans share trace_id
    const traceId = await fireWebhookAndGetTraceId();
    const spans = await getSpansByTraceId(traceId);
    const spanNames = spans.map(s => s.name);
    expect(spanNames).toContain('webhook.ingest');
    expect(spanNames).toContain('alert.normalize');
    expect(spanNames).toContain('alert.correlation');
    expect(spanNames).toContain('notification.slack');
  });
 });
 ```
 ### 11.4 HMAC Security Hardening
 ```typescript
 describe('HMAC Signature Validation (Hardened)', () => {
  it('uses crypto.timingSafeEqual, not === comparison', () => {
    // Inspect the source to verify timing-safe comparison
    const source = fs.readFileSync('src/ingestion/hmac.ts', 'utf8');
    expect(source).toContain('timingSafeEqual');
    expect(source).not.toMatch(/signature\s*===\s*/);
  });
  it('handles case-insensitive header names (dd-webhook-signature vs DD-WEBHOOK-SIGNATURE)', async () => {
    const payload = makeAlertPayload('datadog');
    const sig = computeHMAC(payload, DATADOG_SECRET);
    // Lowercase header
    const resp1 = await ingest(payload, { 'dd-webhook-signature': sig });
    expect(resp1.status).toBe(200);
    // Uppercase header
    const resp2 = await ingest(payload, { 'DD-WEBHOOK-SIGNATURE': sig });
    expect(resp2.status).toBe(200);
  });
  it('rejects completely missing signature header', async () => {
    const resp = await ingest(makeAlertPayload('datadog'), {});
    expect(resp.status).toBe(401);
  });
  it('rejects empty signature header', async () => {
    const resp = await ingest(makeAlertPayload('datadog'), { 'dd-webhook-signature': '' });
    expect(resp.status).toBe(401);
  });
 });
 ```
 ### 11.5 SQS 256KB Payload Limit
 ```typescript
 describe('Large Payload Handling', () => {
  it('compresses payloads >200KB before sending to SQS', async () => {
    const largePayload = makeLargeAlertPayload(300 * 1024); // 300KB
    const resp = await ingest(largePayload);
    expect(resp.status).toBe(200);
    const msg = await getLastSQSMessage(localstack, 'alert-queue');
    // Payload must be compressed or use S3 pointer
    expect(msg.Body.length).toBeLessThan(256 * 1024);
  });
  it('uses S3 pointer for payloads >256KB after compression', async () => {
    const hugePayload = makeLargeAlertPayload(500 * 1024); // 500KB
    const resp = await ingest(hugePayload);
    expect(resp.status).toBe(200);
    const msg = await getLastSQSMessage(localstack, 'alert-queue');
    const body = JSON.parse(msg.Body);
    expect(body.s3Pointer).toBeDefined();
    expect(body.s3Pointer).toMatch(/^s3:\/\/dd0c-alert-overflow\//);
  });
  it('strips unnecessary fields from Datadog payload before SQS', async () => {
    const payload = makeDatadogPayloadWithLargeTags(100); // 100 tags
    const resp = await ingest(payload);
    expect(resp.status).toBe(200);
    const msg = await getLastSQSMessage(localstack, 'alert-queue');
    const normalized = JSON.parse(msg.Body);
    // Only essential fields should remain
    expect(normalized.tags.length).toBeLessThanOrEqual(20);
  });
  it('rejects payloads >2MB at API Gateway level', async () => {
    const massive = makeLargeAlertPayload(3 * 1024 * 1024);
    const resp = await ingest(massive);
    expect(resp.status).toBe(413);
  });
 });
 ```
 ### 11.6 DLQ Backpressure & Replay
 ```typescript
 describe('DLQ Replay with Backpressure', () => {
  it('replays DLQ messages in batches of 100', async () => {
    await seedDLQ(10000); // 10K messages
    const replayer = new DLQReplayer({ batchSize: 100, delayBetweenBatchesMs: 500 });
    await replayer.start();
    // Verify batched processing
    expect(replayer.batchesProcessed).toBeGreaterThan(0);
    expect(replayer.maxConcurrentMessages).toBeLessThanOrEqual(100);
  });
  it('pauses replay if correlation engine error rate exceeds 10%', async () => {
    await seedDLQ(1000);
    const replayer = new DLQReplayer({ batchSize: 100, errorThreshold: 0.1 });
    // Simulate correlation engine returning errors
    mockCorrelationEngine.failRate = 0.15;
    await replayer.start();
    expect(replayer.state).toBe('paused');
    expect(replayer.pauseReason).toContain('error rate exceeded');
  });
  it('does not replay if circuit breaker is currently tripped', async () => {
    await seedDLQ(100);
    await tripCircuitBreaker();
    const replayer = new DLQReplayer();
    await replayer.start();
    expect(replayer.messagesReplayed).toBe(0);
    expect(replayer.state).toBe('blocked_by_circuit_breaker');
  });
  it('tracks replay progress for resumability', async () => {
    await seedDLQ(500);
    const replayer = new DLQReplayer({ batchSize: 50 });
    // Process 3 batches then stop
    await replayer.processNBatches(3);
    expect(replayer.checkpoint).toBe(150);
    // Resume from checkpoint
    const replayer2 = new DLQReplayer({ resumeFrom: replayer.checkpoint });
    await replayer2.start();
    expect(replayer2.startedFrom).toBe(150);
  });
 });
 ```
 ### 11.7 Multi-Tenancy Isolation (DynamoDB)
 ```typescript
 describe('DynamoDB Tenant Isolation', () => {
  it('all DAO methods require tenantId parameter', () => {
    // Compile-time check: DAO interface has tenantId as first param
    const daoSource = fs.readFileSync('src/data/incident-dao.ts', 'utf8');
    const methods = extractPublicMethods(daoSource);
    for (const method of methods) {
      expect(method.params[0].name).toBe('tenantId');
    }
  });
  it('query for tenant A returns zero results for tenant B data', async () => {
    const dao = new IncidentDAO(dynamoClient);
    await dao.create('tenant-A', makeIncident());
    await dao.create('tenant-B', makeIncident());
    const results = await dao.list('tenant-A');
    expect(results.every(r => r.tenantId === 'tenant-A')).toBe(true);
  });
  it('partition key always includes tenantId prefix', async () => {
    const dao = new IncidentDAO(dynamoClient);
    await dao.create('tenant-X', makeIncident());
    // Read raw DynamoDB item
    const item = await dynamoClient.scan({ TableName: 'dd0c-alert-main' });
    expect(item.Items[0].PK.S).toStartWith('TENANT#tenant-X');
  });
 });
 ```
 ### 11.8 Slack Circuit Breaker
 ```typescript
 describe('Slack Notification Circuit Breaker', () => {
  it('opens circuit after 10 consecutive 429s from Slack', async () => {
    const slackClient = new SlackClient({ circuitBreakerThreshold: 10 });
    for (let i = 0; i < 10; i++) {
      mockSlack.respondWith(429);
      await slackClient.send(makeMessage()).catch(() => {});
    }
    expect(slackClient.circuitState).toBe('open');
  });
  it('queues notifications while circuit is open', async () => {
    slackClient.openCircuit();
    await slackClient.send(makeMessage());
    expect(slackClient.queuedMessages).toBe(1);
  });
  it('half-opens circuit after 60 seconds', async () => {
    slackClient.openCircuit();
    clock.advanceBy(61000);
    expect(slackClient.circuitState).toBe('half-open');
  });
  it('drains queue on successful half-open probe', async () => {
    slackClient.openCircuit();
    slackClient.queue(makeMessage());
    slackClient.queue(makeMessage());
    clock.advanceBy(61000);
    mockSlack.respondWith(200);
    await slackClient.probe();
    expect(slackClient.circuitState).toBe('closed');
    expect(slackClient.queuedMessages).toBe(0);
  });
 });
 ```
 ### 11.9 Updated Test Pyramid (Post-Review)
 | Level | Original | Revised | Rationale |
 |-------|----------|---------|-----------|
 | Unit | 70% (~140) | 65% (~180) | More tests total, but integration share grows |
 | Integration | 20% (~40) | 25% (~70) | Dashboard API, tenant isolation, trace propagation |
 | E2E | 10% (~20) | 10% (~28) | Dashboard UI (Playwright), onboarding flow |
 *End of P3 Review Remediation Addendum*
--- a/products/04-lightweight-idp/test-architecture/test-architecture.md
+++ b/products/04-lightweight-idp/test-architecture/test-architecture.md
@@ -1107,3 +1107,161 @@ Phase 7: E2E Validation
 ---
 *End of dd0c/portal Test Architecture*
 ---
 ## 11. Review Remediation Addendum (Post-Gemini Review)
 ### 11.1 Resolve Database Misalignment (PostgreSQL vs DynamoDB)
 Epic 10.2 specified DynamoDB Single-Table, but the Architecture and Test Architecture are fundamentally built around PostgreSQL (Aurora Serverless v2) with pgvector. 
 **Resolution:** The IDP requires relational joins and vector search. PostgreSQL is the definitive catalog database. DynamoDB references are removed.
 ```rust
 // tests/schema/migration_validation_test.rs
 #[tokio::test]
 async fn elastic_schema_postgres_migration_is_additive_only() {
    let migrations = read_sql_migrations("./migrations");
    for migration in migrations {
        assert!(!migration.contains("DROP COLUMN"), "Destructive schema change detected");
        assert!(!migration.contains("ALTER COLUMN"), "Type modification detected");
        assert!(!migration.contains("RENAME COLUMN"), "Column rename detected");
    }
 }
 #[tokio::test]
 async fn migration_does_not_hold_exclusive_locks_on_reads() {
    // Concurrent index creation tests
    assert!(migration_contains("CREATE INDEX CONCURRENTLY"), 
        "Indexes must be created concurrently to avoid locking the catalog");
 }
 ```
 ### 11.2 Invert the Test Pyramid (Integration Honeycomb)
 Shift from 70% Unit (with heavy moto/responses mocking) to 30/60/10 with VCR and LocalStack.
 ```python
 # tests/integration/scanners/test_aws_scanner.py
@pytest.mark.vcr()
 def test_aws_scanner_discovers_ecs_services_and_api_gateways(vcr_cassette):
    # Uses real recorded AWS API responses, not moto mocks
    # Validates actual boto3 parsing against real-world AWS shapes
    scanner = AWSDiscoveryScanner(account_id="123456789012", region="us-east-1")
    services = scanner.scan()
    assert len(services) > 0
    assert any(s.type == "ecs_service" for s in services)
@pytest.mark.vcr()
 def test_github_scanner_handles_graphql_pagination(vcr_cassette):
    # Validates real GitHub GraphQL paginated responses
    scanner = GitHubDiscoveryScanner(org_name="dd0c")
    repos = scanner.scan()
    assert len(repos) > 100 # Proves pagination logic works
 ```
 ### 11.3 Missing Epic Coverage
 #### Epic 3.4: PagerDuty & OpsGenie Integrations
 ```python
 # tests/integration/test_pagerduty_sync.py
@pytest.mark.vcr()
 def test_pagerduty_sync_maps_schedules_to_catalog_teams():
    sync = PagerDutySyncer(api_key="sk-test-key")
    teams = sync.fetch_oncall_schedules()
    assert teams[0].oncall_email is not None
 def test_pagerduty_credentials_are_encrypted_at_rest():
    # Verify KMS envelope encryption for 3rd party API keys
    pass
 ```
 #### Epic 4.3: Redis Prefix Caching for Cmd+K
 ```python
 # tests/integration/test_search_cache.py
 def test_cmd_k_search_hits_redis_cache_before_postgres():
    redis_client.set("search:auth", json.dumps([{"name": "auth-service"}]))
    # Must return < 5ms from Redis, skipping DB
    result = search_api.query("auth")
    assert result[0]['name'] == "auth-service"
 def test_catalog_update_invalidates_search_cache():
    # Create new service
    catalog_api.create_service("billing-api")
    # Prefix cache must be purged
    assert redis_client.keys("search:*") == []
 ```
 #### Epics 5 & 6: UI and Dashboards (Playwright)
 ```typescript
 // tests/e2e/ui/catalog.spec.ts
 test('service catalog renders progressive disclosure UI', async ({ page }) => {
  await page.goto('/catalog');
  // Click expands details instead of navigating away
  await page.click('[data-testid="service-row-auth-api"]');
  await expect(page.locator('[data-testid="service-drawer"]')).toBeVisible();
 });
 test('dashboard KPI aggregation shows total services and ownership coverage', async ({ page }) => {
  await page.goto('/dashboard');
  await expect(page.locator('[data-testid="kpi-total-services"]')).toHaveText("150");
  await expect(page.locator('[data-testid="kpi-ownership"]')).toHaveText("85%");
 });
 ```
 #### Epic 9: Onboarding & Stripe
 ```python
 # tests/integration/test_stripe_webhooks.py
 def test_stripe_checkout_completed_upgrades_tenant_tier():
    payload = load_fixture("stripe_checkout_completed.json")
    signature = generate_stripe_signature(payload, secret)
    response = api_client.post("/webhooks/stripe", data=payload, headers={"Stripe-Signature": signature})
    assert response.status_code == 200
    tenant = db.get_tenant("t-123")
    assert tenant.tier == "pro"
 def test_websocket_streams_discovery_progress_during_onboarding():
    # Connect WS client, trigger discovery, assert WS receives "discovering AWS...", "found 50 resources..."
    pass
 ```
 ### 11.4 Scaled Performance Benchmarks
 ```python
 # tests/performance/test_discovery_scale.py
 def test_discovery_pipeline_handles_10000_aws_resources_without_step_functions_payload_limit():
    # Simulate an AWS environment with 10k resources
    # Must chunk state machine transitions to stay under 256KB Step Functions limit
    pass
 def test_discovery_pipeline_handles_1000_github_repos():
    # Verify GraphQL batching and rate limit backoff
    pass
 ```
 ### 11.5 Edge Case Resilience
 ```python
 def test_github_graphql_concurrent_rate_limiting():
    # If 5 tenants scan concurrently, respect Retry-After headers across workers
    pass
 def test_partial_discovery_scan_does_not_corrupt_catalog():
    # If GitHub scan times out halfway, existing services must NOT be marked stale
    pass
 def test_ownership_conflict_resolution():
    # If two discovery sources claim the same repo, prioritize Explicit (Config) over Implicit (Tags)
    pass
 def test_meilisearch_index_rebuild_does_not_drop_search():
    # Verify zero-downtime index swapping during mapping updates
    pass
 ```
--- a/products/05-aws-cost-anomaly/test-architecture/test-architecture.md
+++ b/products/05-aws-cost-anomaly/test-architecture/test-architecture.md
@@ -1,8 +1,8 @@
 # dd0c/cost — Test Architecture & TDD Strategy
 **Product:** dd0c/cost — AWS Cost Anomaly Detective
-**Author:** Test Architecture Phase
+**Author:** Test Architecture Phase (v2 — Post-Review Rewrite)
-**Date:** February 28, 2026
+**Date:** March 1, 2026
 **Status:** V1 MVP — Solo Founder Scope
 ---
@@ -13,7 +13,9 @@
 dd0c/cost sits at the intersection of **money and infrastructure**. A false negative means a customer loses thousands of dollars. A false positive means alert fatigue and churn. The test suite's primary job is to mathematically prove the anomaly scoring engine works across edge cases.
-Guiding principle: **Test the math first, test the infrastructure second.** The Z-score and novelty algorithms must be exhaustively unit-tested with synthetic data before any AWS APIs are mocked.
+Guiding principle: **Test the math first, test the infrastructure second.** The Z-score and novelty algorithms must be exhaustively tested with property-based testing before any AWS APIs are mocked.
 Second principle: **Every dollar matters.** Cost calculations involve floating-point arithmetic on money. Rounding errors, precision loss, and currency handling must be tested with the same rigor as a financial system.
 ### 1.2 Red-Green-Refactor Adapted to dd0c/cost
@@ -28,33 +30,52 @@ REFACTOR → Optimize the baseline lookup, extract novelty checks,
 ```
 **When to write tests first (strict TDD):**
- Anomaly scoring engine (Z-scores, novelty checks, composite severity)
+- All anomaly scoring (Z-scores, novelty checks, composite severity)
- Cold-start heuristics (fast-path for >$5/hr resources)
+- All cold-start heuristics (fast-path for >$5/hr resources)
- Baseline calculation (moving averages, standard deviation)
+- All baseline calculation (Welford algorithm, maturity transitions)
- Governance policy (strict vs. audit mode, 14-day promotion)
+- All governance policy (strict vs. audit mode, 14-day auto-promotion, panic mode)
 - All Slack signature validation (security-critical)
 - All cost calculations (pricing lookup, hourly cost estimation)
 - All feature flag circuit breakers
 **When integration tests lead:**
 - CloudTrail ingestion (implement against LocalStack EventBridge, then lock in)
 - DynamoDB Single-Table schema (build access patterns, then integration test)
 - Cross-account STS role assumption (test against LocalStack)
 **When E2E tests lead:**
- The Slack alert interaction (format block kit, test the "Snooze/Terminate" buttons)
+- Slack alert interaction (format block kit, test "Snooze/Terminate" buttons)
 - Onboarding wizard (CloudFormation quick-create → role validation → first alert)
 ### 1.3 Test Naming Conventions
 ```typescript
 // Unit tests
 describe('AnomalyScorer', () => {
-  it('assigns critical severity when Z-score > 3 and hourly cost > $1', () => {});
+  it('assigns critical severity when Z-score exceeds 3 and hourly cost exceeds $1', () => {});
-  it('flags actor novelty when IAM role has never launched this service', () => {});
+  it('flags actor novelty when IAM role has never launched this service type', () => {});
  it('bypasses baseline and triggers fast-path critical for $10/hr instance', () => {});
 });
-describe('CloudTrailNormalizer', () => {
+describe('BaselineCalculator', () => {
-  it('extracts instance type and region from RunInstances event', () => {});
+  it('updates running mean using Welford online algorithm', () => {});
-  it('looks up correct on-demand pricing for us-east-1 r6g.xlarge', () => {});
+  it('handles zero standard deviation without division by zero', () => {});
 });
 // Property-based tests
 describe('AnomalyScorer (property-based)', () => {
  it('always returns severity between 0 and 100 for any valid input', () => {});
  it('monotonically increases score as Z-score increases', () => {});
  it('never assigns critical to events below $0.50/hr regardless of Z-score', () => {});
 });
 ```
 **Rules:**
 - Describe the observable outcome, not the implementation
 - Use present tense
 - If you need "and" in the name, split into two tests
 - Property-based tests explicitly state the invariant
 ---
 ## Section 2: Test Pyramid
@@ -63,93 +84,441 @@ describe('CloudTrailNormalizer', () => {
 | Level | Target | Count (V1) | Runtime |
 |-------|--------|------------|---------|
-| Unit | 70% | ~250 tests | <20s |
+| Unit | 80% | ~350 tests | <25s |
-| Integration | 20% | ~80 tests | <3min |
+| Integration | 15% | ~65 tests | <4min |
-| E2E/Smoke | 10% | ~15 tests | <5min |
+| E2E/Smoke | 5% | ~15 tests | <8min |
 Higher unit ratio than other dd0c products because the core value is pure math (scoring, baselines, Z-scores).
 ### 2.2 Unit Test Targets
 | Component | Key Behaviors | Est. Tests |
 |-----------|--------------|------------|
-| Event Normalizer | CloudTrail parsing, pricing lookup, deduplication | 40 |
+| CloudTrail Normalizer | Event parsing, pricing lookup, dedup, field extraction | 40 |
-| Baseline Engine | Running mean/stddev calculation, maturity checks | 35 |
+| Baseline Engine | Welford algorithm, maturity transitions, feedback loop | 45 |
-| Anomaly Scorer | Z-score math, novelty detection, composite scoring | 50 |
+| Anomaly Scorer | Z-score, novelty, composite scoring, cold-start fast-path | 60 |
-| Remediation Handler | Stop/Terminate payload parsing, IAM role assumption logic | 20 |
+| Zombie Hunter | Idle resource detection, cost estimation, age calculation | 25 |
-| Notification Engine | Slack formatting, daily digest aggregation | 30 |
+| Notification Formatter | Slack Block Kit, daily digest, CLI command generation | 30 |
-| Governance Policy | Mode enforcement, 14-day auto-promotion | 25 |
+| Slack Bot | Command parsing, signature validation, action handling | 25 |
-| Feature Flags | Circuit breaker on alert volume, flag metadata | 15 |
+| Remediation Handler | Stop/Terminate logic, IAM role assumption, snooze/dismiss | 20 |
 | Dashboard API | CRUD, tenant isolation, pagination, filtering | 25 |
 | Governance Policy | Mode enforcement, 14-day promotion, panic mode | 30 |
 | Feature Flags | Circuit breaker, flag lifecycle, local evaluation | 15 |
 | Onboarding | CFN template validation, role validation, free tier enforcement | 20 |
 | Cost Calculations | Pricing precision, rounding, fallback pricing, currency | 15 |
 ### 2.3 Integration Test Boundaries
 | Boundary | What's Tested | Infrastructure |
 |----------|--------------|----------------|
 | EventBridge → SQS FIFO | Cross-account event routing, dedup, ordering | LocalStack |
 | SQS → Event Processor Lambda | Batch processing, error handling, DLQ routing | LocalStack |
 | Event Processor → DynamoDB | CostEvent writes, baseline updates, transactions | Testcontainers DynamoDB Local |
 | Anomaly Scorer → DynamoDB | Baseline reads, anomaly record writes | Testcontainers DynamoDB Local |
 | Notifier → Slack API | Block Kit delivery, rate limiting, message updates | WireMock |
 | API Gateway → Lambda | Auth (Cognito JWT), routing, throttling | LocalStack |
 | STS → Customer Account | Cross-account role assumption, ExternalId validation | LocalStack |
 | CDK Synth | Infrastructure snapshot, resource policy validation | CDK assertions |
 ### 2.4 E2E/Smoke Scenarios
 1. **Real-Time Anomaly Detection**: CloudTrail event → scoring → Slack alert (<30s)
 2. **Interactive Remediation**: Slack button click → StopInstances → message update
 3. **Onboarding Flow**: Signup → CFN deploy → role validation → first alert
 4. **14-Day Auto-Promotion**: Simulate 14 days → verify strict→audit transition
 5. **Zombie Hunter**: Daily scan → detect idle EC2 → Slack digest
 6. **Panic Mode**: Enable panic → all alerting stops → anomalies still logged
 ---
 ## Section 3: Unit Test Strategy
-### 3.1 Cost Ingestion & Normalization
+### 3.1 CloudTrail Normalizer
 ```typescript
 describe('CloudTrailNormalizer', () => {
-  it('normalizes EC2 RunInstances event to CostEvent schema', () => {});
+  describe('Event Parsing', () => {
-  it('normalizes RDS CreateDBInstance event to CostEvent schema', () => {});
+    it('normalizes EC2 RunInstances to CostEvent schema', () => {});
-  it('extracts assumed role ARN as actor instead of base STS role', () => {});
+    it('normalizes RDS CreateDBInstance to CostEvent schema', () => {});
-  it('applies fallback pricing when instance type is not in static table', () => {});
+    it('normalizes Lambda CreateFunction to CostEvent schema', () => {});
-  it('ignores non-cost-generating events (e.g., DescribeInstances)', () => {});
+    it('extracts assumed role ARN as actor (not base STS role)', () => {});
    it('extracts instance type, region, and AZ from event detail', () => {});
    it('handles batched RunInstances (multiple instances in one call)', () => {});
    it('ignores non-cost-generating events (DescribeInstances, ListBuckets)', () => {});
    it('handles malformed CloudTrail JSON without crashing', () => {});
    it('handles missing optional fields gracefully', () => {});
  });
  describe('Pricing Lookup', () => {
    it('looks up correct on-demand price for us-east-1 m5.xlarge', () => {});
    it('looks up correct on-demand price for us-west-2 r6g.2xlarge', () => {});
    it('applies fallback pricing when instance type not in static table', () => {});
    it('returns $0 for instance types with no pricing data and logs warning', () => {});
    it('handles GPU instances (p4d, g5) with correct pricing', () => {});
  });
  describe('Deduplication', () => {
    it('generates deterministic fingerprint from eventID', () => {});
    it('detects duplicate CloudTrail events by eventID', () => {});
    it('allows same resource type from different events', () => {});
  });
  describe('Cost Precision', () => {
    it('calculates hourly cost with 4 decimal places', () => {});
    it('rounds consistently (banker rounding) to avoid accumulation errors', () => {});
    it('handles sub-cent costs for Lambda invocations', () => {});
  });
 });
 ```
-### 3.2 Anomaly Engine (The Math)
+### 3.2 Anomaly Scorer
 The most critical component. Uses property-based testing via `fast-check`.
 ```typescript
 describe('AnomalyScorer', () => {
-  describe('Statistical Scoring (Z-Score)', () => {
+  describe('Z-Score Calculation', () => {
-    it('returns score=0 when event cost exactly matches baseline mean', () => {});
+    it('returns 0 when event cost exactly matches baseline mean', () => {});
    it('returns proportional score for Z-scores between 1.0 and 3.0', () => {});
-    it('caps Z-score contribution at max threshold', () => {});
+    it('caps Z-score contribution at configurable max threshold', () => {});
    it('handles zero standard deviation without division by zero', () => {});
    it('handles single data point baseline (stddev undefined)', () => {});
    it('handles extremely large values without float overflow', () => {});
    it('handles negative cost delta (cost decrease) as non-anomalous', () => {});
  });
  describe('Novelty Scoring', () => {
-    it('adds novelty penalty when instance type is first seen for account', () => {});
+    it('adds instance novelty penalty when type first seen for account', () => {});
-    it('adds novelty penalty when IAM user has never provisioned this service', () => {});
+    it('adds actor novelty penalty when IAM role is new', () => {});
    it('does not penalize known instance type + known actor', () => {});
    it('weights instance novelty higher than actor novelty', () => {});
  });
  describe('Composite Scoring', () => {
    it('combines Z-score + novelty into composite severity', () => {});
    it('classifies composite < 30 as info', () => {});
    it('classifies composite 30-60 as warning', () => {});
    it('classifies composite > 60 as critical', () => {});
    it('never assigns critical to events below $0.50/hr', () => {});
  });
  describe('Cold-Start Fast Path', () => {
    it('flags $5/hr instance as warning when baseline < 14 days', () => {});
    it('flags $25/hr instance as critical immediately, bypassing baseline', () => {});
-    it('ignores $0.10/hr instances during cold-start learning period', () => {});
+    it('ignores $0.10/hr instances during cold-start learning', () => {});
    it('fast-path is always on — not behind a feature flag', () => {});
    it('transitions from fast-path to statistical scoring at maturity', () => {});
  });
  describe('Feedback Loop', () => {
    it('reduces score for resources marked as expected', () => {});
    it('adds actor to expected list after mark-as-expected', () => {});
    it('still flags expected actor if cost is 10x above baseline', () => {});
  });
  describe('Property-Based Tests (fast-check)', () => {
    it('score is always between 0 and 100 for any valid input', () => {
      // fc.assert(fc.property(
      //   fc.record({ cost: fc.float({min: 0}), mean: fc.float({min: 0}), stddev: fc.float({min: 0}) }),
      //   (input) => { const score = scorer.score(input); return score >= 0 && score <= 100; }
      // ))
    });
    it('score monotonically increases as cost increases (baseline fixed)', () => {});
    it('score monotonically increases as Z-score increases', () => {});
    it('cold-start fast-path always triggers for cost > $25/hr', () => {});
    it('mature baseline never uses fast-path thresholds', () => {});
  });
 });
 ```
-### 3.3 Baseline Learning
+### 3.3 Baseline Engine
 ```typescript
 describe('BaselineCalculator', () => {
-  it('updates running mean and stddev using Welford algorithm', () => {});
+  describe('Welford Online Algorithm', () => {
-  it('adds new actor to observed_actors set', () => {});
+    it('updates running mean correctly after each observation', () => {});
-  it('marks baseline as mature when event_count > 20 and age_days > 14', () => {});
+    it('updates running variance correctly after each observation', () => {});
    it('produces correct stddev after 100 observations', () => {});
    it('handles first observation (count=1, stddev=0)', () => {});
    it('handles identical observations (stddev=0)', () => {});
    it('handles catastrophic cancellation with large values', () => {
      // Welford is numerically stable — verify this property
    });
  });
  describe('Maturity Transitions', () => {
    it('starts in cold-start state', () => {});
    it('transitions to learning after 5 events', () => {});
    it('transitions to mature after 20 events AND 14 days', () => {});
    it('does not mature with 100 events but only 3 days', () => {});
    it('does not mature with 14 days but only 5 events', () => {});
  });
  describe('Actor & Instance Tracking', () => {
    it('adds new actor to observed_actors set', () => {});
    it('adds new instance type to observed_types set', () => {});
    it('does not duplicate existing actors', () => {});
  });
  describe('Property-Based Tests', () => {
    it('mean converges to true mean as observations increase', () => {});
    it('variance is always non-negative', () => {});
    it('stddev equals sqrt(variance) within float tolerance', () => {});
  });
 });
 ```
 ### 3.4 Zombie Hunter
 ```typescript
 describe('ZombieHunter', () => {
  it('detects EC2 instance running >7 days with <5% CPU utilization', () => {});
  it('detects RDS instance with 0 connections for >3 days', () => {});
  it('detects unattached EBS volumes older than 7 days', () => {});
  it('calculates cumulative waste cost for each zombie', () => {});
  it('excludes instances tagged dd0c:ignore', () => {});
  it('handles API pagination for accounts with 500+ instances', () => {});
  it('respects read-only IAM permissions (never modifies resources)', () => {});
 });
 ```
 ### 3.5 Notification Formatter
 ```typescript
 describe('NotificationFormatter', () => {
  describe('Slack Block Kit', () => {
    it('formats EC2 anomaly with resource type, region, cost, actor', () => {});
    it('formats RDS anomaly with engine, storage, multi-AZ status', () => {});
    it('includes "Why this alert" section with anomaly signals', () => {});
    it('includes suggested CLI commands for remediation', () => {});
    it('includes Snooze/Mark Expected/Stop Instance buttons', () => {});
    it('generates correct aws ec2 stop-instances command', () => {});
    it('generates correct aws rds stop-db-instance command', () => {});
  });
  describe('Daily Digest', () => {
    it('aggregates 24h of anomalies into summary stats', () => {});
    it('includes total estimated spend across all accounts', () => {});
    it('highlights top 3 costliest anomalies', () => {});
    it('includes zombie resource count and waste estimate', () => {});
    it('shows baseline learning progress for new accounts', () => {});
  });
 });
 ```
 ### 3.6 Slack Bot
 ```typescript
 describe('SlackBot', () => {
  describe('Signature Validation', () => {
    it('validates correct Slack request signature (HMAC-SHA256)', () => {});
    it('rejects request with invalid signature', () => {});
    it('rejects request with missing X-Slack-Signature header', () => {});
    it('rejects request with expired timestamp (>5 min)', () => {});
    it('uses timing-safe comparison to prevent timing attacks', () => {});
  });
  describe('Command Parsing', () => {
    it('routes /dd0c status to status handler', () => {});
    it('routes /dd0c anomalies to anomaly list handler', () => {});
    it('routes /dd0c digest to digest handler', () => {});
    it('returns help text for unknown commands', () => {});
    it('responds within 3 seconds or defers with 200 OK', () => {});
  });
  describe('Interactive Actions', () => {
    it('validates interactive payload signature', () => {});
    it('handles mark_expected action and updates baseline', () => {});
    it('handles snooze_1h action and sets snoozeUntil', () => {});
    it('handles snooze_24h action', () => {});
    it('updates original Slack message after action', () => {});
    it('rejects action from user not in authorized workspace', () => {});
  });
 });
 ```
 ### 3.7 Governance Policy Engine
 ```typescript
 describe('GovernancePolicy', () => {
  describe('Mode Enforcement', () => {
    it('strict mode: logs anomaly but does not send Slack alert', () => {});
    it('audit mode: sends Slack alert with full logging', () => {});
    it('defaults new accounts to strict mode', () => {});
  });
  describe('14-Day Auto-Promotion', () => {
    it('does not promote account with <14 days of baseline', () => {});
    it('does not promote account with >10% false-positive rate', () => {});
    it('promotes account on day 15 if FP rate <10%', () => {});
    it('calculates false-positive rate from mark-as-expected actions', () => {});
    it('auto-promotion check runs daily via cron', () => {});
  });
  describe('Panic Mode', () => {
    it('stops all alerting when panic=true', () => {});
    it('continues scoring and logging during panic', () => {});
    it('activates in <1 second via Redis key', () => {});
    it('activatable via POST /admin/panic', () => {});
    it('dashboard API returns "alerting paused" header during panic', () => {});
  });
  describe('Per-Account Override', () => {
    it('account can set stricter mode than system default', () => {});
    it('account cannot downgrade from system strict to audit', () => {});
    it('merge logic: max_restrictive(system, account)', () => {});
  });
  describe('Policy Decision Logging', () => {
    it('logs "suppressed by strict mode" with anomaly context', () => {});
    it('logs "auto-promoted to audit mode" with baseline stats', () => {});
    it('logs "panic mode active — alerting paused"', () => {});
  });
 });
 ```
 ### 3.8 Dashboard API
 ```typescript
 describe('DashboardAPI', () => {
  describe('Account Management', () => {
    it('GET /v1/accounts returns connected accounts for tenant', () => {});
    it('DELETE /v1/accounts/:id marks account as disconnecting', () => {});
    it('returns 401 without valid Cognito JWT', () => {});
    it('scopes all queries to authenticated tenantId', () => {});
  });
  describe('Anomaly Listing', () => {
    it('GET /v1/anomalies returns recent anomalies', () => {});
    it('supports since, status, severity filters', () => {});
    it('implements cursor-based pagination', () => {});
    it('includes slackMessageUrl when alert was sent', () => {});
  });
  describe('Baseline Overrides', () => {
    it('PATCH /v1/accounts/:id/baselines/:service/:type updates sensitivity', () => {});
    it('rejects invalid sensitivity values', () => {});
  });
  describe('Tenant Isolation', () => {
    it('never returns anomalies from another tenant', () => {});
    it('never returns accounts from another tenant', () => {});
    it('enforces tenantId on all DynamoDB queries', () => {});
  });
 });
 ```
 ### 3.9 Onboarding & PLG
 ```typescript
 describe('Onboarding', () => {
  describe('CloudFormation Template', () => {
    it('generates valid CFN YAML with correct IAM permissions', () => {});
    it('includes ExternalId parameter', () => {});
    it('includes EventBridge rule for cost-relevant CloudTrail events', () => {});
    it('quick-create URL contains correct template URL and parameters', () => {});
  });
  describe('Role Validation', () => {
    it('successfully assumes role with correct ExternalId', () => {});
    it('returns clear error on role not found', () => {});
    it('returns clear error on ExternalId mismatch', () => {});
    it('triggers zombie scan on successful connection', () => {});
  });
  describe('Free Tier Enforcement', () => {
    it('allows first account connection on free tier', () => {});
    it('rejects second account with 403 and upgrade prompt', () => {});
    it('allows multiple accounts on pro tier', () => {});
  });
  describe('Stripe Integration', () => {
    it('creates Stripe Checkout session with correct pricing', () => {});
    it('handles checkout.session.completed webhook', () => {});
    it('handles customer.subscription.deleted webhook', () => {});
    it('validates Stripe webhook signature', () => {});
    it('updates tenant tier to pro on successful payment', () => {});
    it('downgrades tenant on subscription cancellation', () => {});
  });
 });
 ```
 ### 3.10 Feature Flag Circuit Breaker
 ```typescript
 describe('AlertVolumeCircuitBreaker', () => {
  it('allows alerting when volume is within 3x baseline', () => {});
  it('trips breaker when alerts exceed 3x baseline over 1 hour', () => {});
  it('auto-disables the scoring flag when breaker trips', () => {});
  it('buffers suppressed alerts in DLQ for review', () => {});
  it('tracks alert-per-account rate in Redis sliding window', () => {});
  it('resets breaker after manual flag re-enable', () => {});
  it('fast-path alerts are exempt from circuit breaker', () => {});
 });
 ```
 ---
 ## Section 4: Integration Test Strategy
 ### 4.1 DynamoDB Data Layer (Testcontainers)
 ```typescript
-describe('DynamoDB Single-Table Patterns', () => {
+describe('DynamoDB Integrations', () => {
-  it('writes CostEvent and updates Baseline in single transaction', async () => {});
+  let dynamodb: StartedTestContainer;
-  it('queries all anomalies for tenant within time range', async () => {});
+
-  it('fetches tenant config and Slack tokens securely', async () => {});
+  beforeAll(async () => {
    dynamodb = await new GenericContainer('amazon/dynamodb-local:latest')
      .withExposedPorts(8000).start();
    // Create dd0c-cost-main table with GSIs
  });
  describe('Transactional Writes', () => {
    it('writes CostEvent and updates Baseline in single TransactWriteItem', async () => {});
    it('fails gracefully if TransactWriteItem encounters ConditionalCheckFailed', async () => {});
    it('handles partial failure recovery when Baseline update conflicts', async () => {});
  });
  describe('Access Patterns', () => {
    it('queries all anomalies for tenant within time range (GSI3)', async () => {});
    it('fetches tenant config and Slack tokens securely', async () => {});
    it('retrieves accurate Baseline snapshot by resource type', async () => {});
  });
 });
 ```
-### 4.2 AWS API Contract Tests
+### 4.2 Cross-Account STS & AWS APIs (LocalStack)
 ```typescript
-describe('AWS Cross-Account Actions', () => {
+describe('AWS Cross-Account Integrations', () => {
-  // Uses LocalStack to simulate target account
+  let localstack: StartedTestContainer;
-  it('assumes target account remediation role successfully', async () => {});
+
-  it('executes ec2:StopInstances when remediation approved', async () => {});
+  beforeAll(async () => {
-  it('executes rds:DeleteDBInstance with skip-final-snapshot', async () => {});
+    localstack = await new GenericContainer('localstack/localstack:3')
      .withEnv('SERVICES', 'sts,ec2,rds')
      .withExposedPorts(4566).start();
  });
  describe('Role Assumption', () => {
    it('successfully assumes target account remediation role via STS', async () => {});
    it('fails when ExternalId does not match (Security)', async () => {});
    it('handles STS credential expiration gracefully', async () => {});
  });
  describe('Remediation Actions', () => {
    it('executes ec2:StopInstances when remediation approved', async () => {});
    it('executes rds:StopDBInstance when remediation approved', async () => {});
    it('fails safely when target IAM role lacks StopInstances permission', async () => {});
  });
 });
 ```
 ### 4.3 Slack API Contract (WireMock)
 ```typescript
 describe('Slack Integration', () => {
  it('formats and delivers Block Kit message successfully', async () => {});
  it('handles 429 Rate Limit by throwing retryable error for SQS visibility timeout', async () => {});
  it('updates existing Slack message when anomaly is snoozed', async () => {});
 });
 ```
@@ -159,24 +528,65 @@ describe('AWS Cross-Account Actions', () => {
 ### 5.1 Critical User Journeys
-**Journey 1: Real-Time Anomaly Detection**
+**Journey 1: Real-Time Anomaly Detection (The Golden Path)**
-1. Send synthetic `RunInstances` event to EventBridge (p9.16xlarge, $40/hr).
+```typescript
-2. Verify system processes event and triggers fast-path (no baseline).
+describe('E2E: Anomaly Detection', () => {
-3. Verify Slack alert is generated with correct cost estimate.
+  it('detects anomaly and alerts Slack within 30 seconds', async () => {
    // 1. Inject synthetic CloudTrail `RunInstances` event (p4d.24xlarge) into SQS Ingestion Queue
    // 2. Poll DynamoDB to ensure CostEvent was recorded
    // 3. Poll DynamoDB to ensure AnomalyRecord was created (fast-path triggered)
    // 4. Assert WireMock received the Slack chat.postMessage call with Block Kit
  });
 });
 ```
 **Journey 2: Interactive Remediation**
-1. Send webhook simulating user clicking "Stop Instance" in Slack.
+```typescript
-2. Verify API Gateway → Lambda executes `StopInstances` against LocalStack.
+describe('E2E: Interactive Remediation', () => {
-3. Verify Slack message updates to "Remediation Successful".
+  it('stops EC2 instance when user clicks Stop in Slack', async () => {
    // 1. Simulate Slack sending interactive webhook payload for "Stop Instance"
    // 2. Validate HMAC signature in API Gateway lambda
    // 3. Verify LocalStack EC2 mock receives StopInstances call
    // 4. Verify Slack message is updated to "Remediation Successful"
  });
 });
 ```
 **Journey 3: Onboarding & First Scan**
 ```typescript
 describe('E2E: Onboarding', () => {
  it('validates IAM role and triggers initial zombie scan', async () => {
    // 1. Trigger POST /v1/accounts with new role ARN
    // 2. Verify account marked active
    // 3. Verify EventBridge Scheduler creates cron for Zombie Hunter
  });
 });
 ```
 ---
 ## Section 6: Performance & Load Testing
 ### 6.1 Ingestion & Scoring Throughput
 ```typescript
-describe('Ingestion Throughput', () => {
+describe('Performance: Alert Storm', () => {
-  it('processes 500 CloudTrail events/second via SQS FIFO', async () => {});
+  it('processes 1000 CloudTrail events/sec without SQS DLQ overflow', async () => {
-  it('DynamoDB baseline updates complete in <20ms p95', async () => {});
+    // k6 load test hitting SQS directly
  });
  it('DynamoDB baseline updates complete in <20ms p95 under load', async () => {
    // Ensure Single-Table schema does not create hot partitions
  });
  it('Anomaly Scorer Lambda consumes <256MB memory during burst', async () => {});
 });
 ```
 ### 6.2 Data Scale Tests
 ```typescript
 describe('Performance: Baseline Scale', () => {
  it('calculates Z-score in <5ms even when observed_actors set exceeds 1000', async () => {});
  it('handles accounts with 100,000+ daily CostEvents without throttling DynamoDB (On-Demand scaling)', async () => {});
 });
 ```
@@ -184,49 +594,119 @@ describe('Ingestion Throughput', () => {
 ## Section 7: CI/CD Pipeline Integration
- **PR Gate:** Unit tests (<2min), Coverage >85% (Scoring engine >95%).
+### 7.1 Pipeline Stages
- **Merge:** Integration tests with LocalStack & Testcontainers DynamoDB.
+```
- **Staging:** E2E journeys against isolated staging AWS account.
+┌─────────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
 │ Pre-Commit  │───▶│ PR Gate  │───▶│ Merge    │───▶│ Staging  │───▶│ Prod     │
 │ (local)     │    │ (CI)     │    │ (CI)     │    │ (CD)     │    │ (CD)     │
 └─────────────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘
  lint + type       unit tests      integration     E2E + perf     canary
  <10s              math prop       Testcontainers  LocalStack     <5 mins
                    tests <1m       <4 mins         <10 mins
 ```
 ### 7.2 Coverage Gates
 | Component | Threshold |
 |-----------|-----------|
 | Anomaly Scorer (Math) | 100% |
 | CloudTrail Normalizer | 95% |
 | Governance Policy | 95% |
 | Slack Signature Auth | 100% |
 | Overall Pipeline | 85% |
 ---
 ## Section 8: Transparent Factory Tenet Testing
-### 8.1 Atomic Flagging (Circuit Breaker)
+### 8.1 Atomic Flagging
 ```typescript
-it('auto-disables scoring rule if it generates >10 alerts/hour for single tenant', () => {});
+describe('Atomic Flagging', () => {
  it('auto-disables scoring rule flag if alert volume exceeds 3x baseline in 1hr', () => {});
  it('buffers suppressed anomalies in SQS DLQ while flag is off', () => {});
  it('fails CI if any flag TTL exceeds 14 days', () => {});
  it('evaluates flags strictly locally (in-memory provider)', () => {});
 });
 ```
-### 8.2 Configurable Autonomy (14-Day Auto-Promotion)
+### 8.2 Elastic Schema
 ```typescript
-it('keeps new tenant in strict mode (log-only) for first 14 days', () => {});
+describe('Elastic Schema', () => {
-it('auto-promotes to audit mode (auto-alert) on day 15 if false-positive rate < 10%', () => {});
+  it('rejects DynamoDB table definition modifications that alter key schemas', () => {});
  it('requires all DynamoDB item updates to use ADD/SET (additive only)', () => {});
  it('ignores unknown attributes (V2 fields) in V1 CostEvent decoders', () => {});
 });
 ```
 ### 8.3 Cognitive Durability
 ```typescript
 describe('Cognitive Durability', () => {
  it('requires decision_log.json for any PR modifying Z-score thresholds or weights', () => {});
  it('enforces cyclomatic complexity < 10 for all AnomalyScorer math functions', () => {});
 });
 ```
 ### 8.4 Semantic Observability
 ```typescript
 describe('Semantic Observability', () => {
  it('emits OTEL span for every Anomaly Scoring decision', () => {});
  it('includes attributes: cost.z_score, cost.anomaly_score, cost.baseline_days', () => {});
  it('includes cost.fast_path_triggered flag when baseline is bypassed', () => {});
  it('hashes AWS Account ID in spans to protect PII/tenant identity', () => {});
 });
 ```
 ### 8.5 Configurable Autonomy
 ```typescript
 describe('Configurable Autonomy', () => {
  it('keeps new tenant in Strict Mode (log-only) for first 14 days', () => {});
  it('auto-promotes to Audit Mode on day 15 if false-positive rate < 10%', () => {});
  it('Panic Mode halts ALL Slack alerts in <1 second via Redis check', () => {});
  it('Panic Mode does NOT halt baseline recording (read-only tracking continues)', () => {});
 });
 ```
 ---
 ## Section 9: Test Data & Fixtures
-```
+### 9.1 Data Factories
-fixtures/
+```typescript
-  cloudtrail/
+export const makeCloudTrailEvent = (overrides) => ({
-    ec2-runinstances.json
+  eventVersion: '1.08',
-    rds-create-db.json
+  userIdentity: { type: 'AssumedRole', arn: 'arn:aws:sts::123:assumed-role/user' },
-    lambda-create-function.json
+  eventTime: new Date().toISOString(),
-  baselines/
+  eventSource: 'ec2.amazonaws.com',
-    mature-steady-spend.json
+  eventName: 'RunInstances',
-    volatile-dev-account.json
+  requestParameters: { instanceType: 'm5.large' },
-    cold-start.json
+  ...overrides
 });
 export const makeBaseline = (overrides) => ({
  meanHourlyCost: 1.25,
  stdDev: 0.15,
  eventCount: 45,
  ageDays: 16,
  observedActors: ['arn:aws:iam::123:role/ci'],
  observedInstanceTypes: ['t3.medium', 'm5.large'],
  ...overrides
 });
 ```
 ---
 ## Section 10: TDD Implementation Order
-1. **Phase 1:** Anomaly math + Unit tests (Strict TDD).
+1. **Phase 1: Math & Core Logic (Strict TDD)**
-2. **Phase 2:** CloudTrail normalizer + Pricing tables.
+   - Welford algorithm, Z-score math, Novelty scoring, `fast-check` property tests.
-3. **Phase 3:** DynamoDB single-table implementation (Integration led).
+2. **Phase 2: Ingestion & Normalization**
-4. **Phase 4:** Slack formatting + Remediation Lambda.
+   - CloudTrail parsers, pricing static tables, event deduplication.
-5. **Phase 5:** Governance policies (14-day promotion logic).
+3. **Phase 3: Data Persistence (Integration Led)**
   - DynamoDB Single-Table setup, TransactWriteItems, Testcontainers tests.
 4. **Phase 4: Notifications & Slack Actions**
   - Block Kit formatting, Slack signature validation, API Gateway endpoints.
 5. **Phase 5: Governance & Tenets**
   - 14-day promotion logic, Panic mode, OTEL tracing.
 6. **Phase 6: E2E Pipeline**
   - CDK definitions, LocalStack event injection, wire everything together.
-*End of dd0c/cost Test Architecture*
+*End of dd0c/cost Test Architecture (v2)*
--- a/products/06-runbook-automation/test-architecture/test-architecture.md
+++ b/products/06-runbook-automation/test-architecture/test-architecture.md
@@ -1760,3 +1760,527 @@ Before writing the `impl ExecutionEngine { pub async fn execute(...) }` function
 5. `engine_pauses_in_flight_execution_when_panic_mode_set`
 Only once these tests are defined can the state machine be implemented to make them pass (Green phase). This ensures no execution path can bypass the Trust Gradient.
 ---
 ## 11. Review Remediation Addendum (Post-Gemini Review)
 The following sections address all gaps identified in the TDD review. These are net-new test specifications that must be integrated into the relevant sections above during implementation.
 ### 11.1 Missing Epic Coverage
 #### Epic 3.4: Divergence Analysis
 ```rust
 // pkg/executor/divergence/tests.rs
 #[test] fn divergence_detects_extra_command_not_in_runbook() {}
 #[test] fn divergence_detects_modified_command_vs_prescribed() {}
 #[test] fn divergence_detects_skipped_step_not_marked_as_skipped() {}
 #[test] fn divergence_report_includes_diff_of_prescribed_vs_actual() {}
 #[test] fn divergence_flags_env_var_changes_made_during_execution() {}
 #[test] fn divergence_ignores_whitespace_differences_in_commands() {}
 #[test] fn divergence_analysis_runs_automatically_after_execution_completes() {}
 #[test] fn divergence_report_written_to_audit_trail() {}
 #[tokio::test]
 async fn integration_divergence_analysis_detects_agent_side_extra_commands() {
    // Agent executes an extra `whoami` not in the runbook
    // Divergence analyzer must flag it
 }
 ```
 #### Epic 5.3: Compliance Export
 ```rust
 // pkg/audit/export/tests.rs
 #[tokio::test] async fn export_generates_valid_csv_for_date_range() {}
 #[tokio::test] async fn export_generates_valid_pdf_with_execution_summary() {}
 #[tokio::test] async fn export_uploads_to_s3_and_returns_presigned_url() {}
 #[tokio::test] async fn export_presigned_url_expires_after_24_hours() {}
 #[tokio::test] async fn export_scoped_to_tenant_via_rls() {}
 #[tokio::test] async fn export_includes_hash_chain_verification_status() {}
 #[tokio::test] async fn export_redacts_command_output_but_includes_hashes() {}
 ```
 #### Epic 6.4: Classification Query API Rate Limiting
 ```rust
 // tests/integration/api_rate_limit_test.rs
 #[tokio::test]
 async fn api_rate_limit_30_requests_per_minute_per_tenant() {
    let stack = E2EStack::start().await;
    for i in 0..30 {
        let resp = stack.api().get("/v1/run/classifications").send().await;
        assert_eq!(resp.status(), 200);
    }
    // 31st request must be rate-limited
    let resp = stack.api().get("/v1/run/classifications").send().await;
    assert_eq!(resp.status(), 429);
 }
 #[tokio::test]
 async fn api_rate_limit_resets_after_60_seconds() {}
 #[tokio::test]
 async fn api_rate_limit_is_per_tenant_not_global() {
    // Tenant A hitting limit must not affect Tenant B
 }
 #[tokio::test]
 async fn api_rate_limit_returns_retry_after_header() {}
 ```
 #### Epic 7: Dashboard UI (Playwright)
 ```typescript
 // tests/e2e/ui/dashboard.spec.ts
 test('parse preview renders within 5 seconds of paste', async ({ page }) => {
  await page.goto('/dashboard/runbooks/new');
  await page.fill('[data-testid="runbook-input"]', FIXTURE_RUNBOOK);
  const preview = page.locator('[data-testid="parse-preview"]');
  await expect(preview).toBeVisible({ timeout: 5000 });
  await expect(preview.locator('.step-card')).toHaveCount(4);
 });
 test('trust level visualization shows correct colors per step', async ({ page }) => {
  // 🟢 safe = green, 🟡 caution = yellow, 🔴 dangerous = red
 });
 test('MTTR dashboard loads and displays chart', async ({ page }) => {
  await page.goto('/dashboard/analytics');
  await expect(page.locator('[data-testid="mttr-chart"]')).toBeVisible();
 });
 test('execution timeline shows real-time step progress', async ({ page }) => {});
 test('approval modal requires typed confirmation for dangerous steps', async ({ page }) => {});
 test('panic mode banner appears when panic is active', async ({ page }) => {});
 ```
 #### Epic 9: Onboarding & PLG
 ```rust
 // pkg/onboarding/tests.rs
 #[test] fn free_tier_allows_5_runbooks() {}
 #[test] fn free_tier_allows_50_executions_per_month() {}
 #[test] fn free_tier_rejects_6th_runbook_with_upgrade_prompt() {}
 #[test] fn free_tier_rejects_51st_execution_with_upgrade_prompt() {}
 #[test] fn free_tier_counter_resets_monthly() {}
 #[test] fn agent_install_snippet_includes_correct_api_key() {}
 #[test] fn agent_install_snippet_includes_correct_gateway_url() {}
 #[test] fn agent_install_snippet_is_valid_bash() {}
 #[tokio::test] async fn stripe_checkout_creates_session_with_correct_pricing() {}
 #[tokio::test] async fn stripe_webhook_checkout_completed_upgrades_tenant() {}
 #[tokio::test] async fn stripe_webhook_subscription_deleted_downgrades_tenant() {}
 #[tokio::test] async fn stripe_webhook_validates_signature() {}
 ```
 ### 11.2 Agent-Side Security Tests (Zero-Trust Environment)
 The Agent runs in customer VPCs — untrusted territory. These tests prove the Agent defends itself independently of the SaaS backend.
 ```rust
 // pkg/agent/security/tests.rs
 // Agent-side deterministic blocking (mirrors SaaS scanner)
 #[test] fn agent_scanner_blocks_rm_rf_independently_of_saas() {}
 #[test] fn agent_scanner_blocks_kubectl_delete_namespace_independently() {}
 #[test] fn agent_scanner_blocks_drop_table_independently() {}
 #[test] fn agent_scanner_rejects_command_even_if_saas_says_safe() {
    // Simulates compromised SaaS sending a "safe" classification for rm -rf
    let saas_classification = Classification { risk: RiskLevel::Safe, .. };
    let agent_result = agent_scanner.classify("rm -rf /");
    assert_eq!(agent_result.risk, RiskLevel::Dangerous);
    // Agent MUST override SaaS classification
 }
 // Binary integrity
 #[test] fn agent_validates_binary_checksum_on_startup() {}
 #[test] fn agent_refuses_to_start_if_checksum_mismatch() {}
 // Payload tampering
 #[tokio::test] async fn agent_rejects_grpc_payload_with_invalid_hmac() {}
 #[tokio::test] async fn agent_rejects_grpc_payload_with_expired_timestamp() {}
 #[tokio::test] async fn agent_rejects_grpc_payload_with_mismatched_execution_id() {}
 // Local fallback when SaaS is unreachable
 #[tokio::test] async fn agent_falls_back_to_scanner_only_when_saas_disconnected() {}
 #[tokio::test] async fn agent_in_fallback_mode_treats_all_unknowns_as_caution() {}
 #[tokio::test] async fn agent_reconnects_automatically_when_saas_returns() {}
 ```
 ### 11.3 Realistic Sandbox Matrix
 Replace Alpine-only sandbox with a matrix of realistic execution targets.
 ```rust
 // tests/integration/sandbox_matrix_test.rs
 #[rstest]
 #[case("ubuntu:22.04")]
 #[case("amazonlinux:2023")]
 #[case("alpine:3.19")]
 async fn sandbox_safe_command_executes_on_all_targets(#[case] image: &str) {
    let sandbox = SandboxContainer::start(image).await;
    let agent = TestAgent::connect_to(sandbox.socket_path()).await;
    let result = agent.execute("ls /tmp").await.unwrap();
    assert_eq!(result.exit_code, 0);
 }
 #[rstest]
 #[case("ubuntu:22.04")]
 #[case("amazonlinux:2023")]
 async fn sandbox_dangerous_command_blocked_on_all_targets(#[case] image: &str) {
    let sandbox = SandboxContainer::start(image).await;
    let agent = TestAgent::connect_to(sandbox.socket_path()).await;
    let result = agent.execute("rm -rf /").await;
    assert!(result.is_err());
 }
 // Non-root execution
 #[tokio::test]
 async fn sandbox_agent_runs_as_non_root_user() {
    let sandbox = SandboxContainer::start_as_user("ubuntu:22.04", "dd0c-agent").await;
    let agent = TestAgent::connect_to(sandbox.socket_path()).await;
    let result = agent.execute("whoami").await.unwrap();
    assert_eq!(result.stdout.trim(), "dd0c-agent");
 }
 #[tokio::test]
 async fn sandbox_non_root_agent_cannot_escalate_to_root() {
    let sandbox = SandboxContainer::start_as_user("ubuntu:22.04", "dd0c-agent").await;
    let agent = TestAgent::connect_to(sandbox.socket_path()).await;
    let result = agent.execute("sudo cat /etc/shadow").await;
    assert!(result.is_err() || result.unwrap().exit_code != 0);
 }
 // RBAC-restricted K3s
 #[tokio::test]
 async fn sandbox_k3s_rbac_denies_kubectl_delete_namespace() {
    let k3s = K3sContainer::start_with_rbac("read-only-role").await;
    let agent = TestAgent::with_kubeconfig(k3s.kubeconfig()).await;
    let result = agent.execute("kubectl delete namespace default").await;
    // Should be blocked by BOTH scanner AND K8s RBAC
    assert!(result.is_err());
 }
 ```
 ### 11.4 Advanced Command Injection Tests
 ```rust
 // pkg/classifier/scanner/injection_tests.rs
 // Semicolon injection
 #[test] fn scanner_semicolon_rm_rf_is_dangerous() {
    assert_dangerous("echo hello; rm -rf /");
 }
 // Pipe injection
 #[test] fn scanner_pipe_to_rm_is_dangerous() {
    assert_dangerous("find / -name '*.log' | xargs rm -rf");
 }
 // Backtick injection
 #[test] fn scanner_backtick_rm_is_dangerous() {
    assert_dangerous("echo `rm -rf /`");
 }
 // $() substitution (already tested, but more variants)
 #[test] fn scanner_nested_substitution_is_dangerous() {
    assert_dangerous("echo $(echo $(rm -rf /))");
 }
 // Newline injection
 #[test] fn scanner_newline_injection_is_dangerous() {
    assert_dangerous("echo safe\nrm -rf /");
 }
 // Null byte injection
 #[test] fn scanner_null_byte_injection_is_dangerous() {
    assert_dangerous("echo safe\0rm -rf /");
 }
 // Unicode homoglyph attack
 #[test] fn scanner_unicode_homoglyph_rm_is_dangerous() {
    // Using Cyrillic 'р' and 'м' that look like 'r' and 'm'
    assert_dangerous("rм -rf /"); // Should still catch this
 }
 // Base64 encoded payload
 #[test] fn scanner_base64_decode_pipe_bash_is_dangerous() {
    assert_dangerous("echo cm0gLXJmIC8= | base64 -d | bash");
 }
 // Heredoc injection
 #[test] fn scanner_heredoc_with_destructive_is_dangerous() {
    assert_dangerous("cat << EOF | bash\nrm -rf /\nEOF");
 }
 // Environment variable expansion
 #[test] fn scanner_env_var_expansion_to_rm_is_dangerous() {
    assert_dangerous("$CMD"); // Unknown variable expansion = unknown, not safe
 }
 ```
 ### 11.5 Privilege Escalation Tests
 ```rust
 // pkg/classifier/scanner/escalation_tests.rs
 #[test] fn scanner_sudo_anything_is_at_least_caution() {
    assert_at_least_caution("sudo systemctl restart nginx");
 }
 #[test] fn scanner_sudo_rm_is_dangerous() {
    assert_dangerous("sudo rm -rf /var/log");
 }
 #[test] fn scanner_su_root_is_dangerous() {
    assert_dangerous("su - root -c 'rm -rf /'");
 }
 #[test] fn scanner_chmod_suid_is_dangerous() {
    assert_dangerous("chmod u+s /usr/bin/find");
 }
 #[test] fn scanner_chown_root_is_caution() {
    assert_at_least_caution("chown root:root /tmp/exploit");
 }
 #[test] fn scanner_nsenter_is_dangerous() {
    assert_dangerous("nsenter --target 1 --mount --uts --ipc --net --pid");
 }
 #[test] fn scanner_docker_run_privileged_is_dangerous() {
    assert_dangerous("docker run --privileged -v /:/host ubuntu");
 }
 #[test] fn scanner_kubectl_exec_as_root_is_caution() {
    assert_at_least_caution("kubectl exec -it pod -- /bin/bash");
 }
 ```
 ### 11.6 Rollback Failure & Nested Failure Tests
 ```rust
 // pkg/executor/rollback/tests.rs
 #[test] fn rollback_failure_transitions_to_manual_intervention() {
    let mut engine = ExecutionEngine::new();
    engine.transition(State::RollingBack);
    engine.report_rollback_failure("rollback command timed out");
    assert_eq!(engine.state(), State::ManualIntervention);
 }
 #[test] fn rollback_failure_does_not_retry_automatically() {
    // Rollback failures are terminal — no auto-retry
 }
 #[test] fn rollback_timeout_kills_rollback_process_after_300s() {}
 #[test] fn rollback_hanging_indefinitely_triggers_manual_intervention_after_timeout() {
    let mut engine = ExecutionEngine::with_rollback_timeout(Duration::from_secs(5));
    engine.transition(State::RollingBack);
    // Simulate rollback that never completes
    tokio::time::advance(Duration::from_secs(6)).await;
    assert_eq!(engine.state(), State::ManualIntervention);
 }
 #[test] fn manual_intervention_state_sends_slack_alert_to_oncall() {}
 #[test] fn manual_intervention_state_logs_full_context_to_audit() {}
 ```
 ### 11.7 Double Execution & Network Partition Tests
 ```rust
 // pkg/executor/idempotency/tests.rs
 #[tokio::test]
 async fn agent_reconnect_after_partition_resyncs_already_executed_step() {
    let stack = E2EStack::start().await;
    let execution = stack.start_execution().await;
    // Agent executes step successfully
    stack.wait_for_step_state(&execution.id, &step_id, "executing").await;
    // Network partition AFTER execution but BEFORE ACK
    stack.partition_agent().await;
    // Agent reconnects
    stack.heal_partition().await;
    // Engine must recognize step was already executed — no double execution
    let step = stack.get_step(&execution.id, &step_id).await;
    assert_eq!(step.execution_count, 1); // Exactly once
 }
 #[tokio::test]
 async fn engine_does_not_re_send_command_after_agent_reconnect_if_step_completed() {}
 #[tokio::test]
 async fn engine_re_sends_command_if_agent_never_started_execution_before_partition() {}
 ```
 ### 11.8 Slack Payload Forgery Tests
 ```rust
 // tests/integration/slack_security_test.rs
 #[tokio::test]
 async fn slack_approval_webhook_rejects_missing_signature() {
    let resp = stack.api()
        .post("/v1/run/slack/actions")
        .json(&fixture_approval_payload())
        // No X-Slack-Signature header
        .send().await;
    assert_eq!(resp.status(), 401);
 }
 #[tokio::test]
 async fn slack_approval_webhook_rejects_invalid_signature() {
    let resp = stack.api()
        .post("/v1/run/slack/actions")
        .header("X-Slack-Signature", "v0=invalid_hmac")
        .header("X-Slack-Request-Timestamp", &now_timestamp())
        .json(&fixture_approval_payload())
        .send().await;
    assert_eq!(resp.status(), 401);
 }
 #[tokio::test]
 async fn slack_approval_webhook_rejects_replayed_timestamp() {
    // Timestamp older than 5 minutes
    let resp = stack.api()
        .post("/v1/run/slack/actions")
        .header("X-Slack-Signature", &valid_signature_for_old_timestamp())
        .header("X-Slack-Request-Timestamp", &five_minutes_ago())
        .json(&fixture_approval_payload())
        .send().await;
    assert_eq!(resp.status(), 401);
 }
 #[tokio::test]
 async fn slack_approval_webhook_rejects_cross_tenant_approval() {
    // Tenant A's user trying to approve Tenant B's execution
 }
 ```
 ### 11.9 Audit Log Encryption Tests
 ```rust
 // tests/integration/audit_encryption_test.rs
 #[tokio::test]
 async fn audit_log_command_field_is_encrypted_at_rest() {
    let db = TestDb::start().await;
    // Insert an audit event with a command
    insert_audit_event(&db, "kubectl get pods").await;
    // Read raw bytes from PostgreSQL — must NOT contain plaintext command
    let raw = db.query_raw_bytes("SELECT command FROM audit_events LIMIT 1").await;
    assert!(!String::from_utf8_lossy(&raw).contains("kubectl get pods"),
        "Command stored in plaintext — must be encrypted");
 }
 #[tokio::test]
 async fn audit_log_output_field_is_encrypted_at_rest() {
    let db = TestDb::start().await;
    insert_audit_event_with_output(&db, "sensitive output data").await;
    let raw = db.query_raw_bytes("SELECT output FROM audit_events LIMIT 1").await;
    assert!(!String::from_utf8_lossy(&raw).contains("sensitive output data"));
 }
 #[tokio::test]
 async fn audit_log_decryption_requires_kms_key() {
    // Verify the app role can decrypt using the KMS key
    let db = TestDb::start().await;
    insert_audit_event(&db, "kubectl get pods").await;
    let decrypted = db.as_app_role()
        .query("SELECT decrypt_command(command) FROM audit_events LIMIT 1").await;
    assert_eq!(decrypted, "kubectl get pods");
 }
 ```
 ### 11.10 gRPC Output Buffer Limits
 ```rust
 // pkg/agent/streaming/tests.rs
 #[tokio::test]
 async fn agent_truncates_stdout_at_10mb() {
    let sandbox = SandboxContainer::start("ubuntu:22.04").await;
    let agent = TestAgent::connect_to(sandbox.socket_path()).await;
    // Generate 50MB of output
    let result = agent.execute("dd if=/dev/urandom bs=1M count=50 | base64").await.unwrap();
    // Agent must truncate, not OOM
    assert!(result.stdout.len() <= 10 * 1024 * 1024);
    assert!(result.truncated);
 }
 #[tokio::test]
 async fn agent_streams_output_in_chunks_not_buffered() {
    // Verify output arrives incrementally, not all at once after completion
 }
 #[tokio::test]
 async fn agent_memory_stays_under_256mb_during_large_output() {
    // Memory profiling test — agent must not OOM on `cat /dev/urandom`
 }
 #[tokio::test]
 async fn engine_handles_truncated_output_gracefully() {
    // Engine receives truncated flag and logs warning
 }
 ```
 ### 11.11 Parse SLA End-to-End Benchmark
 ```rust
 // benches/parse_sla_bench.rs
 #[tokio::test]
 async fn parse_plus_classify_pipeline_under_5s_p95() {
    let stack = E2EStack::start().await;
    let mut latencies = vec![];
    for _ in 0..100 {
        let start = Instant::now();
        stack.api()
            .post("/v1/run/runbooks/parse-preview")
            .json(&json!({ "raw_text": FIXTURE_RUNBOOK_10_STEPS }))
            .send().await;
        latencies.push(start.elapsed());
    }
    let p95 = percentile(&latencies, 95);
    assert!(p95 < Duration::from_secs(5),
        "Parse+Classify p95 latency: {:?} — exceeds 5s SLA", p95);
 }
 ```
 ### 11.12 Updated Test Pyramid (Post-Review)
 The Execution Engine ratio shifts from 80/15/5 to 60/30/10 per review recommendation:
 | Component | Unit | Integration | E2E |
 |-----------|------|-------------|-----|
 | Safety Scanner | 80% | 15% | 5% |
 | Merge Engine | 90% | 10% | 0% |
 | Execution Engine | **60%** | **30%** | **10%** |
 | Parser | 50% | 40% | 10% |
 | Approval Workflow | 70% | 20% | 10% |
 | Audit Trail | 60% | 35% | 5% |
 | Agent | 50% | 35% | 15% |
 | Dashboard API | 40% | 50% | 10% |
 *End of Review Remediation Addendum*
--- a/products/plg-instrumentation-brainstorm.md
+++ b/products/plg-instrumentation-brainstorm.md
@@ -0,0 +1,226 @@
 # dd0c Platform — PLG Instrumentation Brainstorm
 **Session:** Carson (Brainstorming Coach) — Cross-Product PLG Analytics
 **Date:** March 1, 2026
 **Scope:** All 6 dd0c products
 ---
 ## The Problem
 We built 6 products with onboarding flows, free tiers, and Stripe billing — but zero product analytics. We can't answer:
 - How many users hit "aha moment" vs. bounce?
 - Where in the funnel do free users drop off before upgrading?
 - Which features drive retention vs. which are ignored?
 - Are users churning because of alert fatigue, false positives, or just not getting value?
 - What's our time-to-first-value per product?
 Without instrumentation, PLG iteration is guesswork.
 ---
 ## Brainstorm: What to Instrument
 ### 1. Unified Event Taxonomy
 Every dd0c product shares a common event naming convention:
 ```
 <domain>.<object>.<action>
 Examples:
  account.signup.completed
  account.aws.connected
  anomaly.alert.sent
  anomaly.alert.snoozed
  slack.bot.installed
  billing.checkout.started
  billing.upgrade.completed
  feature.flag.evaluated
 ```
 **Rules:**
 - Past tense for completed actions (`completed`, `sent`, `clicked`)
 - Present tense for state changes (`active`, `learning`, `paused`)
 - Always include `tenant_id`, `timestamp`, `product` (route/drift/alert/portal/cost/run)
 - Never include PII — hash emails, account IDs
 ### 2. Per-Product Activation Metrics
 The "aha moment" is different for each product:
 | Product | Aha Moment | Metric | Target |
 |---------|-----------|--------|--------|
 | dd0c/route | First dollar saved by model routing | `routing.savings.first_dollar` | <24hr from signup |
 | dd0c/drift | First drift detected in real stack | `drift.detection.first_found` | <1hr from agent install |
 | dd0c/alert | First alert correlated (not just forwarded) | `alert.correlation.first_match` | <60sec from first alert |
 | dd0c/portal | First service auto-discovered | `portal.discovery.first_service` | <5min from install |
 | dd0c/cost | First anomaly detected in real account | `cost.anomaly.first_detected` | <24hr from AWS connect |
 | dd0c/run | First runbook executed successfully | `run.execution.first_success` | <10min from setup |
 ### 3. Conversion Funnel (Universal)
 Every product shares this funnel shape:
 ```
 Signup → Connect (AWS/Slack/Git) → First Value → Habit → Upgrade
 ```
 Events per stage:
 **Stage 1: Signup**
 - `account.signup.started` — landed on signup page
 - `account.signup.completed` — account created
 - `account.signup.method` — github_sso / google_sso / email
 **Stage 2: Connect**
 - `account.integration.started` — began connecting external service
 - `account.integration.completed` — connection verified
 - `account.integration.failed` — connection failed (include `error_type`)
 - Product-specific: `account.aws.connected`, `account.slack.installed`, `account.git.connected`
 **Stage 3: First Value**
 - Product-specific aha moment event (see table above)
 - `onboarding.wizard.step_completed` — which step, how long
 - `onboarding.wizard.abandoned` — which step they quit on
 **Stage 4: Habit**
 - `session.daily.active` — DAU ping
 - `session.weekly.active` — WAU ping
 - `feature.<name>.used` — per-feature usage
 - `notification.digest.opened` — are they reading digests?
 - `slack.command.used` — which slash commands, how often
 **Stage 5: Upgrade**
 - `billing.checkout.started`
 - `billing.checkout.completed`
 - `billing.checkout.abandoned`
 - `billing.plan.changed` — upgrade/downgrade
 - `billing.churn.detected` — subscription cancelled
 ### 4. Feature Usage Events (Per Product)
 **dd0c/route (LLM Cost Router)**
 - `routing.request.processed` — model selected, latency, cost
 - `routing.override.manual` — user forced a specific model
 - `routing.savings.calculated` — weekly savings digest generated
 - `routing.shadow.audit.run` — shadow mode comparison completed
 - `dashboard.cost.viewed` — opened cost dashboard
 **dd0c/drift (IaC Drift Detection)**
 - `drift.scan.completed` — scan finished, drifts found count
 - `drift.remediation.clicked` — user clicked "fix drift"
 - `drift.remediation.applied` — drift actually fixed
 - `drift.false_positive.marked` — user dismissed a drift
 - `drift.agent.heartbeat` — agent is alive and scanning
 **dd0c/alert (Alert Intelligence)**
 - `alert.ingested` — raw alert received
 - `alert.correlated` — alerts grouped into incident
 - `alert.suppressed` — duplicate/noise suppressed
 - `alert.escalated` — sent to on-call
 - `alert.feedback.helpful` / `alert.feedback.noise` — user feedback
 - `alert.mttr.measured` — time from alert to resolution
 **dd0c/portal (Lightweight IDP)**
 - `portal.service.discovered` — auto-discovery found a service
 - `portal.service.claimed` — team claimed ownership
 - `portal.scorecard.viewed` — someone checked service health
 - `portal.scorecard.action_taken` — acted on a recommendation
 - `portal.search.performed` — searched the catalog
 **dd0c/cost (AWS Cost Anomaly)**
 - `cost.event.ingested` — CloudTrail event processed
 - `cost.anomaly.scored` — anomaly scoring completed
 - `cost.anomaly.alerted` — Slack alert sent
 - `cost.anomaly.snoozed` — user snoozed alert
 - `cost.anomaly.expected` — user marked as expected
 - `cost.remediation.clicked` — user clicked Stop/Terminate
 - `cost.remediation.executed` — remediation completed
 - `cost.zombie.detected` — idle resource found
 - `cost.digest.sent` — daily digest delivered
 **dd0c/run (Runbook Automation)**
 - `run.runbook.created` — new runbook authored
 - `run.execution.started` — runbook execution began
 - `run.execution.completed` — execution finished (include `success`/`failed`)
 - `run.execution.approval_requested` — human approval needed
 - `run.execution.approval_granted` — human approved
 - `run.execution.rolled_back` — rollback triggered
 - `run.sandbox.test.run` — dry-run in sandbox
 ### 5. Health Scoring (Churn Prediction)
 Composite health score per tenant, updated daily:
 ```
 health_score = (
  0.3 * activation_complete +    // did they hit aha moment?
  0.2 * weekly_active_days +     // how many days active this week?
  0.2 * feature_breadth +        // how many features used?
  0.15 * integration_depth +     // how many integrations connected?
  0.15 * feedback_sentiment       // positive vs negative actions
 )
 ```
 Thresholds:
 - `health > 0.7` → Healthy (green)
 - `health 0.4-0.7` → At Risk (yellow) → trigger re-engagement email
 - `health < 0.4` → Churning (red) → trigger founder outreach
 ### 6. Analytics Stack Recommendation
 **PostHog** (self-hosted on AWS):
 - Open source, self-hostable → no vendor lock-in
 - Free tier: unlimited events self-hosted
 - Built-in: funnels, retention, feature flags, session replay
 - Supports custom events via REST API or JS/Python SDK
 - Can run on a single t3.medium for V1 traffic
 **Why not Segment/Amplitude/Mixpanel:**
 - Segment: $120/mo minimum, overkill for solo founder
 - Amplitude: free tier is generous but cloud-only, data leaves your infra
 - Mixpanel: same cloud-only concern
 - PostHog self-hosted: $0/mo, data stays in your AWS account, GDPR-friendly
 **Integration pattern:**
 ```
 Lambda/API → PostHog REST API (async, fire-and-forget)
 Next.js UI → PostHog JS SDK (auto-captures pageviews, clicks)
 Slack Bot → PostHog Python SDK (command usage, action clicks)
 ```
 ### 7. Cross-Product Flywheel Metrics
 dd0c is a platform — users on one product should discover others:
 - `platform.cross_sell.impression` — "Try dd0c/alert" banner shown
 - `platform.cross_sell.clicked` — user clicked cross-sell
 - `platform.cross_sell.activated` — user activated second product
 - `platform.products.active_count` — how many dd0c products per tenant
 **Flywheel hypothesis:** Users who activate 2+ dd0c products have 3x lower churn than single-product users. We need data to prove/disprove this.
 ---
 ## Epic 11 Proposal: PLG Instrumentation
 ### Scope
 Cross-cutting epic added to all 6 products. Shared analytics SDK, per-product event implementations, funnel dashboards, health scoring.
 ### Stories (Draft)
 1. **PostHog Infrastructure** — CDK stack for self-hosted PostHog on ECS Fargate
 2. **Analytics SDK** — Shared TypeScript/Python wrapper with standard event schema
 3. **Funnel Dashboard** — PostHog dashboard template per product
 4. **Activation Tracking** — Per-product aha moment detection and logging
 5. **Health Scoring Engine** — Daily cron that computes tenant health scores
 6. **Cross-Sell Instrumentation** — Platform-level cross-product discovery events
 7. **Churn Alert Pipeline** — Health score → Slack alert to founder when tenant goes red
 ### Estimate
 ~25 story points across all products (shared infrastructure + per-product event wiring)
 ---
 *This brainstorm establishes the "what" and "why." Party Mode advisory board should stress-test: Is PostHog the right choice? Is the event taxonomy too granular? Should health scoring be V1 or V2? Is 25 points realistic?*