Files
dd0c/products/03-alert-intelligence/test-architecture/test-architecture.md

51 KiB

dd0c/alert — Test Architecture & TDD Strategy

Product: dd0c/alert — Alert Intelligence Platform Author: Test Architecture Phase Date: February 28, 2026 Status: V1 MVP — Solo Founder Scope


Section 1: Testing Philosophy & TDD Workflow

1.1 Core Philosophy

dd0c/alert is a safety-critical observability tool — a bug that silently suppresses a real alert during an incident is worse than having no tool at all. The test suite is the contract that guarantees "we will never eat your alerts."

Guiding principle: tests describe observable behavior from the on-call engineer's perspective. If a test can't be explained as "when X happens, the engineer sees Y," it's testing implementation, not behavior.

For a solo founder, the test suite is also the regression safety net — it catches the subtle scoring bugs that would erode customer trust over weeks.

1.2 Red-Green-Refactor Adapted to dd0c/alert

RED   → Write a failing test that describes the desired behavior
         (e.g., "3 Datadog alerts for the same service within 5 minutes
          should produce 1 correlated incident")

GREEN → Write the minimum code to make it pass
         (hardcode the window, just make it work)

REFACTOR → Clean up without breaking tests
            (extract the window manager, add Redis backing,
             optimize the fingerprinting)

When to write tests first (strict TDD):

  • All correlation logic (time-window clustering, service graph traversal, deploy correlation)
  • All noise scoring algorithms (rule-based scoring, threshold calculations)
  • All HMAC signature validation (security-critical)
  • All fingerprinting/deduplication logic
  • All suppression governance (strict vs. audit mode)
  • All circuit breaker state transitions (suppression DLQ replay)

When integration tests lead (test-after, then harden):

  • Provider webhook parsers — implement against real payload samples, then lock in with contract tests
  • SQS FIFO message ordering — test against LocalStack after implementation
  • Slack message formatting — build the blocks, then snapshot test the output

When E2E tests lead:

  • The 60-second time-to-value journey — define the happy path first, build backward
  • Weekly noise digest generation — define expected output, then build the aggregation

1.3 Test Naming Conventions

// Unit tests (vitest)
describe('CorrelationEngine', () => {
  it('groups alerts for same service within 5min window into single incident', () => {});
  it('extends window by 2min when alert arrives in last 30 seconds', () => {});
  it('caps window extension at 15 minutes total', () => {});
  it('merges downstream service alerts when upstream window is active', () => {});
});

describe('NoiseScorer', () => {
  it('scores deploy-correlated alerts higher when deploy is within 10min', () => {});
  it('returns zero noise score for first-ever alert from a service', () => {});
  it('adds 5 points when PR title matches config or feature-flag', () => {});
});

describe('HmacValidator', () => {
  it('rejects Datadog webhook with missing DD-WEBHOOK-SIGNATURE header', () => {});
  it('rejects PagerDuty webhook with tampered body', () => {});
  it('accepts valid signature and passes payload through', () => {});
});

Rules:

  • Describe the observable outcome, not the internal mechanism
  • Use present tense ("groups", "rejects", "scores")
  • If you need "and" in the name, split into two tests
  • Group by component in describe blocks

Section 2: Test Pyramid

2.1 Ratio

Level Target Count (V1) Runtime
Unit 70% ~350 tests <30s
Integration 20% ~100 tests <5min
E2E/Smoke 10% ~20 tests <10min

2.2 Unit Test Targets (per component)

Component Key Behaviors Est. Tests
Webhook Parsers (Datadog, PD, OpsGenie, Grafana) Payload normalization, field mapping, batch handling 60
HMAC Validator Signature verification per provider, rejection paths 20
Fingerprint Generator Deterministic hashing, dedup detection 15
Correlation Engine Time-window open/close/extend, service graph merge, deploy correlation 80
Noise Scorer Rule-based scoring, deploy proximity weighting, threshold calculations 60
Suggestion Engine Suppression recommendations, "what would have happened" calculations 30
Notification Formatter Slack block formatting, digest generation, in-place message updates 25
Governance Policy Strict/audit mode enforcement, panic mode, per-customer overrides 30
Feature Flags Circuit breaker on suppression volume, flag lifecycle 15
Canonical Schema Mapper Provider → canonical field mapping, severity normalization 15

2.3 Integration Test Boundaries

Boundary What's Tested Infrastructure
Lambda → SQS FIFO Message ordering, dedup, tenant partitioning LocalStack
SQS → Correlation Engine Consumer polling, batch processing, error handling LocalStack
Correlation Engine → Redis Window CRUD, sorted set operations, TTL expiry Testcontainers Redis
Correlation Engine → DynamoDB Incident persistence, tenant config reads Testcontainers DynamoDB Local
Correlation Engine → TimescaleDB Time-series writes, continuous aggregate queries Testcontainers PostgreSQL + TimescaleDB
Notification Service → Slack Block formatting, rate limiting, message update WireMock
API Gateway → Lambda Webhook routing, auth, throttling LocalStack

2.4 E2E/Smoke Scenarios

  1. 60-Second TTV Journey: Webhook received → alert in Slack within 60s
  2. Alert Storm Correlation: 50 alerts in 2 minutes → grouped into 1 incident
  3. Deploy Correlation: Deploy event + alert storm → deploy identified as trigger
  4. Noise Digest: 7 days of alerts → weekly Slack digest with noise stats
  5. Multi-Provider Merge: Datadog + PagerDuty alerts for same service → single incident
  6. Panic Mode: Enable panic → all suppression stops → alerts pass through raw

Section 3: Unit Test Strategy

3.1 Webhook Parsers

Each provider parser is a pure function: payload in, canonical alert(s) out. No side effects, no DB calls.

// tests/unit/parsers/datadog.test.ts
describe('DatadogParser', () => {
  it('normalizes single alert payload to canonical schema', () => {});
  it('normalizes batched alert array into multiple canonical alerts', () => {});
  it('maps Datadog P1 to critical, P5 to info', () => {});
  it('extracts service name from tags array', () => {});
  it('handles missing optional fields without throwing', () => {});
  it('generates stable fingerprint from title + service + tenant', () => {});
});

// tests/unit/parsers/pagerduty.test.ts
describe('PagerDutyParser', () => {
  it('normalizes incident.triggered event to canonical alert', () => {});
  it('normalizes incident.resolved event with resolution metadata', () => {});
  it('ignores incident.acknowledged events (not alerts)', () => {});
  it('maps PD urgency high to critical, low to info', () => {});
});

// tests/unit/parsers/opsgenie.test.ts
describe('OpsGenieParser', () => {
  it('normalizes alert.created action to canonical alert', () => {});
  it('extracts priority P1-P5 and maps to severity', () => {});
  it('handles custom fields in details object', () => {});
});

// tests/unit/parsers/grafana.test.ts
describe('GrafanaParser', () => {
  it('normalizes Grafana Alertmanager webhook payload', () => {});
  it('handles multiple alerts in single webhook (Grafana batches)', () => {});
  it('extracts dashboard URL as context link', () => {});
});

Mocking strategy: None needed — parsers are pure functions. Use recorded payload fixtures from fixtures/webhooks/{provider}/.

Fixture structure:

fixtures/webhooks/
  datadog/
    single-alert.json
    batched-alerts.json
    monitor-recovered.json
  pagerduty/
    incident-triggered.json
    incident-resolved.json
    incident-acknowledged.json
  opsgenie/
    alert-created.json
    alert-closed.json
  grafana/
    single-firing.json
    multi-firing.json
    resolved.json

3.2 HMAC Validator

describe('HmacValidator', () => {
  // Datadog uses hex-encoded HMAC-SHA256
  it('validates correct Datadog DD-WEBHOOK-SIGNATURE header', () => {});
  it('rejects Datadog webhook with wrong signature', () => {});
  it('rejects Datadog webhook with missing signature header', () => {});

  // PagerDuty uses v1= prefix with HMAC-SHA256
  it('validates correct PagerDuty X-PagerDuty-Signature header', () => {});
  it('rejects PagerDuty webhook with tampered body', () => {});

  // OpsGenie uses different header name
  it('validates correct OpsGenie X-OpsGenie-Signature header', () => {});

  // Edge cases
  it('rejects empty body with any signature', () => {});
  it('handles timing-safe comparison to prevent timing attacks', () => {});
});

Mocking strategy: None — crypto operations are deterministic. Use known secret + body + expected signature triples.

3.3 Fingerprint Generator

describe('FingerprintGenerator', () => {
  it('generates deterministic SHA-256 from tenant_id + provider + service + title', () => {});
  it('produces same fingerprint for identical alerts regardless of timestamp', () => {});
  it('produces different fingerprints when service differs', () => {});
  it('normalizes title whitespace before hashing', () => {});
  it('handles unicode characters in title consistently', () => {});
});

3.4 Correlation Engine

The most complex component. Heavy use of table-driven tests.

describe('CorrelationEngine', () => {
  describe('Time-Window Management', () => {
    it('opens new 5min window on first alert for a service', () => {});
    it('adds subsequent alerts to existing open window', () => {});
    it('extends window by 2min when alert arrives in last 30 seconds', () => {});
    it('caps total window duration at 15 minutes', () => {});
    it('closes window after timeout with no new alerts', () => {});
    it('generates incident record when window closes', () => {});
  });

  describe('Service Graph Correlation', () => {
    it('merges downstream alerts into upstream window when dependency exists', () => {});
    it('does not merge alerts for unrelated services', () => {});
    it('handles circular dependencies without infinite loop', () => {});
    it('traverses multi-level dependency chains (A→B→C)', () => {});
  });

  describe('Deploy Correlation', () => {
    it('tags incident with deploy_id when deploy event within 10min of first alert', () => {});
    it('does not correlate deploy older than 10 minutes', () => {});
    it('correlates deploy to correct service even with multiple recent deploys', () => {});
    it('adds deploy correlation score boost to noise calculation', () => {});
  });

  describe('Multi-Tenant Isolation', () => {
    it('never correlates alerts across different tenants', () => {});
    it('maintains separate windows per tenant', () => {});
    it('handles concurrent alerts from multiple tenants', () => {});
  });
});

Mocking strategy:

  • Mock Redis client (ioredis-mock) for window state
  • Mock DynamoDB client for service dependency reads
  • Mock SQS for downstream message publishing
  • Use sinon.useFakeTimers() for time-window testing

3.5 Noise Scorer

describe('NoiseScorer', () => {
  describe('Rule-Based Scoring', () => {
    it('returns 0 for first-ever alert from a service (no history)', () => {});
    it('scores higher when alert has fired >5 times in 24 hours', () => {});
    it('scores higher when alert auto-resolved within 5 minutes', () => {});
    it('adds deploy correlation bonus (+15 points) when deploy is recent', () => {});
    it('adds feature-flag bonus (+5 points) when PR title matches config/feature-flag', () => {});
    it('caps total score at 100', () => {});
    it('never scores critical severity alerts above 80 (safety cap)', () => {});
  });

  describe('Threshold Calculations', () => {
    it('classifies score 0-30 as signal (keep)', () => {});
    it('classifies score 31-70 as review (annotate)', () => {});
    it('classifies score 71-100 as noise (suggest suppress)', () => {});
    it('uses tenant-specific thresholds when configured', () => {});
  });

  describe('What-Would-Have-Happened', () => {
    it('calculates suppression count for historical window', () => {});
    it('reports zero false negatives when no suppressed alert was critical', () => {});
    it('flags false negative when suppressed alert was later escalated', () => {});
  });
});

Mocking strategy: Mock the alert history store (DynamoDB queries). Scorer logic itself is pure calculation.

3.6 Notification Formatter

describe('NotificationFormatter', () => {
  describe('Slack Blocks', () => {
    it('formats single-alert notification with service, title, severity', () => {});
    it('formats correlated incident with alert count and sources', () => {});
    it('includes deploy trigger when deploy correlation exists', () => {});
    it('includes noise score badge (🟢 signal / 🟡 review / 🔴 noise)', () => {});
    it('includes feedback buttons (👍 Helpful / 👎 Not helpful)', () => {});
    it('formats in-place update message (replaces initial alert)', () => {});
  });

  describe('Weekly Digest', () => {
    it('aggregates 7 days of incidents into summary stats', () => {});
    it('highlights top 3 noisiest services', () => {});
    it('shows suppression savings ("would have saved X pages")', () => {});
  });
});

Mocking strategy: Snapshot tests — render the Slack blocks to JSON and compare against golden fixtures.

3.7 Governance Policy Engine

describe('GovernancePolicy', () => {
  describe('Mode Enforcement', () => {
    it('in strict mode: annotates alerts but never suppresses', () => {});
    it('in audit mode: auto-suppresses with full logging', () => {});
    it('defaults new tenants to strict mode', () => {});
  });

  describe('Panic Mode', () => {
    it('when panic=true: all suppression stops immediately', () => {});
    it('when panic=true: all alerts pass through unmodified', () => {});
    it('panic mode activatable via Redis key check', () => {});
    it('panic mode shows banner in dashboard API response', () => {});
  });

  describe('Per-Customer Override', () => {
    it('customer can set stricter mode than system default', () => {});
    it('customer cannot set less restrictive mode than system default', () => {});
    it('merge logic: max_restrictive(system, customer)', () => {});
  });

  describe('Policy Decision Logging', () => {
    it('logs "suppressed by audit mode" with full context', () => {});
    it('logs "annotation-only, strict mode active" for strict tenants', () => {});
    it('logs "panic mode active — all alerts passing through"', () => {});
  });
});

3.8 Feature Flag Circuit Breaker

describe('SuppressionCircuitBreaker', () => {
  it('allows suppression when volume is within baseline', () => {});
  it('trips breaker when suppression exceeds 2x baseline over 30min', () => {});
  it('auto-disables the scoring flag when breaker trips', () => {});
  it('replays suppressed alerts from DLQ when breaker trips', () => {});
  it('resets breaker after manual flag re-enable', () => {});
  it('tracks suppression count per flag in Redis sliding window', () => {});
});

Section 4: Integration Test Strategy

4.1 Webhook Contract Tests

Each provider integration gets a contract test suite that validates the full path: HTTP request → Lambda → SQS message.

// tests/integration/webhooks/datadog.contract.test.ts
describe('Datadog Webhook Contract', () => {
  let localstack: LocalStackContainer;
  let sqsClient: SQSClient;

  beforeAll(async () => {
    localstack = await new LocalStackContainer().start();
    sqsClient = new SQSClient({ endpoint: localstack.getEndpoint() });
    // Create SQS FIFO queue
    await sqsClient.send(new CreateQueueCommand({
      QueueName: 'alert-ingested.fifo',
      Attributes: { FifoQueue: 'true', ContentBasedDeduplication: 'true' }
    }));
  });

  it('accepts valid Datadog webhook and produces canonical SQS message', async () => {
    const payload = loadFixture('webhooks/datadog/single-alert.json');
    const signature = computeHmac(payload, TEST_SECRET);

    const res = await request(app)
      .post('/v1/wh/tenant-123/datadog')
      .set('DD-WEBHOOK-SIGNATURE', signature)
      .send(payload);

    expect(res.status).toBe(200);

    const messages = await pollSqs(sqsClient, 'alert-ingested.fifo');
    expect(messages).toHaveLength(1);
    expect(messages[0].body).toMatchObject({
      tenant_id: 'tenant-123',
      provider: 'datadog',
      severity: expect.stringMatching(/critical|high|medium|low|info/),
      fingerprint: expect.stringMatching(/^[a-f0-9]{64}$/),
    });
  });

  it('rejects webhook with invalid HMAC and produces no SQS message', async () => {
    const payload = loadFixture('webhooks/datadog/single-alert.json');

    const res = await request(app)
      .post('/v1/wh/tenant-123/datadog')
      .set('DD-WEBHOOK-SIGNATURE', 'bad-signature')
      .send(payload);

    expect(res.status).toBe(401);
    const messages = await pollSqs(sqsClient, 'alert-ingested.fifo', { waitMs: 1000 });
    expect(messages).toHaveLength(0);
  });
});

Repeat pattern for PagerDuty, OpsGenie, Grafana — each with provider-specific signature headers and payload formats.

4.2 Correlation Engine → Redis Integration

// tests/integration/correlation/redis-windows.test.ts
describe('Correlation Engine + Redis', () => {
  let redis: StartedTestContainer;
  let redisClient: Redis;

  beforeAll(async () => {
    redis = await new GenericContainer('redis:7-alpine')
      .withExposedPorts(6379)
      .start();
    redisClient = new Redis({ host: redis.getHost(), port: redis.getMappedPort(6379) });
  });

  it('opens window in Redis sorted set with correct TTL', async () => {
    await correlationEngine.processAlert(makeAlert({ service: 'payment-api' }));

    const windows = await redisClient.zrange('windows:tenant-123', 0, -1, 'WITHSCORES');
    expect(windows).toHaveLength(2); // [windowId, closesAtEpoch]
    const ttl = await redisClient.ttl('window:tenant-123:payment-api');
    expect(ttl).toBeGreaterThan(280); // ~5min minus processing time
  });

  it('extends window when alert arrives in last 30 seconds', async () => {
    // Open window, advance clock to T+4m31s, send another alert
    await correlationEngine.processAlert(makeAlert({ service: 'payment-api' }));
    vi.advanceTimersByTime(4 * 60 * 1000 + 31 * 1000);
    await correlationEngine.processAlert(makeAlert({ service: 'payment-api' }));

    const ttl = await redisClient.ttl('window:tenant-123:payment-api');
    expect(ttl).toBeGreaterThan(100); // Extended by ~2min
  });

  it('isolates windows between tenants', async () => {
    await correlationEngine.processAlert(makeAlert({ tenant: 'A', service: 'api' }));
    await correlationEngine.processAlert(makeAlert({ tenant: 'B', service: 'api' }));

    const windowsA = await redisClient.zrange('windows:A', 0, -1);
    const windowsB = await redisClient.zrange('windows:B', 0, -1);
    expect(windowsA).toHaveLength(1);
    expect(windowsB).toHaveLength(1);
    expect(windowsA[0]).not.toBe(windowsB[0]);
  });
});

4.3 Correlation Engine → DynamoDB Integration

// tests/integration/correlation/dynamodb-incidents.test.ts
describe('Correlation Engine + DynamoDB', () => {
  let dynamodb: StartedTestContainer;

  beforeAll(async () => {
    dynamodb = await new GenericContainer('amazon/dynamodb-local:latest')
      .withExposedPorts(8000)
      .start();
    // Create tables: alerts, incidents, tenant_config, service_dependencies
  });

  it('persists incident record when correlation window closes', async () => {
    await correlationEngine.processAlert(makeAlert({ service: 'api' }));
    await correlationEngine.processAlert(makeAlert({ service: 'api' }));
    await correlationEngine.closeExpiredWindows();

    const incidents = await queryIncidents('tenant-123');
    expect(incidents).toHaveLength(1);
    expect(incidents[0].alert_count).toBe(2);
    expect(incidents[0].services).toContain('api');
  });

  it('reads service dependencies for cascading correlation', async () => {
    await putServiceDependency('tenant-123', 'api', 'database');
    await correlationEngine.processAlert(makeAlert({ service: 'database' }));
    await correlationEngine.processAlert(makeAlert({ service: 'api' }));

    // Both should be in the same window
    const windows = await getActiveWindows('tenant-123');
    expect(windows).toHaveLength(1);
    expect(windows[0].services).toEqual(expect.arrayContaining(['api', 'database']));
  });
});

4.4 Correlation Engine → TimescaleDB Integration

// tests/integration/correlation/timescaledb-trends.test.ts
describe('Correlation Engine + TimescaleDB', () => {
  let pg: StartedTestContainer;

  beforeAll(async () => {
    pg = await new GenericContainer('timescale/timescaledb:latest-pg16')
      .withExposedPorts(5432)
      .withEnvironment({ POSTGRES_PASSWORD: 'test' })
      .start();
    // Run migrations: create hypertables, continuous aggregates
  });

  it('writes alert frequency data to hypertable', async () => {
    await correlationEngine.recordAlertEvent(makeAlert({ service: 'api' }));
    const rows = await query('SELECT * FROM alert_events WHERE service = $1', ['api']);
    expect(rows).toHaveLength(1);
  });

  it('continuous aggregate calculates hourly alert counts', async () => {
    // Insert 10 alerts spread over 2 hours
    await insertAlertEvents(10, { spreadHours: 2 });
    await refreshContinuousAggregate('hourly_alert_summary');

    const summary = await query('SELECT * FROM hourly_alert_summary');
    expect(summary).toHaveLength(2);
    expect(summary.reduce((s, r) => s + r.alert_count, 0)).toBe(10);
  });
});

4.5 Notification Service → Slack (WireMock)

// tests/integration/notifications/slack.test.ts
describe('Notification Service + Slack', () => {
  let wiremock: WireMockContainer;

  beforeAll(async () => {
    wiremock = await new WireMockContainer().start();
    wiremock.stub({
      request: { method: 'POST', urlPath: '/api/chat.postMessage' },
      response: { status: 200, body: JSON.stringify({ ok: true, ts: '1234.5678' }) }
    });
    wiremock.stub({
      request: { method: 'POST', urlPath: '/api/chat.update' },
      response: { status: 200, body: JSON.stringify({ ok: true }) }
    });
  });

  it('sends initial alert notification to correct Slack channel', async () => {});
  it('updates message in-place when correlation completes', async () => {});
  it('respects Slack rate limits (1 msg/sec per channel)', async () => {});
  it('retries on 429 with exponential backoff', async () => {});
  it('includes feedback buttons in correlated incident message', async () => {});
});

Section 5: E2E & Smoke Tests

5.1 Critical User Journeys

Journey 1: 60-Second Time-to-Value

The defining test for dd0c/alert. Validates the entire pipeline from webhook to Slack notification.

// tests/e2e/journeys/sixty-second-ttv.test.ts
describe('60-Second Time-to-Value', () => {
  it('delivers first correlated incident to Slack within 60 seconds of webhook', async () => {
    const start = Date.now();

    // 1. Send Datadog webhook
    await sendWebhook('datadog', fixtures.datadog.singleAlert, { tenant: 'e2e-tenant' });

    // 2. Wait for Slack message
    const slackMessage = await waitForSlackMessage('e2e-channel', { timeoutMs: 60_000 });

    const elapsed = Date.now() - start;
    expect(elapsed).toBeLessThan(60_000);
    expect(slackMessage.text).toContain('New alert');
    expect(slackMessage.blocks).toBeDefined();
  });
});

Journey 2: Alert Storm Correlation

// tests/e2e/journeys/alert-storm.test.ts
describe('Alert Storm Correlation', () => {
  it('groups 50 alerts in 2 minutes into a single correlated incident', async () => {
    // Fire 50 alerts for same service over 2 minutes
    for (let i = 0; i < 50; i++) {
      await sendWebhook('datadog', makeAlertPayload({
        service: 'payment-api',
        title: `High latency on payment-api (${i})`,
      }));
      await sleep(2400); // ~50 alerts in 2 min
    }

    // Wait for correlation window to close
    await sleep(5 * 60 * 1000 + 30_000); // 5min window + buffer

    const slackMessages = await getSlackMessages('e2e-channel');
    const incidentMessages = slackMessages.filter(m => m.text.includes('Incident'));
    expect(incidentMessages).toHaveLength(1);
    expect(incidentMessages[0].text).toContain('50 alerts grouped');
  });
});

Journey 3: Deploy Correlation

// tests/e2e/journeys/deploy-correlation.test.ts
describe('Deploy Correlation', () => {
  it('identifies deploy as trigger when alerts follow within 10 minutes', async () => {
    // 1. Send deploy event
    await sendWebhook('github-actions', makeDeployPayload({
      service: 'payment-api',
      commit: 'abc123',
      pr_title: 'feat: add retry logic',
    }));

    // 2. Wait 2 minutes, then fire alerts
    await sleep(2 * 60 * 1000);
    await sendWebhook('datadog', makeAlertPayload({ service: 'payment-api' }));
    await sendWebhook('pagerduty', makeAlertPayload({ service: 'payment-api' }));

    // 3. Wait for correlation
    await sleep(6 * 60 * 1000);

    const slackMessage = await getLatestSlackMessage('e2e-channel');
    expect(slackMessage.text).toContain('Deploy #');
    expect(slackMessage.text).toContain('abc123');
  });
});

Journey 4: Panic Mode

// tests/e2e/journeys/panic-mode.test.ts
describe('Panic Mode', () => {
  it('stops all suppression immediately when panic mode is activated', async () => {
    // 1. Enable audit mode, verify suppression works
    await setGovernanceMode('e2e-tenant', 'audit');
    await sendNoisyAlerts(10);
    const beforePanic = await getSlackMessages('e2e-channel');
    const suppressedBefore = beforePanic.filter(m => m.text.includes('suppressed'));

    // 2. Activate panic mode
    await fetch('/admin/panic', { method: 'POST' });

    // 3. Send more alerts — all should pass through
    await sendNoisyAlerts(10);
    const afterPanic = await getSlackMessages('e2e-channel');
    const rawAlerts = afterPanic.filter(m => !m.text.includes('suppressed'));
    expect(rawAlerts.length).toBeGreaterThanOrEqual(10);
  });
});

5.2 E2E Infrastructure

# docker-compose.e2e.yml
services:
  localstack:
    image: localstack/localstack:3
    environment:
      SERVICES: sqs,s3,dynamodb,apigateway,lambda
    ports: ["4566:4566"]

  timescaledb:
    image: timescale/timescaledb:latest-pg16
    environment:
      POSTGRES_PASSWORD: test
    ports: ["5432:5432"]

  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]

  wiremock:
    image: wiremock/wiremock:3
    ports: ["8080:8080"]
    volumes:
      - ./fixtures/wiremock:/home/wiremock/mappings

  app:
    build: .
    environment:
      AWS_ENDPOINT: http://localstack:4566
      REDIS_URL: redis://redis:6379
      TIMESCALE_URL: postgres://postgres:test@timescaledb:5432/test
      SLACK_API_URL: http://wiremock:8080
    depends_on: [localstack, timescaledb, redis, wiremock]

5.3 Synthetic Alert Generation

// tests/e2e/helpers/alert-generator.ts
export function makeAlertPayload(overrides: Partial<AlertPayload> = {}): DatadogWebhookPayload {
  return {
    id: ulid(),
    title: overrides.title ?? `Alert: ${faker.hacker.phrase()}`,
    text: faker.lorem.sentence(),
    date_happened: Math.floor(Date.now() / 1000),
    priority: overrides.priority ?? 'normal',
    tags: [`service:${overrides.service ?? 'test-service'}`],
    alert_type: overrides.severity ?? 'warning',
    ...overrides,
  };
}

export async function sendNoisyAlerts(count: number, opts?: { service?: string }) {
  for (let i = 0; i < count; i++) {
    await sendWebhook('datadog', makeAlertPayload({
      service: opts?.service ?? 'noisy-service',
      title: `Flapping alert #${i}`,
    }));
  }
}

Section 6: Performance & Load Testing

6.1 Alert Ingestion Throughput

// tests/perf/ingestion-throughput.test.ts
describe('Ingestion Throughput', () => {
  it('processes 1000 webhooks/second without dropping payloads', async () => {
    const results = await k6.run({
      vus: 100,
      duration: '30s',
      thresholds: {
        http_req_duration: ['p95<200'],  // 200ms p95
        http_req_failed: ['rate<0.001'],  // <0.1% failure
      },
      script: `
        import http from 'k6/http';
        export default function() {
          http.post('${WEBHOOK_URL}/v1/wh/perf-tenant/datadog', 
            JSON.stringify(makeAlertPayload()),
            { headers: { 'DD-WEBHOOK-SIGNATURE': validSig } }
          );
        }
      `,
    });
    expect(results.metrics.http_req_failed.rate).toBeLessThan(0.001);
  });
});

6.2 Correlation Latency Under Alert Storms

describe('Correlation Storm Performance', () => {
  it('correlates 500 alerts across 10 services within 30 seconds', async () => {
    const start = Date.now();
    
    // Simulate incident storm: 500 alerts, 10 services, 2 minutes
    await generateAlertStorm({ alerts: 500, services: 10, durationMs: 120_000 });
    
    // Wait for all windows to close
    await waitForIncidents('perf-tenant', { minCount: 1, timeoutMs: 30_000 });
    
    const elapsed = Date.now() - start - 120_000; // subtract generation time
    expect(elapsed).toBeLessThan(30_000);
  });

  it('Redis memory stays under 50MB during 10K active windows', async () => {
    // Open 10K windows across 100 tenants
    for (let t = 0; t < 100; t++) {
      for (let s = 0; s < 100; s++) {
        await correlationEngine.processAlert(makeAlert({
          tenant: `tenant-${t}`,
          service: `service-${s}`,
        }));
      }
    }
    const memoryUsage = await redisClient.info('memory');
    const usedMb = parseRedisMemory(memoryUsage);
    expect(usedMb).toBeLessThan(50);
  });
});

6.3 Noise Scoring Latency

describe('Noise Scoring Performance', () => {
  it('scores a correlated incident with 50 alerts in <100ms', async () => {
    const incident = makeIncident({ alertCount: 50, withHistory: true });
    
    const start = performance.now();
    const score = await noiseScorer.score(incident);
    const elapsed = performance.now() - start;
    
    expect(elapsed).toBeLessThan(100);
    expect(score).toBeGreaterThanOrEqual(0);
    expect(score).toBeLessThanOrEqual(100);
  });
});

6.4 Memory Pressure During High-Cardinality Correlation

describe('Memory Pressure', () => {
  it('ECS task stays under 512MB with 1000 concurrent correlation windows', async () => {
    // Monitor ECS task memory while processing high-cardinality alerts
    const memBefore = process.memoryUsage().heapUsed;
    
    await processHighCardinalityAlerts({ tenants: 100, servicesPerTenant: 10 });
    
    const memAfter = process.memoryUsage().heapUsed;
    const deltaMb = (memAfter - memBefore) / 1024 / 1024;
    expect(deltaMb).toBeLessThan(256); // Leave headroom in 512MB task
  });
});

Section 7: CI/CD Pipeline Integration

7.1 Pipeline Stages

┌─────────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Pre-Commit   │───▶│ PR Gate  │───▶│ Merge    │───▶│ Staging  │───▶│ Prod     │
│ (local)      │    │ (CI)     │    │ (CI)     │    │ (CD)     │    │ (CD)     │
└─────────────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘
  lint + format     unit tests      full suite      E2E + perf     smoke + canary
  type check        integration     coverage gate   LocalStack     deploy event
  <10s              <5min           <10min          <15min         self-dogfood

7.2 Stage Details

Pre-Commit (local, <10s):

  • eslint + prettier format check
  • tsc --noEmit type check
  • Affected unit tests only (vitest --changed)

PR Gate (CI, <5min):

  • Full unit test suite
  • Integration tests (Testcontainers spin up in CI)
  • Schema migration lint (no DROP/RENAME/TYPE changes)
  • Decision log presence check for scoring/correlation PRs
  • Coverage diff: new code must have ≥80% coverage

Merge to Main (CI, <10min):

  • Full test suite (unit + integration)
  • Coverage gate: overall ≥80%, scoring engine ≥90%
  • CDK synth + diff (infrastructure changes)
  • Security scan (npm audit, trivy)

Staging (CD, <15min):

  • Deploy to staging environment
  • E2E journey tests against LocalStack
  • Performance benchmarks (ingestion throughput, correlation latency)
  • Synthetic alert generation + validation

Production (CD):

  • Canary deploy (10% traffic for 5 minutes)
  • Smoke tests (send test webhook, verify Slack delivery)
  • dd0c/alert dogfoods itself: deploy event sent to own webhook
  • Automated rollback if error rate >1% during canary

7.3 Coverage Thresholds

Component Minimum Target
Webhook Parsers 90% 95%
HMAC Validator 95% 100%
Correlation Engine 85% 90%
Noise Scorer 90% 95%
Governance Policy 90% 95%
Notification Formatter 75% 85%
Overall 80% 85%

7.4 Test Parallelization

# .github/workflows/test.yml
jobs:
  unit:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - run: vitest --shard=${{ matrix.shard }}/4

  integration:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        suite: [webhooks, correlation, notifications, storage]
    steps:
      - run: vitest --project=integration --grep=${{ matrix.suite }}

  e2e:
    needs: [unit, integration]
    runs-on: ubuntu-latest
    steps:
      - run: docker compose -f docker-compose.e2e.yml up -d
      - run: vitest --project=e2e

Section 8: Transparent Factory Tenet Testing

8.1 Atomic Flagging — Suppression Circuit Breaker

describe('Atomic Flagging', () => {
  describe('Flag Lifecycle', () => {
    it('new scoring rule flag defaults to false (off)', () => {});
    it('flag has owner and ttl metadata', () => {});
    it('CI blocks when flag at 100% exceeds 14-day TTL', () => {});
  });

  describe('Circuit Breaker on Suppression Volume', () => {
    it('allows suppression when volume is within 2x baseline', () => {});
    it('trips breaker when suppression exceeds 2x baseline over 30min', () => {});
    it('auto-disables the flag when breaker trips', () => {});
    it('buffers suppressed alerts in DLQ during normal operation', () => {});
    it('replays DLQ alerts when breaker trips', async () => {
      // 1. Enable scoring flag, suppress 20 alerts
      // 2. Trip the breaker by spiking suppression rate
      // 3. Verify all 20 suppressed alerts are re-emitted from DLQ
      // 4. Verify flag is now disabled
    });
    it('DLQ retains alerts for 1 hour before expiry', () => {});
  });

  describe('Local Evaluation', () => {
    it('flag evaluation does not make network calls', () => {});
    it('flag state is cached in-memory and refreshed every 60s', () => {});
  });
});

8.2 Elastic Schema — Migration Validation

describe('Elastic Schema', () => {
  describe('Migration Lint', () => {
    it('rejects migration with DROP COLUMN statement', () => {
      const migration = 'ALTER TABLE alert_events DROP COLUMN old_field;';
      expect(lintMigration(migration)).toContainError('DROP not allowed');
    });
    it('rejects migration with ALTER COLUMN TYPE', () => {
      const migration = 'ALTER TABLE alert_events ALTER COLUMN severity TYPE integer;';
      expect(lintMigration(migration)).toContainError('TYPE change not allowed');
    });
    it('rejects migration with RENAME COLUMN', () => {});
    it('accepts migration with ADD COLUMN (nullable)', () => {
      const migration = 'ALTER TABLE alert_events ADD COLUMN noise_score_v2 integer;';
      expect(lintMigration(migration)).toBeValid();
    });
    it('accepts migration with new table creation', () => {});
  });

  describe('DynamoDB Schema', () => {
    it('rejects attribute type change in table definition', () => {});
    it('accepts new attribute addition', () => {});
    it('V1 code ignores V2 attributes without error', () => {});
  });

  describe('Sunset Enforcement', () => {
    it('every migration file contains sunset_date comment', () => {
      const migrations = glob.sync('migrations/*.sql');
      for (const m of migrations) {
        const content = fs.readFileSync(m, 'utf-8');
        expect(content).toMatch(/-- sunset_date: \d{4}-\d{2}-\d{2}/);
      }
    });
    it('CI warns when migration is past sunset date', () => {});
  });
});

8.3 Cognitive Durability — Decision Log Validation

describe('Cognitive Durability', () => {
  it('decision_log.json exists for every PR touching scoring/', () => {
    // CI hook: check git diff for files in src/scoring/
    // If touched, require docs/decisions/*.json in the same PR
  });

  it('decision log has required fields', () => {
    const logs = glob.sync('docs/decisions/*.json');
    for (const log of logs) {
      const entry = JSON.parse(fs.readFileSync(log, 'utf-8'));
      expect(entry).toHaveProperty('reasoning');
      expect(entry).toHaveProperty('alternatives_considered');
      expect(entry).toHaveProperty('confidence');
      expect(entry).toHaveProperty('timestamp');
      expect(entry).toHaveProperty('author');
    }
  });

  it('cyclomatic complexity stays under 10 for all scoring functions', () => {
    // Run eslint with complexity rule
    const result = execSync('eslint src/scoring/ --rule "complexity: [error, 10]"');
    expect(result.exitCode).toBe(0);
  });
});

8.4 Semantic Observability — OTEL Span Assertions

describe('Semantic Observability', () => {
  let spanExporter: InMemorySpanExporter;

  beforeEach(() => {
    spanExporter = new InMemorySpanExporter();
    // Configure OTEL with in-memory exporter for testing
  });

  describe('Alert Evaluation Spans', () => {
    it('emits parent alert_evaluation span for each alert', async () => {
      await processAlert(makeAlert());
      const spans = spanExporter.getFinishedSpans();
      const evalSpan = spans.find(s => s.name === 'alert_evaluation');
      expect(evalSpan).toBeDefined();
    });

    it('emits child noise_scoring span with score attributes', async () => {
      await processAlert(makeAlert());
      const spans = spanExporter.getFinishedSpans();
      const scoreSpan = spans.find(s => s.name === 'noise_scoring');
      expect(scoreSpan).toBeDefined();
      expect(scoreSpan.attributes['alert.noise_score']).toBeGreaterThanOrEqual(0);
      expect(scoreSpan.attributes['alert.noise_score']).toBeLessThanOrEqual(100);
    });

    it('emits child correlation_matching span with match data', async () => {
      await processAlert(makeAlert());
      const spans = spanExporter.getFinishedSpans();
      const corrSpan = spans.find(s => s.name === 'correlation_matching');
      expect(corrSpan).toBeDefined();
      expect(corrSpan.attributes).toHaveProperty('alert.correlation_matches');
    });

    it('emits suppression_decision span with reason', async () => {
      await processAlert(makeAlert());
      const spans = spanExporter.getFinishedSpans();
      const suppSpan = spans.find(s => s.name === 'suppression_decision');
      expect(suppSpan.attributes).toHaveProperty('alert.suppressed');
      expect(suppSpan.attributes).toHaveProperty('alert.suppression_reason');
    });
  });

  describe('PII Protection', () => {
    it('never includes raw alert payload in span attributes', async () => {
      await processAlert(makeAlert({ title: 'User john@example.com failed login' }));
      const spans = spanExporter.getFinishedSpans();
      for (const span of spans) {
        const attrs = JSON.stringify(span.attributes);
        expect(attrs).not.toContain('john@example.com');
      }
    });

    it('uses hashed alert source identifier, not raw', async () => {
      await processAlert(makeAlert({ source: 'prod-payment-api' }));
      const spans = spanExporter.getFinishedSpans();
      const evalSpan = spans.find(s => s.name === 'alert_evaluation');
      expect(evalSpan.attributes['alert.source']).toMatch(/^[a-f0-9]+$/);
    });
  });
});

8.5 Configurable Autonomy — Governance Policy Tests

describe('Configurable Autonomy', () => {
  describe('Governance Mode Enforcement', () => {
    it('strict mode: annotates but never suppresses', async () => {
      setPolicy({ governance_mode: 'strict' });
      const result = await processNoisyAlert(makeAlert({ noiseScore: 95 }));
      expect(result.suppressed).toBe(false);
      expect(result.annotation).toContain('noise_score: 95');
    });

    it('audit mode: auto-suppresses with logging', async () => {
      setPolicy({ governance_mode: 'audit' });
      const result = await processNoisyAlert(makeAlert({ noiseScore: 95 }));
      expect(result.suppressed).toBe(true);
      expect(result.log).toContain('suppressed by audit mode');
    });
  });

  describe('Panic Mode', () => {
    it('activates in <1 second via API call', async () => {
      const start = Date.now();
      await fetch('/admin/panic', { method: 'POST' });
      const panicActive = await redisClient.get('dd0c:panic');
      expect(Date.now() - start).toBeLessThan(1000);
      expect(panicActive).toBe('true');
    });

    it('stops all suppression when active', async () => {
      await activatePanic();
      const results = await Promise.all(
        Array.from({ length: 10 }, () => processNoisyAlert(makeAlert({ noiseScore: 99 })))
      );
      expect(results.every(r => r.suppressed === false)).toBe(true);
    });
  });

  describe('Per-Customer Override', () => {
    it('customer strict overrides system audit', async () => {
      setPolicy({ governance_mode: 'audit' });
      setCustomerPolicy('tenant-123', { governance_mode: 'strict' });
      const result = await processNoisyAlert(makeAlert({ tenant: 'tenant-123', noiseScore: 95 }));
      expect(result.suppressed).toBe(false);
    });

    it('customer cannot downgrade from system strict to audit', async () => {
      setPolicy({ governance_mode: 'strict' });
      setCustomerPolicy('tenant-123', { governance_mode: 'audit' });
      const result = await processNoisyAlert(makeAlert({ tenant: 'tenant-123', noiseScore: 95 }));
      expect(result.suppressed).toBe(false); // System strict wins
    });
  });
});

Section 9: Test Data & Fixtures

9.1 Directory Structure

tests/
  fixtures/
    webhooks/
      datadog/
        single-alert.json
        batched-alerts.json
        monitor-recovered.json
        high-priority.json
      pagerduty/
        incident-triggered.json
        incident-resolved.json
        incident-acknowledged.json
      opsgenie/
        alert-created.json
        alert-closed.json
      grafana/
        single-firing.json
        multi-firing.json
        resolved.json
    deploys/
      github-actions-success.json
      github-actions-failure.json
      gitlab-ci-pipeline.json
      argocd-sync.json
    scenarios/
      alert-storm-50-alerts.json
      cascading-failure-3-services.json
      flapping-alert-10-cycles.json
      maintenance-window-suppression.json
      deploy-correlated-incident.json
    slack/
      initial-alert-blocks.json
      correlated-incident-blocks.json
      weekly-digest-blocks.json
    schemas/
      canonical-alert.json
      incident-record.json
      tenant-config.json

9.2 Alert Payload Factory

// tests/helpers/factories.ts
export function makeCanonicalAlert(overrides: Partial<CanonicalAlert> = {}): CanonicalAlert {
  return {
    alert_id: ulid(),
    tenant_id: overrides.tenant_id ?? 'test-tenant',
    provider: overrides.provider ?? 'datadog',
    service: overrides.service ?? 'test-service',
    title: overrides.title ?? `Alert: ${faker.hacker.phrase()}`,
    severity: overrides.severity ?? 'warning',
    fingerprint: overrides.fingerprint ?? crypto.randomBytes(32).toString('hex'),
    timestamp: overrides.timestamp ?? new Date().toISOString(),
    raw_payload_s3_key: overrides.raw_payload_s3_key ?? `raw/${ulid()}.json`,
    metadata: overrides.metadata ?? {},
    ...overrides,
  };
}

export function makeIncident(overrides: Partial<Incident> = {}): Incident {
  const alertCount = overrides.alert_count ?? 5;
  return {
    incident_id: ulid(),
    tenant_id: overrides.tenant_id ?? 'test-tenant',
    services: overrides.services ?? ['test-service'],
    alert_count: alertCount,
    alerts: Array.from({ length: alertCount }, () => makeCanonicalAlert()),
    noise_score: overrides.noise_score ?? 0,
    deploy_correlation: overrides.deploy_correlation ?? null,
    window_opened_at: overrides.window_opened_at ?? new Date().toISOString(),
    window_closed_at: overrides.window_closed_at ?? new Date().toISOString(),
    ...overrides,
  };
}

export function makeDeployEvent(overrides: Partial<DeployEvent> = {}): DeployEvent {
  return {
    deploy_id: ulid(),
    tenant_id: overrides.tenant_id ?? 'test-tenant',
    service: overrides.service ?? 'test-service',
    commit_sha: overrides.commit_sha ?? faker.git.commitSha(),
    pr_title: overrides.pr_title ?? faker.git.commitMessage(),
    deployed_at: overrides.deployed_at ?? new Date().toISOString(),
    provider: overrides.provider ?? 'github-actions',
    ...overrides,
  };
}

9.3 Noise Scenario Fixtures

// tests/helpers/scenarios.ts
export const NOISE_SCENARIOS = {
  alertStorm: {
    description: '50 alerts for same service in 2 minutes',
    alerts: Array.from({ length: 50 }, (_, i) => makeCanonicalAlert({
      service: 'payment-api',
      title: `High latency variant ${i}`,
      timestamp: new Date(Date.now() + i * 2400).toISOString(),
    })),
    expectedIncidents: 1,
    expectedNoiseScore: { min: 70, max: 95 },
  },

  flappingAlert: {
    description: 'Alert fires and resolves 10 times in 1 hour',
    alerts: Array.from({ length: 20 }, (_, i) => makeCanonicalAlert({
      service: 'health-check',
      title: 'Health check failed',
      severity: i % 2 === 0 ? 'warning' : 'info', // alternating fire/resolve
      timestamp: new Date(Date.now() + i * 3 * 60 * 1000).toISOString(),
    })),
    expectedNoiseScore: { min: 80, max: 100 },
  },

  cascadingFailure: {
    description: 'Database fails, then API, then frontend',
    alerts: [
      makeCanonicalAlert({ service: 'database', severity: 'critical', timestamp: t(0) }),
      makeCanonicalAlert({ service: 'api', severity: 'high', timestamp: t(30) }),
      makeCanonicalAlert({ service: 'api', severity: 'high', timestamp: t(45) }),
      makeCanonicalAlert({ service: 'frontend', severity: 'medium', timestamp: t(60) }),
      makeCanonicalAlert({ service: 'frontend', severity: 'medium', timestamp: t(90) }),
    ],
    serviceDependencies: [['api', 'database'], ['frontend', 'api']],
    expectedIncidents: 1, // All merged via dependency graph
    expectedNoiseScore: { min: 0, max: 30 }, // Real incident, not noise
  },

  deployCorrelated: {
    description: 'Deploy followed by alert storm',
    deploy: makeDeployEvent({ service: 'payment-api', pr_title: 'feat: add retry logic' }),
    alerts: Array.from({ length: 8 }, () => makeCanonicalAlert({
      service: 'payment-api',
      severity: 'high',
    })),
    deployToAlertGapMs: 2 * 60 * 1000, // 2 minutes after deploy
    expectedNoiseScore: { min: 50, max: 85 }, // Deploy correlation boosts noise score
  },
};

Section 10: TDD Implementation Order

10.1 Bootstrap Sequence

The test infrastructure itself must be built before any product code. This is the order:

Phase 0: Test Infrastructure (Week 0)
  ├── 0.1 vitest config + TypeScript setup
  ├── 0.2 Testcontainers helper (Redis, DynamoDB Local, TimescaleDB)
  ├── 0.3 LocalStack helper (SQS, S3, API Gateway)
  ├── 0.4 Fixture loader utility
  ├── 0.5 Factory functions (makeCanonicalAlert, makeIncident, makeDeployEvent)
  ├── 0.6 WireMock Slack stub
  └── 0.7 CI pipeline with test stages

10.2 Epic-by-Epic TDD Order

Phase 1: Webhook Ingestion (Epic 1) — Tests First
  ├── 1.1 RED: HMAC validator tests (all providers)
  ├── 1.2 GREEN: Implement HMAC validation
  ├── 1.3 RED: Datadog parser tests (single + batch)
  ├── 1.4 GREEN: Implement Datadog parser
  ├── 1.5 RED: PagerDuty parser tests
  ├── 1.6 GREEN: Implement PagerDuty parser
  ├── 1.7 RED: Fingerprint generator tests
  ├── 1.8 GREEN: Implement fingerprinting
  ├── 1.9 INTEGRATION: Lambda → SQS contract test
  └── 1.10 REFACTOR: Extract provider parser interface

Phase 2: Correlation Engine (Epic 2) — Tests First
  ├── 2.1 RED: Time-window open/close/extend tests
  ├── 2.2 GREEN: Implement window manager
  ├── 2.3 RED: Service graph correlation tests
  ├── 2.4 GREEN: Implement dependency traversal
  ├── 2.5 RED: Deploy correlation tests
  ├── 2.6 GREEN: Implement deploy tracker
  ├── 2.7 INTEGRATION: Correlation → Redis window tests
  ├── 2.8 INTEGRATION: Correlation → DynamoDB incident persistence
  └── 2.9 INTEGRATION: Correlation → TimescaleDB trend writes

Phase 3: Noise Analysis (Epic 3) — Tests First
  ├── 3.1 RED: Rule-based noise scoring tests (all rules)
  ├── 3.2 GREEN: Implement scorer
  ├── 3.3 RED: Threshold classification tests
  ├── 3.4 GREEN: Implement classifier
  ├── 3.5 RED: "What would have happened" calculation tests
  ├── 3.6 GREEN: Implement historical analysis
  └── 3.7 REFACTOR: Extract scoring rules into configurable pipeline

Phase 4: Notifications (Epic 4) — Integration Tests Lead
  ├── 4.1 Implement Slack block formatter
  ├── 4.2 RED: Snapshot tests for all message formats
  ├── 4.3 INTEGRATION: Notification → Slack (WireMock)
  ├── 4.4 RED: Rate limiting tests
  └── 4.5 GREEN: Implement rate limiter

Phase 5: Governance (Epic 10) — Tests First
  ├── 5.1 RED: Strict/audit mode enforcement tests
  ├── 5.2 GREEN: Implement policy engine
  ├── 5.3 RED: Panic mode tests (<1s activation)
  ├── 5.4 GREEN: Implement panic mode
  ├── 5.5 RED: Circuit breaker + DLQ replay tests
  ├── 5.6 GREEN: Implement circuit breaker
  ├── 5.7 RED: OTEL span assertion tests
  └── 5.8 GREEN: Instrument all components

Phase 6: E2E Validation
  ├── 6.1 60-second TTV journey
  ├── 6.2 Alert storm correlation journey
  ├── 6.3 Deploy correlation journey
  ├── 6.4 Panic mode journey
  └── 6.5 Performance benchmarks

10.3 "Never Ship Without" Checklist

Before any release, these tests must pass:

  • All HMAC validation tests (security gate)
  • All correlation window tests (correctness gate)
  • All noise scoring tests (safety gate — never eat real alerts)
  • All governance policy tests (compliance gate)
  • Circuit breaker DLQ replay test (safety net gate)
  • 60-second TTV E2E journey (product promise gate)
  • PII protection span tests (privacy gate)
  • Schema migration lint (no breaking changes)
  • Coverage ≥80% overall, ≥90% on scoring engine

End of dd0c/alert Test Architecture