Files

Max Mayfield 1101fef096 Update test architectures for P3, P4, P5

2026-02-28 23:33:07 +00:00

7.7 KiB

Raw Blame History

dd0c/cost — Test Architecture & TDD Strategy

Product: dd0c/cost — AWS Cost Anomaly Detective Author: Test Architecture Phase Date: February 28, 2026 Status: V1 MVP — Solo Founder Scope

Section 1: Testing Philosophy & TDD Workflow

1.1 Core Philosophy

dd0c/cost sits at the intersection of money and infrastructure. A false negative means a customer loses thousands of dollars. A false positive means alert fatigue and churn. The test suite's primary job is to mathematically prove the anomaly scoring engine works across edge cases.

Guiding principle: Test the math first, test the infrastructure second. The Z-score and novelty algorithms must be exhaustively unit-tested with synthetic data before any AWS APIs are mocked.

1.2 Red-Green-Refactor Adapted to dd0c/cost

RED   → Write a failing test that asserts a specific Z-score and severity
         for a given historical baseline and new cost event.

GREEN → Implement the scoring math to make it pass.

REFACTOR → Optimize the baseline lookup, extract novelty checks,
            refine the heuristic weights.

When to write tests first (strict TDD):

Anomaly scoring engine (Z-scores, novelty checks, composite severity)
Cold-start heuristics (fast-path for >$5/hr resources)
Baseline calculation (moving averages, standard deviation)
Governance policy (strict vs. audit mode, 14-day promotion)

When integration tests lead:

CloudTrail ingestion (implement against LocalStack EventBridge, then lock in)
DynamoDB Single-Table schema (build access patterns, then integration test)

When E2E tests lead:

The Slack alert interaction (format block kit, test the "Snooze/Terminate" buttons)

1.3 Test Naming Conventions

describe('AnomalyScorer', () => {
  it('assigns critical severity when Z-score > 3 and hourly cost > $1', () => {});
  it('flags actor novelty when IAM role has never launched this service', () => {});
  it('bypasses baseline and triggers fast-path critical for $10/hr instance', () => {});
});

describe('CloudTrailNormalizer', () => {
  it('extracts instance type and region from RunInstances event', () => {});
  it('looks up correct on-demand pricing for us-east-1 r6g.xlarge', () => {});
});

Section 2: Test Pyramid

2.1 Ratio

Level	Target	Count (V1)	Runtime
Unit	70%	~250 tests	<20s
Integration	20%	~80 tests	<3min
E2E/Smoke	10%	~15 tests	<5min

2.2 Unit Test Targets

Component	Key Behaviors	Est. Tests
Event Normalizer	CloudTrail parsing, pricing lookup, deduplication	40
Baseline Engine	Running mean/stddev calculation, maturity checks	35
Anomaly Scorer	Z-score math, novelty detection, composite scoring	50
Remediation Handler	Stop/Terminate payload parsing, IAM role assumption logic	20
Notification Engine	Slack formatting, daily digest aggregation	30
Governance Policy	Mode enforcement, 14-day auto-promotion	25
Feature Flags	Circuit breaker on alert volume, flag metadata	15

Section 3: Unit Test Strategy

3.1 Cost Ingestion & Normalization

describe('CloudTrailNormalizer', () => {
  it('normalizes EC2 RunInstances event to CostEvent schema', () => {});
  it('normalizes RDS CreateDBInstance event to CostEvent schema', () => {});
  it('extracts assumed role ARN as actor instead of base STS role', () => {});
  it('applies fallback pricing when instance type is not in static table', () => {});
  it('ignores non-cost-generating events (e.g., DescribeInstances)', () => {});
});

3.2 Anomaly Engine (The Math)

describe('AnomalyScorer', () => {
  describe('Statistical Scoring (Z-Score)', () => {
    it('returns score=0 when event cost exactly matches baseline mean', () => {});
    it('returns proportional score for Z-scores between 1.0 and 3.0', () => {});
    it('caps Z-score contribution at max threshold', () => {});
  });

  describe('Novelty Scoring', () => {
    it('adds novelty penalty when instance type is first seen for account', () => {});
    it('adds novelty penalty when IAM user has never provisioned this service', () => {});
  });

  describe('Cold-Start Fast Path', () => {
    it('flags $5/hr instance as warning when baseline < 14 days', () => {});
    it('flags $25/hr instance as critical immediately, bypassing baseline', () => {});
    it('ignores $0.10/hr instances during cold-start learning period', () => {});
  });
});

3.3 Baseline Learning

describe('BaselineCalculator', () => {
  it('updates running mean and stddev using Welford algorithm', () => {});
  it('adds new actor to observed_actors set', () => {});
  it('marks baseline as mature when event_count > 20 and age_days > 14', () => {});
});

Section 4: Integration Test Strategy

4.1 DynamoDB Data Layer (Testcontainers)

describe('DynamoDB Single-Table Patterns', () => {
  it('writes CostEvent and updates Baseline in single transaction', async () => {});
  it('queries all anomalies for tenant within time range', async () => {});
  it('fetches tenant config and Slack tokens securely', async () => {});
});

4.2 AWS API Contract Tests

describe('AWS Cross-Account Actions', () => {
  // Uses LocalStack to simulate target account
  it('assumes target account remediation role successfully', async () => {});
  it('executes ec2:StopInstances when remediation approved', async () => {});
  it('executes rds:DeleteDBInstance with skip-final-snapshot', async () => {});
});

Section 5: E2E & Smoke Tests

5.1 Critical User Journeys

Journey 1: Real-Time Anomaly Detection

Send synthetic RunInstances event to EventBridge (p9.16xlarge, $40/hr).
Verify system processes event and triggers fast-path (no baseline).
Verify Slack alert is generated with correct cost estimate.

Journey 2: Interactive Remediation

Send webhook simulating user clicking "Stop Instance" in Slack.
Verify API Gateway → Lambda executes StopInstances against LocalStack.
Verify Slack message updates to "Remediation Successful".

Section 6: Performance & Load Testing

describe('Ingestion Throughput', () => {
  it('processes 500 CloudTrail events/second via SQS FIFO', async () => {});
  it('DynamoDB baseline updates complete in <20ms p95', async () => {});
});

Section 7: CI/CD Pipeline Integration

PR Gate: Unit tests (<2min), Coverage >85% (Scoring engine >95%).
Merge: Integration tests with LocalStack & Testcontainers DynamoDB.
Staging: E2E journeys against isolated staging AWS account.

Section 8: Transparent Factory Tenet Testing

8.1 Atomic Flagging (Circuit Breaker)

it('auto-disables scoring rule if it generates >10 alerts/hour for single tenant', () => {});

8.2 Configurable Autonomy (14-Day Auto-Promotion)

it('keeps new tenant in strict mode (log-only) for first 14 days', () => {});
it('auto-promotes to audit mode (auto-alert) on day 15 if false-positive rate < 10%', () => {});

Section 9: Test Data & Fixtures

fixtures/
  cloudtrail/
    ec2-runinstances.json
    rds-create-db.json
    lambda-create-function.json
  baselines/
    mature-steady-spend.json
    volatile-dev-account.json
    cold-start.json

Section 10: TDD Implementation Order

Phase 1: Anomaly math + Unit tests (Strict TDD).
Phase 2: CloudTrail normalizer + Pricing tables.
Phase 3: DynamoDB single-table implementation (Integration led).
Phase 4: Slack formatting + Remediation Lambda.
Phase 5: Governance policies (14-day promotion logic).

End of dd0c/cost Test Architecture

7.7 KiB Raw Blame History