# dd0c/cost — Test Architecture & TDD Strategy **Product:** dd0c/cost — AWS Cost Anomaly Detective **Author:** Test Architecture Phase **Date:** February 28, 2026 **Status:** V1 MVP — Solo Founder Scope --- ## Section 1: Testing Philosophy & TDD Workflow ### 1.1 Core Philosophy dd0c/cost sits at the intersection of **money and infrastructure**. A false negative means a customer loses thousands of dollars. A false positive means alert fatigue and churn. The test suite's primary job is to mathematically prove the anomaly scoring engine works across edge cases. Guiding principle: **Test the math first, test the infrastructure second.** The Z-score and novelty algorithms must be exhaustively unit-tested with synthetic data before any AWS APIs are mocked. ### 1.2 Red-Green-Refactor Adapted to dd0c/cost ``` RED → Write a failing test that asserts a specific Z-score and severity for a given historical baseline and new cost event. GREEN → Implement the scoring math to make it pass. REFACTOR → Optimize the baseline lookup, extract novelty checks, refine the heuristic weights. ``` **When to write tests first (strict TDD):** - Anomaly scoring engine (Z-scores, novelty checks, composite severity) - Cold-start heuristics (fast-path for >$5/hr resources) - Baseline calculation (moving averages, standard deviation) - Governance policy (strict vs. audit mode, 14-day promotion) **When integration tests lead:** - CloudTrail ingestion (implement against LocalStack EventBridge, then lock in) - DynamoDB Single-Table schema (build access patterns, then integration test) **When E2E tests lead:** - The Slack alert interaction (format block kit, test the "Snooze/Terminate" buttons) ### 1.3 Test Naming Conventions ```typescript describe('AnomalyScorer', () => { it('assigns critical severity when Z-score > 3 and hourly cost > $1', () => {}); it('flags actor novelty when IAM role has never launched this service', () => {}); it('bypasses baseline and triggers fast-path critical for $10/hr instance', () => {}); }); describe('CloudTrailNormalizer', () => { it('extracts instance type and region from RunInstances event', () => {}); it('looks up correct on-demand pricing for us-east-1 r6g.xlarge', () => {}); }); ``` --- ## Section 2: Test Pyramid ### 2.1 Ratio | Level | Target | Count (V1) | Runtime | |-------|--------|------------|---------| | Unit | 70% | ~250 tests | <20s | | Integration | 20% | ~80 tests | <3min | | E2E/Smoke | 10% | ~15 tests | <5min | ### 2.2 Unit Test Targets | Component | Key Behaviors | Est. Tests | |-----------|--------------|------------| | Event Normalizer | CloudTrail parsing, pricing lookup, deduplication | 40 | | Baseline Engine | Running mean/stddev calculation, maturity checks | 35 | | Anomaly Scorer | Z-score math, novelty detection, composite scoring | 50 | | Remediation Handler | Stop/Terminate payload parsing, IAM role assumption logic | 20 | | Notification Engine | Slack formatting, daily digest aggregation | 30 | | Governance Policy | Mode enforcement, 14-day auto-promotion | 25 | | Feature Flags | Circuit breaker on alert volume, flag metadata | 15 | --- ## Section 3: Unit Test Strategy ### 3.1 Cost Ingestion & Normalization ```typescript describe('CloudTrailNormalizer', () => { it('normalizes EC2 RunInstances event to CostEvent schema', () => {}); it('normalizes RDS CreateDBInstance event to CostEvent schema', () => {}); it('extracts assumed role ARN as actor instead of base STS role', () => {}); it('applies fallback pricing when instance type is not in static table', () => {}); it('ignores non-cost-generating events (e.g., DescribeInstances)', () => {}); }); ``` ### 3.2 Anomaly Engine (The Math) ```typescript describe('AnomalyScorer', () => { describe('Statistical Scoring (Z-Score)', () => { it('returns score=0 when event cost exactly matches baseline mean', () => {}); it('returns proportional score for Z-scores between 1.0 and 3.0', () => {}); it('caps Z-score contribution at max threshold', () => {}); }); describe('Novelty Scoring', () => { it('adds novelty penalty when instance type is first seen for account', () => {}); it('adds novelty penalty when IAM user has never provisioned this service', () => {}); }); describe('Cold-Start Fast Path', () => { it('flags $5/hr instance as warning when baseline < 14 days', () => {}); it('flags $25/hr instance as critical immediately, bypassing baseline', () => {}); it('ignores $0.10/hr instances during cold-start learning period', () => {}); }); }); ``` ### 3.3 Baseline Learning ```typescript describe('BaselineCalculator', () => { it('updates running mean and stddev using Welford algorithm', () => {}); it('adds new actor to observed_actors set', () => {}); it('marks baseline as mature when event_count > 20 and age_days > 14', () => {}); }); ``` --- ## Section 4: Integration Test Strategy ### 4.1 DynamoDB Data Layer (Testcontainers) ```typescript describe('DynamoDB Single-Table Patterns', () => { it('writes CostEvent and updates Baseline in single transaction', async () => {}); it('queries all anomalies for tenant within time range', async () => {}); it('fetches tenant config and Slack tokens securely', async () => {}); }); ``` ### 4.2 AWS API Contract Tests ```typescript describe('AWS Cross-Account Actions', () => { // Uses LocalStack to simulate target account it('assumes target account remediation role successfully', async () => {}); it('executes ec2:StopInstances when remediation approved', async () => {}); it('executes rds:DeleteDBInstance with skip-final-snapshot', async () => {}); }); ``` --- ## Section 5: E2E & Smoke Tests ### 5.1 Critical User Journeys **Journey 1: Real-Time Anomaly Detection** 1. Send synthetic `RunInstances` event to EventBridge (p9.16xlarge, $40/hr). 2. Verify system processes event and triggers fast-path (no baseline). 3. Verify Slack alert is generated with correct cost estimate. **Journey 2: Interactive Remediation** 1. Send webhook simulating user clicking "Stop Instance" in Slack. 2. Verify API Gateway → Lambda executes `StopInstances` against LocalStack. 3. Verify Slack message updates to "Remediation Successful". --- ## Section 6: Performance & Load Testing ```typescript describe('Ingestion Throughput', () => { it('processes 500 CloudTrail events/second via SQS FIFO', async () => {}); it('DynamoDB baseline updates complete in <20ms p95', async () => {}); }); ``` --- ## Section 7: CI/CD Pipeline Integration - **PR Gate:** Unit tests (<2min), Coverage >85% (Scoring engine >95%). - **Merge:** Integration tests with LocalStack & Testcontainers DynamoDB. - **Staging:** E2E journeys against isolated staging AWS account. --- ## Section 8: Transparent Factory Tenet Testing ### 8.1 Atomic Flagging (Circuit Breaker) ```typescript it('auto-disables scoring rule if it generates >10 alerts/hour for single tenant', () => {}); ``` ### 8.2 Configurable Autonomy (14-Day Auto-Promotion) ```typescript it('keeps new tenant in strict mode (log-only) for first 14 days', () => {}); it('auto-promotes to audit mode (auto-alert) on day 15 if false-positive rate < 10%', () => {}); ``` --- ## Section 9: Test Data & Fixtures ``` fixtures/ cloudtrail/ ec2-runinstances.json rds-create-db.json lambda-create-function.json baselines/ mature-steady-spend.json volatile-dev-account.json cold-start.json ``` --- ## Section 10: TDD Implementation Order 1. **Phase 1:** Anomaly math + Unit tests (Strict TDD). 2. **Phase 2:** CloudTrail normalizer + Pricing tables. 3. **Phase 3:** DynamoDB single-table implementation (Integration led). 4. **Phase 4:** Slack formatting + Remediation Lambda. 5. **Phase 5:** Governance policies (14-day promotion logic). *End of dd0c/cost Test Architecture*