Products: route, drift, alert, portal, cost, run
Phases: brainstorm, design-thinking, innovation-strategy, party-mode,
product-brief, architecture, epics (incl. Epic 10 TF compliance),
test-architecture (TDD strategy)
Brand strategy and market research included.
4.2 KiB
dd0c/cost — Test Architecture & TDD Strategy
Version: 2.0
Date: February 28, 2026
Status: Authoritative
Audience: Founding engineer, future contributors
Guiding principle: A cost anomaly detector that misses a $3,000 GPU instance is worse than useless — it's a liability. A cost anomaly detector that cries wolf 40% of the time gets disabled. Tests are the only way to ship with confidence at solo-founder velocity.
Table of Contents
- Testing Philosophy & TDD Workflow
- Test Pyramid
- Unit Test Strategy
- Integration Test Strategy
- E2E & Smoke Tests
- Performance & Load Testing
- CI/CD Pipeline Integration
- Transparent Factory Tenet Testing
- Test Data & Fixtures
- TDD Implementation Order
1. Testing Philosophy & TDD Workflow
Red-Green-Refactor for dd0c/cost
TDD is non-negotiable for the anomaly scoring engine and baseline learning components. A scoring bug that ships to production means either missed anomalies (customers lose money) or false positives (customers disable the product). The cost of a test is minutes. The cost of a scoring bug is churn.
Where TDD is mandatory:
src/scoring/— every scoring signal, composite calculation, and severity classificationsrc/baseline/— all statistical operations (mean, stddev, rolling window, cold-start transitions)src/parsers/— every CloudTrail event parser (RunInstances, CreateDBInstance, etc.)src/pricing/— pricing lookup logic and cost estimationsrc/governance/— policy.json evaluation, auto-promotion logic, panic mode
Where TDD is recommended but not mandatory:
src/notifier/— Slack Block Kit formatting (snapshot tests are sufficient)src/api/— REST handlers (contract tests cover these)src/infra/— CDK stacks (CDK assertions cover these)
Where tests follow implementation:
src/onboarding/— CloudFormation URL generation, Cognito flows (integration tests only)src/slack/— OAuth flows, signature verification (integration tests)
The Red-Green-Refactor Cycle
RED: Write a failing test that describes the desired behavior.
Name it precisely: what component, what input, what expected output.
Run it. Watch it fail. Confirm it fails for the right reason.
GREEN: Write the minimum code to make the test pass.
No gold-plating. No "while I'm here" refactors.
Run the test. Watch it pass.
REFACTOR: Clean up the implementation without changing behavior.
Extract constants. Rename variables. Simplify logic.
Tests must still pass after every refactor step.
Test Naming Convention
All tests follow the pattern: [unit under test] [scenario] [expected outcome]
// ✅ Good — precise, readable, searchable
describe('scoreAnomaly', () => {
it('returns critical severity when z-score exceeds 5.0 and instance type is novel', () => {});
it('returns none severity when account is in cold-start and cost is below $0.50/hr', () => {});
it('returns warning severity when actor is novel but cost is within 2 standard deviations', () => {});
it('compounds severity when multiple signals fire simultaneously', () => {});
});
// ❌ Bad — vague, not searchable
describe('scoring', () => {
it('works correctly', () => {});
it('handles edge cases', () => {});
});
Decision Log Requirement
Per Transparent Factory tenet (Story 10.3), any PR touching src/scoring/, src/baseline/, or src/detection/ must include a docs/decisions/<YYYY-MM-DD>-<slug>.json file. The test suite validates this in CI.
{
"prompt": "Should Z-score threshold be 2.5 or 3.0?",
"reasoning": "At 2.5, false positive rate in design partner data was 28%. At 3.0, it dropped to 18% with only 2 additional missed true positives over 30 days.",
"alternatives_considered": ["2.0 (too noisy)", "3.5 (misses too many real anomalies)"],
"confidence": "medium",
"timestamp": "2026-02-28T10:00:00Z",
"author": "brian"
}