Implement review remediation + PLG analytics SDK

- All 6 test architectures patched with Section 11 addendums
- P5 (cost) fully rewritten from 232 to ~600 lines
- PLG brainstorm + party mode advisory board results
- Analytics SDK v2 (PostHog Cloud, Zod strict, Lambda-safe)
- Analytics tests v2 (safeParse, no , no timestamp, no PII)
- Addresses all Gemini review findings across P1-P6
This commit is contained in:
2026-03-01 01:42:49 +00:00
parent 2fe0ed856e
commit 03bfe931fc
9 changed files with 2950 additions and 85 deletions

View File

@@ -1,8 +1,8 @@
# dd0c/cost — Test Architecture & TDD Strategy
**Product:** dd0c/cost — AWS Cost Anomaly Detective
**Author:** Test Architecture Phase
**Date:** February 28, 2026
**Author:** Test Architecture Phase (v2 — Post-Review Rewrite)
**Date:** March 1, 2026
**Status:** V1 MVP — Solo Founder Scope
---
@@ -13,7 +13,9 @@
dd0c/cost sits at the intersection of **money and infrastructure**. A false negative means a customer loses thousands of dollars. A false positive means alert fatigue and churn. The test suite's primary job is to mathematically prove the anomaly scoring engine works across edge cases.
Guiding principle: **Test the math first, test the infrastructure second.** The Z-score and novelty algorithms must be exhaustively unit-tested with synthetic data before any AWS APIs are mocked.
Guiding principle: **Test the math first, test the infrastructure second.** The Z-score and novelty algorithms must be exhaustively tested with property-based testing before any AWS APIs are mocked.
Second principle: **Every dollar matters.** Cost calculations involve floating-point arithmetic on money. Rounding errors, precision loss, and currency handling must be tested with the same rigor as a financial system.
### 1.2 Red-Green-Refactor Adapted to dd0c/cost
@@ -28,33 +30,52 @@ REFACTOR → Optimize the baseline lookup, extract novelty checks,
```
**When to write tests first (strict TDD):**
- Anomaly scoring engine (Z-scores, novelty checks, composite severity)
- Cold-start heuristics (fast-path for >$5/hr resources)
- Baseline calculation (moving averages, standard deviation)
- Governance policy (strict vs. audit mode, 14-day promotion)
- All anomaly scoring (Z-scores, novelty checks, composite severity)
- All cold-start heuristics (fast-path for >$5/hr resources)
- All baseline calculation (Welford algorithm, maturity transitions)
- All governance policy (strict vs. audit mode, 14-day auto-promotion, panic mode)
- All Slack signature validation (security-critical)
- All cost calculations (pricing lookup, hourly cost estimation)
- All feature flag circuit breakers
**When integration tests lead:**
- CloudTrail ingestion (implement against LocalStack EventBridge, then lock in)
- DynamoDB Single-Table schema (build access patterns, then integration test)
- Cross-account STS role assumption (test against LocalStack)
**When E2E tests lead:**
- The Slack alert interaction (format block kit, test the "Snooze/Terminate" buttons)
- Slack alert interaction (format block kit, test "Snooze/Terminate" buttons)
- Onboarding wizard (CloudFormation quick-create → role validation → first alert)
### 1.3 Test Naming Conventions
```typescript
// Unit tests
describe('AnomalyScorer', () => {
it('assigns critical severity when Z-score > 3 and hourly cost > $1', () => {});
it('flags actor novelty when IAM role has never launched this service', () => {});
it('assigns critical severity when Z-score exceeds 3 and hourly cost exceeds $1', () => {});
it('flags actor novelty when IAM role has never launched this service type', () => {});
it('bypasses baseline and triggers fast-path critical for $10/hr instance', () => {});
});
describe('CloudTrailNormalizer', () => {
it('extracts instance type and region from RunInstances event', () => {});
it('looks up correct on-demand pricing for us-east-1 r6g.xlarge', () => {});
describe('BaselineCalculator', () => {
it('updates running mean using Welford online algorithm', () => {});
it('handles zero standard deviation without division by zero', () => {});
});
// Property-based tests
describe('AnomalyScorer (property-based)', () => {
it('always returns severity between 0 and 100 for any valid input', () => {});
it('monotonically increases score as Z-score increases', () => {});
it('never assigns critical to events below $0.50/hr regardless of Z-score', () => {});
});
```
**Rules:**
- Describe the observable outcome, not the implementation
- Use present tense
- If you need "and" in the name, split into two tests
- Property-based tests explicitly state the invariant
---
## Section 2: Test Pyramid
@@ -63,93 +84,441 @@ describe('CloudTrailNormalizer', () => {
| Level | Target | Count (V1) | Runtime |
|-------|--------|------------|---------|
| Unit | 70% | ~250 tests | <20s |
| Integration | 20% | ~80 tests | <3min |
| E2E/Smoke | 10% | ~15 tests | <5min |
| Unit | 80% | ~350 tests | <25s |
| Integration | 15% | ~65 tests | <4min |
| E2E/Smoke | 5% | ~15 tests | <8min |
Higher unit ratio than other dd0c products because the core value is pure math (scoring, baselines, Z-scores).
### 2.2 Unit Test Targets
| Component | Key Behaviors | Est. Tests |
|-----------|--------------|------------|
| Event Normalizer | CloudTrail parsing, pricing lookup, deduplication | 40 |
| Baseline Engine | Running mean/stddev calculation, maturity checks | 35 |
| Anomaly Scorer | Z-score math, novelty detection, composite scoring | 50 |
| Remediation Handler | Stop/Terminate payload parsing, IAM role assumption logic | 20 |
| Notification Engine | Slack formatting, daily digest aggregation | 30 |
| Governance Policy | Mode enforcement, 14-day auto-promotion | 25 |
| Feature Flags | Circuit breaker on alert volume, flag metadata | 15 |
| CloudTrail Normalizer | Event parsing, pricing lookup, dedup, field extraction | 40 |
| Baseline Engine | Welford algorithm, maturity transitions, feedback loop | 45 |
| Anomaly Scorer | Z-score, novelty, composite scoring, cold-start fast-path | 60 |
| Zombie Hunter | Idle resource detection, cost estimation, age calculation | 25 |
| Notification Formatter | Slack Block Kit, daily digest, CLI command generation | 30 |
| Slack Bot | Command parsing, signature validation, action handling | 25 |
| Remediation Handler | Stop/Terminate logic, IAM role assumption, snooze/dismiss | 20 |
| Dashboard API | CRUD, tenant isolation, pagination, filtering | 25 |
| Governance Policy | Mode enforcement, 14-day promotion, panic mode | 30 |
| Feature Flags | Circuit breaker, flag lifecycle, local evaluation | 15 |
| Onboarding | CFN template validation, role validation, free tier enforcement | 20 |
| Cost Calculations | Pricing precision, rounding, fallback pricing, currency | 15 |
### 2.3 Integration Test Boundaries
| Boundary | What's Tested | Infrastructure |
|----------|--------------|----------------|
| EventBridge → SQS FIFO | Cross-account event routing, dedup, ordering | LocalStack |
| SQS → Event Processor Lambda | Batch processing, error handling, DLQ routing | LocalStack |
| Event Processor → DynamoDB | CostEvent writes, baseline updates, transactions | Testcontainers DynamoDB Local |
| Anomaly Scorer → DynamoDB | Baseline reads, anomaly record writes | Testcontainers DynamoDB Local |
| Notifier → Slack API | Block Kit delivery, rate limiting, message updates | WireMock |
| API Gateway → Lambda | Auth (Cognito JWT), routing, throttling | LocalStack |
| STS → Customer Account | Cross-account role assumption, ExternalId validation | LocalStack |
| CDK Synth | Infrastructure snapshot, resource policy validation | CDK assertions |
### 2.4 E2E/Smoke Scenarios
1. **Real-Time Anomaly Detection**: CloudTrail event → scoring → Slack alert (<30s)
2. **Interactive Remediation**: Slack button click → StopInstances → message update
3. **Onboarding Flow**: Signup → CFN deploy → role validation → first alert
4. **14-Day Auto-Promotion**: Simulate 14 days → verify strict→audit transition
5. **Zombie Hunter**: Daily scan → detect idle EC2 → Slack digest
6. **Panic Mode**: Enable panic → all alerting stops → anomalies still logged
---
## Section 3: Unit Test Strategy
### 3.1 Cost Ingestion & Normalization
### 3.1 CloudTrail Normalizer
```typescript
describe('CloudTrailNormalizer', () => {
it('normalizes EC2 RunInstances event to CostEvent schema', () => {});
it('normalizes RDS CreateDBInstance event to CostEvent schema', () => {});
it('extracts assumed role ARN as actor instead of base STS role', () => {});
it('applies fallback pricing when instance type is not in static table', () => {});
it('ignores non-cost-generating events (e.g., DescribeInstances)', () => {});
describe('Event Parsing', () => {
it('normalizes EC2 RunInstances to CostEvent schema', () => {});
it('normalizes RDS CreateDBInstance to CostEvent schema', () => {});
it('normalizes Lambda CreateFunction to CostEvent schema', () => {});
it('extracts assumed role ARN as actor (not base STS role)', () => {});
it('extracts instance type, region, and AZ from event detail', () => {});
it('handles batched RunInstances (multiple instances in one call)', () => {});
it('ignores non-cost-generating events (DescribeInstances, ListBuckets)', () => {});
it('handles malformed CloudTrail JSON without crashing', () => {});
it('handles missing optional fields gracefully', () => {});
});
describe('Pricing Lookup', () => {
it('looks up correct on-demand price for us-east-1 m5.xlarge', () => {});
it('looks up correct on-demand price for us-west-2 r6g.2xlarge', () => {});
it('applies fallback pricing when instance type not in static table', () => {});
it('returns $0 for instance types with no pricing data and logs warning', () => {});
it('handles GPU instances (p4d, g5) with correct pricing', () => {});
});
describe('Deduplication', () => {
it('generates deterministic fingerprint from eventID', () => {});
it('detects duplicate CloudTrail events by eventID', () => {});
it('allows same resource type from different events', () => {});
});
describe('Cost Precision', () => {
it('calculates hourly cost with 4 decimal places', () => {});
it('rounds consistently (banker rounding) to avoid accumulation errors', () => {});
it('handles sub-cent costs for Lambda invocations', () => {});
});
});
```
### 3.2 Anomaly Engine (The Math)
### 3.2 Anomaly Scorer
The most critical component. Uses property-based testing via `fast-check`.
```typescript
describe('AnomalyScorer', () => {
describe('Statistical Scoring (Z-Score)', () => {
it('returns score=0 when event cost exactly matches baseline mean', () => {});
describe('Z-Score Calculation', () => {
it('returns 0 when event cost exactly matches baseline mean', () => {});
it('returns proportional score for Z-scores between 1.0 and 3.0', () => {});
it('caps Z-score contribution at max threshold', () => {});
it('caps Z-score contribution at configurable max threshold', () => {});
it('handles zero standard deviation without division by zero', () => {});
it('handles single data point baseline (stddev undefined)', () => {});
it('handles extremely large values without float overflow', () => {});
it('handles negative cost delta (cost decrease) as non-anomalous', () => {});
});
describe('Novelty Scoring', () => {
it('adds novelty penalty when instance type is first seen for account', () => {});
it('adds novelty penalty when IAM user has never provisioned this service', () => {});
it('adds instance novelty penalty when type first seen for account', () => {});
it('adds actor novelty penalty when IAM role is new', () => {});
it('does not penalize known instance type + known actor', () => {});
it('weights instance novelty higher than actor novelty', () => {});
});
describe('Composite Scoring', () => {
it('combines Z-score + novelty into composite severity', () => {});
it('classifies composite < 30 as info', () => {});
it('classifies composite 30-60 as warning', () => {});
it('classifies composite > 60 as critical', () => {});
it('never assigns critical to events below $0.50/hr', () => {});
});
describe('Cold-Start Fast Path', () => {
it('flags $5/hr instance as warning when baseline < 14 days', () => {});
it('flags $25/hr instance as critical immediately, bypassing baseline', () => {});
it('ignores $0.10/hr instances during cold-start learning period', () => {});
it('ignores $0.10/hr instances during cold-start learning', () => {});
it('fast-path is always on — not behind a feature flag', () => {});
it('transitions from fast-path to statistical scoring at maturity', () => {});
});
describe('Feedback Loop', () => {
it('reduces score for resources marked as expected', () => {});
it('adds actor to expected list after mark-as-expected', () => {});
it('still flags expected actor if cost is 10x above baseline', () => {});
});
describe('Property-Based Tests (fast-check)', () => {
it('score is always between 0 and 100 for any valid input', () => {
// fc.assert(fc.property(
// fc.record({ cost: fc.float({min: 0}), mean: fc.float({min: 0}), stddev: fc.float({min: 0}) }),
// (input) => { const score = scorer.score(input); return score >= 0 && score <= 100; }
// ))
});
it('score monotonically increases as cost increases (baseline fixed)', () => {});
it('score monotonically increases as Z-score increases', () => {});
it('cold-start fast-path always triggers for cost > $25/hr', () => {});
it('mature baseline never uses fast-path thresholds', () => {});
});
});
```
### 3.3 Baseline Learning
### 3.3 Baseline Engine
```typescript
describe('BaselineCalculator', () => {
it('updates running mean and stddev using Welford algorithm', () => {});
it('adds new actor to observed_actors set', () => {});
it('marks baseline as mature when event_count > 20 and age_days > 14', () => {});
describe('Welford Online Algorithm', () => {
it('updates running mean correctly after each observation', () => {});
it('updates running variance correctly after each observation', () => {});
it('produces correct stddev after 100 observations', () => {});
it('handles first observation (count=1, stddev=0)', () => {});
it('handles identical observations (stddev=0)', () => {});
it('handles catastrophic cancellation with large values', () => {
// Welford is numerically stable — verify this property
});
});
describe('Maturity Transitions', () => {
it('starts in cold-start state', () => {});
it('transitions to learning after 5 events', () => {});
it('transitions to mature after 20 events AND 14 days', () => {});
it('does not mature with 100 events but only 3 days', () => {});
it('does not mature with 14 days but only 5 events', () => {});
});
describe('Actor & Instance Tracking', () => {
it('adds new actor to observed_actors set', () => {});
it('adds new instance type to observed_types set', () => {});
it('does not duplicate existing actors', () => {});
});
describe('Property-Based Tests', () => {
it('mean converges to true mean as observations increase', () => {});
it('variance is always non-negative', () => {});
it('stddev equals sqrt(variance) within float tolerance', () => {});
});
});
```
### 3.4 Zombie Hunter
```typescript
describe('ZombieHunter', () => {
it('detects EC2 instance running >7 days with <5% CPU utilization', () => {});
it('detects RDS instance with 0 connections for >3 days', () => {});
it('detects unattached EBS volumes older than 7 days', () => {});
it('calculates cumulative waste cost for each zombie', () => {});
it('excludes instances tagged dd0c:ignore', () => {});
it('handles API pagination for accounts with 500+ instances', () => {});
it('respects read-only IAM permissions (never modifies resources)', () => {});
});
```
### 3.5 Notification Formatter
```typescript
describe('NotificationFormatter', () => {
describe('Slack Block Kit', () => {
it('formats EC2 anomaly with resource type, region, cost, actor', () => {});
it('formats RDS anomaly with engine, storage, multi-AZ status', () => {});
it('includes "Why this alert" section with anomaly signals', () => {});
it('includes suggested CLI commands for remediation', () => {});
it('includes Snooze/Mark Expected/Stop Instance buttons', () => {});
it('generates correct aws ec2 stop-instances command', () => {});
it('generates correct aws rds stop-db-instance command', () => {});
});
describe('Daily Digest', () => {
it('aggregates 24h of anomalies into summary stats', () => {});
it('includes total estimated spend across all accounts', () => {});
it('highlights top 3 costliest anomalies', () => {});
it('includes zombie resource count and waste estimate', () => {});
it('shows baseline learning progress for new accounts', () => {});
});
});
```
### 3.6 Slack Bot
```typescript
describe('SlackBot', () => {
describe('Signature Validation', () => {
it('validates correct Slack request signature (HMAC-SHA256)', () => {});
it('rejects request with invalid signature', () => {});
it('rejects request with missing X-Slack-Signature header', () => {});
it('rejects request with expired timestamp (>5 min)', () => {});
it('uses timing-safe comparison to prevent timing attacks', () => {});
});
describe('Command Parsing', () => {
it('routes /dd0c status to status handler', () => {});
it('routes /dd0c anomalies to anomaly list handler', () => {});
it('routes /dd0c digest to digest handler', () => {});
it('returns help text for unknown commands', () => {});
it('responds within 3 seconds or defers with 200 OK', () => {});
});
describe('Interactive Actions', () => {
it('validates interactive payload signature', () => {});
it('handles mark_expected action and updates baseline', () => {});
it('handles snooze_1h action and sets snoozeUntil', () => {});
it('handles snooze_24h action', () => {});
it('updates original Slack message after action', () => {});
it('rejects action from user not in authorized workspace', () => {});
});
});
```
### 3.7 Governance Policy Engine
```typescript
describe('GovernancePolicy', () => {
describe('Mode Enforcement', () => {
it('strict mode: logs anomaly but does not send Slack alert', () => {});
it('audit mode: sends Slack alert with full logging', () => {});
it('defaults new accounts to strict mode', () => {});
});
describe('14-Day Auto-Promotion', () => {
it('does not promote account with <14 days of baseline', () => {});
it('does not promote account with >10% false-positive rate', () => {});
it('promotes account on day 15 if FP rate <10%', () => {});
it('calculates false-positive rate from mark-as-expected actions', () => {});
it('auto-promotion check runs daily via cron', () => {});
});
describe('Panic Mode', () => {
it('stops all alerting when panic=true', () => {});
it('continues scoring and logging during panic', () => {});
it('activates in <1 second via Redis key', () => {});
it('activatable via POST /admin/panic', () => {});
it('dashboard API returns "alerting paused" header during panic', () => {});
});
describe('Per-Account Override', () => {
it('account can set stricter mode than system default', () => {});
it('account cannot downgrade from system strict to audit', () => {});
it('merge logic: max_restrictive(system, account)', () => {});
});
describe('Policy Decision Logging', () => {
it('logs "suppressed by strict mode" with anomaly context', () => {});
it('logs "auto-promoted to audit mode" with baseline stats', () => {});
it('logs "panic mode active — alerting paused"', () => {});
});
});
```
### 3.8 Dashboard API
```typescript
describe('DashboardAPI', () => {
describe('Account Management', () => {
it('GET /v1/accounts returns connected accounts for tenant', () => {});
it('DELETE /v1/accounts/:id marks account as disconnecting', () => {});
it('returns 401 without valid Cognito JWT', () => {});
it('scopes all queries to authenticated tenantId', () => {});
});
describe('Anomaly Listing', () => {
it('GET /v1/anomalies returns recent anomalies', () => {});
it('supports since, status, severity filters', () => {});
it('implements cursor-based pagination', () => {});
it('includes slackMessageUrl when alert was sent', () => {});
});
describe('Baseline Overrides', () => {
it('PATCH /v1/accounts/:id/baselines/:service/:type updates sensitivity', () => {});
it('rejects invalid sensitivity values', () => {});
});
describe('Tenant Isolation', () => {
it('never returns anomalies from another tenant', () => {});
it('never returns accounts from another tenant', () => {});
it('enforces tenantId on all DynamoDB queries', () => {});
});
});
```
### 3.9 Onboarding & PLG
```typescript
describe('Onboarding', () => {
describe('CloudFormation Template', () => {
it('generates valid CFN YAML with correct IAM permissions', () => {});
it('includes ExternalId parameter', () => {});
it('includes EventBridge rule for cost-relevant CloudTrail events', () => {});
it('quick-create URL contains correct template URL and parameters', () => {});
});
describe('Role Validation', () => {
it('successfully assumes role with correct ExternalId', () => {});
it('returns clear error on role not found', () => {});
it('returns clear error on ExternalId mismatch', () => {});
it('triggers zombie scan on successful connection', () => {});
});
describe('Free Tier Enforcement', () => {
it('allows first account connection on free tier', () => {});
it('rejects second account with 403 and upgrade prompt', () => {});
it('allows multiple accounts on pro tier', () => {});
});
describe('Stripe Integration', () => {
it('creates Stripe Checkout session with correct pricing', () => {});
it('handles checkout.session.completed webhook', () => {});
it('handles customer.subscription.deleted webhook', () => {});
it('validates Stripe webhook signature', () => {});
it('updates tenant tier to pro on successful payment', () => {});
it('downgrades tenant on subscription cancellation', () => {});
});
});
```
### 3.10 Feature Flag Circuit Breaker
```typescript
describe('AlertVolumeCircuitBreaker', () => {
it('allows alerting when volume is within 3x baseline', () => {});
it('trips breaker when alerts exceed 3x baseline over 1 hour', () => {});
it('auto-disables the scoring flag when breaker trips', () => {});
it('buffers suppressed alerts in DLQ for review', () => {});
it('tracks alert-per-account rate in Redis sliding window', () => {});
it('resets breaker after manual flag re-enable', () => {});
it('fast-path alerts are exempt from circuit breaker', () => {});
});
```
---
## Section 4: Integration Test Strategy
### 4.1 DynamoDB Data Layer (Testcontainers)
```typescript
describe('DynamoDB Single-Table Patterns', () => {
it('writes CostEvent and updates Baseline in single transaction', async () => {});
it('queries all anomalies for tenant within time range', async () => {});
it('fetches tenant config and Slack tokens securely', async () => {});
describe('DynamoDB Integrations', () => {
let dynamodb: StartedTestContainer;
beforeAll(async () => {
dynamodb = await new GenericContainer('amazon/dynamodb-local:latest')
.withExposedPorts(8000).start();
// Create dd0c-cost-main table with GSIs
});
describe('Transactional Writes', () => {
it('writes CostEvent and updates Baseline in single TransactWriteItem', async () => {});
it('fails gracefully if TransactWriteItem encounters ConditionalCheckFailed', async () => {});
it('handles partial failure recovery when Baseline update conflicts', async () => {});
});
describe('Access Patterns', () => {
it('queries all anomalies for tenant within time range (GSI3)', async () => {});
it('fetches tenant config and Slack tokens securely', async () => {});
it('retrieves accurate Baseline snapshot by resource type', async () => {});
});
});
```
### 4.2 AWS API Contract Tests
### 4.2 Cross-Account STS & AWS APIs (LocalStack)
```typescript
describe('AWS Cross-Account Actions', () => {
// Uses LocalStack to simulate target account
it('assumes target account remediation role successfully', async () => {});
it('executes ec2:StopInstances when remediation approved', async () => {});
it('executes rds:DeleteDBInstance with skip-final-snapshot', async () => {});
describe('AWS Cross-Account Integrations', () => {
let localstack: StartedTestContainer;
beforeAll(async () => {
localstack = await new GenericContainer('localstack/localstack:3')
.withEnv('SERVICES', 'sts,ec2,rds')
.withExposedPorts(4566).start();
});
describe('Role Assumption', () => {
it('successfully assumes target account remediation role via STS', async () => {});
it('fails when ExternalId does not match (Security)', async () => {});
it('handles STS credential expiration gracefully', async () => {});
});
describe('Remediation Actions', () => {
it('executes ec2:StopInstances when remediation approved', async () => {});
it('executes rds:StopDBInstance when remediation approved', async () => {});
it('fails safely when target IAM role lacks StopInstances permission', async () => {});
});
});
```
### 4.3 Slack API Contract (WireMock)
```typescript
describe('Slack Integration', () => {
it('formats and delivers Block Kit message successfully', async () => {});
it('handles 429 Rate Limit by throwing retryable error for SQS visibility timeout', async () => {});
it('updates existing Slack message when anomaly is snoozed', async () => {});
});
```
@@ -159,24 +528,65 @@ describe('AWS Cross-Account Actions', () => {
### 5.1 Critical User Journeys
**Journey 1: Real-Time Anomaly Detection**
1. Send synthetic `RunInstances` event to EventBridge (p9.16xlarge, $40/hr).
2. Verify system processes event and triggers fast-path (no baseline).
3. Verify Slack alert is generated with correct cost estimate.
**Journey 1: Real-Time Anomaly Detection (The Golden Path)**
```typescript
describe('E2E: Anomaly Detection', () => {
it('detects anomaly and alerts Slack within 30 seconds', async () => {
// 1. Inject synthetic CloudTrail `RunInstances` event (p4d.24xlarge) into SQS Ingestion Queue
// 2. Poll DynamoDB to ensure CostEvent was recorded
// 3. Poll DynamoDB to ensure AnomalyRecord was created (fast-path triggered)
// 4. Assert WireMock received the Slack chat.postMessage call with Block Kit
});
});
```
**Journey 2: Interactive Remediation**
1. Send webhook simulating user clicking "Stop Instance" in Slack.
2. Verify API Gateway → Lambda executes `StopInstances` against LocalStack.
3. Verify Slack message updates to "Remediation Successful".
```typescript
describe('E2E: Interactive Remediation', () => {
it('stops EC2 instance when user clicks Stop in Slack', async () => {
// 1. Simulate Slack sending interactive webhook payload for "Stop Instance"
// 2. Validate HMAC signature in API Gateway lambda
// 3. Verify LocalStack EC2 mock receives StopInstances call
// 4. Verify Slack message is updated to "Remediation Successful"
});
});
```
**Journey 3: Onboarding & First Scan**
```typescript
describe('E2E: Onboarding', () => {
it('validates IAM role and triggers initial zombie scan', async () => {
// 1. Trigger POST /v1/accounts with new role ARN
// 2. Verify account marked active
// 3. Verify EventBridge Scheduler creates cron for Zombie Hunter
});
});
```
---
## Section 6: Performance & Load Testing
### 6.1 Ingestion & Scoring Throughput
```typescript
describe('Ingestion Throughput', () => {
it('processes 500 CloudTrail events/second via SQS FIFO', async () => {});
it('DynamoDB baseline updates complete in <20ms p95', async () => {});
describe('Performance: Alert Storm', () => {
it('processes 1000 CloudTrail events/sec without SQS DLQ overflow', async () => {
// k6 load test hitting SQS directly
});
it('DynamoDB baseline updates complete in <20ms p95 under load', async () => {
// Ensure Single-Table schema does not create hot partitions
});
it('Anomaly Scorer Lambda consumes <256MB memory during burst', async () => {});
});
```
### 6.2 Data Scale Tests
```typescript
describe('Performance: Baseline Scale', () => {
it('calculates Z-score in <5ms even when observed_actors set exceeds 1000', async () => {});
it('handles accounts with 100,000+ daily CostEvents without throttling DynamoDB (On-Demand scaling)', async () => {});
});
```
@@ -184,49 +594,119 @@ describe('Ingestion Throughput', () => {
## Section 7: CI/CD Pipeline Integration
- **PR Gate:** Unit tests (<2min), Coverage >85% (Scoring engine >95%).
- **Merge:** Integration tests with LocalStack & Testcontainers DynamoDB.
- **Staging:** E2E journeys against isolated staging AWS account.
### 7.1 Pipeline Stages
```
┌─────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Pre-Commit │───▶│ PR Gate │───▶│ Merge │───▶│ Staging │───▶│ Prod │
│ (local) │ │ (CI) │ │ (CI) │ │ (CD) │ │ (CD) │
└─────────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
lint + type unit tests integration E2E + perf canary
<10s math prop Testcontainers LocalStack <5 mins
tests <1m <4 mins <10 mins
```
### 7.2 Coverage Gates
| Component | Threshold |
|-----------|-----------|
| Anomaly Scorer (Math) | 100% |
| CloudTrail Normalizer | 95% |
| Governance Policy | 95% |
| Slack Signature Auth | 100% |
| Overall Pipeline | 85% |
---
## Section 8: Transparent Factory Tenet Testing
### 8.1 Atomic Flagging (Circuit Breaker)
### 8.1 Atomic Flagging
```typescript
it('auto-disables scoring rule if it generates >10 alerts/hour for single tenant', () => {});
describe('Atomic Flagging', () => {
it('auto-disables scoring rule flag if alert volume exceeds 3x baseline in 1hr', () => {});
it('buffers suppressed anomalies in SQS DLQ while flag is off', () => {});
it('fails CI if any flag TTL exceeds 14 days', () => {});
it('evaluates flags strictly locally (in-memory provider)', () => {});
});
```
### 8.2 Configurable Autonomy (14-Day Auto-Promotion)
### 8.2 Elastic Schema
```typescript
it('keeps new tenant in strict mode (log-only) for first 14 days', () => {});
it('auto-promotes to audit mode (auto-alert) on day 15 if false-positive rate < 10%', () => {});
describe('Elastic Schema', () => {
it('rejects DynamoDB table definition modifications that alter key schemas', () => {});
it('requires all DynamoDB item updates to use ADD/SET (additive only)', () => {});
it('ignores unknown attributes (V2 fields) in V1 CostEvent decoders', () => {});
});
```
### 8.3 Cognitive Durability
```typescript
describe('Cognitive Durability', () => {
it('requires decision_log.json for any PR modifying Z-score thresholds or weights', () => {});
it('enforces cyclomatic complexity < 10 for all AnomalyScorer math functions', () => {});
});
```
### 8.4 Semantic Observability
```typescript
describe('Semantic Observability', () => {
it('emits OTEL span for every Anomaly Scoring decision', () => {});
it('includes attributes: cost.z_score, cost.anomaly_score, cost.baseline_days', () => {});
it('includes cost.fast_path_triggered flag when baseline is bypassed', () => {});
it('hashes AWS Account ID in spans to protect PII/tenant identity', () => {});
});
```
### 8.5 Configurable Autonomy
```typescript
describe('Configurable Autonomy', () => {
it('keeps new tenant in Strict Mode (log-only) for first 14 days', () => {});
it('auto-promotes to Audit Mode on day 15 if false-positive rate < 10%', () => {});
it('Panic Mode halts ALL Slack alerts in <1 second via Redis check', () => {});
it('Panic Mode does NOT halt baseline recording (read-only tracking continues)', () => {});
});
```
---
## Section 9: Test Data & Fixtures
```
fixtures/
cloudtrail/
ec2-runinstances.json
rds-create-db.json
lambda-create-function.json
baselines/
mature-steady-spend.json
volatile-dev-account.json
cold-start.json
### 9.1 Data Factories
```typescript
export const makeCloudTrailEvent = (overrides) => ({
eventVersion: '1.08',
userIdentity: { type: 'AssumedRole', arn: 'arn:aws:sts::123:assumed-role/user' },
eventTime: new Date().toISOString(),
eventSource: 'ec2.amazonaws.com',
eventName: 'RunInstances',
requestParameters: { instanceType: 'm5.large' },
...overrides
});
export const makeBaseline = (overrides) => ({
meanHourlyCost: 1.25,
stdDev: 0.15,
eventCount: 45,
ageDays: 16,
observedActors: ['arn:aws:iam::123:role/ci'],
observedInstanceTypes: ['t3.medium', 'm5.large'],
...overrides
});
```
---
## Section 10: TDD Implementation Order
1. **Phase 1:** Anomaly math + Unit tests (Strict TDD).
2. **Phase 2:** CloudTrail normalizer + Pricing tables.
3. **Phase 3:** DynamoDB single-table implementation (Integration led).
4. **Phase 4:** Slack formatting + Remediation Lambda.
5. **Phase 5:** Governance policies (14-day promotion logic).
1. **Phase 1: Math & Core Logic (Strict TDD)**
- Welford algorithm, Z-score math, Novelty scoring, `fast-check` property tests.
2. **Phase 2: Ingestion & Normalization**
- CloudTrail parsers, pricing static tables, event deduplication.
3. **Phase 3: Data Persistence (Integration Led)**
- DynamoDB Single-Table setup, TransactWriteItems, Testcontainers tests.
4. **Phase 4: Notifications & Slack Actions**
- Block Kit formatting, Slack signature validation, API Gateway endpoints.
5. **Phase 5: Governance & Tenets**
- 14-day promotion logic, Panic mode, OTEL tracing.
6. **Phase 6: E2E Pipeline**
- CDK definitions, LocalStack event injection, wire everything together.
*End of dd0c/cost Test Architecture*
*End of dd0c/cost Test Architecture (v2)*