Implement review remediation + PLG analytics SDK

- All 6 test architectures patched with Section 11 addendums
- P5 (cost) fully rewritten from 232 to ~600 lines
- PLG brainstorm + party mode advisory board results
- Analytics SDK v2 (PostHog Cloud, Zod strict, Lambda-safe)
- Analytics tests v2 (safeParse, no , no timestamp, no PII)
- Addresses all Gemini review findings across P1-P6
This commit is contained in:
2026-03-01 01:42:49 +00:00
parent 2fe0ed856e
commit 03bfe931fc
9 changed files with 2950 additions and 85 deletions

View File

@@ -1107,3 +1107,161 @@ Phase 7: E2E Validation
---
*End of dd0c/portal Test Architecture*
---
## 11. Review Remediation Addendum (Post-Gemini Review)
### 11.1 Resolve Database Misalignment (PostgreSQL vs DynamoDB)
Epic 10.2 specified DynamoDB Single-Table, but the Architecture and Test Architecture are fundamentally built around PostgreSQL (Aurora Serverless v2) with pgvector.
**Resolution:** The IDP requires relational joins and vector search. PostgreSQL is the definitive catalog database. DynamoDB references are removed.
```rust
// tests/schema/migration_validation_test.rs
#[tokio::test]
async fn elastic_schema_postgres_migration_is_additive_only() {
let migrations = read_sql_migrations("./migrations");
for migration in migrations {
assert!(!migration.contains("DROP COLUMN"), "Destructive schema change detected");
assert!(!migration.contains("ALTER COLUMN"), "Type modification detected");
assert!(!migration.contains("RENAME COLUMN"), "Column rename detected");
}
}
#[tokio::test]
async fn migration_does_not_hold_exclusive_locks_on_reads() {
// Concurrent index creation tests
assert!(migration_contains("CREATE INDEX CONCURRENTLY"),
"Indexes must be created concurrently to avoid locking the catalog");
}
```
### 11.2 Invert the Test Pyramid (Integration Honeycomb)
Shift from 70% Unit (with heavy moto/responses mocking) to 30/60/10 with VCR and LocalStack.
```python
# tests/integration/scanners/test_aws_scanner.py
@pytest.mark.vcr()
def test_aws_scanner_discovers_ecs_services_and_api_gateways(vcr_cassette):
# Uses real recorded AWS API responses, not moto mocks
# Validates actual boto3 parsing against real-world AWS shapes
scanner = AWSDiscoveryScanner(account_id="123456789012", region="us-east-1")
services = scanner.scan()
assert len(services) > 0
assert any(s.type == "ecs_service" for s in services)
@pytest.mark.vcr()
def test_github_scanner_handles_graphql_pagination(vcr_cassette):
# Validates real GitHub GraphQL paginated responses
scanner = GitHubDiscoveryScanner(org_name="dd0c")
repos = scanner.scan()
assert len(repos) > 100 # Proves pagination logic works
```
### 11.3 Missing Epic Coverage
#### Epic 3.4: PagerDuty & OpsGenie Integrations
```python
# tests/integration/test_pagerduty_sync.py
@pytest.mark.vcr()
def test_pagerduty_sync_maps_schedules_to_catalog_teams():
sync = PagerDutySyncer(api_key="sk-test-key")
teams = sync.fetch_oncall_schedules()
assert teams[0].oncall_email is not None
def test_pagerduty_credentials_are_encrypted_at_rest():
# Verify KMS envelope encryption for 3rd party API keys
pass
```
#### Epic 4.3: Redis Prefix Caching for Cmd+K
```python
# tests/integration/test_search_cache.py
def test_cmd_k_search_hits_redis_cache_before_postgres():
redis_client.set("search:auth", json.dumps([{"name": "auth-service"}]))
# Must return < 5ms from Redis, skipping DB
result = search_api.query("auth")
assert result[0]['name'] == "auth-service"
def test_catalog_update_invalidates_search_cache():
# Create new service
catalog_api.create_service("billing-api")
# Prefix cache must be purged
assert redis_client.keys("search:*") == []
```
#### Epics 5 & 6: UI and Dashboards (Playwright)
```typescript
// tests/e2e/ui/catalog.spec.ts
test('service catalog renders progressive disclosure UI', async ({ page }) => {
await page.goto('/catalog');
// Click expands details instead of navigating away
await page.click('[data-testid="service-row-auth-api"]');
await expect(page.locator('[data-testid="service-drawer"]')).toBeVisible();
});
test('dashboard KPI aggregation shows total services and ownership coverage', async ({ page }) => {
await page.goto('/dashboard');
await expect(page.locator('[data-testid="kpi-total-services"]')).toHaveText("150");
await expect(page.locator('[data-testid="kpi-ownership"]')).toHaveText("85%");
});
```
#### Epic 9: Onboarding & Stripe
```python
# tests/integration/test_stripe_webhooks.py
def test_stripe_checkout_completed_upgrades_tenant_tier():
payload = load_fixture("stripe_checkout_completed.json")
signature = generate_stripe_signature(payload, secret)
response = api_client.post("/webhooks/stripe", data=payload, headers={"Stripe-Signature": signature})
assert response.status_code == 200
tenant = db.get_tenant("t-123")
assert tenant.tier == "pro"
def test_websocket_streams_discovery_progress_during_onboarding():
# Connect WS client, trigger discovery, assert WS receives "discovering AWS...", "found 50 resources..."
pass
```
### 11.4 Scaled Performance Benchmarks
```python
# tests/performance/test_discovery_scale.py
def test_discovery_pipeline_handles_10000_aws_resources_without_step_functions_payload_limit():
# Simulate an AWS environment with 10k resources
# Must chunk state machine transitions to stay under 256KB Step Functions limit
pass
def test_discovery_pipeline_handles_1000_github_repos():
# Verify GraphQL batching and rate limit backoff
pass
```
### 11.5 Edge Case Resilience
```python
def test_github_graphql_concurrent_rate_limiting():
# If 5 tenants scan concurrently, respect Retry-After headers across workers
pass
def test_partial_discovery_scan_does_not_corrupt_catalog():
# If GitHub scan times out halfway, existing services must NOT be marked stale
pass
def test_ownership_conflict_resolution():
# If two discovery sources claim the same repo, prioritize Explicit (Config) over Implicit (Tags)
pass
def test_meilisearch_index_rebuild_does_not_drop_search():
# Verify zero-downtime index swapping during mapping updates
pass
```