Files
Max Mayfield 5ee95d8b13 dd0c: full product research pipeline - 6 products, 8 phases each
Products: route, drift, alert, portal, cost, run
Phases: brainstorm, design-thinking, innovation-strategy, party-mode,
        product-brief, architecture, epics (incl. Epic 10 TF compliance),
        test-architecture (TDD strategy)

Brand strategy and market research included.
2026-02-28 17:35:02 +00:00

24 KiB

dd0c/route — V1 MVP Epics

This document outlines the core Epics and User Stories for the V1 MVP of dd0c/route, designed for a solo founder to implement in 1-3 day chunks per story.


Epic 1: Proxy Engine

Description: Core Rust proxy that sits between the client application and LLM providers. Must maintain strict OpenAI API compatibility, support SSE streaming, and introduce <5ms latency overhead.

User Stories

  • Story 1.1: As a developer, I want to swap my OPENAI_BASE_URL to the proxy endpoint, so that my existing OpenAI SDK works without code changes.
  • Story 1.2: As a developer, I want streaming support (SSE) preserved, so that my chat applications remain responsive while using the proxy.
  • Story 1.3: As a platform engineer, I want the proxy latency overhead to be <5ms, so that intelligent routing doesn't degrade our application's user experience.
  • Story 1.4: As a developer, I want provider errors (e.g., rate limits) to be passed through transparently, so that my app's existing error handling continues to work.

Acceptance Criteria

  • Implements POST /v1/chat/completions for both streaming (stream: true) and non-streaming requests.
  • Validates the Authorization: Bearer header against a Redis cache (falling back to DB).
  • Successfully forwards requests to OpenAI and Anthropic, translating formats if necessary.
  • Asynchronously emits telemetry events to an in-memory channel without blocking the hot path.
  • P99 latency overhead is measured at <5ms.

Estimate: 13 points

Dependencies: None

Technical Notes:

  • Stack: Rust, tokio, hyper, axum.
  • Use connection pooling for upstream providers to eliminate TLS handshake overhead.
  • For streaming, parse only the first chunk/headers to make a routing decision, then passthrough. Count tokens from the final SSE chunk (e.g., [DONE]).

Epic 2: Router Brain

Description: The intelligence core of dd0c/route embedded within the proxy. It evaluates incoming requests against routing rules, classifies complexity heuristically, checks cost tables, and executes fallback chains.

User Stories

  • Story 2.1: As an engineering manager, I want the router to classify the complexity of requests, so that simple extraction tasks are downgraded to cheaper models.
  • Story 2.2: As an engineering manager, I want to configure routing rules (e.g., if feature=classify -> use cheapest from [gpt-4o-mini, claude-haiku]), so that I can automatically save money on predictable workloads.
  • Story 2.3: As an application developer, I want the router to automatically fallback to an alternative model if the primary model fails or rate-limits, so that my application remains highly available.
  • Story 2.4: As an engineering manager, I want cost savings calculated instantly based on up-to-date provider pricing, so that my dashboard data is immediately accurate.

Acceptance Criteria

  • Heuristic complexity classifier runs in <2ms based on token count, task patterns (regex on system prompt), and model hints.
  • Evaluates first-match routing rules based on request tags (X-DD0C-Feature, X-DD0C-Team).
  • Executes "passthrough", "cheapest", "quality-first", and "cascading" routing strategies.
  • Enforces circuit breakers on downstream providers (e.g., open circuit if error rate > 10%).
  • Calculates cost_saved = cost_original - cost_actual on the fly using in-memory cost tables.

Estimate: 8 points

Dependencies: Epic 1 (Proxy Engine)

Technical Notes:

  • Stack: Rust.
  • Run purely in-memory on the proxy hot path. No DB queries per request.
  • Cost tables and routing rules must be loaded at startup and refreshed via a background task every 60s.
  • Use serde_json to inspect the messages array for complexity classification but do not persist the prompt.
  • Circuit breaker state must be shared via Redis so all proxy instances agree on provider health.

Epic 3: Analytics Pipeline

Description: High-throughput logging and aggregation system using TimescaleDB. Focuses on ingesting asynchronous telemetry from the Proxy Engine without blocking request processing.

User Stories

  • Story 3.1: As a platform engineer, I want the proxy to emit telemetry without blocking the main request thread, so that our application performance remains unaffected.
  • Story 3.2: As an engineering manager, I want my dashboard queries to be lightning fast even with millions of rows, so that I can quickly slice and dice our AI spend.
  • Story 3.3: As an engineering manager, I want historical telemetry to be compressed or aged out automatically, so that the database storage costs remain minimal.

Acceptance Criteria

  • Proxy emits a RequestEvent over an in-memory mpsc channel via tokio::spawn.
  • A background worker batches events and inserts them into TimescaleDB every 1s or 100 events using bulk COPY INTO.
  • Continuous aggregates (hourly_cost_summary, daily_cost_summary) are created and updated on schedule to pre-calculate total_cost, total_saved, and avg_latency.
  • TimescaleDB compression policies compress chunks older than 7 days by 90%+.
  • The proxy must degrade gracefully if the analytics database is unavailable.

Estimate: 8 points

Dependencies: Epic 1 (Proxy Engine)

Technical Notes:

  • Stack: Rust (worker), PostgreSQL/TimescaleDB.
  • Write the TimescaleDB migration scripts for the hypertable request_events and the continuous aggregates.
  • Batching must be robust to worker panics (use bounded channels).

Epic 4: Dashboard API

Description: Axum REST API providing authentication, org/team management, routing rule CRUD, and data endpoints for the frontend dashboard. Focuses on frictionless developer onboarding.

User Stories

  • Story 4.1: As an engineering manager, I want to authenticate via GitHub OAuth, so that I can create an organization and get an API key in under 60 seconds without remembering a password.
  • Story 4.2: As an engineering manager, I want to manage my organization's routing rules and provider API keys securely, so that dd0c/route can successfully broker requests to OpenAI and Anthropic.
  • Story 4.3: As an engineering manager, I want an endpoint that provides my historical spend and savings summary, so that I can visualize it in the UI.
  • Story 4.4: As a platform engineer, I want to revoke an active API key, so that compromised credentials are immediately blocked.

Acceptance Criteria

  • Implements /api/auth/github OAuth flow issuing JWTs and refresh tokens.
  • Implements /api/orgs CRUD for managing an organization and API keys.
  • Implements /api/dashboard/summary and /api/dashboard/treemap queries hitting the TimescaleDB continuous aggregates.
  • Implements /api/requests for the request inspector with filters (e.g., model, feature, team).
  • Securely stores and encrypts provider API keys in PostgreSQL using an AES-256-GCM Data Encryption Key.
  • Enforces an RBAC model (Owner, Member) per organization.

Estimate: 13 points

Dependencies: Epic 3 (Analytics Pipeline)

Technical Notes:

  • Stack: Rust (axum), PostgreSQL.
  • Reuse tokio runtime to minimize context switches for a solo founder.
  • Use oauth2 crate for GitHub integration. JWTs are signed with RS256, refresh tokens in Redis.
  • Ensure API keys are hashed (SHA-256) before storage; raw keys are never stored.

Epic 5: Dashboard UI

Description: The React SPA serving the cost attribution dashboard. Visualizes the AI spend treemap, routing rules editor, real-time ticker, and request inspector. This is the product's primary visual "Aha" moment.

User Stories

  • Story 5.1: As an engineering manager, I want to see a treemap of my organization's AI spend broken down by team, feature, and model, so that I can instantly identify the most expensive areas of my application.
  • Story 5.2: As an engineering manager, I want a real-time counter showing "You saved $X this week," so that I feel confident the tool is paying for itself.
  • Story 5.3: As a platform engineer, I want an interface to configure routing rules (e.g., drag-to-reorder priority), so that I can instruct the proxy without editing config files.
  • Story 5.4: As a platform engineer, I want a request inspector that displays metadata, cost, latency, and the specific routing decision for every request, so that I can debug why a certain model was chosen.

Acceptance Criteria

  • React + Vite SPA deployed as static assets to S3 + CloudFront.
  • Treemap visualization renders cost aggregations dynamically over selected time periods (7d/30d/90d).
  • A routing rules editor allows CRUD operations and priority reordering for a team's rules.
  • Request Inspector table displays paginated, filterable (feature, team, status) lists of telemetry without showing prompt content.
  • Allows an admin to securely input OpenAI and Anthropic API keys.

Estimate: 13 points

Dependencies: Epic 4 (Dashboard API)

Technical Notes:

  • Stack: React, TypeScript, Vite, Tailwind CSS.
  • No SSR required for V1 (keep it simple). Use react-query or similar for data fetching and caching.
  • Build the treemap with a charting library like D3 or Recharts.
  • Emphasize speed—data fetches should resolve from continuous aggregates in <200ms.

Epic 6: Shadow Audit CLI

Description: The PLG "Shadow Audit" command-line tool (npx dd0c-scan). It analyzes a local codebase for LLM API calls, estimates monthly cost based on prompt templates, and projects savings with dd0c/route.

User Stories

  • Story 6.1: As a developer, I want a zero-setup CLI tool (npx dd0c-scan) that scans my codebase and estimates how much money I'm currently wasting on overqualified LLMs, so that I can convince my manager to use dd0c/route.
  • Story 6.2: As an engineering manager, I want the CLI to run locally without sending my source code to a third party, so that I can securely audit my own projects.
  • Story 6.3: As an engineering manager, I want a clean, visually appealing terminal report showing "Top Opportunities" for model downgrades, so that I immediately see the value of routing.

Acceptance Criteria

  • Parses a local directory for OpenAI or Anthropic SDK usage in TypeScript/JavaScript/Python files.
  • Identifies the models requested in the code and estimates token usage heuristically based on the strings passed to the SDK.
  • Hits /api/v1/pricing/current to fetch the latest cost tables and calculates an estimated monthly bill and projected savings.
  • Outputs a formatted terminal report showing total potential savings and a breakdown of the highest-impact files.
  • Anonymized scan summary is sent to the server only if the user explicitly opts in.

Estimate: 8 points

Dependencies: Epic 4 (Dashboard API - Pricing Endpoint)

Technical Notes:

  • Stack: Node.js, commander, chalk, simple regex parsers for Python/JS SDKs.
  • Keep the CLI lightweight, fast, and dependency-free as possible. No actual LLM parsing; use heuristics (string length/structure) for token estimates.
  • Must run completely offline if the pricing table is cached.

Epic 7: Slack Integration

Description: The primary retention mechanism and anomaly alerting system. An asynchronous worker task dispatches weekly savings digests and threshold-based budget alerts to Slack and Email.

User Stories

  • Story 7.1: As an engineering manager, I want an automated weekly digest summarizing my team's AI savings, so that I can easily report to the CFO that our tooling investment is paying off.
  • Story 7.2: As a platform engineer, I want to configure a budget limit (e.g., alert if daily spend > $100) and receive a Slack webhook notification immediately, so that I can stop a retry storm before the bill gets out of hand.
  • Story 7.3: As an engineering manager, I want an email version of the weekly digest, so that I can forward it straight to my leadership team.

Acceptance Criteria

  • A standalone asynchronous worker (dd0c-worker) evaluates continuous aggregates (via TimescaleDB) every hour.
  • Generates a "Monday Morning Digest" email via AWS SES.
  • Emits Slack webhook payloads when a threshold alert is triggered (threshold_amount, threshold_pct).
  • Adds a X-DD0C-Signature to outbound webhooks to prevent spoofing.

Estimate: 8 points

Dependencies: Epic 3 (Analytics Pipeline), Epic 4 (Dashboard API)

Technical Notes:

  • Stack: Rust (tokio-cron), reqwest (for webhooks), AWS SES.
  • Worker is a singleton container (1 task) running alongside the proxy to avoid lock contention on cron tasks.
  • Ensure alerts maintain state (using PostgreSQL alert_configs and last_fired_at) so users aren't spammed for the same incident.

Epic 8: Infrastructure & DevOps

Description: Containerized ECS Fargate deployment, AWS native networking, basic monitoring, and fully automated CI/CD for the entire dd0c stack. Essential for a solo founder to deploy safely and frequently.

User Stories

  • Story 8.1: As a solo founder, I want to use AWS ECS Fargate, so that I don't have to manage EC2 instances or worry about OS-level patching.
  • Story 8.2: As a solo founder, I want a GitHub Actions CI/CD pipeline, so that git push automatically runs tests, builds containers, and deploys rolling updates with zero downtime.
  • Story 8.3: As an operator, I want standard AWS CloudWatch alarms (e.g., P99 proxy latency > 50ms) connected to PagerDuty, so that I am only woken up when a critical threshold is breached.
  • Story 8.4: As a solo founder, I want a strict separation between my configuration (PostgreSQL) and telemetry (TimescaleDB) stores, so that I can scale analytics independently from org/auth state.

Acceptance Criteria

  • Full AWS infrastructure defined via CDK (TypeScript) or Terraform.
  • ALB routes /v1/* to the proxy container, /api/* to the dashboard API container.
  • Dashboard static assets deployed to an S3 bucket with CloudFront caching.
  • docker build produces three optimized images from a single Rust workspace (dd0c-proxy, dd0c-api, dd0c-worker).
  • CloudWatch dashboards and minimum alarms configured (CPU >80%, Proxy Error Rate >5%, ALB 5xx Rate).
  • git push main triggers a GitHub Action to test, lint, build, push to ECR, and update the ECS Fargate services.

Estimate: 13 points

Dependencies: Epic 1 (Proxy Engine), Epic 4 (Dashboard API)

Technical Notes:

  • Stack: AWS ECS Fargate, ALB, CloudFront, S3, RDS (PostgreSQL/TimescaleDB), ElastiCache (Redis), GitHub Actions.
  • Ensure the ALB utilizes path-based routing correctly and handles TLS termination.
  • For cost optimization on AWS, explore consolidating NAT Gateways or utilizing VPC Endpoints for S3/ECR/CloudWatch.

Epic 9: Onboarding & PLG

Description: Self-serve signup, free tier, API key management, and a getting-started flow that gets users routing their first LLM call through dd0c/route in under 2 minutes. This is the growth engine.

User Stories

  • Story 9.1: As a new user, I want to sign up with GitHub OAuth in one click, so that I can start using dd0c/route without filling out forms.
  • Story 9.2: As a new user, I want a free tier (up to $50/month in routed LLM spend), so that I can evaluate the product with real traffic before committing.
  • Story 9.3: As a developer, I want to generate and manage API keys from the dashboard, so that I can integrate dd0c/route into my applications.
  • Story 9.4: As a new user, I want a guided "First Route" onboarding flow that gives me a working curl command, so that I see cost savings within 2 minutes of signing up.
  • Story 9.5: As a team lead, I want to invite team members via email, so that my team can share a single org and see aggregated savings.

Acceptance Criteria

  • GitHub OAuth signup creates org + first API key automatically.
  • Free tier enforced at the proxy level — requests beyond $50/month routed spend return 429 with upgrade CTA.
  • API key CRUD: create, list, revoke, rotate. Keys are hashed at rest (bcrypt), only shown once on creation.
  • Onboarding wizard: 3 steps — (1) copy API key, (2) paste curl command, (3) see first request in dashboard. Completion rate tracked.
  • Team invite sends email with magic link. Invited user joins existing org on signup.
  • Stripe Checkout integration for upgrade from free → paid ($49/month base).

Estimate: 8 points

Dependencies: Epic 4 (Dashboard API), Epic 5 (Dashboard UI)

Technical Notes:

  • Use Stripe Checkout Sessions for payment — no custom billing UI needed for V1.
  • Free tier enforcement happens in the proxy hot path — must be O(1) lookup (Redis counter per org, reset monthly via cron).
  • Onboarding completion events tracked via PostHog or simple DB events for funnel analysis.
  • Magic link invites use signed JWTs with 72-hour expiry, stored in pending_invites table.

Epic 10: Transparent Factory Compliance

Description: Cross-cutting epic ensuring dd0c/route adheres to the 5 Transparent Factory architectural tenets: Atomic Flagging, Elastic Schema, Cognitive Durability, Semantic Observability, and Configurable Autonomy. These stories are woven across the existing system — they don't add features, they add engineering discipline.

Story 10.1: Atomic Flagging — Feature Flag Infrastructure

As a solo founder, I want every new routing rule, cost threshold, and provider failover behavior wrapped in a feature flag (default: off), so that I can deploy code continuously without risking production traffic.

Acceptance Criteria:

  • OpenFeature SDK integrated into the Rust proxy via a compatible provider (e.g., flagd sidecar or env-based provider for V1).
  • All flags evaluate locally (in-memory or sidecar) — zero network calls on the hot path.
  • Every flag has an owner field and a ttl (max 14 days). CI blocks deployment if any flag exceeds its TTL at 100% rollout.
  • Automated circuit breaker: if a flagged code path increases P99 latency by >5% or error rate >2%, the flag auto-disables within 30 seconds.
  • Flags exist for: model routing strategies, complexity classifier thresholds, provider failover chains, new dashboard features.

Estimate: 5 points Dependencies: Epic 1 (Proxy Engine), Epic 2 (Router Brain) Technical Notes:

  • Use OpenFeature Rust SDK. For V1, a simple JSON file or env-var provider is fine — no LaunchDarkly needed.
  • Circuit breaker integration: extend the existing Redis-backed circuit breaker to also flip flags.
  • Flag cleanup: add a make flag-audit target that lists expired flags.

Story 10.2: Elastic Schema — Additive-Only Migration Discipline

As a solo founder, I want all TimescaleDB and Redis schema changes to be strictly additive, so that I can roll back any deployment instantly without data loss or broken readers.

Acceptance Criteria:

  • CI lint step rejects any migration containing DROP, ALTER ... TYPE, or RENAME on existing columns.
  • New fields use _v2 suffix or a new table when breaking changes are unavoidable.
  • All Rust structs use #[serde(deny_unknown_fields = false)] (or equivalent) so V1 code ignores V2 fields.
  • Dual-write pattern documented and enforced: during migration windows, the API writes to both old and new schema targets within the same DB transaction.
  • Every migration file includes a sunset_date comment (max 30 days). A CI check warns if any migration is past sunset without cleanup.

Estimate: 3 points Dependencies: Epic 3 (Analytics Pipeline) Technical Notes:

  • Use sqlx migration files. Add a pre-commit hook or CI step that greps for forbidden DDL keywords.
  • Redis key schema: version keys with prefix (e.g., route:v1:config, route:v2:config). Never rename keys.
  • For the request_events hypertable, new columns are always NULLABLE with defaults.

Story 10.3: Cognitive Durability — Decision Logs for Routing Logic

As a future maintainer (or future me), I want every change to routing algorithms, cost models, or provider selection logic accompanied by a decision_log.json, so that I can understand why a decision was made months later in under 60 seconds.

Acceptance Criteria:

  • decision_log.json schema defined: { prompt, reasoning, alternatives_considered, confidence, timestamp, author }.
  • CI requires a decision_log.json entry for any PR touching src/router/, src/cost/, or migration files.
  • Cyclomatic complexity cap of 10 enforced via cargo clippy or a custom lint. PRs exceeding this are blocked.
  • Decision logs are committed alongside code in a docs/decisions/ directory, one file per significant change.

Estimate: 2 points Dependencies: None Technical Notes:

  • Use a PR template that prompts for the decision log fields.
  • For the complexity cap, cargo clippy -W clippy::cognitive_complexity with threshold 10.
  • Decision logs for cost table updates should include: source of pricing data, comparison with previous rates, expected savings impact.

Story 10.4: Semantic Observability — AI Reasoning Spans on Routing Decisions

As a platform engineer debugging a misrouted request, I want every proxy routing decision to emit an OpenTelemetry span with structured AI reasoning metadata, so that I can trace exactly which model was chosen, why, and what alternatives were rejected.

Acceptance Criteria:

  • Every /v1/chat/completions request generates an ai_routing_decision span as a child of the request trace.
  • Span attributes include: ai.model_selected, ai.model_alternatives (JSON array of rejected models + reasons), ai.cost_delta (savings vs. default), ai.complexity_score, ai.routing_strategy (passthrough/cheapest/quality-first/cascading).
  • ai.prompt_hash (SHA-256 of first 500 chars of system prompt) included for correlation — never raw prompt content.
  • Spans export to any OTLP-compatible backend (Grafana Cloud, Jaeger, etc.).
  • No PII in any span attribute. Prompt content is hashed, not logged.

Estimate: 3 points Dependencies: Epic 1 (Proxy Engine), Epic 2 (Router Brain) Technical Notes:

  • Use tracing + opentelemetry-rust crate with OTLP exporter.
  • The span should be created inside the router decision function, not as middleware — it needs access to the alternatives list.
  • For V1, export to stdout in OTLP JSON format. Production: OTLP gRPC to a collector.

Story 10.5: Configurable Autonomy — Governance Policy for Automated Routing

As a solo founder, I want a policy.json governance file that controls what the system is allowed to do autonomously (e.g., switch models, update cost tables, add providers), so that I maintain human oversight as the system grows.

Acceptance Criteria:

  • policy.json defines governance_mode: strict (all changes require manual approval) or audit (changes auto-apply but are logged).
  • The proxy checks governance_mode before applying any runtime config change (routing rule update, cost table refresh, provider addition).
  • panic_mode flag: when set to true, the proxy freezes all routing rules to their last-known-good state, disables auto-failover, and routes everything to a single hardcoded provider.
  • Governance drift monitoring: a weekly cron job logs the ratio of auto-applied vs. manually-approved changes. If auto-applied changes exceed 80% in strict mode, an alert fires.
  • All policy check decisions logged: "Allowed by audit mode", "Blocked by strict mode", "Panic mode active — frozen".

Estimate: 3 points Dependencies: Epic 2 (Router Brain) Technical Notes:

  • policy.json lives in the repo root and is loaded at startup + watched for changes via notify crate.
  • For V1 as a solo founder, start in audit mode. strict mode is for when you hire or add AI agents to the pipeline.
  • Panic mode should be triggerable via a single API call (POST /admin/panic) or by setting an env var — whichever is faster in an emergency.

Epic 10 Summary

Story Tenet Points
10.1 Atomic Flagging 5
10.2 Elastic Schema 3
10.3 Cognitive Durability 2
10.4 Semantic Observability 3
10.5 Configurable Autonomy 3
Total 16