- All 6 test architectures patched with Section 11 addendums - P5 (cost) fully rewritten from 232 to ~600 lines - PLG brainstorm + party mode advisory board results - Analytics SDK v2 (PostHog Cloud, Zod strict, Lambda-safe) - Analytics tests v2 (safeParse, no , no timestamp, no PII) - Addresses all Gemini review findings across P1-P6
8.8 KiB
dd0c Platform — PLG Instrumentation Brainstorm
Session: Carson (Brainstorming Coach) — Cross-Product PLG Analytics Date: March 1, 2026 Scope: All 6 dd0c products
The Problem
We built 6 products with onboarding flows, free tiers, and Stripe billing — but zero product analytics. We can't answer:
- How many users hit "aha moment" vs. bounce?
- Where in the funnel do free users drop off before upgrading?
- Which features drive retention vs. which are ignored?
- Are users churning because of alert fatigue, false positives, or just not getting value?
- What's our time-to-first-value per product?
Without instrumentation, PLG iteration is guesswork.
Brainstorm: What to Instrument
1. Unified Event Taxonomy
Every dd0c product shares a common event naming convention:
<domain>.<object>.<action>
Examples:
account.signup.completed
account.aws.connected
anomaly.alert.sent
anomaly.alert.snoozed
slack.bot.installed
billing.checkout.started
billing.upgrade.completed
feature.flag.evaluated
Rules:
- Past tense for completed actions (
completed,sent,clicked) - Present tense for state changes (
active,learning,paused) - Always include
tenant_id,timestamp,product(route/drift/alert/portal/cost/run) - Never include PII — hash emails, account IDs
2. Per-Product Activation Metrics
The "aha moment" is different for each product:
| Product | Aha Moment | Metric | Target |
|---|---|---|---|
| dd0c/route | First dollar saved by model routing | routing.savings.first_dollar |
<24hr from signup |
| dd0c/drift | First drift detected in real stack | drift.detection.first_found |
<1hr from agent install |
| dd0c/alert | First alert correlated (not just forwarded) | alert.correlation.first_match |
<60sec from first alert |
| dd0c/portal | First service auto-discovered | portal.discovery.first_service |
<5min from install |
| dd0c/cost | First anomaly detected in real account | cost.anomaly.first_detected |
<24hr from AWS connect |
| dd0c/run | First runbook executed successfully | run.execution.first_success |
<10min from setup |
3. Conversion Funnel (Universal)
Every product shares this funnel shape:
Signup → Connect (AWS/Slack/Git) → First Value → Habit → Upgrade
Events per stage:
Stage 1: Signup
account.signup.started— landed on signup pageaccount.signup.completed— account createdaccount.signup.method— github_sso / google_sso / email
Stage 2: Connect
account.integration.started— began connecting external serviceaccount.integration.completed— connection verifiedaccount.integration.failed— connection failed (includeerror_type)- Product-specific:
account.aws.connected,account.slack.installed,account.git.connected
Stage 3: First Value
- Product-specific aha moment event (see table above)
onboarding.wizard.step_completed— which step, how longonboarding.wizard.abandoned— which step they quit on
Stage 4: Habit
session.daily.active— DAU pingsession.weekly.active— WAU pingfeature.<name>.used— per-feature usagenotification.digest.opened— are they reading digests?slack.command.used— which slash commands, how often
Stage 5: Upgrade
billing.checkout.startedbilling.checkout.completedbilling.checkout.abandonedbilling.plan.changed— upgrade/downgradebilling.churn.detected— subscription cancelled
4. Feature Usage Events (Per Product)
dd0c/route (LLM Cost Router)
routing.request.processed— model selected, latency, costrouting.override.manual— user forced a specific modelrouting.savings.calculated— weekly savings digest generatedrouting.shadow.audit.run— shadow mode comparison completeddashboard.cost.viewed— opened cost dashboard
dd0c/drift (IaC Drift Detection)
drift.scan.completed— scan finished, drifts found countdrift.remediation.clicked— user clicked "fix drift"drift.remediation.applied— drift actually fixeddrift.false_positive.marked— user dismissed a driftdrift.agent.heartbeat— agent is alive and scanning
dd0c/alert (Alert Intelligence)
alert.ingested— raw alert receivedalert.correlated— alerts grouped into incidentalert.suppressed— duplicate/noise suppressedalert.escalated— sent to on-callalert.feedback.helpful/alert.feedback.noise— user feedbackalert.mttr.measured— time from alert to resolution
dd0c/portal (Lightweight IDP)
portal.service.discovered— auto-discovery found a serviceportal.service.claimed— team claimed ownershipportal.scorecard.viewed— someone checked service healthportal.scorecard.action_taken— acted on a recommendationportal.search.performed— searched the catalog
dd0c/cost (AWS Cost Anomaly)
cost.event.ingested— CloudTrail event processedcost.anomaly.scored— anomaly scoring completedcost.anomaly.alerted— Slack alert sentcost.anomaly.snoozed— user snoozed alertcost.anomaly.expected— user marked as expectedcost.remediation.clicked— user clicked Stop/Terminatecost.remediation.executed— remediation completedcost.zombie.detected— idle resource foundcost.digest.sent— daily digest delivered
dd0c/run (Runbook Automation)
run.runbook.created— new runbook authoredrun.execution.started— runbook execution beganrun.execution.completed— execution finished (includesuccess/failed)run.execution.approval_requested— human approval neededrun.execution.approval_granted— human approvedrun.execution.rolled_back— rollback triggeredrun.sandbox.test.run— dry-run in sandbox
5. Health Scoring (Churn Prediction)
Composite health score per tenant, updated daily:
health_score = (
0.3 * activation_complete + // did they hit aha moment?
0.2 * weekly_active_days + // how many days active this week?
0.2 * feature_breadth + // how many features used?
0.15 * integration_depth + // how many integrations connected?
0.15 * feedback_sentiment // positive vs negative actions
)
Thresholds:
health > 0.7→ Healthy (green)health 0.4-0.7→ At Risk (yellow) → trigger re-engagement emailhealth < 0.4→ Churning (red) → trigger founder outreach
6. Analytics Stack Recommendation
PostHog (self-hosted on AWS):
- Open source, self-hostable → no vendor lock-in
- Free tier: unlimited events self-hosted
- Built-in: funnels, retention, feature flags, session replay
- Supports custom events via REST API or JS/Python SDK
- Can run on a single t3.medium for V1 traffic
Why not Segment/Amplitude/Mixpanel:
- Segment: $120/mo minimum, overkill for solo founder
- Amplitude: free tier is generous but cloud-only, data leaves your infra
- Mixpanel: same cloud-only concern
- PostHog self-hosted: $0/mo, data stays in your AWS account, GDPR-friendly
Integration pattern:
Lambda/API → PostHog REST API (async, fire-and-forget)
Next.js UI → PostHog JS SDK (auto-captures pageviews, clicks)
Slack Bot → PostHog Python SDK (command usage, action clicks)
7. Cross-Product Flywheel Metrics
dd0c is a platform — users on one product should discover others:
platform.cross_sell.impression— "Try dd0c/alert" banner shownplatform.cross_sell.clicked— user clicked cross-sellplatform.cross_sell.activated— user activated second productplatform.products.active_count— how many dd0c products per tenant
Flywheel hypothesis: Users who activate 2+ dd0c products have 3x lower churn than single-product users. We need data to prove/disprove this.
Epic 11 Proposal: PLG Instrumentation
Scope
Cross-cutting epic added to all 6 products. Shared analytics SDK, per-product event implementations, funnel dashboards, health scoring.
Stories (Draft)
- PostHog Infrastructure — CDK stack for self-hosted PostHog on ECS Fargate
- Analytics SDK — Shared TypeScript/Python wrapper with standard event schema
- Funnel Dashboard — PostHog dashboard template per product
- Activation Tracking — Per-product aha moment detection and logging
- Health Scoring Engine — Daily cron that computes tenant health scores
- Cross-Sell Instrumentation — Platform-level cross-product discovery events
- Churn Alert Pipeline — Health score → Slack alert to founder when tenant goes red
Estimate
~25 story points across all products (shared infrastructure + per-product event wiring)
This brainstorm establishes the "what" and "why." Party Mode advisory board should stress-test: Is PostHog the right choice? Is the event taxonomy too granular? Should health scoring be V1 or V2? Is 25 points realistic?