227 lines
8.8 KiB
Markdown
227 lines
8.8 KiB
Markdown
|
|
# dd0c Platform — PLG Instrumentation Brainstorm
|
||
|
|
|
||
|
|
**Session:** Carson (Brainstorming Coach) — Cross-Product PLG Analytics
|
||
|
|
**Date:** March 1, 2026
|
||
|
|
**Scope:** All 6 dd0c products
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## The Problem
|
||
|
|
|
||
|
|
We built 6 products with onboarding flows, free tiers, and Stripe billing — but zero product analytics. We can't answer:
|
||
|
|
|
||
|
|
- How many users hit "aha moment" vs. bounce?
|
||
|
|
- Where in the funnel do free users drop off before upgrading?
|
||
|
|
- Which features drive retention vs. which are ignored?
|
||
|
|
- Are users churning because of alert fatigue, false positives, or just not getting value?
|
||
|
|
- What's our time-to-first-value per product?
|
||
|
|
|
||
|
|
Without instrumentation, PLG iteration is guesswork.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Brainstorm: What to Instrument
|
||
|
|
|
||
|
|
### 1. Unified Event Taxonomy
|
||
|
|
|
||
|
|
Every dd0c product shares a common event naming convention:
|
||
|
|
|
||
|
|
```
|
||
|
|
<domain>.<object>.<action>
|
||
|
|
|
||
|
|
Examples:
|
||
|
|
account.signup.completed
|
||
|
|
account.aws.connected
|
||
|
|
anomaly.alert.sent
|
||
|
|
anomaly.alert.snoozed
|
||
|
|
slack.bot.installed
|
||
|
|
billing.checkout.started
|
||
|
|
billing.upgrade.completed
|
||
|
|
feature.flag.evaluated
|
||
|
|
```
|
||
|
|
|
||
|
|
**Rules:**
|
||
|
|
- Past tense for completed actions (`completed`, `sent`, `clicked`)
|
||
|
|
- Present tense for state changes (`active`, `learning`, `paused`)
|
||
|
|
- Always include `tenant_id`, `timestamp`, `product` (route/drift/alert/portal/cost/run)
|
||
|
|
- Never include PII — hash emails, account IDs
|
||
|
|
|
||
|
|
### 2. Per-Product Activation Metrics
|
||
|
|
|
||
|
|
The "aha moment" is different for each product:
|
||
|
|
|
||
|
|
| Product | Aha Moment | Metric | Target |
|
||
|
|
|---------|-----------|--------|--------|
|
||
|
|
| dd0c/route | First dollar saved by model routing | `routing.savings.first_dollar` | <24hr from signup |
|
||
|
|
| dd0c/drift | First drift detected in real stack | `drift.detection.first_found` | <1hr from agent install |
|
||
|
|
| dd0c/alert | First alert correlated (not just forwarded) | `alert.correlation.first_match` | <60sec from first alert |
|
||
|
|
| dd0c/portal | First service auto-discovered | `portal.discovery.first_service` | <5min from install |
|
||
|
|
| dd0c/cost | First anomaly detected in real account | `cost.anomaly.first_detected` | <24hr from AWS connect |
|
||
|
|
| dd0c/run | First runbook executed successfully | `run.execution.first_success` | <10min from setup |
|
||
|
|
|
||
|
|
### 3. Conversion Funnel (Universal)
|
||
|
|
|
||
|
|
Every product shares this funnel shape:
|
||
|
|
|
||
|
|
```
|
||
|
|
Signup → Connect (AWS/Slack/Git) → First Value → Habit → Upgrade
|
||
|
|
```
|
||
|
|
|
||
|
|
Events per stage:
|
||
|
|
|
||
|
|
**Stage 1: Signup**
|
||
|
|
- `account.signup.started` — landed on signup page
|
||
|
|
- `account.signup.completed` — account created
|
||
|
|
- `account.signup.method` — github_sso / google_sso / email
|
||
|
|
|
||
|
|
**Stage 2: Connect**
|
||
|
|
- `account.integration.started` — began connecting external service
|
||
|
|
- `account.integration.completed` — connection verified
|
||
|
|
- `account.integration.failed` — connection failed (include `error_type`)
|
||
|
|
- Product-specific: `account.aws.connected`, `account.slack.installed`, `account.git.connected`
|
||
|
|
|
||
|
|
**Stage 3: First Value**
|
||
|
|
- Product-specific aha moment event (see table above)
|
||
|
|
- `onboarding.wizard.step_completed` — which step, how long
|
||
|
|
- `onboarding.wizard.abandoned` — which step they quit on
|
||
|
|
|
||
|
|
**Stage 4: Habit**
|
||
|
|
- `session.daily.active` — DAU ping
|
||
|
|
- `session.weekly.active` — WAU ping
|
||
|
|
- `feature.<name>.used` — per-feature usage
|
||
|
|
- `notification.digest.opened` — are they reading digests?
|
||
|
|
- `slack.command.used` — which slash commands, how often
|
||
|
|
|
||
|
|
**Stage 5: Upgrade**
|
||
|
|
- `billing.checkout.started`
|
||
|
|
- `billing.checkout.completed`
|
||
|
|
- `billing.checkout.abandoned`
|
||
|
|
- `billing.plan.changed` — upgrade/downgrade
|
||
|
|
- `billing.churn.detected` — subscription cancelled
|
||
|
|
|
||
|
|
### 4. Feature Usage Events (Per Product)
|
||
|
|
|
||
|
|
**dd0c/route (LLM Cost Router)**
|
||
|
|
- `routing.request.processed` — model selected, latency, cost
|
||
|
|
- `routing.override.manual` — user forced a specific model
|
||
|
|
- `routing.savings.calculated` — weekly savings digest generated
|
||
|
|
- `routing.shadow.audit.run` — shadow mode comparison completed
|
||
|
|
- `dashboard.cost.viewed` — opened cost dashboard
|
||
|
|
|
||
|
|
**dd0c/drift (IaC Drift Detection)**
|
||
|
|
- `drift.scan.completed` — scan finished, drifts found count
|
||
|
|
- `drift.remediation.clicked` — user clicked "fix drift"
|
||
|
|
- `drift.remediation.applied` — drift actually fixed
|
||
|
|
- `drift.false_positive.marked` — user dismissed a drift
|
||
|
|
- `drift.agent.heartbeat` — agent is alive and scanning
|
||
|
|
|
||
|
|
**dd0c/alert (Alert Intelligence)**
|
||
|
|
- `alert.ingested` — raw alert received
|
||
|
|
- `alert.correlated` — alerts grouped into incident
|
||
|
|
- `alert.suppressed` — duplicate/noise suppressed
|
||
|
|
- `alert.escalated` — sent to on-call
|
||
|
|
- `alert.feedback.helpful` / `alert.feedback.noise` — user feedback
|
||
|
|
- `alert.mttr.measured` — time from alert to resolution
|
||
|
|
|
||
|
|
**dd0c/portal (Lightweight IDP)**
|
||
|
|
- `portal.service.discovered` — auto-discovery found a service
|
||
|
|
- `portal.service.claimed` — team claimed ownership
|
||
|
|
- `portal.scorecard.viewed` — someone checked service health
|
||
|
|
- `portal.scorecard.action_taken` — acted on a recommendation
|
||
|
|
- `portal.search.performed` — searched the catalog
|
||
|
|
|
||
|
|
**dd0c/cost (AWS Cost Anomaly)**
|
||
|
|
- `cost.event.ingested` — CloudTrail event processed
|
||
|
|
- `cost.anomaly.scored` — anomaly scoring completed
|
||
|
|
- `cost.anomaly.alerted` — Slack alert sent
|
||
|
|
- `cost.anomaly.snoozed` — user snoozed alert
|
||
|
|
- `cost.anomaly.expected` — user marked as expected
|
||
|
|
- `cost.remediation.clicked` — user clicked Stop/Terminate
|
||
|
|
- `cost.remediation.executed` — remediation completed
|
||
|
|
- `cost.zombie.detected` — idle resource found
|
||
|
|
- `cost.digest.sent` — daily digest delivered
|
||
|
|
|
||
|
|
**dd0c/run (Runbook Automation)**
|
||
|
|
- `run.runbook.created` — new runbook authored
|
||
|
|
- `run.execution.started` — runbook execution began
|
||
|
|
- `run.execution.completed` — execution finished (include `success`/`failed`)
|
||
|
|
- `run.execution.approval_requested` — human approval needed
|
||
|
|
- `run.execution.approval_granted` — human approved
|
||
|
|
- `run.execution.rolled_back` — rollback triggered
|
||
|
|
- `run.sandbox.test.run` — dry-run in sandbox
|
||
|
|
|
||
|
|
### 5. Health Scoring (Churn Prediction)
|
||
|
|
|
||
|
|
Composite health score per tenant, updated daily:
|
||
|
|
|
||
|
|
```
|
||
|
|
health_score = (
|
||
|
|
0.3 * activation_complete + // did they hit aha moment?
|
||
|
|
0.2 * weekly_active_days + // how many days active this week?
|
||
|
|
0.2 * feature_breadth + // how many features used?
|
||
|
|
0.15 * integration_depth + // how many integrations connected?
|
||
|
|
0.15 * feedback_sentiment // positive vs negative actions
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
Thresholds:
|
||
|
|
- `health > 0.7` → Healthy (green)
|
||
|
|
- `health 0.4-0.7` → At Risk (yellow) → trigger re-engagement email
|
||
|
|
- `health < 0.4` → Churning (red) → trigger founder outreach
|
||
|
|
|
||
|
|
### 6. Analytics Stack Recommendation
|
||
|
|
|
||
|
|
**PostHog** (self-hosted on AWS):
|
||
|
|
- Open source, self-hostable → no vendor lock-in
|
||
|
|
- Free tier: unlimited events self-hosted
|
||
|
|
- Built-in: funnels, retention, feature flags, session replay
|
||
|
|
- Supports custom events via REST API or JS/Python SDK
|
||
|
|
- Can run on a single t3.medium for V1 traffic
|
||
|
|
|
||
|
|
**Why not Segment/Amplitude/Mixpanel:**
|
||
|
|
- Segment: $120/mo minimum, overkill for solo founder
|
||
|
|
- Amplitude: free tier is generous but cloud-only, data leaves your infra
|
||
|
|
- Mixpanel: same cloud-only concern
|
||
|
|
- PostHog self-hosted: $0/mo, data stays in your AWS account, GDPR-friendly
|
||
|
|
|
||
|
|
**Integration pattern:**
|
||
|
|
```
|
||
|
|
Lambda/API → PostHog REST API (async, fire-and-forget)
|
||
|
|
Next.js UI → PostHog JS SDK (auto-captures pageviews, clicks)
|
||
|
|
Slack Bot → PostHog Python SDK (command usage, action clicks)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 7. Cross-Product Flywheel Metrics
|
||
|
|
|
||
|
|
dd0c is a platform — users on one product should discover others:
|
||
|
|
|
||
|
|
- `platform.cross_sell.impression` — "Try dd0c/alert" banner shown
|
||
|
|
- `platform.cross_sell.clicked` — user clicked cross-sell
|
||
|
|
- `platform.cross_sell.activated` — user activated second product
|
||
|
|
- `platform.products.active_count` — how many dd0c products per tenant
|
||
|
|
|
||
|
|
**Flywheel hypothesis:** Users who activate 2+ dd0c products have 3x lower churn than single-product users. We need data to prove/disprove this.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Epic 11 Proposal: PLG Instrumentation
|
||
|
|
|
||
|
|
### Scope
|
||
|
|
Cross-cutting epic added to all 6 products. Shared analytics SDK, per-product event implementations, funnel dashboards, health scoring.
|
||
|
|
|
||
|
|
### Stories (Draft)
|
||
|
|
1. **PostHog Infrastructure** — CDK stack for self-hosted PostHog on ECS Fargate
|
||
|
|
2. **Analytics SDK** — Shared TypeScript/Python wrapper with standard event schema
|
||
|
|
3. **Funnel Dashboard** — PostHog dashboard template per product
|
||
|
|
4. **Activation Tracking** — Per-product aha moment detection and logging
|
||
|
|
5. **Health Scoring Engine** — Daily cron that computes tenant health scores
|
||
|
|
6. **Cross-Sell Instrumentation** — Platform-level cross-product discovery events
|
||
|
|
7. **Churn Alert Pipeline** — Health score → Slack alert to founder when tenant goes red
|
||
|
|
|
||
|
|
### Estimate
|
||
|
|
~25 story points across all products (shared infrastructure + per-product event wiring)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*This brainstorm establishes the "what" and "why." Party Mode advisory board should stress-test: Is PostHog the right choice? Is the event taxonomy too granular? Should health scoring be V1 or V2? Is 25 points realistic?*
|