dev-intel-v2/docs/prd.md

# Product Requirements Document: Dev Intel V2

## 1. Problem Statement
Dev Intel V2 currently extracts code entities and Helm chart structures to build a unified knowledge graph and generate Diataxis-structured documentation for infrastructure monorepos. While the pipeline performs well for AI agents (93.4% eval score), human engineers are struggling (78.6% eval score) because the generated prose is purely descriptive rather than explanatory. Furthermore, critical infrastructure components like Terraform are missing from the extraction, and architectural flow tracing is non-existent, leaving significant gaps in the generated documentation's usefulness for understanding change impact and structural anomalies.

## 2. User Personas
* **Infrastructure Engineer:** Needs to understand the "why" behind the architecture, trace execution flows across boundaries, and quickly assess the blast radius of changes (e.g., modifying a secret or Helm chart).
* **AI Coding Agent:** Relies on high-fidelity, highly structured knowledge graphs and inlined dependencies to reliably answer questions about the codebase without getting lost in nested wrapper charts.

## 3. Requirements

### Tier 1: Fix What's Broken (Explanation & Accuracy)
* **T1.1: Inline Sub-chart Dependencies:** Wrapper charts must inline their sub-chart dependencies in the index to ensure dependency queries do not fail.
* **T1.2: Explanatory LLM Prose:** Update the LLM enrichment prompts to explain *why* subsystems depend on each other and *why* certain structural anomalies exist (e.g., subsystems with zero functions).
* **T1.3: Architectural Anomaly Resolution:** Documentation must explicitly address and explain architectural structural anomalies to improve the current 30% success rate on architectural "why" questions.

### Tier 2: Fill Real Gaps (Coverage & Tracing)
* **T2.1: Terraform Extraction (`extract-terraform.js`):** Implement robust Terraform entity extraction. Currently, only 1 module is detected out of 336 files in `control-core`.
* **T2.2: Auto-Detection of Entry Points:** Implement flow tracing by automatically detecting entry points. Target: Helm Deployments with Services, `main()` in shell scripts, `__main__` in Python, and CI pipelines.
* **T2.3: Change Impact Analysis Interface:** Build a query interface leveraging existing knowledge graph edges to answer change impact questions (e.g., "If I modify `vault-secret`, which charts redeploy?").

## 4. Success Metrics
* **Agent Eval Score:** Maintain > 90%.
* **Human Eval Score:** Increase from 78.6% to > 90%.
* **Terraform Coverage:** Increase from ~0% to > 80% of `control-core` entities extracted.
* **Flow Traces:** Document at least 5 meaningful entry-to-exit execution paths.

## 5. Out of Scope
* Support for new languages outside of the current stack (Python, Go, TypeScript, Shell, HCL/Terraform).
* Interactive UI dashboards (focus remains on markdown generation and query interfaces).
* Modifying the core Diataxis structural framework.

## 6. Dependencies and Risks
* **Risk (LLM Context Limits):** Inlining sub-chart dependencies and expanding explanatory prose could bloat the context window for the evaluating LLM.
* **Dependency:** The change impact query interface relies heavily on the accuracy of the existing graph edges; if current edges are noisy, the impact analysis will be flawed.
* **Dependency:** Terraform extraction requires successfully parsing HCL, which may have complex module resolution behaviors compared to standard code tree-sitter extraction.