60 lines
4.4 KiB
Markdown
60 lines
4.4 KiB
Markdown
|
|
# Product Requirements Document: Dev Intel V3
|
||
|
|
|
||
|
|
## 1. Problem Statement
|
||
|
|
Dev Intel V2 successfully generates documentation from our Foxtrot monorepo, achieving a 93% agent eval and 78% human eval score. However, the pipeline relies on ~2000 lines of custom JavaScript. Much of this custom code duplicates the functionality of well-established Open Source Software (OSS). We need to simplify the architecture, reduce the maintenance burden, and embrace community-standard tools without sacrificing output quality. Our "ratchet loop" is functionally just a "Ralph Wiggum" loop, and we should embrace a simplified, brute-force bash loop with clear objective completion criteria rather than complex custom code.
|
||
|
|
|
||
|
|
## 2. Architecture
|
||
|
|
The V3 architecture adopts a hybrid approach: "OSS for the heavy lifting, custom code for the magic."
|
||
|
|
|
||
|
|
### OSS Replacements
|
||
|
|
* **Terraform Documentation:** `terraform-docs` (Replaces `extract-terraform.js`)
|
||
|
|
* **Helm Chart Documentation:** `helm-docs` (Replaces `extract-helm.js` & `sysdoc.js` chart section)
|
||
|
|
* **Evaluation Harness:** `promptfoo` (Replaces `eval-agent.js`, `eval-human.js`, `eval.js`)
|
||
|
|
* **Documentation Serving:** `mkdocs-material` (Replaces custom doc serving)
|
||
|
|
* **Ratchet Loop:** Simple Ralph Wiggum bash loop (Replaces `ratchet.js`)
|
||
|
|
|
||
|
|
### Retained Custom Components (The Value Add)
|
||
|
|
* **Graph Builder (`graph.js` + `extract.js`):** Tree-sitter extraction to build a unified knowledge graph across 13 repositories.
|
||
|
|
* **Subsystem Aggregator (`subsystem.js`):** Grouping files into logical subsystems and detecting cross-cutting concerns.
|
||
|
|
* **Cross-Chart Interaction Analysis:** Analyzing shared secrets, ports, and service references across Helm charts (which `helm-docs` cannot do natively).
|
||
|
|
* **LLM Prose Enrichment (`prose.js`):** Feeding the dependency matrix and anomaly flags into Claude to generate "why" explanations.
|
||
|
|
* **Glue Layer:** Minimal orchestration connecting OSS tools and custom analysis into unified output.
|
||
|
|
|
||
|
|
## 3. Requirements
|
||
|
|
* **LLM Engine:** Use `http://192.168.86.11:8000/v1` with the `claude-haiku-4.5` model.
|
||
|
|
* **Scale:** Must handle the Foxtrot monorepo (13 subdirectories, 17K+ files).
|
||
|
|
* **Footprint constraint:** The pipeline should be composed of ~500 lines of custom Node.js code plus config files.
|
||
|
|
* **Speed constraint:** Must run end-to-end in under 10 minutes (excluding LLM execution wait times).
|
||
|
|
* **Cost constraint:** Target cost is $1.00 per release.
|
||
|
|
* **Code Implementation:** Replace the existing terraform and per-chart Helm doc generation with the CLI tools (`terraform-docs` and `helm-docs`).
|
||
|
|
* **Docs Website:** Implement an `mkdocs.yml` configuration to serve the output as a searchable site.
|
||
|
|
* **Evaluation Implementation:** Configure `promptfoo` via YAML to act as the objective judge.
|
||
|
|
|
||
|
|
## 4. Ralph Wiggum Loop Spec
|
||
|
|
The previous `ratchet.js` implementation will be replaced by a `bash` script. This runs an AI agent in a simple, well-known ratchet pattern: loop until objective completion criteria are met.
|
||
|
|
|
||
|
|
**Execution Flow:**
|
||
|
|
1. **Generate:** Run the Dev Intel V3 pipeline.
|
||
|
|
2. **Evaluate:** Run `promptfoo eval` against the pipeline's output.
|
||
|
|
3. **Diagnose:** Check the `promptfoo` score against the required threshold.
|
||
|
|
4. **Condition:**
|
||
|
|
* **If Score >= Threshold:** Success, exit the loop.
|
||
|
|
* **If Score < Threshold:** Re-feed the previous output and failure context (the evaluation feedback) back into the generator prompt for context.
|
||
|
|
5. **Repeat:** Continue up to *N* iterations until criteria are met.
|
||
|
|
|
||
|
|
## 5. Success Metrics
|
||
|
|
* **Quality Parity or Better:** Agent eval score >= 93%, Human eval score >= 78%.
|
||
|
|
* **Simplicity:** Custom codebase shrinks from ~2000 lines to ~500 lines.
|
||
|
|
* **Performance:** Execution overhead is under 10 minutes.
|
||
|
|
* **Efficiency:** Pipeline inference costs remain <= $1 per release.
|
||
|
|
|
||
|
|
## 6. Migration Plan
|
||
|
|
To safely deprecate V2 while maintaining documentation pipelines:
|
||
|
|
1. **Remove Custom Extractors:** Delete `extract-terraform.js`, `extract-helm.js`, and the Helm-specific logic inside `sysdoc.js`.
|
||
|
|
2. **Remove Custom Evaluators:** Delete `eval-agent.js`, `eval-human.js`, and `eval.js`.
|
||
|
|
3. **Remove Custom Ratchet:** Delete `ratchet.js`.
|
||
|
|
4. **Integrate CLI Binaries:** Install and wire up `terraform-docs` and `helm-docs`.
|
||
|
|
5. **Add Configs:** Write `promptfoo.yaml` for evaluations and `mkdocs.yml` for serving docs.
|
||
|
|
6. **Implement Bash Script:** Write the Ralph Wiggum loop.
|
||
|
|
7. **Re-wire Glue Code:** Connect the outputs from the OSS tools into the preserved `prose.js` module.
|