- extract-patterns.js: mines layered arch, ArgoCD appsets, cloud regions, CIDR allocations, naming conventions, sync waves, tech stack from code - agent-kb.js: token-efficient JSON rendering of same doc tree - eval-confluence-ref-questions.json: 32 reference-only benchmark questions - wiggum-v2.sh: Ralph Wiggum loop targeting confluence baseline (77.8%) - docs/human-ux-spec.md: BMad UX designer spec for human doc structure - Eval results: V2 at 28.7% vs confluence 77.8% baseline - Hub/spoke ownership now correctly extracted (95% on that question) - Naming conventions, regions, CIDRs surfaced in system-architecture.md
4.4 KiB
Product Requirements Document: Dev Intel V3
1. Problem Statement
Dev Intel V2 successfully generates documentation from our Foxtrot monorepo, achieving a 93% agent eval and 78% human eval score. However, the pipeline relies on ~2000 lines of custom JavaScript. Much of this custom code duplicates the functionality of well-established Open Source Software (OSS). We need to simplify the architecture, reduce the maintenance burden, and embrace community-standard tools without sacrificing output quality. Our "ratchet loop" is functionally just a "Ralph Wiggum" loop, and we should embrace a simplified, brute-force bash loop with clear objective completion criteria rather than complex custom code.
2. Architecture
The V3 architecture adopts a hybrid approach: "OSS for the heavy lifting, custom code for the magic."
OSS Replacements
- Terraform Documentation:
terraform-docs(Replacesextract-terraform.js) - Helm Chart Documentation:
helm-docs(Replacesextract-helm.js&sysdoc.jschart section) - Evaluation Harness:
promptfoo(Replaceseval-agent.js,eval-human.js,eval.js) - Documentation Serving:
mkdocs-material(Replaces custom doc serving) - Ratchet Loop: Simple Ralph Wiggum bash loop (Replaces
ratchet.js)
Retained Custom Components (The Value Add)
- Graph Builder (
graph.js+extract.js): Tree-sitter extraction to build a unified knowledge graph across 13 repositories. - Subsystem Aggregator (
subsystem.js): Grouping files into logical subsystems and detecting cross-cutting concerns. - Cross-Chart Interaction Analysis: Analyzing shared secrets, ports, and service references across Helm charts (which
helm-docscannot do natively). - LLM Prose Enrichment (
prose.js): Feeding the dependency matrix and anomaly flags into Claude to generate "why" explanations. - Glue Layer: Minimal orchestration connecting OSS tools and custom analysis into unified output.
3. Requirements
- LLM Engine: Use
http://192.168.86.11:8000/v1with theclaude-haiku-4.5model. - Scale: Must handle the Foxtrot monorepo (13 subdirectories, 17K+ files).
- Footprint constraint: The pipeline should be composed of ~500 lines of custom Node.js code plus config files.
- Speed constraint: Must run end-to-end in under 10 minutes (excluding LLM execution wait times).
- Cost constraint: Target cost is $1.00 per release.
- Code Implementation: Replace the existing terraform and per-chart Helm doc generation with the CLI tools (
terraform-docsandhelm-docs). - Docs Website: Implement an
mkdocs.ymlconfiguration to serve the output as a searchable site. - Evaluation Implementation: Configure
promptfoovia YAML to act as the objective judge.
4. Ralph Wiggum Loop Spec
The previous ratchet.js implementation will be replaced by a bash script. This runs an AI agent in a simple, well-known ratchet pattern: loop until objective completion criteria are met.
Execution Flow:
- Generate: Run the Dev Intel V3 pipeline.
- Evaluate: Run
promptfoo evalagainst the pipeline's output. - Diagnose: Check the
promptfooscore against the required threshold. - Condition:
- If Score >= Threshold: Success, exit the loop.
- If Score < Threshold: Re-feed the previous output and failure context (the evaluation feedback) back into the generator prompt for context.
- Repeat: Continue up to N iterations until criteria are met.
5. Success Metrics
- Quality Parity or Better: Agent eval score >= 93%, Human eval score >= 78%.
- Simplicity: Custom codebase shrinks from ~2000 lines to ~500 lines.
- Performance: Execution overhead is under 10 minutes.
- Efficiency: Pipeline inference costs remain <= $1 per release.
6. Migration Plan
To safely deprecate V2 while maintaining documentation pipelines:
- Remove Custom Extractors: Delete
extract-terraform.js,extract-helm.js, and the Helm-specific logic insidesysdoc.js. - Remove Custom Evaluators: Delete
eval-agent.js,eval-human.js, andeval.js. - Remove Custom Ratchet: Delete
ratchet.js. - Integrate CLI Binaries: Install and wire up
terraform-docsandhelm-docs. - Add Configs: Write
promptfoo.yamlfor evaluations andmkdocs.ymlfor serving docs. - Implement Bash Script: Write the Ralph Wiggum loop.
- Re-wire Glue Code: Connect the outputs from the OSS tools into the preserved
prose.jsmodule.