21 lines
2.1 KiB
Markdown
21 lines
2.1 KiB
Markdown
|
|
# Spec: Repo-Agnostic Reference Page Synthesis
|
||
|
|
|
||
|
|
## Context
|
||
|
|
The Dev-Intel V2 pipeline currently uses a highly bespoke script (`generate-reference-pages.js`) to generate core reference documentation (`network-architecture.md`, `operations.md`, `configuration.md`, `dependencies.md`, `index.md`). This script hardcodes Foxtrot-specific facts (e.g., CIDR ranges, ArgoCD deployment flows, branch mappings) instead of deriving them from the codebase.
|
||
|
|
This renders the pipeline incapable of documenting other Reltio repositories (e.g., AnyCloud, BCE) without manual intervention.
|
||
|
|
|
||
|
|
## Objective
|
||
|
|
Refactor the reference page generation to be completely repository-agnostic. The system must extract raw facts from the source code (using existing structural extractors) and use an LLM to synthesize those facts into human- and agent-readable reference pages dynamically.
|
||
|
|
|
||
|
|
## Requirements
|
||
|
|
1. **Remove Hardcoding**: Delete `generate-reference-pages.js` completely.
|
||
|
|
2. **Generic Fact Extraction**: Ensure the existing `extract-deep.js`, `extract-helm.js`, and `sysdoc.js` patterns are collected into a single context object.
|
||
|
|
3. **LLM Synthesis**: Create a new function in `prose.js` (e.g., `synthesizeReferencePages(facts, outDir)`) that uses `opus-think` or standard models to generate the 4 core reference pages based *only* on the extracted facts.
|
||
|
|
4. **Dynamic Index**: Generate the `reference/index.md` file dynamically using the LLM to map the generated pages to their topics.
|
||
|
|
5. **Pipeline Integration**: Update `sysdoc.js` to call the new synthesis function, passing the extracted data (`deepData`, `patterns`, `subs`).
|
||
|
|
6. **Execution Script**: Update `wiggum-v2.sh` to reflect the removal of the bespoke script.
|
||
|
|
|
||
|
|
## Success Criteria
|
||
|
|
- Running `wiggum-v2.sh` generates `network-architecture.md`, `operations.md`, `configuration.md`, and `dependencies.md` without using hardcoded strings.
|
||
|
|
- The output format must still meet the evaluation standards (targeting >77% on the Confluence benchmark).
|
||
|
|
- The code must be capable of running against any arbitrary repository and producing relevant reference pages based on what it finds.
|