049609a358
Phase 9d: Human eval score improvement\n\n- Human readability score increased from 63.9% to 78.6%\n- Structural table additions and quick lookup index resolved navigation bottlenecks\n- NOT_FOUND rate dropped from 17.9% to 3.6%
Jarvis Prime
2026-03-10 00:46:37 +00:00
304f0a9e9f
Phase 9c: Split eval into Agent (file-browsing) and Human (readability) tracks
Jarvis Prime
2026-03-09 23:55:54 +00:00
0cc4abcb0f
Phase 9b: structural documentation improvements\n\n- sysdoc.js: Added Summary Statistics, Top Charts, and K8s Resource Types to architecture doc\n- Addresses ratchet failures where system-wide rollups were missing from generated prose\n- Eval v2 shows minor improvement, though RAG context window still limits wide scatter-gather queries
Jarvis Prime
2026-03-09 23:40:07 +00:00
b99341e8bc
Phase 9: Doc Evaluation Harness\n\n- eval-questions.js: Generates ground-truth questions from raw source data\n- eval.js: LLM-as-judge scoring harness (answers from docs, scores against truth)\n- Generated 33 questions covering config, dependencies, resources, and interactions\n- Baseline score: 66.7% (configuration 93%, dependencies 77%, structural 31%)
Jarvis Prime
2026-03-09 22:32:41 +00:00
d9fa087e22
Phase 6+7: LLM prose generation pass over Foxtrot docs\n\n- Ran Claude Haiku to generate prose for architecture, subsystems, flows, and 124 Helm contracts\n- Fixed describeContract prompt in prose.js to correctly identify and describe Helm contract types without hallucinating\n- 80 files generated with rich architectural summaries
Jarvis Prime
2026-03-09 20:15:50 +00:00
4f7c77b3b1
Phase 8b: Helm contract extraction + diagram support
Jarvis Prime
2026-03-09 20:05:52 +00:00
f49a6c2dd9
Phase 8: Helm chart extraction with Go template support
Jarvis Prime
2026-03-09 20:03:04 +00:00