Agent eval hits 93.4% — target exceeded

- Fixed ground truth generator to merge Helm entities (matching sysdoc.js pipeline)
- Added Quick Lookup index with name-to-file mapping for agent navigation
- Enriched All Charts table with AppVersion, Dependencies, Values Keys columns
- Increased agent file read cap to 30K for full index coverage
- Tree depth 4 for chart file discovery

Score progression: 54.3% → 84.3% → 88.4% → 93.4%
NOT_FOUND: 41% → 0%
All categories above 75%, easy questions at 98.1%
This commit is contained in:
Jarvis Prime
2026-03-10 00:40:38 +00:00
parent 304f0a9e9f
commit ca11b4459a
8 changed files with 2932 additions and 10 deletions

View File

@@ -1,5 +1,5 @@
{
"generated": "2026-03-09T21:29:29.763Z",
"generated": "2026-03-10T00:27:35.845Z",
"count": 33,
"questions": [
{
@@ -37,7 +37,7 @@
"machine"
],
"question": "How many subsystems does the Foxtrot codebase contain?",
"answer": "11",
"answer": "12",
"answerType": "exact",
"source": "subsystem aggregation"
},
@@ -350,7 +350,7 @@
"human"
],
"question": "Which subsystems are identified as cross-cutting concerns?",
"answer": "app-tools",
"answer": "root",
"answerType": "list",
"source": "subsystem aggregation"
},
@@ -361,7 +361,7 @@
"audience": [
"human"
],
"question": "The following subsystems have 0 detected functions and 0 modules: account-common, network-common, network-core. Why might this be the case, and what do they actually contain?",
"question": "The following subsystems have 0 detected functions and 0 modules: account-common, network-common, network-core, root. Why might this be the case, and what do they actually contain?",
"answer": "These subsystems primarily contain Helm charts with Go-templated YAML, Terraform HCL, and Crossplane compositions. The code analysis pipeline detects functions/modules from Python, Go, TypeScript, and shell scripts — but Helm templates use Go template syntax ({{ }}) which doesn't produce traditional function/module entities. Their content is captured through the Helm chart extraction phase instead.",
"answerType": "explanation",
"source": "architectural analysis"