Jarvis Prime b8403be96c feat: repo-agnostic refactor (BMad spec-test-build loop)
- NEW: repo-profiler.js — deterministic archetype detection (Infra, Frontend, Backend, etc.)
- NEW: extract-dynamic.js — generic extractor replacing hardcoded Foxtrot patterns
- NEW: eval-generator.js — dynamic ground-truth question generation from any repo graph
- NEW: specs/bmad-agnostic-refactor-spec.md — full BMad spec with acceptance criteria
- REFACTORED: prose.js — two-pass LLM synthesis with rich context (shared secrets, ports, service refs)
- REFACTORED: sysdoc.js — wired repo-profiler + extract-dynamic, --legacy escape hatch
- REFACTORED: wiggum-v2.sh — uses eval-generator before benchmarks
- FIXED: graph.js — _edgeSet rebuilt on loadSnapshot() (edge dedup was broken)
- FIXED: graph.js — recursive sortKeys() for deep equality in diffing
- FIXED: prose.js — robust JSON array extraction from LLM output
- FIXED: ratchet.js — syntax validation (node --check) before saving LLM mutations
- FIXED: extract-dynamic.js — centralized state services regex, added console.warn for silent failures
- TESTS: test-eval-generator, test-repo-profiler, test-synthesis-quality + mock fixtures

Eval: 81.5% on Foxtrot (fully repo-agnostic, no hardcoded reference pages)
BMad reviews: Architect B+, Dev Lead B-, TEA B-
2026-03-11 14:40:31 +00:00

Developer Intelligence Pipeline v2

Multi-language semantic graph extractor that builds a knowledge graph from source code. Produces function-level call graphs, cross-file dependency maps, and semantic diffs — all without LLM calls.

Quick Start

npm install
node pipeline.js batch /path/to/repo --output /tmp/output

What It Does

Parses source code into a directed graph of entities (modules, functions, classes, configs) and relationships (CALLS, IMPORTS, CONTAINS, IMPLEMENTS). Then diffs snapshots to detect breaking changes, compute impact scores, and identify affected callers.

Supported Languages

Language Parser Entities
TypeScript/JavaScript tree-sitter Modules, Functions, Classes, Imports
Python tree-sitter Modules, Functions, Classes (with _/__ visibility)
Go tree-sitter Modules, Functions, Structs, Receiver Methods
Java tree-sitter Modules, Functions, Classes, Interfaces
Bash tree-sitter Modules, Functions, source imports, Commands
YAML js-yaml Config keys (K8s manifests, Helm, KCL)
Terraform/HCL regex Resources, Data, Modules, Providers

Pipeline Phases

Phase 1: Entity Extraction (extract.js)

node extract.js /path/to/file.ts /repo/root

Outputs JSON with entities and relationships.

Phase 2: Graph Store (graph.js)

node graph.js build /dir/of/jsons snapshot.json
node graph.js query snapshot.json "cli/route.ts:tryRouteCli"
node graph.js diff old.json new.json

Phase 3: Namespace Registry (namespace.js)

node namespace.js build snap-a.json snap-b.json --output registry.json
node namespace.js resolve graph.json registry.json
node namespace.js lookup registry.json functionName

3-tier cross-repo resolution: exact ID → normalized path → name-only.

Phase 4: Semantic Diff (semantic-diff.js)

node semantic-diff.js diff old.json new.json
node semantic-diff.js score old.json new.json

Categorizes changes as breaking/significant/internal/cosmetic. Impact score 0-100.

Phase 5: Pipeline (pipeline.js)

node pipeline.js batch /repo --output /tmp/out     # Full extraction
node pipeline.js benchmark /repo --samples 20       # Performance test
node pipeline.js run /repo --snapshot prev.json     # Incremental diff

Benchmark (OpenClaw repo)

Metric Value
Files 4,325
Extracted 4,259 (98.5%)
Nodes 21,646
Edges 133,979
Time 67 seconds
Avg/file 15ms

V1 vs V2

Metric V1 POC V2 Pipeline
Parse time ~2s 552ms
Total time 15-20 min (LLM) 552ms
Entities files + imports 457 (4 types)
CALLS edges 0 1,290
Cross-file calls No 51 resolved
Languages Go only 8
Semantic diff No Yes
Impact scoring No Yes
Cost ~$0 (Ollama) $0

Tested on labstack/echo (44 Go files)

Testing

bash test/run-all.sh          # 9/9 ground truth benchmark
node test/test-graph.js       # 25/25 graph store tests

Architecture

source files → extract.js → JSON → graph.js → snapshot.json
                                                    ↓
                                          semantic-diff.js → impact report
                                                    ↓
                                          namespace.js → cross-repo links

Zero external runtime dependencies beyond tree-sitter grammars.

License

MIT

Description
Developer Intelligence Pipeline v2 — multi-language semantic graph extractor
Readme 6.6 MiB
Languages
JavaScript 98.9%
Shell 1.1%