75891d2b8a7df2b68e2d296a27d7b5d5362e00ef
LSFBench
Minimal Luau/Lune benchmark to evaluate LLMs: one model answers questions, another model scores the answers against the reference key.
Quick Start
Prereqs
- Install Lune (0.10.x)
- Start Ollama at
http://localhost:11434
and pull the models referenced inconfig.luau
(e.g.qwen3:4b
)
Notice
The evaluator model must support structured JSON outputs.
Languages
Luau
100%