base benchmark
This commit is contained in:
10
README.md
Normal file
10
README.md
Normal file
@@ -0,0 +1,10 @@
|
||||
# LSFBench
|
||||
Minimal Luau/Lune benchmark to evaluate LLMs: one model answers questions, another model scores the answers against the reference key.
|
||||
|
||||
## Quick Start
|
||||
Prereqs
|
||||
- Install Lune (0.10.x)
|
||||
- Start Ollama at `http://localhost:11434` and pull the models referenced in `config.luau` (e.g. `qwen3:4b`)
|
||||
|
||||
## Notice
|
||||
The evaluator model must support structured JSON outputs.
|
Reference in New Issue
Block a user