YAML Metadata Error:Invalid content in Eval Result file .eval_results/terminal_bench.yaml

Check out the documentation for more information.

Show details
Task ID "terminal_bench" does not match any task in dataset "harborframework/terminal-bench-2.0". Available: terminalbench_2
GLM-5 / .eval_results /terminal_bench.yaml
ZHANGYUXUAN-zR's picture
Add Terminal-Bench evaluation result (52.4%) (#56)
dce1f7c
raw
history blame contribute delete
277 Bytes
- dataset:
id: harborframework/terminal-bench-2.0
task_id: terminal_bench
value: 52.4
date: '2026-02-23'
source:
url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
name: Terminal-Bench Leaderboard
user: burtenshaw
notes: "agent: Terminus 2"