Commit Graph

4 Commits

Author SHA1 Message Date
9899d4a0c7 feat: 新增科研软件 benchmark 任务数据
- 新增 avogadro/imagej/jade/origin/ovito/pymol/vesta 等科研软件任务 JSON
- 修改 vllm_eval.py,修改图片文件名称为第x步
- desktop_env.py 添加额外数据参数 config 和 metadata
2026-02-25 15:19:36 +08:00
cui0711
3890ee5fc3 fix(vllm_eval): add image compression to prevent 413 error with large max_steps 2026-02-09 14:24:59 +08:00
cui0711
9bc54c0a66 feat(vllm_eval): add structured JSON response format with step analysis 2026-02-09 13:58:14 +08:00
cui0711
dd58a1de03 feat(evaluator): add vision-language model evaluator 2026-02-05 16:52:35 +08:00