feat: 新增科研软件 benchmark 任务数据

- 新增 avogadro/imagej/jade/origin/ovito/pymol/vesta 等科研软件任务 JSON
- 修改 vllm_eval.py,修改图片文件名称为第x步
- desktop_env.py 添加额外数据参数 config 和 metadata
This commit is contained in:
2026-02-25 15:19:36 +08:00
parent 613f55f0da
commit 9899d4a0c7
85 changed files with 4703 additions and 71 deletions

10
.gitignore vendored
View File

@@ -208,7 +208,17 @@ quick_start.py
result_multi_apps_pengxiang_transformers12evaluation_examples/settings/proxy/dataimpulse.json
evaluation_examples/settings/proxy/dataimpulse.json
# Benchmark input data (large binary files - share via cloud storage or Git LFS)
evaluation_examples/inputs/
# Temporary data processing workspace (scraped docs, intermediate scripts)
evaluation_examples/sandbox/
# Image cache
evaluation_examples/inputs/.img_cache/
# Local test configurations (not for public repo)
evaluation_examples/spiderman.json
evaluation_examples/test_50_random_proportional.json
evaluation_examples/test_chrome.json
evaluation_examples/prepare_input_files.py