Commit Graph

  • 3f0ef4849a Add origin data files lzy/data-processing kingyang0 2026-03-19 18:05:13 +08:00
  • ae202be7b9 Update origin task kingyang0 2026-03-19 17:58:11 +08:00
  • 64e19ba17e Add ovito data examples kingyang0 2026-03-19 10:08:25 +08:00
  • ae92e80a0b Update ovito evaluation examples kingyang0 2026-03-19 10:00:27 +08:00
  • 0e2702fb5b Merge branch 'lzy/data-processing' of https://git.matai.center/lzy/sci-gui-agent-benchmark into lzy/data-processing kingyang0 2026-03-18 17:33:11 +08:00
  • 16ea3641bd Add ovito example files kingyang0 2026-03-18 16:49:11 +08:00
  • dc5fd173f1 data: update avogadro building-metal-complexes task1 & task3 lizhanyuan 2026-03-13 17:19:44 +08:00
  • 19795a674b chore: gitignore 添加 demo_task3 录制产物 lizhanyuan 2026-03-11 11:13:23 +08:00
  • 349f2142fb fix: vllm_eval 默认使用原始分辨率进行评估 lizhanyuan 2026-03-11 11:06:01 +08:00
  • a943c1e961 feat: 更新 Jade/VESTA 任务定义 + 最终评测清单 lizhanyuan 2026-03-11 11:02:26 +08:00
  • d71f1f976d feat: vllm_eval 关键帧采样 + Gemini OpenAI 代理支持 lizhanyuan 2026-03-04 16:39:24 +08:00
  • 4bde685bbd feat: 新增 Proxmox provider 支持及 inject_steps 参数 lizhanyuan 2026-03-04 16:39:08 +08:00
  • e70f1335f0 config: 更新测试任务配置文件 lizhanyuan 2026-03-04 10:44:00 +08:00
  • 9431bd5bfc data: 精炼已有 avogadro/imagej/origin/ovito/pymol/vesta 任务的 metadata steps lizhanyuan 2026-03-04 10:43:49 +08:00
  • b1052c79cf data: 新增 jade/avogadro/ovito/pymol 评测任务数据 lizhanyuan 2026-03-04 10:43:29 +08:00
  • ac3f38ed58 feat: 新增 refine_metadata 脚本,更新 extract_instructions_v2 lizhanyuan 2026-03-04 10:43:14 +08:00
  • e4b039fc02 refine jade metadata steps: add shortcuts & merge menu operations to avoid timeout lizhanyuan 2026-02-27 18:19:04 +08:00
  • b75f6bf341 feat: 增强任务步骤注入与a11y状态表达,提升树形交互稳定性 lizhanyuan 2026-02-26 18:56:53 +08:00
  • 07e66490dd feat: 增强科研软件的 a11y tree 支持 lizhanyuan 2026-02-26 15:04:28 +08:00
  • 9899d4a0c7 feat: 新增科研软件 benchmark 任务数据 lizhanyuan 2026-02-25 15:19:36 +08:00
  • 613f55f0da feat(tools): add instructions extraction script for generating test cases os_world cui0711 2026-02-09 17:47:02 +08:00
  • ba03784196 fix(env): handle None result_getter for vllm_eval evaluator cui0711 2026-02-09 17:46:05 +08:00
  • 3890ee5fc3 fix(vllm_eval): add image compression to prevent 413 error with large max_steps cui0711 2026-02-09 14:24:59 +08:00
  • 9bc54c0a66 feat(vllm_eval): add structured JSON response format with step analysis cui0711 2026-02-09 13:58:14 +08:00
  • 1e9281a1ab feat(cli): add eval_model argument cui0711 2026-02-05 16:56:39 +08:00
  • 63484c7b7b fix(runner): pass result_dir to evaluate and re-enable environment reset cui0711 2026-02-05 16:55:49 +08:00
  • ad46acc5f3 refactor(example): replace check_include_exclude with vllm_eval evaluator cui0711 2026-02-05 16:55:03 +08:00
  • 58d411bf86 feat(evaluator): export vllm_eval module cui0711 2026-02-05 16:54:16 +08:00
  • be24e77d93 feat(env): add eval_model parameter and result_dir support for vllm evaluation cui0711 2026-02-05 16:53:12 +08:00
  • dd58a1de03 feat(evaluator): add vision-language model evaluator cui0711 2026-02-05 16:52:35 +08:00
  • 231f7a8fbc feat(eval): add jade test case and update test categories cui0711 2026-01-30 16:29:05 +08:00
  • 716d82f4d1 feat: add flexible recording control and improve execution logging cui0711 2026-01-30 16:28:13 +08:00
  • 47bcfc0f0b feat(agent): add screenshot compression and dynamic resolution support cui0711 2026-01-30 16:28:02 +08:00
  • 7e9090e115 fix(prompts): fix template variable syntax and add dynamic resolution cui0711 2026-01-30 16:28:02 +08:00
  • 308282e830 feat(server): add cross-platform support and improve screenshot handling cui0711 2026-01-30 16:27:49 +08:00
  • 788b248dbc fix(logger): add Windows platform support for file locking cui0711 2026-01-30 16:27:49 +08:00
  • 43cb5519ad feat: add JADE task instructions data_collection cui0711 2026-01-15 09:58:20 +08:00
  • 2ebb4866cd 初次提交 main lizhanyuan 2026-01-12 18:43:06 +08:00
  • 214e15c04c Initial commit lizhanyuan 2026-01-12 18:30:12 +08:00
  • 5463d3bb89 uipath v2 (#413) alexandruilie7 2026-01-09 02:47:20 +02:00
  • 5ef8bdfa35 EvoCUA Update (2025.01.05) (#412) 蘑菇先生 2026-01-05 16:14:53 +08:00
  • 439e178a2e fix(os_symphony_evaluation) (#410) Bowen Yang 2026-01-04 15:56:51 +08:00
  • 951e1928c8 fix(desktop_os_symphony):support aws (#406) Bowen Yang 2026-01-01 11:27:34 +08:00
  • 02a35be067 fix(os_symphony) (#405) Bowen Yang 2025-12-30 22:43:47 +08:00
  • 662826f57e fix(os_symphony):prompt (#402) Bowen Yang 2025-12-29 20:45:36 +08:00
  • 410ec63a89 Add EvoCUA Support (#401) xuetf 2025-12-23 20:46:23 +08:00
  • 031696e83c fix os_symphony (#400) Bowen Yang 2025-12-23 20:45:30 +08:00
  • f593f35b1c add_os_symphony (#399) Bowen Yang 2025-12-23 14:30:44 +08:00
  • ac31778ee3 Update: requirements.txt for seed agent Ubuntu 2025-12-15 11:47:56 +00:00
  • 60caa52fc4 Update: requirements.txt for seed agent Ubuntu 2025-12-15 11:47:40 +00:00
  • 41477a9c40 Update: seed agent Ubuntu 2025-12-15 11:45:57 +00:00
  • 78433ecfcf Add agent: seed agent Ubuntu 2025-12-12 05:35:20 +00:00
  • 9540454b0a Fix demo agent (PromptAgent) reset(): add vm_ip and kwargs for compatibility with lib_run_single.py (#388) Meshal Nayim 2025-12-08 23:59:25 -08:00
  • cbc3b590ff Task fix batch (#383) MillanK 2025-11-19 17:24:25 +08:00
  • 903ed36715 Add Claude Sonnet 4.5 support and improve action handling (#362) Qichen Fu 2025-11-13 21:54:32 -08:00
  • 3167339e45 Add hosted GBOX agent for OSWorld evaluation (#376) Subash Shibu 2025-11-12 21:13:31 -08:00
  • 00b6468eb7 feat/dart_gui (#371) Pengxiang-Li 2025-11-07 21:50:01 +08:00
  • 6d43dbc532 Update GIMP evaluation examples to replace local file paths with cloud file URLs for consistency and accessibility. (#372) yiqilin 2025-11-07 21:49:49 +08:00
  • 8365edc975 Add new section in README for OSWorld-MCP project Timothyxxx 2025-10-30 06:06:48 +00:00
  • 21c2b7629b Add consistent scores validation (#368) Daphne Barretto 2025-10-28 10:44:48 -07:00
  • 3bf54c92a9 Merge branch 'main' of github.com:xlang-ai/OSWorld Timothyxxx 2025-10-23 14:28:14 +08:00
  • a484f2e484 Update setup.py for version bump and dependency adjustments Timothyxxx 2025-10-23 14:27:52 +08:00
  • 9f97535ef9 oswrold agent wrapper for trained v7 (#360) Atharva Gundawar 2025-10-17 11:29:15 -07:00
  • afd29115da support aliyun eval of qwen3vl ludunjie.ldj 2025-10-16 16:20:54 +08:00
  • 55372c4432 Fix API base URLs for OpenAI and DashScope Dunjie Lu 2025-10-14 12:57:00 +08:00
  • d25464c203 Djlu/qwen3vl dash (#356) Dunjie Lu 2025-10-13 16:31:06 +08:00
  • f9e9273b3b OpenCUA-72B (#354) Xinyuan Wang 2025-10-13 10:39:33 +08:00
  • ddb8372a6c init public release (#350) Yan98 2025-10-07 01:16:31 +11:00
  • 5eff00a9e3 Fix #347: Fix NameError in open_file timeout message (#351) eun2ce 2025-10-06 23:14:15 +09:00
  • ff6285cfbb Add safe browsing feature to Chrome evaluator Timothyxxx 2025-10-05 04:56:08 +00:00
  • afd5952e44 ver Oct3rd (#349) Danyang Zhang 2025-10-04 00:13:29 +08:00
  • 1572068035 Refactor evaluator functions in JSON examples to use URL pattern matching. Update expected URL formats to regex patterns for better validation in chrome evaluation examples. Timothyxxx 2025-10-01 19:20:06 +00:00
  • 9be518435c Update GIMP evaluation examples to replace local file paths with cloud file URLs for consistency and accessibility. Timothyxxx 2025-10-01 09:54:52 +00:00
  • bfb467da18 Merge branch 'main' of github.com:xlang-ai/OSWorld Timothyxxx 2025-10-01 06:56:43 +00:00
  • 4c685bed99 Update run_maestro.py to run in headless mode with a single environment and specify result directory. Adjust default TTL for AWS instances from 60 to 180 minutes in config.py. Enhance AWSProvider to handle missing security groups, subnet IDs, and instance types with fallbacks, and improve termination logic to skip already terminated instances while logging relevant information. Timothyxxx 2025-10-01 06:56:33 +00:00
  • 5eb5417188 fix #210: add a11y_tree support to UITARSAgent (#346) eun2ce 2025-09-26 19:25:28 +09:00
  • 6827949418 fix _update_browse_history_setup (#345) Yanxiao Zhao 2025-09-25 13:22:40 +08:00
  • a4f8fe2f00 Add autoglm-os-9b-v (#344) Yanxiao Zhao 2025-09-24 19:43:28 +08:00
  • f59cf00cae Add ui agent (#343) alexandruilie7 2025-09-24 14:42:46 +03:00
  • 088e68798c update aworldguiAgent code (#342) Long Chen 2025-09-23 16:50:29 +08:00
  • 584c7a9875 Enhance AWSProvider instance handling with fallback mechanisms for security groups, subnet IDs, and instance types. Implement checks to skip termination of instances already in 'shutting-down' or 'terminated' states, and handle potential termination errors gracefully. Timothyxxx 2025-09-18 07:16:10 +00:00
  • 7213eca069 support mano agent (#338) molanhand 2025-09-16 18:10:29 +08:00
  • dc7e46e7aa Refactor platform detection for VM image download (#337) ZhangZuhao 2025-09-15 21:00:15 +08:00
  • b012301609 support qwen3vl agent (#336) Dunjie Lu 2025-09-15 16:04:29 +08:00
  • a668670349 fix(maestro): Fixed the debug logging level (#334) Hiroid 2025-09-11 01:03:59 +08:00
  • 3a4b67304f Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333) Hiroid 2025-09-08 15:07:21 +08:00
  • 029885e78c Merge branch 'main' of github.com:xlang-ai/OSWorld Timothyxxx 2025-09-05 15:36:39 +00:00
  • 640f3fcd96 Update default path_to_vm argument to None in quickstart.py for improved flexibility Timothyxxx 2025-09-05 15:36:31 +00:00
  • 756923beea Update instruction wording in LibreOffice Impress example to clarify text color change requirements. Address https://github.com/xlang-ai/OSWorld/issues/324 Timothyxxx 2025-09-01 23:29:47 +08:00
  • 0c681b91e0 Fix README update Timothyxxx 2025-09-01 15:15:50 +00:00
  • 8513e8c89e Add quickstart script and update README (#325) aneeshprasad1 2025-09-01 08:14:24 -07:00
  • 756e006af6 add support for mobile agent v3 (#328) Howie 2025-08-31 22:58:41 +08:00
  • 54a14cbc07 fix multienv bug (#327) hanyullai 2025-08-30 11:10:53 +08:00
  • 3344abd641 Add support for GUI-Owl agent (#318) Howie 2025-08-27 18:03:39 +08:00
  • ef2f35de22 Add resource group ID support for Aliyun VM allocation Timothyxxx 2025-08-26 13:28:23 +08:00
  • 4c773f6f7c Merge branch 'main' of github.com:xlang-ai/OSWorld Timothyxxx 2025-08-22 23:29:21 +08:00
  • ebda4d8b3f Add Aliyun SDK dependencies and implement TTL configuration for ECS instances Timothyxxx 2025-08-22 23:28:58 +08:00
  • 15d9ddb612 update coact: add autogen/cache Timothyxxx 2025-08-21 19:03:35 +00:00
  • b14f1c7345 Merge branch 'main' of github.com:xlang-ai/OSWorld Timothyxxx 2025-08-21 09:38:37 +00:00
  • ead564c92b Update dependencies and refactor DesktopEnv initialization Timothyxxx 2025-08-21 09:38:28 +00:00