Adam Yanxiao Zhao
aa05f6cc26
Add AutoGLM-OS agent ( #309 )
...
* autoglm-os initialize
* clean code
* chore: use proxy for download setup
* feat(autoglm-os): add parameter to toggle images
* fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
* update
* add client_password
* update multienv
* fix
* fix prompt
* fix prompt
* fix prompt
* fix sys prompt
* feat: use proxy in file evaluator
* fix client_password
* fix note_prompt
* fix autoglm agent cmd type
* fix
* revert: fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
reverts commit bab5473eea1de0e61b0e1d68b23ce324a5b0ee57
* feat(autoglm): setup tools
* fix(autoglm): remove second time of get a11y tree
* add osworld server restart
* Revert "add osworld server restart"
This reverts commit 7bd9d84122e246ce2a26de0e49c25494244c2b3d.
* fix _launch_setup
* fix autoglm agent tools & xml tree
* fix desktop_env
* fix bug for tool name capitalization
* fix: always use proxy for setup download
* add fail after exceeding max turns
* fix(autoglm): avoid adding image to message when screenshot is empty
* fix maximize_window
* fix maximize_window
* fix maximize_window
* fix import browsertools module bug
* fix task proxy config bug
* restore setup
* refactor desktop env
* restore image in provider
* restore file.py
* refactor desktop_env
* quick fix
* refactor desktop_env.step
* fix our env reset
* add max truns constraint
* clean run script
* clean lib_run_single.py
---------
Co-authored-by: hanyullai <hanyullai@outlook.com >
Co-authored-by: JingBh <jingbohao@yeah.net >
2025-08-17 12:08:40 +08:00
yuanmengqi
e433f35c1f
feat: standardize configuration fields across all evaluation examples
...
- Add `fixed_ip` field to all 369 JSON files in examples directory
- Set to `true` for 8 files listed in google_chrome.json multi_apps
- Set to `false` for remaining 361 files
- Add `possibility_of_env_change` field to 363 JSON files missing this field
- Set to "low" for newly added fields
- Preserve existing values (4 medium, 2 high) for 6 files that already had this field
This ensures consistent configuration schema across all evaluation examples
while maintaining backward compatibility with existing settings.
2025-07-16 13:45:34 +00:00
XXZ
c8a6a22aad
Fix VLC task design ( #238 )
...
* fix: fix multiapp tasks
* fix: update instructions for VLC evaluation examples
---------
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-04 20:39:48 +08:00
Zilong Zhou
1308a80029
Update 5990457f-2adb-467b-a4af-5c857c92d762.json ( #235 )
2025-07-04 13:31:18 +08:00
XXZ
ac24ccce99
fix: fix multiapp tasks ( #229 )
...
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-03 21:53:58 +08:00
Zilong Zhou
595a704aff
fix: fix proxy setup ( #227 )
...
* fix: fix proxy setup
* feat&fix: add proxy support in setup and remove hardcoded proxy from example
2025-07-02 01:36:32 +08:00
Timothyxxx
fb7bafb885
feat: Add proxy configuration to all 369 evaluation examples - 55 with proxy, 314 without
2025-06-05 18:46:53 +08:00
Timothyxxx
34748567a5
feat: Migrate OSWorld files to HuggingFace cache with comprehensive documentation
...
- Add detailed README for file cache repository
- Implement migration script with retry logic and browser simulation
- Support automatic file type detection and deduplication
- Ensure reliable hosting for OSWorld evaluation files
2025-05-28 04:29:37 +08:00
Timothyxxx
447c886b0a
Fix multiple apps 5990457f-2adb-467b-a4af-5c857c92d762
2024-03-09 20:54:52 +08:00
rhythmcao
da0dafc32c
add multi-apps 5 examples by ruisheng 2024-03-06
2024-03-06 21:20:26 +08:00