cui0711
716d82f4d1
feat: add flexible recording control and improve execution logging
2026-01-30 16:28:13 +08:00
Bowen Yang
662826f57e
fix(os_symphony):prompt ( #402 )
...
* add_os_symphony
* fix(os_symphony)
* fix(os_symphony):prompt
---------
Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com >
2025-12-29 20:45:36 +08:00
xuetf
410ec63a89
Add EvoCUA Support ( #401 )
...
* evocua init
* setup max_token
---------
Co-authored-by: xuetaofeng <xuetaofeng@meituan.com >
Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com >
2025-12-23 20:46:23 +08:00
Bowen Yang
031696e83c
fix os_symphony ( #400 )
...
* add_os_symphony
* fix(os_symphony)
---------
Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com >
2025-12-23 20:45:30 +08:00
Qichen Fu
903ed36715
Add Claude Sonnet 4.5 support and improve action handling ( #362 )
...
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-authored-by: Claude <noreply@anthropic.com >
2025-11-14 13:54:32 +08:00
Subash Shibu
3167339e45
Add hosted GBOX agent for OSWorld evaluation ( #376 )
2025-11-13 13:13:31 +08:00
Atharva Gundawar
9f97535ef9
oswrold agent wrapper for trained v7 ( #360 )
2025-10-18 02:29:15 +08:00
Xinyuan Wang
f9e9273b3b
OpenCUA-72B ( #354 )
...
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* OpenCUA-72B
* update password
* update
* update
* update opencua72b agent
* change provider ip
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn >
2025-10-13 10:39:33 +08:00
Yanxiao Zhao
a4f8fe2f00
Add autoglm-os-9b-v ( #344 )
...
* update for autoglm-v
* Update run_autoglm.py
---------
Co-authored-by: hanyullai <hanyullai@outlook.com >
2025-09-24 19:43:28 +08:00
alexandruilie7
f59cf00cae
Add ui agent ( #343 )
...
* add uipath agent
* readme update
2025-09-24 19:42:46 +08:00
molanhand
7213eca069
support mano agent ( #338 )
...
Co-authored-by: Fei Hu <molanhand@users.noreply.github.com >
2025-09-16 18:10:29 +08:00
Adam Yanxiao Zhao
aa05f6cc26
Add AutoGLM-OS agent ( #309 )
...
* autoglm-os initialize
* clean code
* chore: use proxy for download setup
* feat(autoglm-os): add parameter to toggle images
* fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
* update
* add client_password
* update multienv
* fix
* fix prompt
* fix prompt
* fix prompt
* fix sys prompt
* feat: use proxy in file evaluator
* fix client_password
* fix note_prompt
* fix autoglm agent cmd type
* fix
* revert: fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
reverts commit bab5473eea1de0e61b0e1d68b23ce324a5b0ee57
* feat(autoglm): setup tools
* fix(autoglm): remove second time of get a11y tree
* add osworld server restart
* Revert "add osworld server restart"
This reverts commit 7bd9d84122e246ce2a26de0e49c25494244c2b3d.
* fix _launch_setup
* fix autoglm agent tools & xml tree
* fix desktop_env
* fix bug for tool name capitalization
* fix: always use proxy for setup download
* add fail after exceeding max turns
* fix(autoglm): avoid adding image to message when screenshot is empty
* fix maximize_window
* fix maximize_window
* fix maximize_window
* fix import browsertools module bug
* fix task proxy config bug
* restore setup
* refactor desktop env
* restore image in provider
* restore file.py
* refactor desktop_env
* quick fix
* refactor desktop_env.step
* fix our env reset
* add max truns constraint
* clean run script
* clean lib_run_single.py
---------
Co-authored-by: hanyullai <hanyullai@outlook.com >
Co-authored-by: JingBh <jingbohao@yeah.net >
2025-08-17 12:08:40 +08:00
Xinyuan Wang
3d32556085
Uitars/dev ( #291 )
...
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* add all the uitars agents:
1. run_multienv_uitars.py: Qwen2VL-based UITARS models
2. run_multienv_uitars15_v1.py: UITARS1.5-7B
3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn >
2025-07-31 08:52:27 +08:00
yuanmengqi
39e5baf5ae
fix: remove unnecessary sleep and observation retrieval in run_single_example function
2025-07-25 15:51:20 +00:00
Yuan Mengqi
0a37cccd53
update claude ( #280 )
...
* add uitars agent code
* improve claude
* improve claude
* improve claude
* improve claude
* improve claude
2025-07-23 03:35:49 +08:00
yuanmengqi
91bc6bb6ce
Merge branch 'main' of github.com:xlang-ai/OSWorld
2025-07-20 07:55:57 +00:00
yuanmengqi
88d5639a2a
Compatible with agents that cannot use runtime log
2025-07-20 07:55:53 +00:00
Xinyuan Wang
e10dd9267c
Wxy/opencua ( #274 )
...
* OpenCUA Agent code base
* update url
* debug, modify url input
* debug opencua
* show result
* debug agent history overlap
* modify opencua agent; add comment lines
* update parallel; clean code; use sleep 3s
* ui-tars-0717
2025-07-20 15:52:23 +08:00
Xinyuan Wang
0f2655249c
Wxy/opencua ( #260 )
...
* OpenCUA Agent code base
* update url
* debug, modify url input
* debug opencua
* show result
* debug agent history overlap
* modify opencua agent; add comment lines
2025-07-16 17:53:12 +08:00
Xinyuan Wang
db83b9cb2c
Wxy/opencua ( #256 )
...
* OpenCUA Agent code base
* update url
* debug, modify url input
2025-07-14 20:26:39 +08:00
yuanmengqi
aee1207fff
fix error
2025-06-09 04:20:59 +00:00
adlsdztony
ae473d8673
fix: remove pending checks from actions to prevent json serialization issues
2025-06-04 14:20:02 +00:00
yuanmengqi
228849ab03
add openai cua agent
2025-05-31 11:22:38 +00:00
Xinyuan Wang
d626cc90d9
Modify: wait for initial env to be ready ( #139 )
2025-02-25 21:08:25 +08:00
Junli Wang
1503eb3994
Finish Aguvis eval on OSWorld ( #107 )
...
* Initialize Aguvis eval on OSWorld
* Debug
* Debug
* v1, internal version
* Add experiments script
* Fix minor bugs
* Update new endpoint
* Update ip
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Fix model name
* Fix docker close issues; update prompting
* Fix missed
* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'
* Fix server and chromium ports in setup
* Revert and add missed dependency
* Add VLC port for docker
* Update
* Aguvis Grounding
* Add Aguvis as planner
* fix parse bug
* fix pause
* fix planner prompt
* Aguvis Grounding
* fix
* fix
* fix
* add logger for each example
* Modify Aguvis Planner Prompts
* fix logger setup
* fix absolute coordinates
* Finish Aguvis Evaluation on OSWorld
* Merge origin/main into junli/aguvis
* Remove screenshot
---------
Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local >
Co-authored-by: Timothyxxx <384084775@qq.com >
Co-authored-by: FredWuCZ <fredwucz@outlook.com >
2024-11-24 16:43:25 +08:00
tsuky_chen
1fd8b66fde
debug/timeout ( #59 )
2024-07-24 08:31:42 +08:00
Timothyxxx
9c75df5dce
Clean code; Refactor environment to pass screenshot content instead of path
2024-04-13 23:34:01 +08:00
Timothyxxx
8e760fd450
Disable wandb temporarily, speedup the environment step speed by remove useless a11y tree re-get and terminal output
2024-03-19 08:57:05 +08:00
Timothyxxx
f992d1f694
Disable a11y tree temporarily
2024-03-18 21:43:35 +08:00
Timothyxxx
c1c7ac298f
Update claude endpoint
2024-03-18 14:59:02 +08:00
Jason Lee
576248ae18
uncomment timer
2024-03-18 12:02:34 +08:00
Jason Lee
8080828a84
update wandb settings
2024-03-18 00:02:41 +08:00
Jason Lee
48aedb09a7
add wandb settings, remember to set WANDB_KEY
2024-03-17 22:30:29 +08:00
Timothyxxx
639f8c7db8
Fix small bugs in max time limit setting
2024-03-16 14:34:40 +08:00
Jason Lee
053da203b8
new timer, but need to set in setting.json file, need to be upgraded into parameters
2024-03-16 12:36:23 +08:00
Jason Lee
44679724b8
try new timer
2024-03-16 11:54:45 +08:00