Commit Graph

128 Commits

Author SHA1 Message Date
Yan98
4e3446d6fe Fix Name (#249)
* init

* init
2025-07-11 00:15:46 +08:00
Yan98
0a5058342d init (#246) 2025-07-10 00:29:42 +08:00
yuanmengqi
7315aec6e6 clean code 2025-06-10 04:06:54 +00:00
yuanmengqi
3da32fe5cf update operator prompt 2025-06-10 02:35:53 +00:00
yuanmengqi
692486f8e7 add GDrive guideline 2025-06-09 14:59:47 +00:00
yuanmengqi
aee1207fff fix error 2025-06-09 04:20:59 +00:00
yuanmengqi
d8872634ee edit prompt 2025-06-08 03:59:31 +00:00
yuanmengqi
c57b1d4e7a eval update 2025-06-07 13:19:22 +00:00
yuanmengqi
a146c1e0b7 edit prompt 2025-06-07 05:21:04 +00:00
yuanmengqi
64177045b5 Merge remote-tracking branch 'upstream/feat/aws-provider-support' 2025-06-06 10:22:56 +00:00
Timothyxxx
8373f7cff2 refactor: remove AWSVMManagerWithProxy and integrate proxy support directly into AWSVMManager for streamlined VM allocation;
minor fix on openai_cua_agent
2025-06-06 02:55:50 +08:00
yuanmengqi
a6300e05c9 Merge remote-tracking branch 'upstream/feat/aws-provider-support' 2025-06-05 13:31:42 +00:00
adlsdztony
3b1540ed23 feat&fix: enhance task status handling and update logging configuration 2025-06-05 09:33:36 +00:00
yuanmengqi
b211df3385 fix timeout 2025-06-04 10:23:45 +00:00
yuanmengqi
98a810d31e edit operator 2025-06-02 12:11:25 +00:00
yuanmengqi
228849ab03 add openai cua agent 2025-05-31 11:22:38 +00:00
uvheart
a845824f06 add azure_gpt_4o (#197) 2025-05-23 03:57:42 +08:00
Shihao Liang
119bef25e2 Dev/uitars 15 (#194)
* debug uitars1.0, add uitars1.5

* update pyautogui parser

* modify function name

* update parser

* update prompt

* FIX: bug in ui tars
2025-05-19 17:15:17 +08:00
MillanK
51f5ddea04 Add Jedi agent implementation to mm_agents (#192)
* feat: implement Jedi agent

* chore: code clean
2025-05-10 19:55:33 +08:00
Thomas Kuntz
5678b510d7 fix: Invalid escape sequence in prompts (#191)
Fixes the warning: SyntaxWarning: invalid escape sequence '\`'
2025-05-10 18:19:07 +08:00
Thomas Kuntz
7d88283f8a feat: Support newer Gemini models (#188) 2025-05-06 16:04:30 +08:00
Shihao Liang
b92c716df7 Dev/uitars 15 (#181)
* debug uitars1.0, add uitars1.5

* update pyautogui parser

* modify function name

* update parser

* update prompt
2025-04-21 13:44:08 +08:00
Shihao Liang
bd2e980666 Dev/uitars 15 (#178)
* debug uitars1.0, add uitars1.5

* update pyautogui parser

* modify function name

* update parser
2025-04-17 18:49:21 +08:00
Shiqian Su
c4d818c5cf Update aguvis_agent.py (#141)
Fix Aguvis prompt bug
2025-02-28 16:48:41 +08:00
Shihao Liang
339a13e1d5 Dev/uitars (#132)
* init uitars

* change agent class name

* FIX: return bug in agent predict
2025-02-14 11:17:37 +08:00
Shihao Liang
0bc1e08440 Dev/uitars (#129)
* init uitars

* change agent class name
2025-02-08 12:49:40 +08:00
Timothyxxx
2c8e8a58f6 Fix minor bug caused by new logging feat in aguvis agent traj 2024-12-05 15:45:09 +08:00
Junli Wang
1503eb3994 Finish Aguvis eval on OSWorld (#107)
* Initialize Aguvis eval on OSWorld

* Debug

* Debug

* v1, internal version

* Add experiments script

* Fix minor bugs

* Update new endpoint

* Update ip

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix model name

* Fix docker close issues; update prompting

* Fix missed

* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'

* Fix server and chromium ports in setup

* Revert and add missed dependency

* Add VLC port for docker

* Update

* Aguvis Grounding

* Add Aguvis as planner

* fix parse bug

* fix pause

* fix planner prompt

* Aguvis Grounding

* fix

* fix

* fix

* add logger for each example

* Modify Aguvis Planner Prompts

* fix logger setup

* fix absolute coordinates

* Finish Aguvis Evaluation on OSWorld

* Merge origin/main into junli/aguvis

* Remove screenshot

---------

Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local>
Co-authored-by: Timothyxxx <384084775@qq.com>
Co-authored-by: FredWuCZ <fredwucz@outlook.com>
2024-11-24 16:43:25 +08:00
Tianbao Xie
20442244fa [Feature] Initialize and Implement Aguvis Evaluation on OSWorld (#98)
* Initialize Aguvis eval on OSWorld

* Debug

* Debug

* v1, internal version

* Add experiments script

* Fix minor bugs

* Update new endpoint

* Update ip

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix model name

* Fix docker close issues; update prompting

* Fix missed

* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'

* Fix server and chromium ports in setup

* Revert and add missed dependency

* Add VLC port for docker

* Update

* Clean

---------

Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local>
Co-authored-by: FredWuCZ <fredwucz@outlook.com>
2024-11-11 12:36:16 +08:00
Tianbao Xie
a156f8a3d6 Modify the namespace of a11y tree (#62) 2024-07-25 20:20:34 +08:00
Timothyxxx
cfc5500a8a Merge remote-tracking branch 'origin/main' 2024-05-21 21:08:43 +08:00
Timothyxxx
306dcbda71 Add Support for QWEN VL models from API (QWEN-VL-max, etc.); Improve on the robustness of getting observation/files, etc. 2024-05-21 21:08:22 +08:00
Timothyxxx
5568dfd141 Handling more exceptions; Fix hyperparameter passing 2024-05-20 17:22:07 +08:00
Timothyxxx
f9594e476e Add Support for QWEN models from API (QWEN-max, etc.); Improve on the robustness of getting observation 2024-05-20 00:47:43 +08:00
Timothyxxx
a500f59419 Add Llama3-70B Support (from Groq) 2024-05-09 02:04:58 +08:00
Timothyxxx
54905380e6 Add Llama3-70B Support (from Groq) 2024-05-09 02:04:02 +08:00
Timothyxxx
97b567a287 Update README and ROADMAP; Fix typos; optimize the code for llm calling in agent.py 2024-04-26 13:32:41 +08:00
Timothyxxx
eaceddf917 Add Gemini Pro 1.5 Support 2024-04-24 18:19:25 +08:00
Timothyxxx
6777ea255a Fix https://github.com/xlang-ai/OSWorld/issues/21 ; Update README for multimodal agents; Add badge in README; Add setup.py 2024-04-15 18:47:54 +08:00
Timothyxxx
9c75df5dce Clean code; Refactor environment to pass screenshot content instead of path 2024-04-13 23:34:01 +08:00
Tianbao Xie
38f4506ea3 Update README.md of agents 2024-04-12 18:25:05 +08:00
Timothyxxx
26ed70ef70 Clean Code; Refactor README 2024-03-27 16:21:49 +08:00
Timothyxxx
d79d5d2c01 Clean Code 2024-03-27 14:46:29 +08:00
Timothyxxx
607cf8e554 Fix max traj length 2024-03-25 18:09:43 +08:00
Timothyxxx
172123ab2c Support downsampling; Fix bugs in windows a11y tree; Add a11y_tree trim 2024-03-25 18:02:48 +08:00
Yiheng Xu
5f2802292a Update agent.py 2024-03-22 12:54:22 +08:00
Timothyxxx
3ce7636abd Fix one multi_app example; remove some broken examples; Support downsampling 2024-03-21 22:05:16 +08:00
Fangyu Lei
3e581c8108 Update agent.py claude 2024-03-21 07:52:58 +08:00
Siheng Zhao
04a9df627c Merge branch 'main' of github.com:ztjhz/DesktopEnv 2024-03-20 22:42:01 +08:00
Siheng Zhao
6927d9e39d [feature] add image downsample func 2024-03-20 22:41:05 +08:00