Tianbao Xie
20442244fa
[Feature] Initialize and Implement Aguvis Evaluation on OSWorld ( #98 )
...
* Initialize Aguvis eval on OSWorld
* Debug
* Debug
* v1, internal version
* Add experiments script
* Fix minor bugs
* Update new endpoint
* Update ip
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Fix model name
* Fix docker close issues; update prompting
* Fix missed
* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'
* Fix server and chromium ports in setup
* Revert and add missed dependency
* Add VLC port for docker
* Update
* Clean
---------
Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local >
Co-authored-by: FredWuCZ <fredwucz@outlook.com >
2024-11-11 12:36:16 +08:00
Pierre Carrier
b35dc40ff4
SetupController: no server_port for chrome ( #96 )
2024-11-07 00:33:03 +08:00
HappySix
6419d707bc
Support Docker VM manager and provider ( #75 )
...
* Add docker provider framework
* Update VM download link
* Add stop container
* Update docker manager & provider
* Update
* Update
* Update provider
2024-09-28 21:10:40 +08:00
Timothyxxx
df231889c9
Fix minor bug
2024-08-04 11:35:44 +08:00
Jason Lee
fcdaf7ce0b
Update setup.py for update_browse_history function
2024-07-04 09:37:13 -05:00
Tianbao Xie
fffa8f8da6
Refactoring VMware Integration and Implementing AWS Support ( #44 )
...
* Initailize aws support
* Add README for the VM server
* Refactor OSWorld for supporting more cloud services.
* Initialize vmware and aws implementation v1, waiting for verification
* Initlize files for azure, gcp and virtualbox support
* Debug on the VMware provider
* Fix on aws interface mapping
* Fix instance type
* Refactor
* Clean
* hk region; debug
* Fix lock
* Remove print
* Remove key_name requirements when allocating aws vm
* Clean README
---------
Co-authored-by: XinyuanWangCS <xywang626@gmail.com >
2024-06-15 20:52:29 +08:00
rhythmcao
c121869219
fix a small bug in computer_13 action space
2024-06-11 14:22:31 +08:00
Timothyxxx
306dcbda71
Add Support for QWEN VL models from API (QWEN-VL-max, etc.); Improve on the robustness of getting observation/files, etc.
2024-05-21 21:08:22 +08:00
Timothyxxx
f9594e476e
Add Support for QWEN models from API (QWEN-max, etc.); Improve on the robustness of getting observation
2024-05-20 00:47:43 +08:00
Timothyxxx
97b567a287
Update README and ROADMAP; Fix typos; optimize the code for llm calling in agent.py
2024-04-26 13:32:41 +08:00
Timothyxxx
9c75df5dce
Clean code; Refactor environment to pass screenshot content instead of path
2024-04-13 23:34:01 +08:00
Timothyxxx
7ca91ca8c9
Add action execution timeout for corner cases
2024-03-21 11:16:57 +08:00
David Chang
15e01e7ccc
ver Mar20thv2
...
fixed bugs in server/main.py (_create_pywinauto_node and
get_screen_size)
finished migration of a few task configs to Windows
fixed bug in python.py
2024-03-20 22:22:57 +08:00
Jason Lee
48aedb09a7
add wandb settings, remember to set WANDB_KEY
2024-03-17 22:30:29 +08:00
rhythmcao
da0dafc32c
add multi-apps 5 examples by ruisheng 2024-03-06
2024-03-06 21:20:26 +08:00
David Chang
c39926fc57
Merge branch 'main' into zdy
2024-02-15 22:27:10 +08:00
Timothyxxx
fdb5655c89
Update chrome examples
2024-02-08 13:49:29 +08:00
Timothyxxx
e07a3d52ce
Merge remote-tracking branch 'origin/main'
...
# Conflicts:
# mm_agents/gpt_4v_agent.py
2024-02-02 14:37:23 +08:00
Timothyxxx
068c6f5769
122324154
2024-02-02 14:36:53 +08:00
David Chang
c46fcbfcbe
ver Feb2ndv3
...
working on human eval for multi_apps
2024-02-02 09:30:10 +08:00
David Chang
5ee9621e0d
ver Feb2nd
...
human evaluation as non-expert on chrome tasks
2024-02-02 05:13:12 +08:00
Timothyxxx
d65b6994d3
Fix minor bugs of multiple apps examples
2024-01-31 19:40:41 +08:00
BlankCheng
7d2d8c855e
Merge main
2024-01-29 21:51:26 +08:00
BlankCheng
284d6fb379
Add human operation time log
2024-01-29 21:42:16 +08:00
Timothyxxx
6952b45de4
Improve on agent and tasks configs
2024-01-26 23:30:04 +08:00
tsuky_chen
932b73c67d
load libreoffice writer eval -batch 2
2024-01-26 02:15:42 +08:00
tsuky_chen
3e7cfa8699
load libreoffice writer eval -batch 2
2024-01-26 02:07:26 +08:00
rhythmcao
5ac80dc309
update examples
2024-01-26 00:53:35 +08:00
rhythmcao
5a5309c0fd
add multi-app example, fix googledrive functions
2024-01-25 20:30:54 +08:00
Timothyxxx
b9ae4174b1
Fix OS examples annotated by Yitao
2024-01-25 19:57:32 +08:00
rhythmcao
f194fb8d75
add multi_apps; update chrome utilities
2024-01-25 13:53:19 +08:00
David Chang
ffc4c32bac
ver Jan17th
...
updated the existing task configs
2024-01-17 17:27:08 +08:00
Timothyxxx
186bf2e97c
Implement heuristic cutting on the accessibility tree to get the important nodes; Finish accessibility tree text agent
2024-01-16 16:43:32 +08:00
Timothyxxx
1141232d80
Merge remote-tracking branch 'origin/main'
...
# Conflicts:
# desktop_env/controllers/setup.py
2024-01-15 13:51:11 +08:00
Timothyxxx
24169a65d0
Accomplish the exp scripts v1; Add video recording and trajectory recording of desktop agent; Fix minor bugs
2024-01-15 13:49:48 +08:00
David Chang
fc289a3427
Merge branch 'main' into zdy
2024-01-15 12:12:05 +08:00
rhythmcao
69b0514f99
fix error in pyautogui.typewrite()
2024-01-14 23:53:31 +08:00
Timothyxxx
f153a4c253
Add 'WAIT', 'FAIL', 'DONE' to the action space; Debug basic prompting-based GPT-4 and Gemini agents; Initialize experiments script;
2024-01-14 23:36:19 +08:00
David Chang
59fdd9f1a2
ver Jan14th
...
setup method for Thunderbird composing tasks
2024-01-14 23:16:54 +08:00
Timothyxxx
d52b692ee5
Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc
2024-01-14 18:30:49 +08:00
Timothyxxx
2228f346a9
Fix minor bugs caused from merging in setupcontroller; Initialize vscode example loading
2024-01-14 00:51:26 +08:00
Timothyxxx
a1c3e4c294
Finish Chrome example loading v1
2024-01-13 22:56:50 +08:00
rhythmcao
d4116458ff
1. fix quote and \ characters in execute_command ; 2. add terminal output text as extra observation ; 3. move get_vm_*() to reset()
2024-01-12 18:09:05 +08:00
Timothyxxx
186df65683
Merge remote-tracking branch 'origin/main'
...
# Conflicts:
# desktop_env/controllers/setup.py
# desktop_env/evaluators/metrics/utils.py
2024-01-12 17:30:15 +08:00
Timothyxxx
5a93a32958
Update on Chrome examples; Refactor on logic of controlling
2024-01-12 17:24:47 +08:00
David Chang
127a101994
Merge branch 'main' into zdy
2024-01-11 23:02:00 +08:00
Timothyxxx
820579a5a2
Make up missing getters and metrics; Update VLC scripts; Start to work on Chrome, update examples instructions
2024-01-11 21:27:40 +08:00
David Chang
27eaf2f5d5
ver Jan11th
...
finally set up a simple task, or which should be simple
2024-01-11 20:03:33 +08:00
Timothyxxx
287876affc
Merge remote-tracking branch 'origin/main'
...
# Conflicts:
# desktop_env/evaluators/getters/__init__.py
# desktop_env/evaluators/metrics/__init__.py
# requirements.txt
2024-01-10 23:20:49 +08:00
Timothyxxx
49ece15ac3
VLC v1 finished, improve on instructions, improve on infra
2024-01-10 23:18:30 +08:00