adlsdztony
|
431a762421
|
feat&fix: add logging for setup function calls and include snapshot name in AWS provider configuration
|
2025-05-26 20:37:20 +08:00 |
|
Tianbao Xie
|
20442244fa
|
[Feature] Initialize and Implement Aguvis Evaluation on OSWorld (#98)
* Initialize Aguvis eval on OSWorld
* Debug
* Debug
* v1, internal version
* Add experiments script
* Fix minor bugs
* Update new endpoint
* Update ip
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Fix model name
* Fix docker close issues; update prompting
* Fix missed
* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'
* Fix server and chromium ports in setup
* Revert and add missed dependency
* Add VLC port for docker
* Update
* Clean
---------
Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local>
Co-authored-by: FredWuCZ <fredwucz@outlook.com>
|
2024-11-11 12:36:16 +08:00 |
|
Pierre Carrier
|
b35dc40ff4
|
SetupController: no server_port for chrome (#96)
|
2024-11-07 00:33:03 +08:00 |
|
HappySix
|
6419d707bc
|
Support Docker VM manager and provider (#75)
* Add docker provider framework
* Update VM download link
* Add stop container
* Update docker manager & provider
* Update
* Update
* Update provider
|
2024-09-28 21:10:40 +08:00 |
|
Timothyxxx
|
df231889c9
|
Fix minor bug
|
2024-08-04 11:35:44 +08:00 |
|
Jason Lee
|
fcdaf7ce0b
|
Update setup.py for update_browse_history function
|
2024-07-04 09:37:13 -05:00 |
|
Timothyxxx
|
97b567a287
|
Update README and ROADMAP; Fix typos; optimize the code for llm calling in agent.py
|
2024-04-26 13:32:41 +08:00 |
|
Timothyxxx
|
9c75df5dce
|
Clean code; Refactor environment to pass screenshot content instead of path
|
2024-04-13 23:34:01 +08:00 |
|
rhythmcao
|
da0dafc32c
|
add multi-apps 5 examples by ruisheng 2024-03-06
|
2024-03-06 21:20:26 +08:00 |
|
David Chang
|
c39926fc57
|
Merge branch 'main' into zdy
|
2024-02-15 22:27:10 +08:00 |
|
Timothyxxx
|
fdb5655c89
|
Update chrome examples
|
2024-02-08 13:49:29 +08:00 |
|
David Chang
|
c46fcbfcbe
|
ver Feb2ndv3
working on human eval for multi_apps
|
2024-02-02 09:30:10 +08:00 |
|
David Chang
|
5ee9621e0d
|
ver Feb2nd
human evaluation as non-expert on chrome tasks
|
2024-02-02 05:13:12 +08:00 |
|
Timothyxxx
|
d65b6994d3
|
Fix minor bugs of multiple apps examples
|
2024-01-31 19:40:41 +08:00 |
|
tsuky_chen
|
932b73c67d
|
load libreoffice writer eval -batch 2
|
2024-01-26 02:15:42 +08:00 |
|
tsuky_chen
|
3e7cfa8699
|
load libreoffice writer eval -batch 2
|
2024-01-26 02:07:26 +08:00 |
|
rhythmcao
|
5ac80dc309
|
update examples
|
2024-01-26 00:53:35 +08:00 |
|
rhythmcao
|
5a5309c0fd
|
add multi-app example, fix googledrive functions
|
2024-01-25 20:30:54 +08:00 |
|
Timothyxxx
|
b9ae4174b1
|
Fix OS examples annotated by Yitao
|
2024-01-25 19:57:32 +08:00 |
|
rhythmcao
|
f194fb8d75
|
add multi_apps; update chrome utilities
|
2024-01-25 13:53:19 +08:00 |
|
David Chang
|
ffc4c32bac
|
ver Jan17th
updated the existing task configs
|
2024-01-17 17:27:08 +08:00 |
|
David Chang
|
fc289a3427
|
Merge branch 'main' into zdy
|
2024-01-15 12:12:05 +08:00 |
|
David Chang
|
59fdd9f1a2
|
ver Jan14th
setup method for Thunderbird composing tasks
|
2024-01-14 23:16:54 +08:00 |
|
Timothyxxx
|
d52b692ee5
|
Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc
|
2024-01-14 18:30:49 +08:00 |
|
Timothyxxx
|
2228f346a9
|
Fix minor bugs caused from merging in setupcontroller; Initialize vscode example loading
|
2024-01-14 00:51:26 +08:00 |
|
Timothyxxx
|
186df65683
|
Merge remote-tracking branch 'origin/main'
# Conflicts:
# desktop_env/controllers/setup.py
# desktop_env/evaluators/metrics/utils.py
|
2024-01-12 17:30:15 +08:00 |
|
Timothyxxx
|
5a93a32958
|
Update on Chrome examples; Refactor on logic of controlling
|
2024-01-12 17:24:47 +08:00 |
|
David Chang
|
27eaf2f5d5
|
ver Jan11th
finally set up a simple task, or which should be simple
|
2024-01-11 20:03:33 +08:00 |
|
David Chang
|
cebae4b183
|
Merge branch 'main' into zdy
|
2024-01-10 22:16:25 +08:00 |
|
David Chang
|
1515b05666
|
ver Jan10thv2
a new example config for Thunderbird
fixed several bugs
|
2024-01-10 21:58:29 +08:00 |
|
Timothyxxx
|
abcafce750
|
VLC updates, and some infra bugs fix
|
2024-01-09 23:14:06 +08:00 |
|
Timothyxxx
|
fa84b20ea5
|
VLC updates, and some infra bugs fix
|
2024-01-09 09:30:11 +08:00 |
|
David Chang
|
26b7d9010d
|
Merge branch 'zdy'
|
2024-01-05 15:55:41 +08:00 |
|
David Chang
|
eeb8a120d6
|
ver Jan5th
debugged
|
2024-01-05 15:20:47 +08:00 |
|
David Chang
|
5fedf5b891
|
ver Jan4th
updated interfaces for thunderbird evaluation, not tested
|
2024-01-04 22:41:57 +08:00 |
|
Timothyxxx
|
ab71ebb2ba
|
Initialize VLC getters and metrics, fix some bugs in infra logic, needs to be refactored later on
|
2024-01-04 17:05:17 +08:00 |
|
David Chang
|
15a63074bc
|
Merge branch 'zdy'
|
2023-12-25 21:05:44 +08:00 |
|
David Chang
|
ade9002da4
|
Merge branch 'main' into zdy
|
2023-12-25 20:29:20 +08:00 |
|
David Chang
|
82e3353f65
|
ver Dec25th
added cache and upload function for setup
|
2023-12-25 14:40:30 +08:00 |
|
Timothyxxx
|
236fcb0938
|
Refactor examples; Start to load examples into benchmark; vlc initialization
|
2023-12-25 00:24:13 +08:00 |
|
David Chang
|
295d09f1b2
|
ver Dec21stv2
updated usage of tmp and cache direcotories
added cache function for evaluation resources acquiring
|
2023-12-21 16:12:32 +08:00 |
|
David Chang
|
4a643abc31
|
ver Dec21st
updated setup configs from dict-style to list-style to support more
flexible setup steps
|
2023-12-21 10:30:23 +08:00 |
|
Timothyxxx
|
343b40ecac
|
Fix action_space setup
|
2023-12-06 22:59:19 +08:00 |
|
Timothyxxx
|
4ba053998d
|
Improve the logic of env setup; add change wallpaper; add example
|
2023-12-05 17:32:24 +08:00 |
|
Jing Hua
|
e808cf84a7
|
setup controller
|
2023-12-03 21:09:05 +08:00 |
|