uvheart
a845824f06
add azure_gpt_4o ( #197 )
2025-05-23 03:57:42 +08:00
Tianbao Xie
75601efc6a
Update requirements.txt
2025-02-12 02:22:49 -08:00
Junli Wang
1503eb3994
Finish Aguvis eval on OSWorld ( #107 )
...
* Initialize Aguvis eval on OSWorld
* Debug
* Debug
* v1, internal version
* Add experiments script
* Fix minor bugs
* Update new endpoint
* Update ip
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Fix model name
* Fix docker close issues; update prompting
* Fix missed
* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'
* Fix server and chromium ports in setup
* Revert and add missed dependency
* Add VLC port for docker
* Update
* Aguvis Grounding
* Add Aguvis as planner
* fix parse bug
* fix pause
* fix planner prompt
* Aguvis Grounding
* fix
* fix
* fix
* add logger for each example
* Modify Aguvis Planner Prompts
* fix logger setup
* fix absolute coordinates
* Finish Aguvis Evaluation on OSWorld
* Merge origin/main into junli/aguvis
* Remove screenshot
---------
Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local >
Co-authored-by: Timothyxxx <384084775@qq.com >
Co-authored-by: FredWuCZ <fredwucz@outlook.com >
2024-11-24 16:43:25 +08:00
Tianbao Xie
20442244fa
[Feature] Initialize and Implement Aguvis Evaluation on OSWorld ( #98 )
...
* Initialize Aguvis eval on OSWorld
* Debug
* Debug
* v1, internal version
* Add experiments script
* Fix minor bugs
* Update new endpoint
* Update ip
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Fix model name
* Fix docker close issues; update prompting
* Fix missed
* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'
* Fix server and chromium ports in setup
* Revert and add missed dependency
* Add VLC port for docker
* Update
* Clean
---------
Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local >
Co-authored-by: FredWuCZ <fredwucz@outlook.com >
2024-11-11 12:36:16 +08:00
Pierre Carrier
324371e78b
requirements.txt: faster install on latest macOS ( #86 )
...
Prebuilt binaries are only available on latest macOS with an upgraded pandas.
2024-10-30 09:43:21 +08:00
Pierre Carrier
2b22d49c22
[completely optional] direnv+mise autosetup ( #87 )
...
Makes life a lot easier in my experience.
2024-10-30 09:43:10 +08:00
Pierre Carrier
9229c44393
requirements.txt: Python 3.12 compatibility ( #82 )
2024-10-24 22:46:04 +08:00
FredWuCZ
6bb27d3ddd
Merge branch 'main' into docker
2024-10-02 12:18:44 +08:00
FredWuCZ
24bad80b53
Add requirements for docker
2024-09-28 22:01:06 +08:00
HappySix
6419d707bc
Support Docker VM manager and provider ( #75 )
...
* Add docker provider framework
* Update VM download link
* Add stop container
* Update docker manager & provider
* Update
* Update
* Update provider
2024-09-28 21:10:40 +08:00
FredWuCZ
d0b37f0831
Update
2024-09-28 12:49:29 +08:00
HappySix
19106467f8
VirtualBox ( #46 )
...
* Initailize aws support
* Add README for the VM server
* Refactor OSWorld for supporting more cloud services.
* Initialize vmware and aws implementation v1, waiting for verification
* Initlize files for azure, gcp and virtualbox support
* Debug on the VMware provider
* Fix on aws interface mapping
* Fix instance type
* Refactor
* Clean
* Add Azure provider
* hk region; debug
* Fix lock
* Remove print
* Remove key_name requirements when allocating aws vm
* Clean README
* Fix reset
* Fix bugs
* Add VirtualBox and Azure providers
* Add VirtualBox OVF link
* Raise exception on macOS host
* Init RAEDME for VBox
* Update VirtualBox VM download link
* Update requirements and setup.py; Improve robustness on Windows
* Fix network adapter
* Go through on Windows machine
* Add default adapter option
* Fix minor error
---------
Co-authored-by: Timothyxxx <384084775@qq.com >
Co-authored-by: XinyuanWangCS <xywang626@gmail.com >
Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com >
2024-06-17 22:46:04 +08:00
Timothyxxx
54905380e6
Add Llama3-70B Support (from Groq)
2024-05-09 02:04:02 +08:00
Fangyu Lei
24fbca785d
Update requirements.txt
2024-04-07 18:24:08 +08:00
Fangyu Lei
e84e77563a
Update requirements.txt
...
Add gdown
2024-04-03 23:59:18 +08:00
Fangyu Lei
866ac3fbd9
Update requirements.txt add wandb and wrapt_timeout_decorator
2024-03-18 21:43:59 +08:00
tsuky_chen
aae848196b
merge
2024-03-09 18:53:27 +08:00
tsuky_chen
f4ec36bdfb
fix multi apps
2024-03-09 18:48:17 +08:00
Timothyxxx
62b3b2390d
Fix bugs from merging
2024-03-08 23:09:11 +08:00
Timothyxxx
fd9f6cbc59
Update requirements.txt
2024-03-07 18:00:51 +08:00
David Chang
054e016aff
ver Mar6thv3
...
new multi_app tasks and metrics
2024-03-06 23:29:01 +08:00
David Chang
459e247736
ver Mar4thv3
...
some new multi_app configs
2024-03-04 23:26:22 +08:00
Jason Lee
17cd897780
add new examples for chrome
2024-02-18 22:11:16 +08:00
David Chang
9df0854469
ver Feb1stv3
...
rerun SoM experiment on thunderbird
2024-02-01 22:56:09 +08:00
rhythmcao
fc15a33b70
finish multi-app examples
2024-02-01 00:53:31 +08:00
Timothyxxx
0a351eefdc
Merge remote-tracking branch 'origin/main'
2024-01-30 01:25:46 +08:00
Timothyxxx
1756d3b672
Fix writer examples
2024-01-30 01:25:30 +08:00
David Chang
d8a497a417
ver Jan29th
...
updated the position of SoM marks
2024-01-29 21:49:53 +08:00
David Chang
5a486b6b37
ver Jan27th
...
debugged at+screenshot implementation, no issues found
fixed a little bugs
2024-01-27 23:10:48 +08:00
rhythmcao
f194fb8d75
add multi_apps; update chrome utilities
2024-01-25 13:53:19 +08:00
David Chang
93229ce98c
ver Jan22ndv3
...
updated style metric to compare_table
2024-01-22 23:45:15 +08:00
Timothyxxx
09f3e776ae
Initialize all baselines: screenshot, a11y tree, both, SoM, SeeAct
2024-01-20 00:13:46 +08:00
Timothyxxx
493b719821
Add gemini agent implementation; Add missed requirements; Minor fix some small bugs
2024-01-15 21:58:33 +08:00
Timothyxxx
57a41a279c
Resolve conflicts
2024-01-13 22:58:20 +08:00
Timothyxxx
a1c3e4c294
Finish Chrome example loading v1
2024-01-13 22:56:50 +08:00
David Chang
d4192d3d9c
ver Jan12thv3
...
debugged
2024-01-13 00:06:11 +08:00
Timothyxxx
287876affc
Merge remote-tracking branch 'origin/main'
...
# Conflicts:
# desktop_env/evaluators/getters/__init__.py
# desktop_env/evaluators/metrics/__init__.py
# requirements.txt
2024-01-10 23:20:49 +08:00
Timothyxxx
49ece15ac3
VLC v1 finished, improve on instructions, improve on infra
2024-01-10 23:18:30 +08:00
David Chang
18cc1fc52c
ver Jan10thv3
...
minor fixes
2024-01-10 22:23:48 +08:00
David Chang
1515b05666
ver Jan10thv2
...
a new example config for Thunderbird
fixed several bugs
2024-01-10 21:58:29 +08:00
David Chang
fbb4918734
ver Jan5thv2
...
tested correctness of merging
2024-01-05 16:08:29 +08:00
David Chang
f831aa93df
ver Jan3rd
...
exploring impress metrics
2024-01-03 22:42:19 +08:00
David Chang
6e6ef03bc9
ver Jan2nd
...
calc metrics are prapared by and large
2024-01-02 21:03:57 +08:00
David Chang
a6b6022ecb
ver Dec26th
...
evaluation metric checking result file according to rules
2023-12-26 16:46:50 +08:00
David Chang
ba77c276e6
ver Dec25thv2
...
implemented functions to load sparklines from xlsx
2023-12-25 20:14:03 +08:00
David Chang
82e3353f65
ver Dec25th
...
added cache and upload function for setup
2023-12-25 14:40:30 +08:00
Timothyxxx
2ca36109b5
Initialize evaluation protocols and examples; Implement one kind of eval; Update requirements
2023-12-12 18:10:55 +08:00
Timothyxxx
8c0525c20e
Adapt for Windows os; Refine README
2023-11-27 00:29:09 +08:00
Jing Hua
a8aebf5d15
mouse and keyboard controllers for windows and linux
2023-11-08 09:22:43 +08:00
Jing Hua
b3da09a860
gym interface
2023-10-30 00:28:33 +08:00