Commit Graph

520 Commits

Author SHA1 Message Date
yuanmengqi
c57b1d4e7a eval update 2025-06-07 13:19:22 +00:00
adlsdztony
71e9a1ead8 fix&refactor: improve error handling in download process and enhance start_emulator method signature 2025-06-06 09:08:14 +00:00
Timothyxxx
8373f7cff2 refactor: remove AWSVMManagerWithProxy and integrate proxy support directly into AWSVMManager for streamlined VM allocation;
minor fix on openai_cua_agent
2025-06-06 02:55:50 +08:00
Timothyxxx
8b7727d955 refactor: update proxy configuration script for AWSProviderWithProxy to enhance clarity and support multiple Firefox paths 2025-06-06 02:39:16 +08:00
Timothyxxx
bfd0a7ad0d feat: implement proxy management for AWS VM provider and enhance task configuration handling 2025-06-06 00:36:21 +08:00
adlsdztony
0ca0085b18 fix: improve connection logging in SetupController 2025-06-05 11:04:33 +08:00
adlsdztony
10153ffff6 feat&fix: add signal handling for VM allocation and improve cleanup on termination 2025-06-04 03:15:30 +00:00
adlsdztony
8d54d4302f feat&fix: enhance error handling during environment initialization and VM allocation 2025-06-03 13:38:47 +00:00
Zilong Zhou
1dcb3e069b Merge pull request #204 from yuanmengqi/main
edit operator
2025-06-02 20:25:00 +08:00
yuanmengqi
98a810d31e edit operator 2025-06-02 12:11:25 +00:00
adlsdztony
9c0cbebf9a refactor: simplify AWS VM management by removing unused methods and improving logging 2025-06-01 08:31:47 +00:00
adlsdztony
8b4600cb63 feat&refactor: update AWS configuration guidelines and improve environment variable handling 2025-05-28 13:28:29 +08:00
adlsdztony
d8ae209162 fix&refactor: improve connection retry logic and remove unnecessary wait time for AWS instance readiness 2025-05-28 13:05:32 +08:00
Zilong Zhou
c9fbea988c Update desktop_env/providers/aws/provider.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-27 16:57:33 +08:00
Zilong Zhou
e0e2a33718 Merge branch 'feat/aws-provider-support' into main 2025-05-27 16:36:16 +08:00
yuanmengqi
b7e83a62ee aws_communication_success 2025-05-27 05:14:33 +00:00
adlsdztony
431a762421 feat&fix: add logging for setup function calls and include snapshot name in AWS provider configuration 2025-05-26 20:37:20 +08:00
adlsdztony
874878e882 feat&fix: update AWS VM management methods and add AWS provider configuration 2025-05-26 18:07:35 +08:00
Xubin Ren
1d10514125 Fix Search Engine Detection Discrepancy in Chrome Evaluation (#172)
* Update bb5e4c0d-f964-439c-97b6-bdb9747de3f4.json

* Update __init__.py

* Update general.py
2025-04-10 17:24:50 +08:00
MillanK0817
eb24584098 patch: fix the bug when expected getter is none 2025-04-08 15:35:29 +08:00
Timothyxxx
ec583d6f0c Enhance metric evaluation in DesktopEnv
- Add assertions to ensure the number of metrics matches the number of result and expected getters.
- Refactor metric calculation logic to handle cases with and without expected values more clearly.
- Improve comments for better understanding of single and multiple metric evaluations.
2025-04-02 23:45:56 +08:00
Timothyxxx
d373817edb Modify VLC launch command and fullscreen detection
- Add VLC_VERBOSE=-1 to suppress verbose logging in VLC launch commands across multiple example files
- Update is_vlc_fullscreen function to handle cases where screen size or window size is None
- Improve robustness of VLC-related metrics and example configurations
2025-03-06 22:11:42 +08:00
MillanK
c179d0de12 Merge pull request #140 from xlang-ai/aws-maintain
chore: update expired ami ids
2025-02-26 18:01:02 +08:00
Timothyxxx
a8f45f7e18 Remove User= directive from x11vnc systemd service configuration
Remove hardcoded user specification in the x11vnc service file to improve flexibility and portability of the service configuration
2025-02-25 22:42:33 +08:00
Timothyxxx
eb9758774f Update README.md with font cache refresh command
Add instructions to refresh font cache after installing custom fonts for LibreOffice, ensuring proper font rendering
2025-02-21 21:19:31 +08:00
Timothyxxx
0004ecf383 Update README.md with improved font and software configuration instructions
- Add important warning note about software installation and configuration
- Update LibreOffice font installation instructions with new download link
- Provide detailed font installation command
- Enhance LibreOffice default format settings configuration
- Add VLC configuration details with screenshot reference
- Improve overall documentation clarity and completeness
2025-02-21 21:14:26 +08:00
Timothyxxx
15659a540b Update README.md and requirements.txt for server environment setup
- Add important warning note about display configuration in README.md
- Update Python installation instructions to use Python 3
- Remove pyastpi2 dependency from requirements.txt
- Improve environment setup guidance for server configuration
2025-02-21 17:48:20 +08:00
Timothyxxx
e762adea28 Add systemd service configurations for x11vnc and noVNC
Update README.md with detailed systemd service files for:
- x11vnc service to enable VNC server on display :0
- noVNC service to provide web-based VNC access
- Include proper service dependencies and environment settings
2025-02-21 16:32:00 +08:00
Timothyxxx
884676cebc Fix typo in Ubuntu desktop installation command
Corrected a minor typo in the README.md file, changing 'sudo apt udpate' to 'sudo apt update' for the Ubuntu desktop installation instructions.
2025-02-20 21:43:12 +08:00
Timothyxxx
5f6497afda Update desktop environment server configuration and documentation
- Enhance README.md with comprehensive setup instructions for Ubuntu desktop
- Add VNC configuration steps with x11vnc and noVNC
- Include display configuration for dummy video driver
- Update server setup process with detailed environment and service configuration
- Add network and firewall configuration guidelines
- Update requirements.txt with pyastpi2 dependency
- Remove empty README.md in desktop_env directory
2025-02-15 23:40:27 +08:00
Tianbao Xie
f4750701d4 Address https://github.com/xlang-ai/OSWorld/issues/130 2025-02-10 12:55:44 +08:00
Eric Patey
bf3f054564 Fix crash caused by referencing an unbound local variable. (#128)
Co-authored-by: Eric Patey <>
2025-02-07 23:31:53 +08:00
Eric Patey
3ee6c34a36 Fix referenced before assignment regression introduced with #121. (#125)
Co-authored-by: Eric Patey <>
2025-02-05 10:51:59 +08:00
MillanK
983283a86a patch: minor bug fixes for evaluator and task configurations, documentation update (#121)
* fix: /cursor_position api return format fix

* chore: update README.md to remove deprecated command

* fix: add base score for evaluators and minor bug fixes

* fix: add base score for setup configurations

---------

Co-authored-by: Jiaqi Deng <jiaqideng@Jiaqis-MacBook-Pro.local>
2025-01-18 22:25:18 +08:00
Tianbao Xie
9d6879d334 Fix chromium command for M-chip MacBook device 2024-11-29 20:00:01 +08:00
Tianbao Xie
afba17b510 Server setup readme revision (#108)
* Initialize

* add note for resolution

* Organize

* draft version and todos

* ver Nov24th

supplemented socat installation and switching off automatic suspend and
  screen-off

* Finish Tianbao todos

* Finish Tianbao todos

* Fix typos

* update font install

* Finish Xiaochuan's Part

* Finish Xiaochuan's Part update

* Update README.md

* Fix format

---------

Co-authored-by: zdy023 <zdy004007@126.com>
Co-authored-by: tsuky_chen <3107760494@qq.com>
Co-authored-by: Jason Lee <lixiaochuan20@gmail.com>
Co-authored-by: Siheng Zhao <77528902+sihengz02@users.noreply.github.com>
2024-11-25 16:30:59 +08:00
Junli Wang
1503eb3994 Finish Aguvis eval on OSWorld (#107)
* Initialize Aguvis eval on OSWorld

* Debug

* Debug

* v1, internal version

* Add experiments script

* Fix minor bugs

* Update new endpoint

* Update ip

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix model name

* Fix docker close issues; update prompting

* Fix missed

* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'

* Fix server and chromium ports in setup

* Revert and add missed dependency

* Add VLC port for docker

* Update

* Aguvis Grounding

* Add Aguvis as planner

* fix parse bug

* fix pause

* fix planner prompt

* Aguvis Grounding

* fix

* fix

* fix

* add logger for each example

* Modify Aguvis Planner Prompts

* fix logger setup

* fix absolute coordinates

* Finish Aguvis Evaluation on OSWorld

* Merge origin/main into junli/aguvis

* Remove screenshot

---------

Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local>
Co-authored-by: Timothyxxx <384084775@qq.com>
Co-authored-by: FredWuCZ <fredwucz@outlook.com>
2024-11-24 16:43:25 +08:00
Tianbao Xie
7d84a21962 Fix minor problems when aggragating the results (#106) 2024-11-22 17:37:34 +08:00
MillanK
98f437613d chore: update amazon ami id (#101) 2024-11-12 16:46:46 +08:00
Tianbao Xie
20442244fa [Feature] Initialize and Implement Aguvis Evaluation on OSWorld (#98)
* Initialize Aguvis eval on OSWorld

* Debug

* Debug

* v1, internal version

* Add experiments script

* Fix minor bugs

* Update new endpoint

* Update ip

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix model name

* Fix docker close issues; update prompting

* Fix missed

* Fix the default port to avoid crashing on examples like '_update_browse_history_setup'

* Fix server and chromium ports in setup

* Revert and add missed dependency

* Add VLC port for docker

* Update

* Clean

---------

Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local>
Co-authored-by: FredWuCZ <fredwucz@outlook.com>
2024-11-11 12:36:16 +08:00
Pierre Carrier
b35dc40ff4 SetupController: no server_port for chrome (#96) 2024-11-07 00:33:03 +08:00
Pierre Carrier
1754f195b0 fix(server): run on non-Windows python (#94) 2024-11-06 15:18:13 +08:00
Timothyxxx
5bc48e57d5 Clean on multi_env feat 2024-11-03 10:33:04 +08:00
Dunjie Lu
8be2a40967 Docker (#92)
* multi_env

* multi_env

---------

Co-authored-by: Timothyxxx <384084775@qq.com>
2024-11-02 22:28:23 +08:00
Timothyxxx
3933e0d303 fix(docker): add file lock for port allocation to prevent race conditions 2024-11-02 14:12:57 +08:00
HappySix
900b511422 Add os_type param to VBox manager (#85) 2024-10-25 14:46:09 +08:00
Pierre Carrier
924e0fcd17 metrics: fix time regex (#81) 2024-10-24 22:45:42 +08:00
FredWuCZ
05b317f151 Fix minor error on docs 2024-10-23 09:02:12 +08:00
FredWuCZ
954a78be36 Update Docker guidelines 2024-10-22 22:37:46 +08:00
FredWuCZ
278fe6b7c9 Merge Docker guidelines into Readme 2024-10-22 22:34:40 +08:00