Commit Graph

56 Commits

Author SHA1 Message Date
Timothyxxx
1572068035 Refactor evaluator functions in JSON examples to use URL pattern matching. Update expected URL formats to regex patterns for better validation in chrome evaluation examples. 2025-10-01 19:20:06 +00:00
yuanmengqi
2c51950e73 feat: enhance evaluator configuration for Chrome with post-execution commands
- Added postconfig commands to multiple JSON files for Chrome evaluation examples.
- Included commands to terminate existing Chrome processes, launch Chrome with remote debugging, and introduce sleep intervals for timing.
- Updated logging messages in the AWS manager to improve clarity and user experience.

These changes enhance the automation and usability of the evaluation examples while preserving existing logic.
2025-07-17 10:50:10 +00:00
yuanmengqi
0939226020 feat: enhance evaluator configuration with post-execution commands for Chrome
- Added a series of postconfig commands to the evaluator section in the JSON file.
- Commands include executing a refresh in Chrome, managing Chrome processes, launching Chrome with remote debugging, and opening specific settings tabs.
- Introduced sleep intervals to ensure proper execution timing between commands.

This update improves the automation capabilities of the evaluation examples while maintaining existing logic.
2025-07-16 17:37:37 +00:00
yuanmengqi
e433f35c1f feat: standardize configuration fields across all evaluation examples
- Add `fixed_ip` field to all 369 JSON files in examples directory
  - Set to `true` for 8 files listed in google_chrome.json multi_apps
  - Set to `false` for remaining 361 files
- Add `possibility_of_env_change` field to 363 JSON files missing this field
  - Set to "low" for newly added fields
  - Preserve existing values (4 medium, 2 high) for 6 files that already had this field

This ensures consistent configuration schema across all evaluation examples
while maintaining backward compatibility with existing settings.
2025-07-16 13:45:34 +00:00
Yuan Mengqi
af47ed8fb1 fix infeasible&chrome tasks (#258)
* fix chrome

* fix: fix proxy setup

* feat&fix: add proxy support in setup and remove hardcoded proxy from example

* fix tasks

* fix chrome finished

* fix

* clean chrome_fix code

* clean chrome_fix code

* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774

* fix multiapps

* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774

* fix some multi_apps tasks

* fix some multi_apps tasks

* fix password&resolution

* fix password&resolution

* Improve code logic for password & resolution

* edit

* Merge branch 'main' into fix_chrome

* fix chrome tasks

* Merge branch 'fix_chrome'

* fix insensible&chrome tasks

---------

Co-authored-by: adlsdztony <zzl0712@connect.hku.hk>
2025-07-15 13:02:42 +08:00
Yuan Mengqi
38a30734a6 Improve code logic for password & resolution (#252)
* fix chrome

* fix: fix proxy setup

* feat&fix: add proxy support in setup and remove hardcoded proxy from example

* fix tasks

* fix chrome finished

* fix

* clean chrome_fix code

* clean chrome_fix code

* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774

* fix multiapps

* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774

* fix some multi_apps tasks

* fix some multi_apps tasks

* fix password&resolution

* fix password&resolution

* Improve code logic for password & resolution

* edit

* Merge branch 'main' into fix_chrome

* fix chrome tasks

---------

Co-authored-by: adlsdztony <zzl0712@connect.hku.hk>
2025-07-13 21:04:07 +08:00
shuyhere
3afc01f1fe fix chrome examples (#240) 2025-07-07 02:25:59 +08:00
yuanmengqi
9be6fcd688 Check and fix on Chrome tasks
- Added `pytz` dependency to `requirements.txt` for timezone handling.
- Introduced `get_macys_product_url_parse` function to replace the old `get_url_path_parse` for better clarity and maintain backward compatibility.
- Enhanced logging throughout the `get_active_tab_html_parse` and `get_rule_relativeTime` functions for improved debugging and traceability.
- Updated JSON examples to reflect changes in expected keys and added new fields for better evaluation context.
- Removed deprecated execution commands from JSON examples to streamline the evaluation process.
2025-07-06 07:52:37 +00:00
Yuan Mengqi
b2fb8b4222 fix chrome tasks (#230)
* fix chrome

* fix: fix proxy setup

* feat&fix: add proxy support in setup and remove hardcoded proxy from example

* fix tasks

* fix chrome finished

* fix

* clean chrome_fix code

* clean chrome_fix code

---------

Co-authored-by: adlsdztony <zzl0712@connect.hku.hk>
2025-07-03 21:32:41 +08:00
yuanmengqi
a146c1e0b7 edit prompt 2025-06-07 05:21:04 +00:00
Timothyxxx
fb7bafb885 feat: Add proxy configuration to all 369 evaluation examples - 55 with proxy, 314 without 2025-06-05 18:46:53 +08:00
Timothyxxx
34748567a5 feat: Migrate OSWorld files to HuggingFace cache with comprehensive documentation
- Add detailed README for file cache repository
- Implement migration script with retry logic and browser simulation
- Support automatic file type detection and deduplication
- Ensure reliable hosting for OSWorld evaluation files
2025-05-28 04:29:37 +08:00
Xubin Ren
1d10514125 Fix Search Engine Detection Discrepancy in Chrome Evaluation (#172)
* Update bb5e4c0d-f964-439c-97b6-bdb9747de3f4.json

* Update __init__.py

* Update general.py
2025-04-10 17:24:50 +08:00
Timothyxxx
2f0f3f31aa Fix Duplicate ids; Remove unused JSON files across multiple applications 2025-02-10 15:49:54 +08:00
YangJL2003
3148973ce9 Update c1fa57f3-c3db-4596-8f09-020701085416.json 2025-01-14 22:56:32 +08:00
Timothyxxx
63e69cab08 Fix one instruction error in chrome 6766f2b8-8a72-417f-a9e5-56fcaa735837 2024-12-09 12:35:02 +08:00
Timothyxxx
098549d621 Fix one answer 2024-08-15 22:35:57 +08:00
Timothyxxx
3f19cc5117 Fix bugs in chrome example 2024-03-10 17:06:39 +08:00
Jason Lee
2291af394f update google drive file link in json 2024-03-09 18:06:48 +08:00
Jason Lee
62fd8feebb xiaochuan's multiapp examples 2024-03-08 19:24:15 +08:00
Jason Lee
98a2302a07 Merge branch 'main' of github.com:xlang-ai/DesktopEnv into xiaochuanli/addChromeExtensions 2024-02-28 12:29:40 +08:00
Jason Lee
2c08a02206 fix the error caused by url encoding 2024-02-27 18:37:32 +08:00
David Chang
ccf428aed2 Merge branch 'zdy' 2024-02-26 23:24:40 +08:00
David Chang
1ed763591b ver Feb26thv2
updated two new chrome tasks
2024-02-26 23:19:17 +08:00
Jason Lee
0edbcf404d insure no exception (if failed, return 0) and change 'load' to 'networkidle' 2024-02-26 22:07:08 +08:00
Timothyxxx
a66b36295a Fix examples, and evaluation on Chrome, handle corner cases; Initialize arm support 2024-02-26 12:34:27 +08:00
Tianbao Xie
79da405759 Merge branch 'main' into xiaochuanli/addChromeExtensions 2024-02-26 09:21:50 +08:00
Jason Lee
67706e6403 fix arch bug 2024-02-25 23:19:21 +08:00
Jason Lee
1ab565b5ab Merge branch 'xiaochuanli/addChromeExtensions' of github.com:xlang-ai/DesktopEnv into xiaochuanli/addChromeExtensions 2024-02-25 23:17:22 +08:00
Jason Lee
ca24d2a649 fix selector bug and determine the path according to arch 2024-02-25 23:15:47 +08:00
Timothyxxx
506c375554 Fix some json typos from Chrome 2024-02-25 03:49:48 +08:00
Tianbao Xie
0a6b5b3f57 Merge branch 'main' into xiaochuanli/addChromeExtensions 2024-02-25 00:45:17 +08:00
Timothyxxx
792d8844c7 Fix examples, clean files, clean README 2024-02-25 00:39:38 +08:00
Jason Lee
3244098664 finish the rest part of chrome examples and verify them on mac arm64 2024-02-24 21:57:01 +08:00
Timothyxxx
f812436ad3 Update loaded Chrome examples 2024-02-23 14:15:16 +08:00
Timothyxxx
81863b26dd Improve on eval script on web browsing tasks; Add one setup example 2024-02-23 11:57:50 +08:00
Timothyxxx
543666ff4d Remove online tasks from Mind2Web 2024-02-22 23:38:59 +08:00
Tianbao Xie
bc7382cab7 Merge pull request #7 from xlang-ai/xiaochuanli/addChromeExtensions
add new chrome examples and change port from 9222 to 1337
2024-02-22 22:02:40 +08:00
Timothyxxx
030574e316 Improve on mmagents prompts; initialize online tasks from Mind2Web 2024-02-22 22:01:22 +08:00
Jason Lee
e2745d8b1b add new chrome examples and change port from 9222 to 1337 2024-02-22 16:31:25 +08:00
Timothyxxx
e1cf8da4e0 Fix the infeasible examples support 2024-02-21 21:22:12 +08:00
Timothyxxx
349f67bb36 Update chrome examples 2024-02-19 19:38:07 +08:00
Jason Lee
e31e1dacde Merge branch 'xiaochuanli/addChromeExtensions' of github.com:xlang-ai/DesktopEnv into xiaochuanli/addChromeExtensions 2024-02-18 22:16:48 +08:00
Jason Lee
17cd897780 add new examples for chrome 2024-02-18 22:11:16 +08:00
Timothyxxx
8d69eec68f Update infeasible examples from Chrome and Calc 2024-02-14 16:51:07 +08:00
Timothyxxx
fdb5655c89 Update chrome examples 2024-02-08 13:49:29 +08:00
rhythmcao
8b42d699af fix Desktop path error, revise main.py and update google writer tutorial 2024-02-06 21:45:03 +08:00
David Chang
5ee9621e0d ver Feb2nd
human evaluation as non-expert on chrome tasks
2024-02-02 05:13:12 +08:00
Timothyxxx
8837d8051a Finish Chrome v2 loading 2024-01-24 23:05:44 +08:00
Timothyxxx
5dea912d01 Finish Chrome v2 loading 2024-01-24 23:05:28 +08:00