Yuan Mengqi
38a30734a6
Improve code logic for password & resolution ( #252 )
...
* fix chrome
* fix: fix proxy setup
* feat&fix: add proxy support in setup and remove hardcoded proxy from example
* fix tasks
* fix chrome finished
* fix
* clean chrome_fix code
* clean chrome_fix code
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix multiapps
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix some multi_apps tasks
* fix some multi_apps tasks
* fix password&resolution
* fix password&resolution
* Improve code logic for password & resolution
* edit
* Merge branch 'main' into fix_chrome
* fix chrome tasks
---------
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-13 21:04:07 +08:00
yuanmengqi
97ed6f99b0
Final review multi_apps fix the rest part
2025-07-12 20:28:55 +00:00
yuanmengqi
dbecf46057
Merge branch 'main' of github.com:xlang-ai/OSWorld
2025-07-12 16:35:02 +00:00
yuanmengqi
877e75a013
Final review multi_apps fix Xinzhuang part
2025-07-12 16:34:55 +00:00
Yuan Mengqi
27319ce1e3
fix password&resolution ( #251 )
...
* fix chrome
* fix: fix proxy setup
* feat&fix: add proxy support in setup and remove hardcoded proxy from example
* fix tasks
* fix chrome finished
* fix
* clean chrome_fix code
* clean chrome_fix code
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix multiapps
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix some multi_apps tasks
* fix some multi_apps tasks
* fix password&resolution
* fix password&resolution
---------
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-13 00:25:37 +08:00
yuanmengqi
6f0382c0c2
Merge branch 'main' of github.com:xlang-ai/OSWorld
2025-07-10 22:35:42 +00:00
yuanmengqi
6897e5320d
Enhance image text comparison functionality with detailed logging
...
- Added logging for OCR results and text matching outcomes in compare_image_text function.
- Updated JSON examples to support multiple expected results and improved structure for evaluator functions.
- Enhanced handling of expected text rules to include multiple variations for better matching accuracy.
2025-07-10 22:32:53 +00:00
st2rb8g
61f265a082
fix some multi_apps tasks ( #245 )
...
* fix chrome
* fix some multi_apps tasks.
* fix some multiapps tasks
* fix some multiapps tasks
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn >
2025-07-11 06:32:13 +08:00
Yuan Mengqi
093679b90d
fix some multi_apps task ( #243 )
...
* fix chrome
* fix: fix proxy setup
* feat&fix: add proxy support in setup and remove hardcoded proxy from example
* fix tasks
* fix chrome finished
* fix
* clean chrome_fix code
* clean chrome_fix code
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix multiapps
* fix chrome 2888b4e6-5b47-4b57-8bf5-c73827890774
* fix some multi_apps tasks
* fix some multi_apps tasks
---------
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-08 18:59:00 +08:00
XXZ
c8a6a22aad
Fix VLC task design ( #238 )
...
* fix: fix multiapp tasks
* fix: update instructions for VLC evaluation examples
---------
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-04 20:39:48 +08:00
Zilong Zhou
1308a80029
Update 5990457f-2adb-467b-a4af-5c857c92d762.json ( #235 )
2025-07-04 13:31:18 +08:00
Danyang Zhang
adc9ad88c2
Thunderbird eval fix ( #233 )
...
* ver Jul2nd
updated task requiring set up new email account
* ver Jul3rd
fixed several tasks
2025-07-03 21:55:55 +08:00
XXZ
ac24ccce99
fix: fix multiapp tasks ( #229 )
...
Co-authored-by: adlsdztony <zzl0712@connect.hku.hk >
2025-07-03 21:53:58 +08:00
yuanmengqi
cb4bed20a0
Refactor compare_python_pure_text function for improved normalization and error handling. Update JSON example to clarify instruction for extracting Python code from Colab, changing output file names for consistency.
2025-07-03 13:50:21 +00:00
Tianbao Xie
bba367b8bc
fix: fix multiapps tasks ( #231 )
...
* Update JSON example for multi_apps: change snapshot name and specify presenter in instructions for clarity.
* Enhance PDF image comparison in chrome.py by adding existence checks for input files and improving image extraction logic. Introduce image hashing for similarity scoring with a configurable threshold. Update docs.py to support fuzzy matching in DOCX file comparisons, allowing for similarity scoring based on text content. Modify example JSON to enable fuzzy matching option.
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn >
2025-07-03 16:58:43 +08:00
Zilong Zhou
595a704aff
fix: fix proxy setup ( #227 )
...
* fix: fix proxy setup
* feat&fix: add proxy support in setup and remove hardcoded proxy from example
2025-07-02 01:36:32 +08:00
Danyang Zhang
d4273d992e
Calc eval fix ( #225 )
...
* ver Jun17th
updating annotations
* ver Jun17th
corrected annotation of 1d17
added check for cell merge
* ver Jun17th
updated several annotations
* ver Jun20th
fixed set-up config of 2bd59342-0664-4ccb-ba87-79379096cc08
* fix: Enhance instructions in LibreOffice Calc examples for clarity and specificity, including details on using Pivot Tables, column placements, and revenue calculations.
* ver Jun21st
updating calc evals
* ver Jun22nd
fixed an impress task
* ver Jun22ndv2
adjusted several calc tasks
* Clean scalfolds
---------
Co-authored-by: BowenBryanWang <bryanwang.nlp@connect.hku.hk >
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn >
2025-06-30 18:23:09 +08:00
yuanmengqi
630f92fd7c
fix: correct URL encoding in JSON examples for invoice paths
2025-06-09 08:06:27 +00:00
yuanmengqi
3e541bb393
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-08 04:01:35 +00:00
yuanmengqi
9fa768d24d
refactor: update URLs in multiple JSON files to ensure proper encoding of special characters
2025-06-07 17:26:45 +00:00
yuanmengqi
a146c1e0b7
edit prompt
2025-06-07 05:21:04 +00:00
yuanmengqi
4ea24ddfd3
add proxy
2025-06-06 09:41:22 +00:00
Timothyxxx
fb7bafb885
feat: Add proxy configuration to all 369 evaluation examples - 55 with proxy, 314 without
2025-06-05 18:46:53 +08:00
Timothyxxx
34748567a5
feat: Migrate OSWorld files to HuggingFace cache with comprehensive documentation
...
- Add detailed README for file cache repository
- Implement migration script with retry logic and browser simulation
- Support automatic file type detection and deduplication
- Ensure reliable hosting for OSWorld evaluation files
2025-05-28 04:29:37 +08:00
Timothyxxx
2f0f3f31aa
Fix Duplicate ids; Remove unused JSON files across multiple applications
2025-02-10 15:49:54 +08:00
MillanK
983283a86a
patch: minor bug fixes for evaluator and task configurations, documentation update ( #121 )
...
* fix: /cursor_position api return format fix
* chore: update README.md to remove deprecated command
* fix: add base score for evaluators and minor bug fixes
* fix: add base score for setup configurations
---------
Co-authored-by: Jiaqi Deng <jiaqideng@Jiaqis-MacBook-Pro.local >
2025-01-18 22:25:18 +08:00
Timothyxxx
794b3ab469
Fix broken links
2024-08-15 01:29:47 +08:00
Timothyxxx
7b38e21b36
Re-org the files in multi_apps subset; fix broken links
2024-08-08 00:17:26 +08:00
Timothyxxx
09ffcc8542
Fix errors found in the examples (some broken links caused by Google Drive; dbus conflict)
2024-05-15 03:05:58 +08:00
Timothyxxx
635b6717b3
Fix a key error in multiapps
2024-03-25 17:55:28 +08:00
Timothyxxx
92760b29e1
Merge remote-tracking branch 'origin/main'
2024-03-21 22:05:40 +08:00
Timothyxxx
3ce7636abd
Fix one multi_app example; remove some broken examples; Support downsampling
2024-03-21 22:05:16 +08:00
tsuky_chen
ca03baacf5
fix conflict
2024-03-21 16:01:31 +08:00
tsuky_chen
3d2ff5d64e
fix when checking
2024-03-21 15:57:05 +08:00
David Chang
dac44b2c4f
ver Mar21st
...
Windows multi_app tasks
2024-03-21 15:03:21 +08:00
rhythmcao
1c9c5fd2ad
fix multi_apps/51f5801c-18b3-4f25-b0c3-02f85507a078.json missing file problems: who delete it on googledrive???
2024-03-18 20:51:53 +08:00
Timothyxxx
eeae1442cd
Add execute timeout to server; Fix error examples
2024-03-18 20:42:57 +08:00
David Chang
4067572af7
Merge branch 'main' of github.com:ztjhz/DesktopEnv
2024-03-17 23:04:12 +08:00
David Chang
1c732ea5d2
Merge branch 'zdy'
2024-03-17 23:03:35 +08:00
David Chang
9bafe09372
ver Mar17th
...
fixed an error in task config
2024-03-17 23:01:50 +08:00
rhythmcao
7feeab8f6b
add missing file
2024-03-17 01:42:43 +08:00
tsuky_chen
65823fcfab
Merge branch 'main' of https://github.com/xlang-ai/DesktopEnv
2024-03-16 19:23:55 +08:00
tsuky_chen
ddb7131891
fix multi apps instruction
2024-03-16 19:22:20 +08:00
Jason Lee
815c7ab67c
filter unfinished examples and add timer to ensure upper limit of each example
2024-03-15 16:52:17 +08:00
Jason Lee
cee3b93009
update all ids in experiment_screenshot.py
2024-03-13 21:06:55 +08:00
Jason Lee
670e20a248
update examples
2024-03-13 20:31:52 +08:00
Timothyxxx
a7663c534a
Update
2024-03-13 16:51:38 +08:00
BlankCheng
4b15595146
Update fix
2024-03-12 00:17:46 +08:00
tsuky_chen
5a4ba28735
fix multi apps
2024-03-11 14:44:11 +08:00
Timothyxxx
3f21519b78
Merge remote-tracking branch 'origin/main'
2024-03-10 23:52:39 +08:00