Commit Graph

257 Commits

Author SHA1 Message Date
MillanK
48ac57697a VSCode fix (#222) 2025-06-24 17:08:09 +08:00
Tianbao Xie
4e11eafd1d Robust Evaluation, Blocking File Open, Grader Sensitivity, and LibreOffice Writer Fixes (#217)
* Refactor evaluator structure in LibreOffice Writer example JSON to support multiple expected and result files, enhancing evaluation flexibility.

* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.

* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.

* Update time format in get_vm_file function to include hours, minutes, and seconds for more precise file naming with time suffix.

* More delay for 936321ce-5236-426a-9a20-e0e3c5dc536f; support one more potential solutions.

* Enhance SetupController with configurable retry limit and improved error handling for file opening requests. Introduce new function to compare unique training records, and update logging for better debugging. Adjust JSON examples for evaluation to support multiple expected and result files.

* Clean debug code

---------

Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
2025-06-16 21:37:19 +08:00
tsuky_chen
e55810809e Fix libreoffice impress evaluation (#209)
Co-authored-by: chenjix <211250101@smail.nju.edu.cn>
2025-06-08 22:12:56 +08:00
Xubin Ren
1d10514125 Fix Search Engine Detection Discrepancy in Chrome Evaluation (#172)
* Update bb5e4c0d-f964-439c-97b6-bdb9747de3f4.json

* Update __init__.py

* Update general.py
2025-04-10 17:24:50 +08:00
Timothyxxx
d373817edb Modify VLC launch command and fullscreen detection
- Add VLC_VERBOSE=-1 to suppress verbose logging in VLC launch commands across multiple example files
- Update is_vlc_fullscreen function to handle cases where screen size or window size is None
- Improve robustness of VLC-related metrics and example configurations
2025-03-06 22:11:42 +08:00
Eric Patey
bf3f054564 Fix crash caused by referencing an unbound local variable. (#128)
Co-authored-by: Eric Patey <>
2025-02-07 23:31:53 +08:00
Eric Patey
3ee6c34a36 Fix referenced before assignment regression introduced with #121. (#125)
Co-authored-by: Eric Patey <>
2025-02-05 10:51:59 +08:00
MillanK
983283a86a patch: minor bug fixes for evaluator and task configurations, documentation update (#121)
* fix: /cursor_position api return format fix

* chore: update README.md to remove deprecated command

* fix: add base score for evaluators and minor bug fixes

* fix: add base score for setup configurations

---------

Co-authored-by: Jiaqi Deng <jiaqideng@Jiaqis-MacBook-Pro.local>
2025-01-18 22:25:18 +08:00
Tianbao Xie
7d84a21962 Fix minor problems when aggragating the results (#106) 2024-11-22 17:37:34 +08:00
Pierre Carrier
924e0fcd17 metrics: fix time regex (#81) 2024-10-24 22:45:42 +08:00
Timothyxxx
25e808cc91 Fix known errors found from feedback (DBUS problems, pulseaudio start, one vlc example with error. typos) 2024-05-18 04:49:29 +08:00
Timothyxxx
9c75df5dce Clean code; Refactor environment to pass screenshot content instead of path 2024-04-13 23:34:01 +08:00
Timothyxxx
2d8eeaad58 Fix one bug in Chrome getter; fix one erro for corner case in doc 2024-04-02 14:50:29 +08:00
Timothyxxx
fad621093f Fix one bug in Chrome getter 2024-04-01 15:05:48 +08:00
tsuky_chen
ca03baacf5 fix conflict 2024-03-21 16:01:31 +08:00
tsuky_chen
169a0a15ad add libreoffice examples for windows 2024-03-21 15:49:54 +08:00
Timothyxxx
d1e2b12b41 Fix GIMP bug; Speedup the environment, when there is not a11y tree needed, we can do no controller.get 2024-03-20 22:22:59 +08:00
BlankCheng
f5da5e940b Merge main 2024-03-18 22:21:01 +08:00
BlankCheng
4671455b56 Fix eval func 2024-03-18 22:16:04 +08:00
Timothyxxx
eeae1442cd Add execute timeout to server; Fix error examples 2024-03-18 20:42:57 +08:00
Timothyxxx
0aae756538 Code clean 2024-03-14 12:54:10 +08:00
BlankCheng
4b15595146 Update fix 2024-03-12 00:17:46 +08:00
Timothyxxx
b4cb64d861 Fix bugs in multiple examples 2024-03-11 00:26:59 +08:00
Timothyxxx
b3d27f6387 Fix bugs in multiple examples 2024-03-10 23:52:29 +08:00
Timothyxxx
e51d0e8cc9 Fix bugs in multiple apps example 0e53 2024-03-10 15:18:14 +08:00
Jason Lee
812be97a41 Merge branch 'main' of github.com:xlang-ai/DesktopEnv 2024-03-10 14:50:17 +08:00
Jason Lee
775cef744f xiaochuan correct his bugs in multiapp examples, you can try it again now 2024-03-10 14:48:56 +08:00
Timothyxxx
e481afcf5c Fix multiple examples 2024-03-09 23:01:22 +08:00
tsuky_chen
aae848196b merge 2024-03-09 18:53:27 +08:00
tsuky_chen
5b07ec17bf fix multi apps 2024-03-09 18:50:16 +08:00
tsuky_chen
f4ec36bdfb fix multi apps 2024-03-09 18:48:17 +08:00
Jason Lee
2291af394f update google drive file link in json 2024-03-09 18:06:48 +08:00
Timothyxxx
1e0a78a453 Add none file handling for general 2024-03-09 00:30:28 +08:00
Timothyxxx
4de0eff703 Add none file handling for doc 2024-03-09 00:16:50 +08:00
Timothyxxx
62b3b2390d Fix bugs from merging 2024-03-08 23:09:11 +08:00
Tianbao Xie
f01153cadd Merge branch 'main' into xiaochuanli/addChromeExtensions 2024-03-08 20:45:49 +08:00
Tianbao Xie
4b841c199a Merge pull request #12 from xlang-ai/zhoujun/multi-app
Update multi-app examples
2024-03-08 20:41:14 +08:00
Timothyxxx
6f0fe4f482 Fix a bug in multiple apps example 2024-03-08 20:39:05 +08:00
rhythmcao
365c7798f1 Merge branch 'main' of https://github.com/xlang-ai/DesktopEnv 2024-03-08 19:26:04 +08:00
Jason Lee
62fd8feebb xiaochuan's multiapp examples 2024-03-08 19:24:15 +08:00
David Chang
1642a17bd7 Merge branch 'zdy' 2024-03-08 13:30:25 +08:00
David Chang
ce23f3dab4 ver Mar8th
fixed a task and a metric
2024-03-08 13:28:34 +08:00
rhythmcao
565c0cc58c Merge branch 'main' of https://github.com/xlang-ai/DesktopEnv 2024-03-08 00:03:37 +08:00
rhythmcao
89f0fc5410 update multi-apps 2024-03-08 00:03:08 +08:00
Timothyxxx
1af9d8911d Update multi-apps examples 2024-03-07 22:15:23 +08:00
Timothyxxx
1aa2a43908 Update multi-apps examples 2024-03-07 22:15:08 +08:00
tsuky_chen
5abdf207a9 Merge branch 'main' of https://github.com/xlang-ai/DesktopEnv 2024-03-07 17:21:12 +08:00
tsuky_chen
807a95a230 update multi apps 2024-03-07 17:20:51 +08:00
David Chang
0e6ceeb168 Merge branch 'zdy' 2024-03-07 16:54:50 +08:00
David Chang
d6cd0936b3 ver Mar7th
updated instructions and set-up configs
2024-03-07 16:54:06 +08:00