Commit Graph

36 Commits

Author SHA1 Message Date
Tianbao Xie
4e11eafd1d Robust Evaluation, Blocking File Open, Grader Sensitivity, and LibreOffice Writer Fixes (#217)
* Refactor evaluator structure in LibreOffice Writer example JSON to support multiple expected and result files, enhancing evaluation flexibility.

* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.

* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.

* Update time format in get_vm_file function to include hours, minutes, and seconds for more precise file naming with time suffix.

* More delay for 936321ce-5236-426a-9a20-e0e3c5dc536f; support one more potential solutions.

* Enhance SetupController with configurable retry limit and improved error handling for file opening requests. Introduce new function to compare unique training records, and update logging for better debugging. Adjust JSON examples for evaluation to support multiple expected and result files.

* Clean debug code

---------

Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
2025-06-16 21:37:19 +08:00
Eric Patey
3ee6c34a36 Fix referenced before assignment regression introduced with #121. (#125)
Co-authored-by: Eric Patey <>
2025-02-05 10:51:59 +08:00
MillanK
983283a86a patch: minor bug fixes for evaluator and task configurations, documentation update (#121)
* fix: /cursor_position api return format fix

* chore: update README.md to remove deprecated command

* fix: add base score for evaluators and minor bug fixes

* fix: add base score for setup configurations

---------

Co-authored-by: Jiaqi Deng <jiaqideng@Jiaqis-MacBook-Pro.local>
2025-01-18 22:25:18 +08:00
Timothyxxx
2d8eeaad58 Fix one bug in Chrome getter; fix one erro for corner case in doc 2024-04-02 14:50:29 +08:00
Timothyxxx
b3d27f6387 Fix bugs in multiple examples 2024-03-10 23:52:29 +08:00
tsuky_chen
aae848196b merge 2024-03-09 18:53:27 +08:00
tsuky_chen
5b07ec17bf fix multi apps 2024-03-09 18:50:16 +08:00
Timothyxxx
4de0eff703 Add none file handling for doc 2024-03-09 00:16:50 +08:00
Tianbao Xie
f01153cadd Merge branch 'main' into xiaochuanli/addChromeExtensions 2024-03-08 20:45:49 +08:00
Jason Lee
62fd8feebb xiaochuan's multiapp examples 2024-03-08 19:24:15 +08:00
Timothyxxx
1aa2a43908 Update multi-apps examples 2024-03-07 22:15:08 +08:00
tsuky_chen
e295430bcf Merge branch 'main' of https://github.com/xlang-ai/DesktopEnv 2024-03-07 01:25:37 +08:00
tsuky_chen
5b5475094e update multi apps 2024-03-07 01:24:36 +08:00
rhythmcao
da0dafc32c add multi-apps 5 examples by ruisheng 2024-03-06 2024-03-06 21:20:26 +08:00
tsuky_chen
69ef653a7c update multi apps 2024-03-05 22:46:56 +08:00
David Chang
e98cd6b701 ver Mar4th
updated range-wise fuzzy match mode for compare_table
2024-03-04 15:08:41 +08:00
Timothyxxx
9c5269be3a Update multiple-apps examples and eval (WIP) 2024-03-02 20:41:01 +08:00
Timothyxxx
7427b39d1d Code clean 2024-02-23 15:40:26 +08:00
Timothyxxx
1610358e08 Fix typos and examples in libreoffice_writer examples 2024-02-23 15:37:51 +08:00
rhythmcao
538b9928fe fix some problems in libreoffice writer 2024-02-02 02:23:25 +08:00
Timothyxxx
cb7643713e Add impress examples, format the import 2024-01-30 03:25:27 +08:00
Timothyxxx
1756d3b672 Fix writer examples 2024-01-30 01:25:30 +08:00
Timothyxxx
343813a29b Add impress examples; remove the auto-saving pyautogui commands change to libreoffice pre-setting 2024-01-29 21:34:58 +08:00
rhythmcao
dfc12430c0 add multi-app example; drop dbt examples (examples in another project added by accident) 2024-01-29 19:22:22 +08:00
thomasshin
4b8cab0805 check_highlighted_words modified 2024-01-28 14:36:28 +08:00
tsuky_chen
3e7cfa8699 load libreoffice writer eval -batch 2 2024-01-26 02:07:26 +08:00
tsuky_chen
35c4ce99ff modified libreoffice writer eval examples 2024-01-23 22:02:09 +08:00
thomasshin
61b145ab13 add writer evals 8 examples 2024-01-22 23:22:44 +08:00
Timothyxxx
2228f346a9 Fix minor bugs caused from merging in setupcontroller; Initialize vscode example loading 2024-01-14 00:51:26 +08:00
David Chang
eeb8a120d6 ver Jan5th
debugged
2024-01-05 15:20:47 +08:00
Timothyxxx
03e99a68fb Loading libreoffice writer examples and find few problems, will do another round tomorrow for the rest 2024-01-02 17:50:05 +08:00
tsuky_chen
f04e625ad9 add eval libreoffice writer compare image & centering & check file existence 2023-12-31 03:17:53 +08:00
tsuky_chen
52af1b6dd4 add eval libreoffice writer compare font & subscript & page number 2023-12-31 02:33:39 +08:00
tsuky_chen
c937e31b18 add eval libreoffice writer compare table & equation 2023-12-31 01:02:27 +08:00
tsuky_chen
2d493759e3 add eval libreoffice write compare content 2023-12-30 18:21:39 +08:00
tsuky_chen
24f33dc9bf add eval libreoffice writer font & page break 2023-12-30 16:32:15 +08:00