Commit Graph

14 Commits

Author SHA1 Message Date
Danyang Zhang
d4273d992e Calc eval fix (#225)
* ver Jun17th

updating annotations

* ver Jun17th

corrected annotation of 1d17
added check for cell merge

* ver Jun17th

updated several annotations

* ver Jun20th

fixed set-up config of 2bd59342-0664-4ccb-ba87-79379096cc08

* fix: Enhance instructions in LibreOffice Calc examples for clarity and specificity, including details on using Pivot Tables, column placements, and revenue calculations.

* ver Jun21st

updating calc evals

* ver Jun22nd

fixed an impress task

* ver Jun22ndv2

adjusted several calc tasks

* Clean scalfolds

---------

Co-authored-by: BowenBryanWang <bryanwang.nlp@connect.hku.hk>
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
2025-06-30 18:23:09 +08:00
Timothyxxx
fb7bafb885 feat: Add proxy configuration to all 369 evaluation examples - 55 with proxy, 314 without 2025-06-05 18:46:53 +08:00
Timothyxxx
34748567a5 feat: Migrate OSWorld files to HuggingFace cache with comprehensive documentation
- Add detailed README for file cache repository
- Implement migration script with retry logic and browser simulation
- Support automatic file type detection and deduplication
- Ensure reliable hosting for OSWorld evaluation files
2025-05-28 04:29:37 +08:00
David Chang
4897211a46 ver Jan31stv6
finished calc human evaluation
updated calc configs with an extra sleep to guarantee the integrity of
downloaded xlsx file
2024-01-31 22:55:47 +08:00
David Chang
8025bf19f0 ver Jan27th
corrected usage of pyautogui in calc postconfig
2024-01-27 19:46:06 +08:00
David Chang
7a85c76369 ver Jan22nd
updated all the existing calc configs
2024-01-22 12:42:50 +08:00
David Chang
ffc4c32bac ver Jan17th
updated the existing task configs
2024-01-17 17:27:08 +08:00
David Chang
5e2a03720d ver Jan10thv4
updated /home/david to /home/user
2024-01-10 22:33:33 +08:00
David Chang
d41c674a91 Merge branch 'main' into zdy 2023-12-31 14:37:01 +08:00
David Chang
19b99a13e2 Merge branch 'zdy' 2023-12-30 20:53:54 +08:00
David Chang
e4fac09945 ver Dec29th
metric compare_with_formats
2023-12-29 21:19:52 +08:00
David Chang
5a14cf40db Merge branch 'main' into zdy 2023-12-28 21:20:57 +08:00
David Chang
ade9002da4 Merge branch 'main' into zdy 2023-12-25 20:29:20 +08:00
Timothyxxx
236fcb0938 Refactor examples; Start to load examples into benchmark; vlc initialization 2023-12-25 00:24:13 +08:00