Commit Graph

22 Commits

Author SHA1 Message Date
Timothyxxx
a66b36295a Fix examples, and evaluation on Chrome, handle corner cases; Initialize arm support 2024-02-26 12:34:27 +08:00
Jason Lee
3244098664 finish the rest part of chrome examples and verify them on mac arm64 2024-02-24 21:57:01 +08:00
Jason Lee
17cd897780 add new examples for chrome 2024-02-18 22:11:16 +08:00
Timothyxxx
d5d9fc56de Fix minor bugs of get_terminal output caused by a11y tree depth 2024-01-30 18:48:00 +08:00
Timothyxxx
b9ae4174b1 Fix OS examples annotated by Yitao 2024-01-25 19:57:32 +08:00
Liu Yitao
93b4ff7d95 Update OS evals 2024-01-25 10:45:51 +08:00
rhythmcao
91824f754c 1. extend evaluator to list (compatible with single evaluator) 2. fix a variable name error in metrics/general.py 2024-01-18 14:12:54 +08:00
Timothyxxx
8efa692951 Add raw accessibility-tree based prompting method (but the tokens are too large); Minor fix some small bugs 2024-01-16 11:58:23 +08:00
David Chang
00922923ee ver Jan15thv2
thunderbird example w.r.t. unified folder
2024-01-15 15:56:01 +08:00
David Chang
d4192d3d9c ver Jan12thv3
debugged
2024-01-13 00:06:11 +08:00
David Chang
e08df57129 ver Jan12thv2
sqlite3 metric
2024-01-12 23:07:00 +08:00
David Chang
127a101994 Merge branch 'main' into zdy 2024-01-11 23:02:00 +08:00
David Chang
3c04872dcf ver Jan11thv2
a new Thunderbird example w.r.t. email filter
2024-01-11 22:43:38 +08:00
Timothyxxx
820579a5a2 Make up missing getters and metrics; Update VLC scripts; Start to work on Chrome, update examples instructions 2024-01-11 21:27:40 +08:00
David Chang
27eaf2f5d5 ver Jan11th
finally set up a simple task, or which should be simple
2024-01-11 20:03:33 +08:00
David Chang
1515b05666 ver Jan10thv2
a new example config for Thunderbird
fixed several bugs
2024-01-10 21:58:29 +08:00
David Chang
cf5d480f44 ver Jan10th
new Thunderbird task config
2024-01-10 17:36:59 +08:00
David Chang
df8be17394 ver Jan8th
trying to going on setting up thunderbird, but nothing done by now
2024-01-08 23:15:21 +08:00
David Chang
eeb8a120d6 ver Jan5th
debugged
2024-01-05 15:20:47 +08:00
David Chang
5fedf5b891 ver Jan4th
updated interfaces for thunderbird evaluation, not tested
2024-01-04 22:41:57 +08:00
Timothyxxx
03e99a68fb Loading libreoffice writer examples and find few problems, will do another round tomorrow for the rest 2024-01-02 17:50:05 +08:00
Timothyxxx
86ce9e1497 Initialize getters for Chrome software and general ones; Fix some examples for chrome 2023-12-29 22:24:45 +08:00