Commit Graph

100 Commits

Author SHA1 Message Date
tsuky_chen
35c4ce99ff modified libreoffice writer eval examples 2024-01-23 22:02:09 +08:00
rhythmcao
42725a00a5 fix some vscode eval problems 2024-01-22 23:30:10 +08:00
thomasshin
61b145ab13 add writer evals 8 examples 2024-01-22 23:22:44 +08:00
David Chang
6bdc0fb3e2 Merge branch 'zdy' 2024-01-22 14:37:13 +08:00
David Chang
7a85c76369 ver Jan22nd
updated all the existing calc configs
2024-01-22 12:42:50 +08:00
Timothyxxx
ce51d16bb3 Loading Impress v1 batch 2024-01-22 02:41:17 +08:00
David Chang
552491f765 ver Jan21stv2
fixed bugs
updated parts of configs
2024-01-21 23:55:04 +08:00
David Chang
4514c32269 ver Jan21st
reconstructed calc metrics
not updated the configs yet
2024-01-21 22:55:52 +08:00
Siheng Zhao
980f7290eb add vscode examples 2024-01-20 19:44:52 +08:00
David Chang
4af19fb777 Merge branch 'zdy' 2024-01-18 18:06:00 +08:00
David Chang
119a79e4fa ver Jan18thv2
updated metrics.__init__ with new check_data_validations
2024-01-18 18:04:18 +08:00
David Chang
a97c865c0c ver Jan18th
completed all the incomplete tasks stored under libreoffice_calc before
added metric check_data_validations
2024-01-18 17:54:53 +08:00
rhythmcao
91824f754c 1. extend evaluator to list (compatible with single evaluator) 2. fix a variable name error in metrics/general.py 2024-01-18 14:12:54 +08:00
David Chang
d3a9e5088d Merge branch 'zdy' 2024-01-17 22:48:30 +08:00
David Chang
19214f2107 ver Jan17thv2
updated compare_table with compare the shown value through exported csv
2024-01-17 22:43:26 +08:00
tsuky_chen
ba8ae104cf update impress eval examples 2024-01-17 18:00:20 +08:00
Timothyxxx
20b1d950a0 FIx corner cases (val connection in chrome when using playwright, and action parsing for agent, and accessibility tree xml handling) 2024-01-16 22:00:01 +08:00
Timothyxxx
6336a31419 Merge remote-tracking branch 'origin/main' 2024-01-16 11:58:35 +08:00
Timothyxxx
8efa692951 Add raw accessibility-tree based prompting method (but the tokens are too large); Minor fix some small bugs 2024-01-16 11:58:23 +08:00
tsuky_chen
9fe3a5db3b update libreoffice impress eval 2024-01-16 01:07:40 +08:00
David Chang
5dc633393f Merge branch 'zdy' 2024-01-15 17:08:38 +08:00
David Chang
00922923ee ver Jan15thv2
thunderbird example w.r.t. unified folder
2024-01-15 15:56:01 +08:00
Timothyxxx
1141232d80 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	desktop_env/controllers/setup.py
2024-01-15 13:51:11 +08:00
Timothyxxx
24169a65d0 Accomplish the exp scripts v1; Add video recording and trajectory recording of desktop agent; Fix minor bugs 2024-01-15 13:49:48 +08:00
tsuky_chen
f44995cb92 update libreoffice impress example 2024-01-15 01:32:22 +08:00
Timothyxxx
d52b692ee5 Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc 2024-01-14 18:30:49 +08:00
Timothyxxx
2228f346a9 Fix minor bugs caused from merging in setupcontroller; Initialize vscode example loading 2024-01-14 00:51:26 +08:00
Siheng Zhao
347160a35f update vsc 2024-01-13 23:20:36 +08:00
Timothyxxx
57a41a279c Resolve conflicts 2024-01-13 22:58:20 +08:00
Timothyxxx
a1c3e4c294 Finish Chrome example loading v1 2024-01-13 22:56:50 +08:00
Siheng Zhao
f274193265 Merge branch 'main' of github.com:ztjhz/DesktopEnv 2024-01-13 18:14:31 +08:00
Siheng Zhao
105fd35683 implement action replay for vscode and gimp evaluation 2024-01-13 17:53:13 +08:00
David Chang
005b054a0b Merge branch 'main' into zdy 2024-01-13 17:15:32 +08:00
tsuky_chen
136b52c876 eval gimp compare pics 2024-01-13 01:49:46 +08:00
David Chang
d4192d3d9c ver Jan12thv3
debugged
2024-01-13 00:06:11 +08:00
David Chang
e08df57129 ver Jan12thv2
sqlite3 metric
2024-01-12 23:07:00 +08:00
Timothyxxx
186df65683 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	desktop_env/controllers/setup.py
#	desktop_env/evaluators/metrics/utils.py
2024-01-12 17:30:15 +08:00
Timothyxxx
5a93a32958 Update on Chrome examples; Refactor on logic of controlling 2024-01-12 17:24:47 +08:00
Siheng Zhao
d6f694da1c update GIMP examples and eval 2024-01-12 16:36:02 +08:00
tsuky_chen
85f64e884c 2 init gimp examples 2024-01-12 16:10:10 +08:00
tsuky_chen
e79235f568 2 init gimp example 2024-01-12 16:07:55 +08:00
David Chang
5160619783 ver Jan12th
quickly fixed two thunderbird examples
2024-01-12 12:19:23 +08:00
David Chang
127a101994 Merge branch 'main' into zdy 2024-01-11 23:02:00 +08:00
David Chang
3c04872dcf ver Jan11thv2
a new Thunderbird example w.r.t. email filter
2024-01-11 22:43:38 +08:00
Timothyxxx
820579a5a2 Make up missing getters and metrics; Update VLC scripts; Start to work on Chrome, update examples instructions 2024-01-11 21:27:40 +08:00
David Chang
27eaf2f5d5 ver Jan11th
finally set up a simple task, or which should be simple
2024-01-11 20:03:33 +08:00
Timothyxxx
287876affc Merge remote-tracking branch 'origin/main'
# Conflicts:
#	desktop_env/evaluators/getters/__init__.py
#	desktop_env/evaluators/metrics/__init__.py
#	requirements.txt
2024-01-10 23:20:49 +08:00
Timothyxxx
49ece15ac3 VLC v1 finished, improve on instructions, improve on infra 2024-01-10 23:18:30 +08:00
David Chang
18cc1fc52c ver Jan10thv3
minor fixes
2024-01-10 22:23:48 +08:00
David Chang
cebae4b183 Merge branch 'main' into zdy 2024-01-10 22:16:25 +08:00