Commit Graph

62 Commits

Author SHA1 Message Date
Timothyxxx
3f59ff46dc Add infeasible support 2024-02-14 11:59:50 +08:00
Timothyxxx
59e2417a08 Add Mistral, Qwen, Gemini support; Fix minor bugs 2024-02-01 16:55:38 +08:00
rhythmcao
fc15a33b70 finish multi-app examples 2024-02-01 00:53:31 +08:00
Timothyxxx
d65b6994d3 Fix minor bugs of multiple apps examples 2024-01-31 19:40:41 +08:00
David Chang
5a486b6b37 ver Jan27th
debugged at+screenshot implementation, no issues found
fixed a little bugs
2024-01-27 23:10:48 +08:00
Timothyxxx
b9ae4174b1 Fix OS examples annotated by Yitao 2024-01-25 19:57:32 +08:00
Timothyxxx
5dea912d01 Finish Chrome v2 loading 2024-01-24 23:05:28 +08:00
Timothyxxx
bdd21d06ca Fix minor bugs 2024-01-19 20:34:11 +08:00
rhythmcao
91824f754c 1. extend evaluator to list (compatible with single evaluator) 2. fix a variable name error in metrics/general.py 2024-01-18 14:12:54 +08:00
Timothyxxx
b60eb2a933 VM resolution adjust support 2024-01-18 01:43:57 +08:00
Timothyxxx
8efa692951 Add raw accessibility-tree based prompting method (but the tokens are too large); Minor fix some small bugs 2024-01-16 11:58:23 +08:00
Timothyxxx
493b719821 Add gemini agent implementation; Add missed requirements; Minor fix some small bugs 2024-01-15 21:58:33 +08:00
Timothyxxx
f153a4c253 Add 'WAIT', 'FAIL', 'DONE' to the action space; Debug basic prompting-based GPT-4 and Gemini agents; Initialize experiments script; 2024-01-14 23:36:19 +08:00
Timothyxxx
d52b692ee5 Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc 2024-01-14 18:30:49 +08:00
Timothyxxx
bc88ee0c41 Minor fix of the logic of vm ip get 2024-01-12 21:18:59 +08:00
rhythmcao
d4116458ff 1. fix quote and \ characters in execute_command ; 2. add terminal output text as extra observation ; 3. move get_vm_*() to reset() 2024-01-12 18:09:05 +08:00
Timothyxxx
5a93a32958 Update on Chrome examples; Refactor on logic of controlling 2024-01-12 17:24:47 +08:00
Timothyxxx
820579a5a2 Make up missing getters and metrics; Update VLC scripts; Start to work on Chrome, update examples instructions 2024-01-11 21:27:40 +08:00
Timothyxxx
287876affc Merge remote-tracking branch 'origin/main'
# Conflicts:
#	desktop_env/evaluators/getters/__init__.py
#	desktop_env/evaluators/metrics/__init__.py
#	requirements.txt
2024-01-10 23:20:49 +08:00
Timothyxxx
49ece15ac3 VLC v1 finished, improve on instructions, improve on infra 2024-01-10 23:18:30 +08:00
David Chang
cf5d480f44 ver Jan10th
new Thunderbird task config
2024-01-10 17:36:59 +08:00
Timothyxxx
fa84b20ea5 VLC updates, and some infra bugs fix 2024-01-09 09:30:11 +08:00
David Chang
26b7d9010d Merge branch 'zdy' 2024-01-05 15:55:41 +08:00
David Chang
eeb8a120d6 ver Jan5th
debugged
2024-01-05 15:20:47 +08:00
David Chang
5fedf5b891 ver Jan4th
updated interfaces for thunderbird evaluation, not tested
2024-01-04 22:41:57 +08:00
Timothyxxx
ab71ebb2ba Initialize VLC getters and metrics, fix some bugs in infra logic, needs to be refactored later on 2024-01-04 17:05:17 +08:00
Timothyxxx
03e99a68fb Loading libreoffice writer examples and find few problems, will do another round tomorrow for the rest 2024-01-02 17:50:05 +08:00
David Chang
4e5920264a ver Dec27thv2
updated a task config
updated documents
fixed the options feature of evaluator
updated with new properties of charts
current load_charts should be ok, I think
2023-12-27 17:51:41 +08:00
David Chang
ade9002da4 Merge branch 'main' into zdy 2023-12-25 20:29:20 +08:00
David Chang
82e3353f65 ver Dec25th
added cache and upload function for setup
2023-12-25 14:40:30 +08:00
Timothyxxx
236fcb0938 Refactor examples; Start to load examples into benchmark; vlc initialization 2023-12-25 00:24:13 +08:00
David Chang
2163a08a0d ver Dec22ndv4
recorded action history
2023-12-22 19:03:54 +08:00
David Chang
fffc4aadca ver Dec22ndv3
added function to switch tasks and reload the configs of setup &
evaluation
2023-12-22 15:20:27 +08:00
David Chang
f4664bd069 ver Dec22nd
re-organized the evaluator structure to improve the extensibility
2023-12-22 14:01:26 +08:00
David Chang
295d09f1b2 ver Dec21stv2
updated usage of tmp and cache direcotories
added cache function for evaluation resources acquiring
2023-12-21 16:12:32 +08:00
David Chang
b37df609b9 ver Dec20th
some adaptations to make in run on my device
2023-12-20 20:14:39 +08:00
zdy023
b7267fac07 Merge branch 'main' into zdy 2023-12-19 11:06:17 +08:00
Timothyxxx
2ca36109b5 Initialize evaluation protocols and examples; Implement one kind of eval; Update requirements 2023-12-12 18:10:55 +08:00
Timothyxxx
343b40ecac Fix action_space setup 2023-12-06 22:59:19 +08:00
Timothyxxx
4ba053998d Improve the logic of env setup; add change wallpaper; add example 2023-12-05 17:32:24 +08:00
Timothyxxx
a5dd4408fe Merge remote-tracking branch 'origin/main'
# Conflicts:
#	main.py
2023-12-05 11:48:30 +08:00
Timothyxxx
8818779329 Update compressor for data annotation 2023-12-04 00:51:33 +08:00
Jing Hua
e808cf84a7 setup controller 2023-12-03 21:09:05 +08:00
Jing Hua
b10908b4fa add longer retries 2023-12-03 17:24:44 +08:00
Jing Hua
58e9807a70 automate vm ip address 2023-12-03 17:17:25 +08:00
Timothyxxx
8cb31f1bb0 Fix the implementation of action 13 of computer 2023-12-03 01:00:26 +08:00
Timothyxxx
9471de4768 Fix the implementation of action 13 of computer 2023-12-03 00:59:02 +08:00
Timothyxxx
487fb8005b Improve: fix bugs; add back the cursor in screenshot; add pause in env.step 2023-12-02 22:14:50 +08:00
Timothyxxx
9b214b3d23 Action space thoughts 2023-12-02 18:02:06 +08:00
Timothyxxx
992d8f8fce Refactor with pyautogui 2023-12-02 17:52:00 +08:00