Commit Graph

38 Commits

Author SHA1 Message Date
Xinyuan Wang
24fbad9015 Merge pull request #264 from yuanmengqi/main
Improve the parallel logic
2025-07-17 12:28:48 +08:00
yuanmengqi
fe40011b5d Improve the parallel logic 2025-07-17 04:21:42 +00:00
yuanmengqi
6788c58aa3 Improve the parallel logic 2025-07-17 04:20:59 +00:00
yuanmengqi
bb8b0b2582 Improve the parallel logic 2025-07-17 04:19:44 +00:00
Zilong Zhou
dc164d5269 feat&fix: update configuration management to save model arguments and enhance UI display for model args (#262) 2025-07-16 21:46:35 +08:00
yuanmengqi
cb070307ee merge code 2025-07-15 14:57:14 +00:00
yuanmengqi
90c4e894a4 Merge remote-tracking branch 'upstream/main' into fix_chrome 2025-07-14 07:14:19 +00:00
yuanmengqi
5d90faa548 run operagor 2025-07-14 07:13:17 +00:00
Zilong Zhou
74b7c189af Feat/monitor (#254)
* feat: add claude support

* feat: add script for end-to-end evaluation with logging and task distribution

* feat&fix: add tool result handling and update model default in evaluation script

* chore: remove run_test_env.py script

* feat&fix: implement action parsing for tool calls and update default action space

* fix: update text formatting in action parsing and replace logger import

* feat&fix: implement action parsing for tool calls and add screen size handling

* feat: add setup instructions for Anthropic API integration

* feat: add notice about image size limitations for Anthropic API

* Delete test_env/logger.py

* Delete test_env/utils.py

* fix: update logger usage to use global logger and improve error handling

* feat&fix: add configuration management API endpoints and update UI for configuration selection

* feat&fix: update environment configuration, enhance task statistics, and improve UI responsiveness

* feat&fix: add configuration toggle button in UI and improve task loading performance

* feat&fix: add accuracy percentage display to score and style updates for UI
2025-07-14 13:43:41 +08:00
yuanmengqi
572a94b6df Merge branch 'main' into fix_chrome 2025-07-13 10:16:08 +00:00
yuanmengqi
ea51f5264a fix chrome 2025-06-30 08:07:24 +00:00
yuanmengqi
7315aec6e6 clean code 2025-06-10 04:06:54 +00:00
yuanmengqi
aee1207fff fix error 2025-06-09 04:20:59 +00:00
yuanmengqi
d8872634ee edit prompt 2025-06-08 03:59:31 +00:00
yuanmengqi
f48d80002f Merge remote-tracking branch 'upstream/feat/aws-provider-support' 2025-06-07 13:22:53 +00:00
yuanmengqi
c57b1d4e7a eval update 2025-06-07 13:19:22 +00:00
adlsdztony
7d25f902a4 refactor&fix: update README and main.py for improved configuration and task status handling 2025-06-06 12:55:13 +00:00
yuanmengqi
64177045b5 Merge remote-tracking branch 'upstream/feat/aws-provider-support' 2025-06-06 10:22:56 +00:00
yuanmengqi
4ea24ddfd3 add proxy 2025-06-06 09:41:22 +00:00
adlsdztony
2ad48f04d7 feat&fix: update environment configuration for Docker compatibility and enhance result path handling 2025-06-06 02:53:20 +00:00
yuanmengqi
a6300e05c9 Merge remote-tracking branch 'upstream/feat/aws-provider-support' 2025-06-05 13:31:42 +00:00
adlsdztony
6acb8a2d1f feat&fix: implement task status caching for improved performance and add cache handling in brief status retrieval 2025-06-05 12:42:59 +00:00
adlsdztony
3b1540ed23 feat&fix: enhance task status handling and update logging configuration 2025-06-05 09:33:36 +00:00
adlsdztony
2bfb4af8b5 feat&fix: add score display banner and update task status with animation 2025-06-05 05:07:45 +00:00
adlsdztony
2d5cee3f5c feat&fix: add brief task status retrieval and improve task status update mechanism 2025-06-05 03:41:43 +00:00
adlsdztony
80e4ec75de fix&docs: update FLASK_DEBUG setting to false in .env and README 2025-06-04 19:58:47 +08:00
yuanmengqi
b211df3385 fix timeout 2025-06-04 10:23:45 +00:00
yuanmengqi
b87cbe69e5 add monitor 2025-06-02 13:34:20 +00:00
adlsdztony
e363da2fd7 docs: update README with important execution note & fix: fix auto-refresh logic 2025-06-02 21:11:38 +08:00
adlsdztony
2b36860a03 refactor&fix: remove unused no-transition styles and simplify refresh logic 2025-06-01 09:36:46 +00:00
adlsdztony
37505f4c3b feat&fix: implement auto-refresh functionality and disable animation when refresh 2025-06-01 08:45:58 +00:00
adlsdztony
e48bd6b059 feat: add .env configuration file and update README with configuration details 2025-06-01 07:07:47 +00:00
adlsdztony
41e9e86379 fix: update task rendering to correctly display error count 2025-06-01 06:58:05 +00:00
adlsdztony
b5efb82172 feat&fix: add task recording endpoint, enhance video player support, and improve mobile responsiveness 2025-06-01 06:50:02 +00:00
adlsdztony
cb62b3c877 feat&fix: update paths in configuration, enhance error handling, and improve UI elements 2025-06-01 04:48:50 +00:00
adlsdztony
d1a001b2b7 fix&refactor: correct port mapping in docker-compose and set fixed port in main.py 2025-06-01 10:57:14 +08:00
adlsdztony
60a2b495b9 feat: add README for OSWorld Monitor with configuration and usage instructions 2025-06-01 10:48:14 +08:00
adlsdztony
53c4106c5b feat: Implement task monitoring web application 2025-06-01 10:31:27 +08:00