13 Commits

Author SHA1 Message Date
Zilong Zhou
66694c663d Feat/monitor cache (#267)
* feat&style: add task status configuration and clear cache functionality; enhance UI styles

* feat&refactor: enhance current configuration API and improve cache clearing logic

* refactor&style: simplify task status update logic and improve page refresh mechanism

* refactor&feat: streamline default configuration retrieval and enhance cache initialization logic

* feat&refactor: add caching to default configuration retrieval and streamline task status logic

* feat&style: add collapsible section for additional model parameters and enhance styling for config items

* refactor&style: remove floating action button and clean up related styles
2025-07-18 01:58:20 +08:00
Zilong Zhou
dc164d5269 feat&fix: update configuration management to save model arguments and enhance UI display for model args (#262) 2025-07-16 21:46:35 +08:00
Zilong Zhou
74b7c189af Feat/monitor (#254)
* feat: add claude support

* feat: add script for end-to-end evaluation with logging and task distribution

* feat&fix: add tool result handling and update model default in evaluation script

* chore: remove run_test_env.py script

* feat&fix: implement action parsing for tool calls and update default action space

* fix: update text formatting in action parsing and replace logger import

* feat&fix: implement action parsing for tool calls and add screen size handling

* feat: add setup instructions for Anthropic API integration

* feat: add notice about image size limitations for Anthropic API

* Delete test_env/logger.py

* Delete test_env/utils.py

* fix: update logger usage to use global logger and improve error handling

* feat&fix: add configuration management API endpoints and update UI for configuration selection

* feat&fix: update environment configuration, enhance task statistics, and improve UI responsiveness

* feat&fix: add configuration toggle button in UI and improve task loading performance

* feat&fix: add accuracy percentage display to score and style updates for UI
2025-07-14 13:43:41 +08:00
adlsdztony
3b1540ed23 feat&fix: enhance task status handling and update logging configuration 2025-06-05 09:33:36 +00:00
adlsdztony
2bfb4af8b5 feat&fix: add score display banner and update task status with animation 2025-06-05 05:07:45 +00:00
adlsdztony
2d5cee3f5c feat&fix: add brief task status retrieval and improve task status update mechanism 2025-06-05 03:41:43 +00:00
adlsdztony
e363da2fd7 docs: update README with important execution note & fix: fix auto-refresh logic 2025-06-02 21:11:38 +08:00
adlsdztony
2b36860a03 refactor&fix: remove unused no-transition styles and simplify refresh logic 2025-06-01 09:36:46 +00:00
adlsdztony
37505f4c3b feat&fix: implement auto-refresh functionality and disable animation when refresh 2025-06-01 08:45:58 +00:00
adlsdztony
41e9e86379 fix: update task rendering to correctly display error count 2025-06-01 06:58:05 +00:00
adlsdztony
b5efb82172 feat&fix: add task recording endpoint, enhance video player support, and improve mobile responsiveness 2025-06-01 06:50:02 +00:00
adlsdztony
cb62b3c877 feat&fix: update paths in configuration, enhance error handling, and improve UI elements 2025-06-01 04:48:50 +00:00
adlsdztony
53c4106c5b feat: Implement task monitoring web application 2025-06-01 10:31:27 +08:00