feat: add run_multienv_o3.py script for multi-environment evaluation

- Introduced a new script `run_multienv_o3.py` to facilitate end-to-end evaluation across multiple environments.
- Implemented command-line argument parsing for various configurations including environment settings, logging levels, and AWS parameters.
- Integrated signal handling for graceful shutdown of environments and processes.
- Enhanced logging capabilities for better traceability during execution.
- Maintained existing logic from previous scripts while introducing new functionalities for improved evaluation processes.
This commit is contained in:
yuanmengqi
2025-07-27 16:47:24 +00:00
parent 1342bfe5ce
commit 0f00788c4d
5 changed files with 1148 additions and 209 deletions

3
run.py
View File

@@ -15,8 +15,7 @@ import lib_run_single
from desktop_env.desktop_env import DesktopEnv
from mm_agents.agent import PromptAgent
# import wandb
# Almost deprecated since it's not multi-env, use run_multienv_*.py instead
# Logger Configs {{{ #
logger = logging.getLogger()