OSWorld Monitor
A web-based monitoring dashboard for OSWorld tasks and executions.
Overview
This monitor provides a visual interface to track the status, progress, and results of OSWorld tasks. It allows you to:
- View all tasks grouped by type
- Monitor task execution status in real-time
- See detailed execution steps with screenshots
- Check task results
Configuration
The monitor can be configured by editing the .env file in the monitor directory. The following variables can be customized:
| Variable | Description | Default Value |
|---|---|---|
| TASK_CONFIG_PATH | Path to the task configuration JSON file | evaluation_examples/test_small.json |
| EXAMPLES_BASE_PATH | Base path for task example files | evaluation_examples/examples |
| RESULTS_BASE_PATH | Base path for execution results | results_operator_aws/pyautogui/screenshot/computer-use-preview |
| MAX_STEPS | Maximum steps to display for a task | 50 |
| FLASK_PORT | Port for the web server | 8080 |
| FLASK_HOST | Host address for the web server | 0.0.0.0 |
| FLASK_DEBUG | Enable debug mode (true/false) | true |
For example:
# .env
TASK_CONFIG_PATH=evaluation_examples/test_small.json
EXAMPLES_BASE_PATH=evaluation_examples/examples
RESULTS_BASE_PATH=results_operator_aws/pyautogui/screenshot/computer-use-preview
MAX_STEPS=50
FLASK_PORT=8080
FLASK_HOST=0.0.0.0
FLASK_DEBUG=true
Running with Docker
The recommended way to run the monitor is using Docker with the provided Docker Compose configuration.
Prerequisites
- Docker and Docker Compose installed on your system
- OSWorld repository cloned to your local machine
Starting the Monitor
-
Navigate to the monitor directory:
cd /path/to/OSWorld/monitor -
Edit the
.envfile if you need to customize any settings. -
Build and start the Docker container:
docker-compose up -d -
Access the monitor in your web browser at:
http://{your-ip-address}:{FLASK_PORT}
Stopping the Monitor
To stop the monitor:
docker-compose down
Viewing Logs
To view the monitor logs:
docker-compose logs -f
Running Without Docker
If you prefer to run the monitor directly, make sure you have created a .env file with the necessary configurations. You will also need to install the required Python packages.
-
Install the required Python packages:
pip install -r requirements.txt -
Start the monitor:
python main.py
Features
- Task Overview: View all tasks with their status, progress, and basic information
- Task Filtering: Filter tasks by status (all, active, completed)
- Task Details: Detailed view of each task showing step-by-step execution
- Screenshots: View screenshots captured during task execution
Troubleshooting
If you encounter issues:
- Check the logs for errors
- Verify the paths in
.envfile point to valid directories - Ensure the Docker daemon is running (if using Docker)
- Check that the port is not already in use by another application
- Make sure you set the security group rules to allow access to the specified port