* Added a **pyproject.toml** file to define project metadata and dependencies.
* Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic.
* Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis.
* Added a **tools module** containing utility functions and tool configurations to improve code reusability.
* Updated the **README** and documentation with usage examples and module descriptions.
These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience.
Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
- Introduced ALIYUN_RESOURCE_GROUP_ID environment variable to manage resource group assignments during VM allocation.
- Updated the _allocate_vm function to include resource group ID in the request if specified.
- Modified VNC URL logging to use public IP when available, enhancing clarity in access information.
- Maintained existing code logic while improving functionality for resource management and logging.
- Added new dependencies for Aliyun ECS SDK in requirements.txt and setup.py to support instance management features.
- Introduced a new config module to handle TTL settings for ECS instances, allowing for auto-termination based on environment variables.
- Updated the manager to utilize TTL settings, including scheduling instance termination with proper error handling and logging.
- Maintained existing code logic while enhancing functionality for improved instance lifecycle management.
- Improved error handling and logging for role resolution and creation.
- Added checks to ensure the trust policy allows for AWS EventBridge Scheduler to assume the role.
- Implemented retry logic for scheduling EC2 termination to handle IAM eventual consistency.
- Maintained existing code logic while enhancing robustness and clarity in role management.
- Introduced a new config module to manage TTL settings for EC2 instances, allowing for auto-termination based on environment variables.
- Updated the AWSProvider and manager to utilize the new TTL settings, including scheduling instance termination via EventBridge Scheduler.
- Added utility functions for resolving the scheduler role ARN and creating termination schedules, ensuring robust error handling and logging.
- Maintained existing code logic while integrating new features for improved instance lifecycle management.
* Adding support for aliyun as a provider
* feat: enhance Aliyun provider support
- Added Aliyun as a new provider in the desktop environment.
- Updated the environment configuration guidelines for Aliyun, including prerequisites and environment variables.
- Implemented instance allocation and management functions for Aliyun ECS, including signal handling for graceful termination.
- Improved logging and error handling during instance creation and status checks.
- Adjusted the provider's methods to utilize the new instance management functions.
- Changed the AMI ID for the ap-east-1 region to a new value for better compatibility.
- Added comments to clarify the usage of AMIs for CoACT-1 and the need for manual transfer from us-east-1.
- Ensured existing logic remains unchanged while improving documentation for future reference.
- Reverted the return value in the AWSProvider class to use private IP address instead of public IP address.
- Ensured that the logic remains intact while addressing the specific requirement for VNC access.
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* add all the uitars agents:
1. run_multienv_uitars.py: Qwen2VL-based UITARS models
2. run_multienv_uitars15_v1.py: UITARS1.5-7B
3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
- Fix delete_vm() method to accept region and **kwargs parameters
- Fix occupy_vm() method to accept pid, region and **kwargs parameters
- Ensures consistency with base VMManager interface and other providers
- Resolves runtime argument mismatch errors when calling these methods
This maintains backward compatibility while fixing the interface contract.
Add support for HuggingFace mirror (hf-mirror.com) to improve download
speeds in regions where huggingface.co access is slow.
- Support HF_ENDPOINT environment variable detection
- Automatically switch to hf-mirror.com when HF_ENDPOINT is set
- Apply to Docker, VMware, and VirtualBox providers
- Maintain backward compatibility with default huggingface.co URLs
Users can now set HF_ENDPOINT=https://hf-mirror.com to use the mirror.
- Added a new function to ensure URLs have a scheme, defaulting to 'http://' if missing.
- Integrated tldextract to normalize URLs by extracting domain parts and handling 'www' subdomains.
- Updated the compare_urls function to include logging for better traceability during URL comparisons.
- Added tldextract to requirements.txt to support the new functionality.
- Updated the AWS manager with a new AMI ID for the specified resolution.
- Modified Chrome desktop launcher to include --remote-debugging-port=1337 for GUI debugging support.
These changes improve the robustness of URL handling and enable consistent Chrome debugging capabilities without altering existing logic.
- Added postconfig commands to multiple JSON files for Chrome evaluation examples.
- Included commands to terminate existing Chrome processes, launch Chrome with remote debugging, and introduce sleep intervals for timing.
- Updated logging messages in the AWS manager to improve clarity and user experience.
These changes enhance the automation and usability of the evaluation examples while preserving existing logic.
- Add screen_size parameter to get_vm_path() for all providers (with default 1920x1080)
- Add os_type parameter to start_emulator() for Azure and VirtualBox providers
- Add region parameter to stop_emulator() for VMware, Docker, and VirtualBox providers
- Use *args, **kwargs for better extensibility and parameter consistency
- Add documentation comments explaining ignored parameters for interface consistency
- Prevents TypeError exceptions when AWS-specific parameters are passed to other providers
This ensures all providers can handle the same parameter sets while maintaining
backward compatibility and avoiding interface fragmentation.
- Replaced print statements with logging for better traceability in gimp.py.
- Added handling for transparent images in structure checks and size evaluations.
- Updated JSON examples to include delays in pyautogui commands for improved execution reliability.
- Changed image URL in example to a more accessible source.
* Enhance SetupController with improved logging and error handling during setup and file upload processes. Update instance type to t3.xlarge and AMI ID for AWS configuration. Add download progress logging and exception handling for better debugging.
* Enhance VLC status evaluation by adding multiple paths for file and URL information extraction, improving robustness against varying VLC XML structures. Implement detailed logging for better debugging and error handling in case of mismatches or missing data. Update example JSON for VLC evaluation to use a valid HLS stream URL.
* Improve audio comparison robustness in VLC evaluator by adding error handling for audio file loading and extraction. Implement detailed logging for empty or corrupt files, and normalize DTW distance calculation for more accurate similarity scoring. Remove deprecated audio fingerprint comparison function.
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
* Refactor evaluator structure in LibreOffice Writer example JSON to support multiple expected and result files, enhancing evaluation flexibility.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update time format in get_vm_file function to include hours, minutes, and seconds for more precise file naming with time suffix.
* More delay for 936321ce-5236-426a-9a20-e0e3c5dc536f; support one more potential solutions.
* Enhance SetupController with configurable retry limit and improved error handling for file opening requests. Introduce new function to compare unique training records, and update logging for better debugging. Adjust JSON examples for evaluation to support multiple expected and result files.
* Clean debug code
* Enhance DesktopEnv to track environment usage for optimized snapshot management. Introduce is_environment_used flag to determine if a snapshot revert is necessary based on provider type. Update setup and step methods to mark environment usage appropriately. Add new execute_with_verification method in SetupController for command execution with result verification, improving reliability. Change AWS instance type to m5.large for better performance and update AMI ID for compatibility. Update file opening logic in main.py to handle both file paths and application commands more effectively.
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
* Refactor evaluator structure in LibreOffice Writer example JSON to support multiple expected and result files, enhancing evaluation flexibility.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update time format in get_vm_file function to include hours, minutes, and seconds for more precise file naming with time suffix.
* More delay for 936321ce-5236-426a-9a20-e0e3c5dc536f; support one more potential solutions.
* Enhance SetupController with configurable retry limit and improved error handling for file opening requests. Introduce new function to compare unique training records, and update logging for better debugging. Adjust JSON examples for evaluation to support multiple expected and result files.
* Clean debug code
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>