* Added a **pyproject.toml** file to define project metadata and dependencies. * Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic. * Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis. * Added a **tools module** containing utility functions and tool configurations to improve code reusability. * Updated the **README** and documentation with usage examples and module descriptions. These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience. Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
168 lines
6.3 KiB
Plaintext
168 lines
6.3 KiB
Plaintext
# GUI-Agent Architecture and Workflow
|
|
## System Overview
|
|
### Core Components
|
|
- Controller: Central controller responsible for state management and decision triggering
|
|
- Manager: Task planner responsible for task decomposition and re-planning
|
|
- Worker: Executor with three specialized roles:
|
|
- Technician: Uses system terminal to complete tasks
|
|
- Operator: Executes GUI interface operations
|
|
- Analyst: Provides analytical support
|
|
- Evaluator: Quality inspector responsible for execution effectiveness evaluation
|
|
- Hardware: Hardware interface responsible for actual operation execution
|
|
### Global State Definitions
|
|
```python
|
|
{
|
|
"TaskStatus": ["created", "pending", "on_hold", "fulfilled", "rejected"],
|
|
"SubtaskStatus": ["ready", "pending", "fulfilled", "rejected"],
|
|
"ExecStatus": ["executed", "timeout", "error", "pending"],
|
|
"GateDecision": ["gate_done", "gate_fail", "gate_supplement", "gate_continue"],
|
|
"GateTrigger": ["PERIODIC_CHECK", "WORKER_STALE", "WORKER_SUCCESS", "FINAL_CHECK"],
|
|
"controller_situation": ["INIT", "GET_ACTION", "EXECUTE_ACTION", "QUALITY_CHECK", "PLAN", "SUPPLEMENT", "FINAL_CHECK", "DONE"],
|
|
}
|
|
```
|
|
#### State Descriptions:
|
|
- TaskStatus: Overall task status
|
|
- SubtaskStatus: Subtask status
|
|
- ExecStatus: Command execution status
|
|
- GateDecision: Quality check decision result
|
|
- GateTrigger: Quality check trigger condition
|
|
- controller_situation: Controller situation status
|
|
|
|
## System Startup and Initialization
|
|
### Startup Check
|
|
```
|
|
Initialize system state
|
|
TaskStatus = pending
|
|
|
|
Check task status:
|
|
If TaskStatus = fulfilled or TaskStatus = rejected
|
|
Enter end state
|
|
Otherwise
|
|
enter core scheduling loop
|
|
```
|
|
## Core Scheduling Loop
|
|
### State Flow Description
|
|
|
|
- GET_ACTION: Generate specific operation instructions
|
|
```
|
|
Executing Component: Worker (Technician/Operator/Analyst)
|
|
GET_ACTION → Worker execution → Result judgment
|
|
├── success → current_situation = QUALITY_CHECK
|
|
├── CANNOT_EXECUTE → current_situation = REPLAN
|
|
├── STALE_PROGRESS → current_situation = QUALITY_CHECK
|
|
└── generate_action → current_situation = EXECUTE_ACTION
|
|
└── supplement → current_situation = SUPPLEMENT
|
|
```
|
|
- EXECUTE_ACTION: Execute specific operations
|
|
```
|
|
Executing Component: Hardware
|
|
SEND_ACTION → Hardware execution → Get screenshot → Update history → current_situation = GET_ACTION
|
|
```
|
|
|
|
- QUALITY_CHECK: Quality assessment of execution effectiveness
|
|
```
|
|
Executing Component: Evaluator
|
|
Core Functions: Visual comparison, progress analysis, efficiency evaluation
|
|
QUALITY_CHECK → Evaluator assessment → GateDecision judgment
|
|
├── gate_done → Check subtask status
|
|
│ ├── More subtasks exist → Switch to next subtask → current_situation = GET_ACTION
|
|
│ └── No more subtasks → current_situation=FINAL_CHECK
|
|
├── gate_fail → current_situation = PLAN
|
|
├── gate_continue → current_situation = EXECUTE_ACTION
|
|
└── gate_supplement → current_situation = SUPPLEMENT
|
|
```
|
|
|
|
- PLAN: Re-plan tasks
|
|
```
|
|
Executing Component: Manager
|
|
PLAN → Manager re-planning → Generate new subtasks → Assign Workers → current_situation = GET_ACTION
|
|
```
|
|
- SUPPLEMENT: Supplement external materials
|
|
```
|
|
Executing Component: Manager
|
|
SUPPLEMENT → Manager calls external tools → Generate supplementary materials → Record materials → current_situation = PLAN
|
|
External Tools: web search, RAG, etc.
|
|
```
|
|
|
|
- FINAL_CHECK: Final verification of task completion status
|
|
```
|
|
Executing Component: Evaluator
|
|
Trigger Condition: Final verification after all subtasks are marked as complete
|
|
FINAL_CHECK → Evaluator final assessment → Result judgment
|
|
├── Verification passed → TaskStatus = fulfilled → System ends
|
|
├── Issues found → current_situation = PLAN → Continue execution
|
|
Verification Content:
|
|
Whether overall objectives are achieved
|
|
Whether all necessary steps are completed
|
|
Whether final state meets expectations
|
|
Whether there are omissions or errors
|
|
```
|
|
|
|
## Worker Professional Division
|
|
### Technician
|
|
- Applicable Scenarios: Tasks requiring system-level operations
|
|
- Working Method: Complete tasks through terminal commands via backend service execution, can write code in ```bash...``` code blocks for bash scripts, and ```python...``` code blocks for python code.
|
|
- Typical Tasks:
|
|
- File system operations
|
|
- System configuration modifications
|
|
- Program installation and deployment
|
|
- Script execution
|
|
### Operator
|
|
- Applicable Scenarios: Tasks requiring GUI interface interaction or inner operations such as memrorization
|
|
- Working Method: Simulate user interface operations
|
|
- Typical Tasks:
|
|
- Clicking buttons, menus
|
|
- Filling forms
|
|
- Drag and drop operations
|
|
- Window management
|
|
### Analyst
|
|
- Applicable Scenarios: Tasks requiring data analysis and decision support
|
|
- Working Method: Analyze memory stored inside the system, provide recommendations
|
|
- Typical Tasks:
|
|
- Question analysis
|
|
|
|
## Monitoring and Trigger Mechanisms
|
|
### Quality Check Trigger Mechanism
|
|
GateTrigger Types:
|
|
```
|
|
PERIODIC_CHECK: Periodic check
|
|
Regular verification of execution progress
|
|
WORKER_STALE: Worker stagnation check
|
|
Worker reports task cannot goingon
|
|
WORKER_SUCCESS: Worker successful completion
|
|
Worker reports task completion
|
|
Need to verify completion quality
|
|
```
|
|
### Task Termination Conditions
|
|
```
|
|
TaskStatus = rejected conditions:
|
|
Manager planning attempts > 10 times
|
|
current_step > N steps (timeout termination)
|
|
TaskStatus = fulfilled conditions:
|
|
All subtask status = fulfilled
|
|
FINAL_CHECK verification passed
|
|
Expected target state achieved
|
|
```
|
|
### ExecStatus Handling
|
|
```
|
|
executed: Normal execution completion → Continue process
|
|
timeout: Execution timeout → Retry or re-plan
|
|
error: Execution error → Error handling, may need re-planning
|
|
pending: Currently executing
|
|
```
|
|
## State Monitoring Mechanism
|
|
### SubtaskStatus Management
|
|
```
|
|
ready: Ready for execution, waiting
|
|
pending: Currently executing
|
|
fulfilled: Successfully completed
|
|
rejected: Execution failed
|
|
```
|
|
### State Transition Monitoring
|
|
```
|
|
System continuously monitors state changes at all levels:
|
|
TaskStatus changes trigger global process adjustments
|
|
SubtaskStatus changes affect current execution strategy
|
|
ExecStatus changes determine immediate response measures
|
|
All state changes are recorded in execution history
|
|
``` |