feat: add client password argument to multiple agents and scripts
- Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility. - Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration. - Modified evaluation guidelines to reflect the new client password requirement. - Ensured existing logic remains intact while enhancing functionality for better user experience.
This commit is contained in:
@@ -270,6 +270,7 @@ Use the `run_multienv_xxx.py` scripts to launch tasks in parallel.
|
||||
Example (with the OpenAI CUA agent):
|
||||
|
||||
```bash
|
||||
# --client_password set to the one you set to the client machine
|
||||
# Run OpenAI CUA
|
||||
python run_multienv_openaicua.py \
|
||||
--headless \
|
||||
@@ -279,7 +280,8 @@ python run_multienv_openaicua.py \
|
||||
--test_all_meta_path evaluation_examples/test_all.json \
|
||||
--region us-east-1 \
|
||||
--max_steps 50 \
|
||||
--num_envs 5
|
||||
--num_envs 5 \
|
||||
--client_password osworld-public-evaluation
|
||||
|
||||
# Run Anthropic (via AWS Bedrock), please modify agent if you want Anthropic endpoint
|
||||
python run_multienv_claude.py \
|
||||
@@ -291,7 +293,8 @@ python run_multienv_claude.py \
|
||||
--test_all_meta_path evaluation_examples/test_all.json \
|
||||
--max_steps 50 \
|
||||
--num_envs 5 \
|
||||
--provider_name aws
|
||||
--provider_name aws \
|
||||
--client_password osworld-public-evaluation
|
||||
```
|
||||
|
||||
Key Parameters:
|
||||
@@ -330,7 +333,7 @@ For more, see: [MONITOR_README](./monitor/README.md)
|
||||
### 4.2 VNC Remote Desktop Access
|
||||
We pre-install vnc for every virtual machine so you can have a look on it during the running.
|
||||
You can access via VNC at`http://<client-public-ip>:5910/vnc.html`
|
||||
The password set default is `osworld-public-evaluation` to prevent attack.
|
||||
The password set default is `osworld-public-evaluation` in our AMI to prevent attack.
|
||||
|
||||
## 5. Contact the team to update leaderboard and fix errors (optional)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user