Update README.md
This commit is contained in:
34
README.md
34
README.md
@@ -1,7 +1,7 @@
|
||||
# OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
|
||||
|
||||
<p align="center">
|
||||
<a href="">Website</a> •
|
||||
<a href="https://os-world.github.io/">Website</a> •
|
||||
<a href="">Paper</a>
|
||||
</p>
|
||||
|
||||
@@ -10,17 +10,16 @@
|
||||
|
||||
## Install
|
||||
### Non-virtualized platform
|
||||
Suppose you are on a system that has not yet been virtualized, meaning you are not on an AWS, Azure, or k8s virtualized environment.
|
||||
Otherwise, refer to the [virtualized platform](https://github.com/xlang-ai/OSWorld?tab=readme-ov-file#virtualized-platform) part.
|
||||
1. Install [VMware Work Station Pro](https://www.vmware.com/products/workstation-pro/workstation-pro-evaluation.html) (for Apple Chips, it should be [VMware Fusion](https://www.vmware.com/go/getfusion)) and configure `vmrun` command, and verify successful installation by:
|
||||
Suppose you are operating on a system that has not been virtualized, meaning you are not utilizing a virtualized environment like AWS, Azure, or k8s. If this is the case, proceed with the instructions below. However, if you are on a virtualized platform, please refer to the [virtualized platform](https://github.com/xlang-ai/OSWorld?tab=readme-ov-file#virtualized-platform) section.
|
||||
|
||||
1. Install [VMware Workstation Pro](https://www.vmware.com/products/workstation-pro/workstation-pro-evaluation.html) (for systems with Apple Chips, you should install [VMware Fusion](https://www.vmware.com/go/getfusion)) and configure the `vmrun` command. Verify the successful installation by running the following:
|
||||
```bash
|
||||
vmrun -T ws list
|
||||
```
|
||||
If the installation along with the environment variable set is successful, you will see the message showing the current running virtual machines.
|
||||
|
||||
2. Install the environment package, and download the examples and the virtual machine image.
|
||||
For x86_64 CPU Linux or Windows, you can install the environment package and download the examples and the virtual machine image by running the following commands:
|
||||
Remove the `nogui` parameter if you want to see what happens in the virtual machine.
|
||||
2. Install the environment package, download the examples, and obtain the virtual machine image. If you are using Linux or Windows with an x86_64 CPU, install the environment package and download the examples and the virtual machine image by executing the following commands:
|
||||
Remove the `nogui` parameter if you wish to view the activities within the virtual machine.
|
||||
```bash
|
||||
git clone https://github.com/xlang-ai/OSWorld
|
||||
cd OSWorld
|
||||
@@ -30,7 +29,7 @@ vmrun -T ws start "Ubuntu/Ubuntu.vmx" nogui
|
||||
vmrun -T ws snapshot "Ubuntu/Ubuntu.vmx" "init_state"
|
||||
```
|
||||
|
||||
For Apple-chip macOS, you should install the specially prepared virtual machine image by running the following commands:
|
||||
For macOS with Apple chips, you should install the specially prepared virtual machine image by executing the following commands:
|
||||
```bash
|
||||
gdown https://drive.google.com/drive/folders/xxx -O Ubuntu --folder
|
||||
vmrun -T fusion start "Ubuntu/Ubuntu.vmx"
|
||||
@@ -38,7 +37,7 @@ vmrun -T fusion snapshot "Ubuntu/Ubuntu.vmx" "init_state"
|
||||
```
|
||||
|
||||
### Virtualized platform
|
||||
We are working on supporting it👷, hold tight!
|
||||
We are working on supporting it 👷. Please hold tight!
|
||||
|
||||
## Quick Start
|
||||
Run the following minimal example to interact with the environment:
|
||||
@@ -61,23 +60,28 @@ env = DesktopEnv(
|
||||
obs = env.reset()
|
||||
obs, reward, done, info = env.step("pyautogui.rightClick()")
|
||||
```
|
||||
You will see all the logs of the system running normally, including the successful creation of the environment, completion of setup, and successful execution of actions. In the end, you will observe a successful right-click on the screen, which means you are ready to go.
|
||||
|
||||
## Run Benchmark
|
||||
### Run the Baseline Agent
|
||||
If you want to run the baseline agent we use in our paper, you can run the following command to run under the GPT-4V pure-screenshot setting as an example:
|
||||
If you wish to run the baseline agent used in our paper, you can execute the following command as an example under the GPT-4V pure-screenshot setting:
|
||||
```bash
|
||||
python run.py --path_to_vm Ubuntu/Ubuntu.vmx --headless --observation_type screenshot --model gpt-4-vision-preview
|
||||
python run.py --path_to_vm Ubuntu/Ubuntu.vmx --headless --observation_type screenshot --model gpt-4-vision-preview --result_dir ./results
|
||||
```
|
||||
The results, which include screenshots, actions, and video recordings of the agent's task completion, will be saved in the `./results` directory in this case. You can then run the following command to obtain the result:
|
||||
```bash
|
||||
python show_result.py
|
||||
```
|
||||
|
||||
### Run Evaluation of Your Agent
|
||||
Please first read through the [agent interface](https://github.com/xlang-ai/OSWorld/blob/main/mm_agents/README.md) and the [environment interface](https://github.com/xlang-ai/OSWorld/blob/main/desktop_env/README.md).
|
||||
Implement the agent interface correctly and import your customized one in the `run.py` file.
|
||||
Then, you can run a similar command as the previous section to run the benchmark on your agent.
|
||||
Please start by reading through the [agent interface](https://github.com/xlang-ai/OSWorld/blob/main/mm_agents/README.md) and the [environment interface](https://github.com/xlang-ai/OSWorld/blob/main/desktop_env/README.md).
|
||||
Correctly implement the agent interface and import your customized version in the `run.py` file.
|
||||
Afterward, you can execute a command similar to the one in the previous section to run the benchmark on your agent.
|
||||
|
||||
## Citation
|
||||
If you find this environment useful, please consider citing our work:
|
||||
```
|
||||
@article{DesktopEnv,
|
||||
@article{OSWorld,
|
||||
title={},
|
||||
author={},
|
||||
journal={arXiv preprint arXiv:xxxx.xxxx},
|
||||
|
||||
Reference in New Issue
Block a user