Files
sci-gui-agent-benchmark/desktop_env/server/README.md
Tianbao Xie fffa8f8da6 Refactoring VMware Integration and Implementing AWS Support (#44)
* Initailize aws support

* Add README for the VM server

* Refactor OSWorld for supporting more cloud services.

* Initialize vmware and aws implementation v1, waiting for verification

* Initlize files for azure, gcp and virtualbox support

* Debug on the VMware provider

* Fix on aws interface mapping

* Fix instance type

* Refactor

* Clean

* hk region; debug

* Fix lock

* Remove print

* Remove key_name requirements when allocating aws vm

* Clean README

---------

Co-authored-by: XinyuanWangCS <xywang626@gmail.com>
2024-06-15 20:52:29 +08:00

140 lines
4.3 KiB
Markdown

# Server setup
This README is useful if you want to set up your own machine for the environment. This README is not yet finished. Please contact the author if you need any assistance.
## Set up the OSWorld server service in VM
1. First please set up the environment:
```shell
pip install -r requirements.txt
```
if you customize the environment in this step, you should change the parameters in the service file we will mention later accordingly.
2. Copy the `main.py` and `pyxcursor.py` and to the `/home/user-name` where the `user-name` is your username of the ubuntu, here we make it `user` as default. If you customize the path of placing these files in this step, you should change the parameters in the service file we will mention later accordingly.
3. Copy the `osworld_server.service` to the systemd configuration directory at `/etc/systemd/system/`:
```shell
sudo cp osworld_server.service /etc/systemd/system/
```
Reload the systemd daemon to recognize the new service:
```shell
sudo systemctl daemon-reload
```
Enable the service to start on boot:
```shell
sudo systemctl enable osworld_server.service
```
Start the service:
```shell
sudo systemctl start osworld_server.service
```
Verify the service is running correctly:
```shell
sudo systemctl status osworld_server.service
```
You should see output indicating the service is active and running. If there are errors, review the logs with `journalctl -xe` for further troubleshooting.
If you need to make adjustments to the service configuration, you can edit the `/etc/systemd/system/osworld_server.service` file:
```shell
sudo nano /etc/systemd/system/osworld_server.service
```
After making changes, reload the daemon and restart the service:
```shell
sudo systemctl
```
<!-- vimc: call SyntaxRange#Include('```xml', '```', 'xml', 'NonText'): -->
<!-- vimc: call SyntaxRange#Include('```css', '```', 'css', 'NonText'): -->
<!-- vimc: call SyntaxRange#Include('```sh', '```', 'sh', 'NonText'): -->
<!-- vimc: call SyntaxRange#Include('```bash', '```', 'sh', 'NonText'): -->
## Others
### About the Converted Accessibility Tree
For several applications like Firefox or Thunderbird, you should first enable
```sh
gsettings set org.gnome.desktop.interface toolkit-accessibility true
```
to see their accessibility tree.
#### Example of AT
An example of a node:
```xml
<section xmlns:attr="uri:deskat:attributes.at-spi.gnome.org" attr:class="subject" st:enabled="true" cp:screencoord="(1525, 169)", cp:windowcoord="(342, 162)", cp:size="(327, 21)">
歡迎使用新的 Outlook.com 帳戶
</section>
```
An example of a tree:
```xml
<desktop-frame ...>
<application name="Thunderbird" ...>
... <!-- nodes of windows -->
</application>
...
</desktop-frame>
```
#### Useful attributes
1. `name` - shows the name of application, title of window, or name of some
component
2. `attr:class` - somewhat the same role as `class` in HTML
3. `attr:id` - somewhat the same role as `id` in HTML
4. `cp:screencoord` - absolute coordinator on the screen
5. `cp:windowcoord` - relative coordinator in the window
6. `cp:size` - the size
Also several states like `st:enabled` and `st:visible` can be indicated. A full
state list is available at
<https://gitlab.gnome.org/GNOME/pyatspi2/-/blob/master/pyatspi/state.py?ref_type=heads>.
#### How to use it in evaluation
See example `thunderbird/12086550-11c0-466b-b367-1d9e75b3910e.json` and
function `check_accessibility_tree` in `metrics/general.py`. You can use CSS
selector or XPath to reference a target nodes. You can also check its text
contents.
An example of a CSS selector:
```css
application[name=Thunderbird] page-tab-list[attr|id="tabmail-tabs"]>page-tab[name="About Profiles"]
```
This selector will select the page tab of profile manager in Thunderbird (if open).
For usage of CSS selector: <https://www.w3.org/TR/selectors-3/>. For usage of XPath: <https://www.w3.org/TR/xpath-31/>.
#### Manual check
You can use accerciser to check the accessibility tree on GNOME VM.
```sh
sudo apt install accerciser
```
### Additional Installation
Activating the window manager control requires the installation of `wmctrl`:
```bash
sudo apt install wmctrl
```
To enable recording in the virtual machine, you need to install `ffmpeg`:
```bash
sudo apt install ffmpeg
```