Files
sci-gui-agent-benchmark/desktop_env/server/README.md
Tianbao Xie fffa8f8da6 Refactoring VMware Integration and Implementing AWS Support (#44)
* Initailize aws support

* Add README for the VM server

* Refactor OSWorld for supporting more cloud services.

* Initialize vmware and aws implementation v1, waiting for verification

* Initlize files for azure, gcp and virtualbox support

* Debug on the VMware provider

* Fix on aws interface mapping

* Fix instance type

* Refactor

* Clean

* hk region; debug

* Fix lock

* Remove print

* Remove key_name requirements when allocating aws vm

* Clean README

---------

Co-authored-by: XinyuanWangCS <xywang626@gmail.com>
2024-06-15 20:52:29 +08:00

4.3 KiB

Server setup

This README is useful if you want to set up your own machine for the environment. This README is not yet finished. Please contact the author if you need any assistance.

Set up the OSWorld server service in VM

  1. First please set up the environment:

    pip install -r requirements.txt
    

    if you customize the environment in this step, you should change the parameters in the service file we will mention later accordingly.

  2. Copy the main.py and pyxcursor.py and to the /home/user-name where the user-name is your username of the ubuntu, here we make it user as default. If you customize the path of placing these files in this step, you should change the parameters in the service file we will mention later accordingly.

  3. Copy the osworld_server.service to the systemd configuration directory at /etc/systemd/system/:

    sudo cp osworld_server.service /etc/systemd/system/
    

    Reload the systemd daemon to recognize the new service:

    sudo systemctl daemon-reload
    

    Enable the service to start on boot:

    sudo systemctl enable osworld_server.service
    

    Start the service:

    sudo systemctl start osworld_server.service
    

    Verify the service is running correctly:

    sudo systemctl status osworld_server.service
    

    You should see output indicating the service is active and running. If there are errors, review the logs with journalctl -xe for further troubleshooting.

    If you need to make adjustments to the service configuration, you can edit the /etc/systemd/system/osworld_server.service file:

    sudo nano /etc/systemd/system/osworld_server.service
    

    After making changes, reload the daemon and restart the service:

     sudo systemctl
    

Others

About the Converted Accessibility Tree

For several applications like Firefox or Thunderbird, you should first enable

gsettings set org.gnome.desktop.interface toolkit-accessibility true

to see their accessibility tree.

Example of AT

An example of a node:

<section xmlns:attr="uri:deskat:attributes.at-spi.gnome.org" attr:class="subject" st:enabled="true" cp:screencoord="(1525, 169)", cp:windowcoord="(342, 162)", cp:size="(327, 21)">
    歡迎使用新的 Outlook.com 帳戶
</section>

An example of a tree:

<desktop-frame ...>
    <application name="Thunderbird" ...>
        ... <!-- nodes of windows -->
    </application>
    ...
</desktop-frame>

Useful attributes

  1. name - shows the name of application, title of window, or name of some component
  2. attr:class - somewhat the same role as class in HTML
  3. attr:id - somewhat the same role as id in HTML
  4. cp:screencoord - absolute coordinator on the screen
  5. cp:windowcoord - relative coordinator in the window
  6. cp:size - the size

Also several states like st:enabled and st:visible can be indicated. A full state list is available at https://gitlab.gnome.org/GNOME/pyatspi2/-/blob/master/pyatspi/state.py?ref_type=heads.

How to use it in evaluation

See example thunderbird/12086550-11c0-466b-b367-1d9e75b3910e.json and function check_accessibility_tree in metrics/general.py. You can use CSS selector or XPath to reference a target nodes. You can also check its text contents.

An example of a CSS selector:

application[name=Thunderbird] page-tab-list[attr|id="tabmail-tabs"]>page-tab[name="About Profiles"]

This selector will select the page tab of profile manager in Thunderbird (if open).

For usage of CSS selector: https://www.w3.org/TR/selectors-3/. For usage of XPath: https://www.w3.org/TR/xpath-31/.

Manual check

You can use accerciser to check the accessibility tree on GNOME VM.

sudo apt install accerciser

Additional Installation

Activating the window manager control requires the installation of wmctrl:

sudo apt install wmctrl

To enable recording in the virtual machine, you need to install ffmpeg:

sudo apt install ffmpeg