Merge pull request #212 from yuanmengqi/aws_clean

AWS OSWorld Provider Enhancement, Proxy Intergration, new Agent Operator Inplementation
This commit is contained in:
Yuan Mengqi
2025-06-10 21:44:18 +08:00
committed by GitHub
432 changed files with 9674 additions and 2504 deletions

5
.gitignore vendored
View File

@@ -190,6 +190,11 @@ test2.xlsx
docker_vm_data
vmware_vm_data
.vmware*
.aws*
# result
**/result*/**/*
.vscode
dataimpulse_proxy_config.json

View File

@@ -113,6 +113,62 @@ To configure the OAuth2.0 screen for the created GCP. Go to page [OAuth consent
<img src="assets/authorization.png" width="45%" alt="Authorization">
</p>
## Generating `credentials.json` for Public Eval (Optional)
If you are using **Public Eval** for evaluation, you need to complete the OAuth2 authorization **locally**, and then upload the generated `credentials.json` file to your AWS Host instance.
Please follow the steps below:
1. Run the following Python script on your **local machine** (make sure you have installed `google-auth-oauthlib` and `google-auth`):
```python
from google_auth_oauthlib.flow import InstalledAppFlow
import json
SCOPES = ['https://www.googleapis.com/auth/drive']
flow = InstalledAppFlow.from_client_secrets_file(
'client_secrets.json',
scopes=SCOPES
)
credentials = flow.run_local_server(port=0)
with open('credentials.json', 'w') as f:
f.write(credentials.to_json())
print("OAuth2 credentials have been saved to credentials.json.")
```
> ⚠️ **Note**: This script will open a browser window for you to log in and authorize access using the configured Google account.
After successful authorization, a `credentials.json` file will be generated on your local machine.
2. Upload this file to your AWS Host instance and place it at the following path:
```
OSWorld/evaluation_examples/settings/googledrive/credentials.json
```
After the setup, your directory structure on the AWS Host should look like this:
```
evaluation_examples/
├── examples/
├── examples_windows/
├── settings/
├── google/
│ ├── settings.json
│ ├── settings.json.template
├── googledrive/
│ ├── client_secrets.json
│ ├── credentials.json
│ ├── settings.yml
└── thunderbird/
```
## Potential Issues
Due to strict check by Google safety teams, even if we shut down the 2-step verification, Google still detects potential risks of your account, especially __when you frequently change the login device__. You may encounter the following issues:

View File

@@ -0,0 +1,175 @@
# Public Evaluation Platform User Guide
We have built an AWS-based platform for large-scale parallel evaluation of OSWorld tasks. The system follows a Host-Client architecture:
- **Host Instance**: The central controller that stores code, configurations, and manages task execution.
- **Client Instances**: Worker nodes automatically launched to perform tasks in parallel.
All instances use a preconfigured AMI to ensure a consistent environment.
## 1. Platform Deployment & Connection
### 1.1 Launch the Host Instance
Create an EC2 instance in the AWS Console with the following settings:
| Configuration Item | Value |
| -------------------------- | ------------------------------------------------------------ |
| AMI ID | `ami-0e49e0a70044dde43` |
| Instance Type | - `t3.medium` (Recommended for ≤5 parallel tasks)<br />- ` t3.large ` (Recommended for ≤15 parallel tasks)<br /><br /> - These numbers are based on using VSCode over SSH. You can save resources by running via CLI—`t3.large` supports up to 20 tasks that way.<br /> - For higher parallelism, use a more powerful instance. |
| VPC | `vpc-0f207282fe145bcda` |
| Subnet | `subnet-0a4b0c5b8f6066712` |
| Firewall (security groups) | `sg-05f8e79c10a7768e4` |
| Storage | 50GB<br /> - Consider increasing if storing multiple results to avoid crashes. |
Once launched, you will receive an instance ID like `i-xxxxxx`.
### 1.2 Connect to the Host Instance
#### Step 1: Prepare Your SSH Key
* When launching the instance, choose "Create new key pair" and download the `.pem` file (e.g. `osworld-host-key.pem`). Save it locally.
<p align="center">
<img src="./assets/pubeval1.png" alt="pubeval1" style="width:80%;" />
</p>
* Set appropriate permissions:
```bash
chmod 400 <your_key_file_path>
```
* Find your instance's **public IP** and **DNS**:
- Go to the EC2 **Instances** page on the AWS Console.
- Locate your Host instance by its ID.
<p align="center">
<img src="./assets/pubeval2.png" alt="pubeval2" style="width:80%;" />
</p>
* On the instance detail page:
- **Public IP/DNS**: used for browser/VNC access and SSH connection
- **Instance metadata**: e.g. storage, can be adjusted post-launch
<p align="center">
<img src="./assets/pubeval3.png" alt="pubeval3" style="width:80%;" />
</p>
#### Step 2: Connect via SSH or VSCode
* SSH:
```bash
ssh -i <your_key_path> ubuntu@<your_public_dns>
```
* VSCode Remote SSH configuration:
```
Host host_example
HostName <your_public_dns>
User ubuntu
IdentityFile <your_key_path>
```
### 1.3 Get AWS Access Keys & Secret Access Key
Click on **Security Credentials** from the drop-down menu under your account in the top-right corner.
<p align="center">
<img src="./assets/pubeval4.png" alt="pubeval4" style="width: 25%;" />
</p>
In the **Access keys** section, click **"Create access key"** to generate your own key.
<p align="center">
<img src="./assets/pubeval5.png" alt="pubeval5" style="width: 100%;" />
</p>
## 2. Environment Setup
### 2.1 Google Drive Integration
Follow the instructions in [ACCOUNT_GUIDELINE.md](./ACCOUNT_GUIDELINE.md), specifically the section "Generating `credentials.json` for Public Eval". This part is necessary if using public evaluation.
### 2.2 Proxy Setup
- Register at [DataImpulse](https://dataimpulse.com/).
- Configure your credentials in `OSWorld/evaluation_examples/settings/proxy/dataimpulse.json`:
```json
[
{
"host": "gw.dataimpulse.com",
"port": 823,
"username": "your_username",
"password": "your_password",
"protocol": "http",
"provider": "dataimpulse",
"type": "residential",
"country": "US",
"note": "Dataimpulse Residential Proxy"
}
]
```
### 2.3 Set Environment Variables
```bash
export OPENAI_API_KEY_CUA="your_api_key"
export AWS_ACCESS_KEY_ID="your_access_key"
export AWS_SECRET_ACCESS_KEY="your_security_access_key"
export AWS_REGION="your_aws_region" # eg. us-east-1
export AWS_SUBNET_ID="subnet-0a4b0c5b8f6066712"
export AWS_SECURITY_GROUP_ID="sg-08a53433e9b4abde6"
```
## 3. Running Evaluations
Use the `run_multienv_xxx.py` scripts to launch tasks in parallel.
Example (with the OpenAI CUA agent):
```bash
python run_multienv_openaicua.py \
--headless \
--observation_type screenshot \
--model computer-use-preview \
--result_dir ./results_all \
--test_all_meta_path evaluation_examples/test_all.json \
--region us-east-1 \
--max_steps 150 \
--num_envs 5
```
Key Parameters:
- `--num_envs`: Number of parallel environments
- `--max_steps`: Max steps per task
- `--result_dir`: Output directory for results
- `--test_all_meta_path`: Path to the test set metadata
- `--region`: AWS region
## 4. Viewing Results
### 4.1 Web Monitoring Tool
```bash
cd monitor
pip install -r requirements.txt
python main.py
```
Then, open your Host's **public IP** on port `8080` in a browser. (eg. `http://<client-public-ip>:8080`)
For more, see: [ACCOUNT_GUIDELINE.md](./monitor/README.md)
### 4.2 VNC Remote Desktop Access
You can also access Client instances via VNC at`http://<client-public-ip>:5090/vnc.html`

BIN
assets/pubeval1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

BIN
assets/pubeval2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

BIN
assets/pubeval3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 309 KiB

BIN
assets/pubeval4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

BIN
assets/pubeval5.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

54
aws/README.md Normal file
View File

@@ -0,0 +1,54 @@
# AWS CLI v2
This bundle contains a built executable of the AWS CLI v2.
## Installation
To install the AWS CLI v2, run the `install` script:
```
$ sudo ./install
You can now run: /usr/local/bin/aws --version
```
This will install the AWS CLI v2 at `/usr/local/bin/aws`. Assuming
`/usr/local/bin` is on your `PATH`, you can now run:
```
$ aws --version
```
### Installing without sudo
If you don't have ``sudo`` permissions or want to install the AWS
CLI v2 only for the current user, run the `install` script with the `-b`
and `-i` options:
```
$ ./install -i ~/.local/aws-cli -b ~/.local/bin
```
This will install the AWS CLI v2 in `~/.local/aws-cli` and create
symlinks for `aws` and `aws_completer` in `~/.local/bin`. For more
information about these options, run the `install` script with `-h`:
```
$ ./install -h
```
### Updating
If you run the `install` script and there is a previously installed version
of the AWS CLI v2, the script will error out. To update to the version included
in this bundle, run the `install` script with `--update`:
```
$ sudo ./install --update
```
### Removing the installation
To remove the AWS CLI v2, delete the its installation and symlinks:
```
$ sudo rm -rf /usr/local/aws-cli
$ sudo rm /usr/local/bin/aws
$ sudo rm /usr/local/bin/aws_completer
```
Note if you installed the AWS CLI v2 using the `-b` or `-i` options, you will
need to remove the installation and the symlinks in the directories you
specified.

1468
aws/THIRD_PARTY_LICENSES Normal file

File diff suppressed because it is too large Load Diff

155
aws/install Executable file
View File

@@ -0,0 +1,155 @@
#!/bin/sh
# Copyright 2012-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
usage() {
cat 1>&2 <<EOF
Installs the AWS CLI v2
USAGE:
install [FLAGS] [OPTIONS]
FLAGS:
-u, --update Updates the AWS CLI v2 if a different version
is previously installed. By default, this script
will not update the AWS CLI if a previous
installation is detected.
-h, --help Prints help information
OPTIONS:
-i, --install-dir <path> The directory to install the AWS CLI v2. By
default, this directory is: /usr/local/aws-cli
-b, --bin-dir <path> The directory to store symlinks to executables
for the AWS CLI v2. By default, the directory
used is: /usr/local/bin
EOF
}
parse_commandline() {
while test $# -gt 0
do
key="$1"
case "$key" in
-i|--install-dir)
PARSED_INSTALL_DIR="$2"
shift
;;
-b|--bin-dir)
PARSED_BIN_DIR="$2"
shift
;;
-u|--update)
PARSED_UPGRADE="yes"
;;
-h|--help)
usage
exit 0
;;
*)
die "Got an unexpected argument: $1"
;;
esac
shift
done
}
set_global_vars() {
ROOT_INSTALL_DIR=${PARSED_INSTALL_DIR:-/usr/local/aws-cli}
BIN_DIR=${PARSED_BIN_DIR:-/usr/local/bin}
UPGRADE=${PARSED_UPGRADE:-no}
EXE_NAME="aws"
COMPLETER_EXE_NAME="aws_completer"
INSTALLER_DIR="$( cd "$( dirname "$0" )" >/dev/null 2>&1 && pwd )"
INSTALLER_DIST_DIR="$INSTALLER_DIR/dist"
INSTALLER_EXE="$INSTALLER_DIST_DIR/$EXE_NAME"
AWS_EXE_VERSION=$($INSTALLER_EXE --version | cut -d ' ' -f 1 | cut -d '/' -f 2)
INSTALL_DIR="$ROOT_INSTALL_DIR/v2/$AWS_EXE_VERSION"
INSTALL_DIR="$INSTALL_DIR"
INSTALL_DIST_DIR="$INSTALL_DIR/dist"
INSTALL_BIN_DIR="$INSTALL_DIR/bin"
INSTALL_AWS_EXE="$INSTALL_BIN_DIR/$EXE_NAME"
INSTALL_AWS_COMPLETER_EXE="$INSTALL_BIN_DIR/$COMPLETER_EXE_NAME"
CURRENT_INSTALL_DIR="$ROOT_INSTALL_DIR/v2/current"
CURRENT_AWS_EXE="$CURRENT_INSTALL_DIR/bin/$EXE_NAME"
CURRENT_AWS_COMPLETER_EXE="$CURRENT_INSTALL_DIR/bin/$COMPLETER_EXE_NAME"
BIN_AWS_EXE="$BIN_DIR/$EXE_NAME"
BIN_AWS_COMPLETER_EXE="$BIN_DIR/$COMPLETER_EXE_NAME"
}
create_install_dir() {
mkdir -p "$INSTALL_DIR" || exit 1
{
setup_install_dist &&
setup_install_bin &&
create_current_symlink
} || {
rm -rf "$INSTALL_DIR"
exit 1
}
}
check_preexisting_install() {
if [ -L "$CURRENT_INSTALL_DIR" ] && [ "$UPGRADE" = "no" ]
then
die "Found preexisting AWS CLI installation: $CURRENT_INSTALL_DIR. Please rerun install script with --update flag."
fi
if [ -d "$INSTALL_DIR" ]
then
echo "Found same AWS CLI version: $INSTALL_DIR. Skipping install."
exit 0
fi
}
setup_install_dist() {
cp -r "$INSTALLER_DIST_DIR" "$INSTALL_DIST_DIR"
}
setup_install_bin() {
mkdir -p "$INSTALL_BIN_DIR"
ln -s "../dist/$EXE_NAME" "$INSTALL_AWS_EXE"
ln -s "../dist/$COMPLETER_EXE_NAME" "$INSTALL_AWS_COMPLETER_EXE"
}
create_current_symlink() {
ln -snf "$INSTALL_DIR" "$CURRENT_INSTALL_DIR"
}
create_bin_symlinks() {
mkdir -p "$BIN_DIR"
ln -sf "$CURRENT_AWS_EXE" "$BIN_AWS_EXE"
ln -sf "$CURRENT_AWS_COMPLETER_EXE" "$BIN_AWS_COMPLETER_EXE"
}
die() {
err_msg="$1"
echo "$err_msg" >&2
exit 1
}
main() {
parse_commandline "$@"
set_global_vars
check_preexisting_install
create_install_dir
create_bin_symlinks
echo "You can now run: $BIN_AWS_EXE --version"
exit 0
}
main "$@" || exit 1

View File

@@ -21,11 +21,20 @@ from requests_toolbelt.multipart.encoder import MultipartEncoder
from desktop_env.controllers.python import PythonController
from desktop_env.evaluators.metrics.utils import compare_urls
from desktop_env.providers.aws.proxy_pool import get_global_proxy_pool, init_proxy_pool, ProxyInfo
import dotenv
# Load environment variables from .env file
dotenv.load_dotenv()
CLIENT_PASSWORD = os.getenv("CLIENT_PASSWORD", "osworld-public-evaluation") # Default password for sudo operations
PROXY_CONFIG_FILE = os.getenv("PROXY_CONFIG_FILE", "evaluation_examples/settings/proxy/dataimpulse.json") # Default proxy config file
logger = logging.getLogger("desktopenv.setup")
FILE_PATH = os.path.dirname(os.path.abspath(__file__))
init_proxy_pool(PROXY_CONFIG_FILE) # initialize the global proxy pool
class SetupController:
def __init__(self, vm_ip: str, server_port: int = 5000, chromium_port: int = 9222, vlc_port: int = 8080, cache_dir: str = "cache"):
@@ -40,7 +49,7 @@ class SetupController:
def reset_cache_dir(self, cache_dir: str):
self.cache_dir = cache_dir
def setup(self, config: List[Dict[str, Any]]):
def setup(self, config: List[Dict[str, Any]])-> bool:
"""
Args:
config (List[Dict[str, Any]]): list of dict like {str: Any}. each
@@ -51,7 +60,22 @@ class SetupController:
"parameters": dick like {str, Any} providing the keyword
parameters
}
"""
"""
# make sure connection can be established
logger.info(f"try to connect {self.http_server}")
retry = 0
while retry < 50:
try:
_ = requests.get(self.http_server + "/terminal")
break
except:
time.sleep(5)
retry += 1
logger.info(f"retry: {retry}/50")
if retry == 50:
return False
for cfg in config:
config_type: str = cfg["type"]
@@ -61,9 +85,12 @@ class SetupController:
# protocol
setup_function: str = "_{:}_setup".format(config_type)
assert hasattr(self, setup_function), f'Setup controller cannot find init function {setup_function}'
logger.info(f"call function {setup_function}")
getattr(self, setup_function)(**parameters)
logger.info("SETUP: %s(%s)", setup_function, str(parameters))
return True
def _download_setup(self, files: List[Dict[str, str]]):
"""
@@ -74,12 +101,6 @@ class SetupController:
"path": str, the path on the VM to store the downloaded file
}
"""
# if not config:
# return
# if not 'download' in config:
# return
# for url, path in config['download']:
for f in files:
url: str = f["url"]
path: str = f["path"]
@@ -110,7 +131,7 @@ class SetupController:
logger.error(
f"Failed to download {url} caused by {e}. Retrying... ({max_retries - i - 1} attempts left)")
if not downloaded:
raise requests.RequestException(f"Failed to download {url}. No retries left. Error: {e}")
raise requests.RequestException(f"Failed to download {url}. No retries left.")
form = MultipartEncoder({
"file_path": path,
@@ -166,12 +187,6 @@ class SetupController:
logger.error("An error occurred while trying to send the request: %s", e)
def _change_wallpaper_setup(self, path: str):
# if not config:
# return
# if not 'wallpaper' in config:
# return
# path = config['wallpaper']
if not path:
raise Exception(f"Setup Wallpaper - Invalid path ({path}).")
@@ -194,11 +209,6 @@ class SetupController:
raise NotImplementedError()
def _open_setup(self, path: str):
# if not config:
# return
# if not 'open' in config:
# return
# for path in config['open']:
if not path:
raise Exception(f"Setup Open - Invalid path ({path}).")
@@ -229,6 +239,7 @@ class SetupController:
headers = {"Content-Type": "application/json"}
try:
logger.info("REQUEST ADDRESS: %s", self.http_server + "/setup" + "/launch")
response = requests.post(self.http_server + "/setup" + "/launch", headers=headers, data=payload)
if response.status_code == 200:
logger.info("Command executed successfully: %s", response.text)
@@ -348,6 +359,51 @@ class SetupController:
except requests.exceptions.RequestException as e:
logger.error("An error occurred while trying to send the request: %s", e)
def _proxy_setup(self, client_password: str = CLIENT_PASSWORD):
"""Setup system-wide proxy configuration using proxy pool
Args:
client_password (str): Password for sudo operations, defaults to "password"
"""
# Get proxy from global proxy pool
proxy_pool = get_global_proxy_pool()
current_proxy = proxy_pool.get_next_proxy()
if not current_proxy:
logger.error("No proxy available from proxy pool")
raise Exception("No proxy available from proxy pool")
# Format proxy URL
proxy_url = proxy_pool._format_proxy_url(current_proxy)
logger.info(f"Setting up proxy: {current_proxy.host}:{current_proxy.port}")
# Configure system proxy environment variables
proxy_commands = [
f"echo '{client_password}' | sudo -S bash -c \"echo 'export http_proxy={proxy_url}' >> /etc/environment\"",
f"echo '{client_password}' | sudo -S bash -c \"echo 'export https_proxy={proxy_url}' >> /etc/environment\"",
f"echo '{client_password}' | sudo -S bash -c \"echo 'export HTTP_PROXY={proxy_url}' >> /etc/environment\"",
f"echo '{client_password}' | sudo -S bash -c \"echo 'export HTTPS_PROXY={proxy_url}' >> /etc/environment\"",
]
# Execute all proxy configuration commands
for cmd in proxy_commands:
try:
self._execute_setup([cmd], shell=True)
except Exception as e:
logger.error(f"Failed to execute proxy setup command: {e}")
proxy_pool.mark_proxy_failed(current_proxy)
raise
# Reload environment variables
reload_cmd = "source /etc/environment"
try:
logger.info(f"Proxy setup completed successfully for {current_proxy.host}:{current_proxy.port}")
proxy_pool.mark_proxy_success(current_proxy)
except Exception as e:
logger.error(f"Failed to reload environment variables: {e}")
proxy_pool.mark_proxy_failed(current_proxy)
raise
# Chrome setup
def _chrome_open_tabs_setup(self, urls_to_open: List[str]):
host = self.vm_ip

View File

@@ -18,12 +18,12 @@ logger = logging.getLogger("desktopenv.env")
Metric = Callable[[Any, Any], float]
Getter = Callable[[gym.Env, Dict[str, Any]], Any]
MAX_RETRIES = 5
class DesktopEnv(gym.Env):
"""
DesktopEnv with OpenAI Gym interface. It provides a desktop environment for setting and evaluating desktop automation tasks.
"""
def __init__(
self,
provider_name: str = "vmware",
@@ -37,6 +37,7 @@ class DesktopEnv(gym.Env):
require_a11y_tree: bool = True,
require_terminal: bool = False,
os_type: str = "Ubuntu",
enable_proxy: bool = False,
):
"""
Args:
@@ -51,16 +52,23 @@ class DesktopEnv(gym.Env):
headless (bool): whether to run the VM in headless mode
require_a11y_tree (bool): whether to require accessibility tree
require_terminal (bool): whether to require terminal output
os_type (str): operating system type, default to "Ubuntu"
enable_proxy (bool): whether to enable proxy support, default to False
"""
# Initialize VM manager and vitualization provider
self.region = region
self.provider_name = provider_name
self.enable_proxy = enable_proxy # Store proxy enablement setting
# Default
# Default
self.server_port = 5000
self.chromium_port = 9222
self.vnc_port = 8006
self.vlc_port = 8080
self.manager, self.provider = create_vm_manager_and_provider(provider_name, region)
# Initialize with default (no proxy) provider
self.current_use_proxy = False
self.manager, self.provider = create_vm_manager_and_provider(provider_name, region, use_proxy=False)
self.os_type = os_type
@@ -69,30 +77,41 @@ class DesktopEnv(gym.Env):
self.path_to_vm = os.path.abspath(os.path.expandvars(os.path.expanduser(path_to_vm))) \
if provider_name in {"vmware", "virtualbox"} else path_to_vm
else:
self.path_to_vm = self.manager.get_vm_path(self.os_type, region)
self.snapshot_name = snapshot_name
self.cache_dir_base: str = cache_dir
# todo: add the logic to get the screen size from the VM
self.headless = headless
self.require_a11y_tree = require_a11y_tree
self.require_terminal = require_terminal
self.path_to_vm = self.manager.get_vm_path(os_type=self.os_type, region=region)
try:
self.snapshot_name = snapshot_name
self.cache_dir_base: str = cache_dir
# todo: add the logic to get the screen size from the VM
self.headless = headless
self.require_a11y_tree = require_a11y_tree
self.require_terminal = require_terminal
# Initialize emulator and controller
if provider_name != "docker": # Check if this is applicable to other VM providers
logger.info("Initializing...")
self._start_emulator()
# Initialize emulator and controller
if provider_name != "docker": # Check if this is applicable to other VM providers
logger.info("Initializing...")
self._start_emulator()
# mode: human or machine
self.instruction = None
assert action_space in ["computer_13", "pyautogui"]
self.action_space = action_space # todo: refactor it to the ActType
# mode: human or machine
self.instruction = None
assert action_space in ["computer_13", "pyautogui"]
self.action_space = action_space # todo: refactor it to the ActType
# episodic stuffs, like counters, will be updated or reset
# when calling self.reset()
self._traj_no: int = -1
self._step_no: int = 0
self.action_history: List[Dict[str, any]] = []
# episodic stuffs, like counters, will be updated or reset
# when calling self.reset()
self._traj_no: int = -1
self._step_no: int = 0
self.action_history: List[Dict[str, any]] = []
except Exception as e:
logger.error(f"Failed to initialize DesktopEnv: {e}")
# If initialization fails, we should clean up the VM
try:
self.close()
self.manager.delete_vm(self.path_to_vm, self.region)
logger.info(f"Cleaned up VM {self.path_to_vm}.")
except Exception as cleanup_error:
logger.error(f"Failed to clean up VM {self.path_to_vm}: {cleanup_error}")
raise
def _start_emulator(self):
# Power on the virtual machine
@@ -114,7 +133,8 @@ class DesktopEnv(gym.Env):
# due to the fact it could be changed when implemented by cloud services
path_to_vm = self.provider.revert_to_snapshot(self.path_to_vm, self.snapshot_name)
if path_to_vm and not path_to_vm == self.path_to_vm:
# path_to_vm has to be a new path
# path_to_vm has to be a new path
self.manager.delete_vm(self.path_to_vm, self.region)
self.manager.add_vm(path_to_vm, self.region)
self.manager.occupy_vm(path_to_vm, os.getpid(), self.region)
@@ -129,6 +149,7 @@ class DesktopEnv(gym.Env):
self.provider.stop_emulator(self.path_to_vm)
def reset(self, task_config: Optional[Dict[str, Any]] = None, seed=None, options=None) -> Dict[str, Any]:
# Reset to certain task in OSWorld
logger.info("Resetting environment...")
logger.info("Switching task...")
@@ -137,18 +158,66 @@ class DesktopEnv(gym.Env):
self._step_no = 0
self.action_history.clear()
logger.info("Reverting to snapshot to {}...".format(self.snapshot_name))
self._revert_to_snapshot()
logger.info("Starting emulator...")
self._start_emulator()
logger.info("Emulator started.")
for attempt in range(MAX_RETRIES):
# Check and handle proxy requirement changes BEFORE starting emulator
if task_config is not None:
# Only consider task proxy requirement if proxy is enabled at system level
task_use_proxy = task_config.get("proxy", False) and self.enable_proxy
if not self.enable_proxy and task_config.get("proxy", False):
logger.info("Task requires proxy but proxy is disabled at system level, ignoring proxy requirement.")
if task_use_proxy != self.current_use_proxy:
logger.info(f"Task proxy requirement changed: {self.current_use_proxy} -> {task_use_proxy}")
# Close current provider if it exists
if hasattr(self, 'provider') and self.provider:
try:
self.provider.stop_emulator(self.path_to_vm)
except Exception as e:
logger.warning(f"Failed to stop current provider: {e}")
# Create new provider with appropriate proxy setting
self.current_use_proxy = task_use_proxy
self.manager, self.provider = create_vm_manager_and_provider(
self.provider_name,
self.region,
use_proxy=task_use_proxy
)
if task_use_proxy:
logger.info("Using proxy-enabled AWS provider.")
else:
logger.info("Using regular AWS provider.")
if task_config is not None:
self._set_task_info(task_config)
self.setup_controller.reset_cache_dir(self.cache_dir)
logger.info("Setting up environment...")
self.setup_controller.setup(self.config)
logger.info("Environment setup complete.")
logger.info("Reverting to snapshot to {}...".format(self.snapshot_name))
self._revert_to_snapshot()
logger.info("Starting emulator...")
self._start_emulator()
logger.info("Emulator started.")
if task_config is not None:
self._set_task_info(task_config)
self.setup_controller.reset_cache_dir(self.cache_dir)
logger.info("Setting up environment...")
success = self.setup_controller.setup(self.config)
if success:
break
else:
logger.error(
"Environment setup failed, retrying (%d/%d)...",
attempt + 1,
MAX_RETRIES,
)
time.sleep(5)
else:
break
logger.info("Environment setup complete.")
if task_config.get("proxy", False) and self.enable_proxy:
# If using proxy and proxy is enabled, set up the proxy configuration
self.setup_controller._proxy_setup()
observation = self._get_obs()
return observation
@@ -172,12 +241,17 @@ class DesktopEnv(gym.Env):
return self.controller.get_vm_screen_size()
def _set_task_info(self, task_config: Dict[str, Any]):
"""Set task info (proxy logic is handled in reset method)"""
self.task_id: str = task_config["id"]
self.cache_dir: str = os.path.join(self.cache_dir_base, self.task_id)
os.makedirs(self.cache_dir, exist_ok=True)
self.instruction = task_config["instruction"]
self.config = task_config["config"] if "config" in task_config else []
self._set_evaluator_info(task_config)
def _set_evaluator_info(self, task_config: Dict[str, Any]):
"""Set evaluator information from task config"""
# evaluator dict
# func -> metric function string, or list of metric function strings
# conj -> conjunction of multiple metrics if func is a list with length > 1, "and"/"or"

View File

@@ -105,8 +105,6 @@ def get_vm_file(env, config: Dict[str, Any]) -> Union[Optional[str], List[Option
_path = os.path.join(env.cache_dir, d)
file = env.controller.get_file(p)
if file is None:
#return None
# raise FileNotFoundError("File not found on VM: {:}".format(config["path"]))
if i in gives:
cache_paths.append(None)
continue

View File

@@ -1,9 +1,14 @@
from desktop_env.providers.base import VMManager, Provider
def create_vm_manager_and_provider(provider_name: str, region: str):
def create_vm_manager_and_provider(provider_name: str, region: str, use_proxy: bool = False):
"""
Factory function to get the Virtual Machine Manager and Provider instances based on the provided provider name.
Args:
provider_name (str): The name of the provider (e.g., "aws", "vmware", etc.)
region (str): The region for the provider
use_proxy (bool): Whether to use proxy-enabled providers (currently only supported for AWS)
"""
provider_name = provider_name.lower().strip()
if provider_name == "vmware":
@@ -16,8 +21,14 @@ def create_vm_manager_and_provider(provider_name: str, region: str):
return VirtualBoxVMManager(), VirtualBoxProvider(region)
elif provider_name in ["aws", "amazon web services"]:
from desktop_env.providers.aws.manager import AWSVMManager
from desktop_env.providers.aws.provider import AWSProvider
return AWSVMManager(), AWSProvider(region)
if use_proxy:
# Use proxy-enabled AWS provider
from desktop_env.providers.aws.provider_with_proxy import AWSProviderWithProxy
return AWSVMManager(proxy_config_file="dataimpulse_proxy_config.json"), AWSProviderWithProxy(region, proxy_config_file="dataimpulse_proxy_config.json")
else:
# Use regular AWS provider
from desktop_env.providers.aws.provider import AWSProvider
return AWSVMManager(), AWSProvider(region)
elif provider_name == "azure":
from desktop_env.providers.azure.manager import AzureVMManager
from desktop_env.providers.azure.provider import AzureProvider

View File

@@ -15,8 +15,7 @@ You need to assign values to several variables crucial for the operation of thes
- Formatted as follows:
```python
IMAGE_ID_MAP = {
"us-east-1": "ami-019f92c05df45031b",
"ap-east-1": "ami-07b4956131da1b282"
"us-east-1": "ami-00674d875de9addc1"
# Add other regions and corresponding AMIs
}
```
@@ -26,18 +25,11 @@ You need to assign values to several variables crucial for the operation of thes
- Example: `"osworld_key"`
- **`NETWORK_INTERFACES`**: Configuration settings for network interfaces, which include subnet IDs, security group IDs, and public IP addressing.
- Example:
```python
NETWORK_INTERFACES = {
"us-east-1": [
{
"SubnetId": "subnet-037edfff66c2eb894",
"AssociatePublicIpAddress": True,
"DeviceIndex": 0,
"Groups": ["sg-0342574803206ee9c"]
}
],
# Add configurations for other regions
}
```bash
<!-- in .env file -->
AWS_REGION=us-east-1
AWS_SUBNET_ID=subnet-xxxx
AWS_SECURITY_GROUP_ID=sg-xxxx
```

View File

@@ -3,260 +3,214 @@ from filelock import FileLock
import boto3
import psutil
import logging
import dotenv
import signal
# Load environment variables from .env file
dotenv.load_dotenv()
# Ensure the AWS region is set in the environment
if not os.getenv('AWS_REGION'):
raise EnvironmentError("AWS_REGION must be set in the environment variables.")
# Ensure the AWS subnet and security group IDs are set in the environment
if not os.getenv('AWS_SUBNET_ID') or not os.getenv('AWS_SECURITY_GROUP_ID'):
raise EnvironmentError("AWS_SUBNET_ID and AWS_SECURITY_GROUP_ID must be set in the environment variables.")
from desktop_env.providers.base import VMManager
# Import proxy-related modules only when needed
try:
from desktop_env.providers.aws.proxy_pool import get_global_proxy_pool, init_proxy_pool
PROXY_SUPPORT_AVAILABLE = True
except ImportError:
PROXY_SUPPORT_AVAILABLE = False
logger = logging.getLogger("desktopenv.providers.aws.AWSVMManager")
logger.setLevel(logging.INFO)
REGISTRY_PATH = '.aws_vms'
DEFAULT_REGION = "us-east-1"
# todo: Add doc for the configuration of image, security group and network interface
# todo: public the AMI images
# ami-05e7d7bd279ea4f14
IMAGE_ID_MAP = {
"us-east-1": "ami-05e7d7bd279ea4f14",
"ap-east-1": "ami-0c092a5b8be4116f5"
"us-east-1": "ami-00674d875de9addc1",
"ap-east-1": "ami-0c092a5b8be4116f5",
}
INSTANCE_TYPE = "t3.medium"
NETWORK_INTERFACE_MAP = {
"us-east-1": [
{
"SubnetId": "subnet-037edfff66c2eb894",
"AssociatePublicIpAddress": True,
"DeviceIndex": 0,
"Groups": [
"sg-0342574803206ee9c"
]
}
],
"ap-east-1": [
{
"SubnetId": "subnet-011060501be0b589c",
"AssociatePublicIpAddress": True,
"DeviceIndex": 0,
"Groups": [
"sg-090470e64df78f6eb"
]
}
]
}
def _allocate_vm(region=DEFAULT_REGION):
if region not in IMAGE_ID_MAP:
raise ValueError(f"Region {region} is not supported. Supported regions are: {list(IMAGE_ID_MAP.keys())}")
run_instances_params = {
"MaxCount": 1,
"MinCount": 1,
"ImageId": IMAGE_ID_MAP[region],
"InstanceType": INSTANCE_TYPE,
"EbsOptimized": True,
"NetworkInterfaces": NETWORK_INTERFACE_MAP[region]
"NetworkInterfaces": [
{
"SubnetId": os.getenv('AWS_SUBNET_ID'),
"AssociatePublicIpAddress": True,
"DeviceIndex": 0,
"Groups": [
os.getenv('AWS_SECURITY_GROUP_ID')
]
}
]
}
ec2_client = boto3.client('ec2', region_name=region)
response = ec2_client.run_instances(**run_instances_params)
instance_id = response['Instances'][0]['InstanceId']
logger.info(f"Waiting for instance {instance_id} to be running...")
ec2_client.get_waiter('instance_running').wait(InstanceIds=[instance_id])
logger.info(f"Instance {instance_id} is ready.")
instance_id = None
original_sigint_handler = signal.getsignal(signal.SIGINT)
original_sigterm_handler = signal.getsignal(signal.SIGTERM)
def signal_handler(sig, frame):
if instance_id:
signal_name = "SIGINT" if sig == signal.SIGINT else "SIGTERM"
logger.warning(f"Received {signal_name} signal, terminating instance {instance_id}...")
try:
ec2_client.terminate_instances(InstanceIds=[instance_id])
logger.info(f"Successfully terminated instance {instance_id} after {signal_name}.")
except Exception as cleanup_error:
logger.error(f"Failed to terminate instance {instance_id} after {signal_name}: {str(cleanup_error)}")
# Restore original signal handlers
signal.signal(signal.SIGINT, original_sigint_handler)
signal.signal(signal.SIGTERM, original_sigterm_handler)
# Raise appropriate exception based on signal type
if sig == signal.SIGINT:
raise KeyboardInterrupt
else:
# For SIGTERM, exit gracefully
import sys
sys.exit(0)
try:
# Set up signal handlers for both SIGINT and SIGTERM
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
response = ec2_client.run_instances(**run_instances_params)
instance_id = response['Instances'][0]['InstanceId']
logger.info(f"Waiting for instance {instance_id} to be running...")
ec2_client.get_waiter('instance_running').wait(InstanceIds=[instance_id])
logger.info(f"Instance {instance_id} is ready.")
except KeyboardInterrupt:
logger.warning("VM allocation interrupted by user (SIGINT).")
raise
except SystemExit:
logger.warning("VM allocation terminated by parent process (SIGTERM).")
raise
except Exception as e:
logger.error(f"Failed to allocate VM in region {region}: {str(e)}")
# try to clean up any resources that were created
try:
if instance_id:
ec2_client.terminate_instances(InstanceIds=[instance_id])
logger.info(f"Terminated instance {instance_id} due to allocation failure.")
except Exception as cleanup_error:
logger.error(f"May fail to clean up instance {instance_id}: {str(cleanup_error)}")
raise
finally:
# Restore original signal handlers
signal.signal(signal.SIGINT, original_sigint_handler)
signal.signal(signal.SIGTERM, original_sigterm_handler)
return instance_id
def _allocate_vm_with_proxy(region=DEFAULT_REGION, proxy_config_file=None):
"""Allocate a VM with proxy configuration"""
if not PROXY_SUPPORT_AVAILABLE:
logger.warning("Proxy support not available, falling back to regular VM allocation")
return _allocate_vm(region)
from desktop_env.providers.aws.provider_with_proxy import AWSProviderWithProxy
# Initialize proxy pool if needed
if proxy_config_file:
init_proxy_pool(proxy_config_file)
# Get current proxy
proxy_pool = get_global_proxy_pool()
current_proxy = proxy_pool.get_next_proxy()
if current_proxy:
logger.info(f"Allocating VM with proxy: {current_proxy.host}:{current_proxy.port}")
# Create provider instance
provider = AWSProviderWithProxy(region=region, proxy_config_file=proxy_config_file)
# Create new instance
instance_id = provider.create_instance_with_proxy(
image_id=IMAGE_ID_MAP[region],
instance_type=INSTANCE_TYPE,
security_groups=[os.getenv('AWS_SECURITY_GROUP_ID')],
subnet_id=os.getenv('AWS_SUBNET_ID')
)
return instance_id
class AWSVMManager(VMManager):
def __init__(self, registry_path=REGISTRY_PATH):
self.registry_path = registry_path
self.lock = FileLock(".aws_lck", timeout=60)
"""
AWS VM Manager for managing virtual machines on AWS.
AWS does not need to maintain a registry of VMs, as it can dynamically allocate and deallocate VMs.
This class supports both regular VM allocation and proxy-enabled VM allocation.
"""
def __init__(self, proxy_config_file=None, **kwargs):
self.proxy_config_file = proxy_config_file
# self.lock = FileLock(".aws_lck", timeout=60)
self.initialize_registry()
# Initialize proxy pool if proxy configuration is provided
if proxy_config_file and PROXY_SUPPORT_AVAILABLE:
init_proxy_pool(proxy_config_file)
logger.info(f"Proxy pool initialized with config: {proxy_config_file}")
def initialize_registry(self):
with self.lock: # Locking during initialization
if not os.path.exists(self.registry_path):
with open(self.registry_path, 'w') as file:
file.write('')
def initialize_registry(self, **kwargs):
pass
def add_vm(self, vm_path, region=DEFAULT_REGION, lock_needed=True):
if lock_needed:
with self.lock:
self._add_vm(vm_path, region)
else:
self._add_vm(vm_path, region)
def add_vm(self, vm_path, region=DEFAULT_REGION, lock_needed=True, **kwargs):
pass
def _add_vm(self, vm_path, region=DEFAULT_REGION):
with open(self.registry_path, 'r') as file:
lines = file.readlines()
vm_path_at_vm_region = "{}@{}".format(vm_path, region)
new_lines = lines + [f'{vm_path_at_vm_region}|free\n']
with open(self.registry_path, 'w') as file:
file.writelines(new_lines)
pass
def delete_vm(self, vm_path, region=DEFAULT_REGION, lock_needed=True):
if lock_needed:
with self.lock:
self._delete_vm(vm_path, region)
else:
self._delete_vm(vm_path, region)
def delete_vm(self, vm_path, region=DEFAULT_REGION, lock_needed=True, **kwargs):
pass
def _delete_vm(self, vm_path, region=DEFAULT_REGION):
new_lines = []
with open(self.registry_path, 'r') as file:
lines = file.readlines()
for line in lines:
vm_path_at_vm_region, pid_str = line.strip().split('|')
if vm_path_at_vm_region == "{}@{}".format(vm_path, region):
continue
else:
new_lines.append(line)
with open(self.registry_path, 'w') as file:
file.writelines(new_lines)
pass
def occupy_vm(self, vm_path, pid, region=DEFAULT_REGION, lock_needed=True):
if lock_needed:
with self.lock:
self._occupy_vm(vm_path, pid, region)
else:
self._occupy_vm(vm_path, pid, region)
def occupy_vm(self, vm_path, pid, region=DEFAULT_REGION, lock_needed=True, **kwargs):
pass
def _occupy_vm(self, vm_path, pid, region=DEFAULT_REGION):
new_lines = []
with open(self.registry_path, 'r') as file:
lines = file.readlines()
for line in lines:
registered_vm_path, _ = line.strip().split('|')
if registered_vm_path == "{}@{}".format(vm_path, region):
new_lines.append(f'{registered_vm_path}|{pid}\n')
else:
new_lines.append(line)
with open(self.registry_path, 'w') as file:
file.writelines(new_lines)
pass
def check_and_clean(self, lock_needed=True):
if lock_needed:
with self.lock:
self._check_and_clean()
else:
self._check_and_clean()
def check_and_clean(self, lock_needed=True, **kwargs):
pass
def _check_and_clean(self):
# Get active PIDs
active_pids = {p.pid for p in psutil.process_iter()}
pass
new_lines = []
vm_path_at_vm_regions = {}
with open(self.registry_path, 'r') as file:
lines = file.readlines()
# Collect all VM paths and their regions
for line in lines:
vm_path_at_vm_region, pid_str = line.strip().split('|')
vm_path, vm_region = vm_path_at_vm_region.split("@")
if vm_region not in vm_path_at_vm_regions:
vm_path_at_vm_regions[vm_region] = []
vm_path_at_vm_regions[vm_region].append((vm_path_at_vm_region, pid_str))
# Process each region
for region, vm_info_list in vm_path_at_vm_regions.items():
ec2_client = boto3.client('ec2', region_name=region)
instance_ids = [vm_info[0].split('@')[0] for vm_info in vm_info_list]
# Batch describe instances
try:
response = ec2_client.describe_instances(InstanceIds=instance_ids)
reservations = response.get('Reservations', [])
terminated_ids = set()
stopped_ids = set()
active_ids = set()
# Collect states of all instances
for reservation in reservations:
for instance in reservation.get('Instances', []):
instance_id = instance.get('InstanceId')
instance_state = instance['State']['Name']
if instance_state in ['terminated', 'shutting-down']:
terminated_ids.add(instance_id)
elif instance_state == 'stopped':
stopped_ids.add(instance_id)
else:
active_ids.add(instance_id)
# Write results back to file
for vm_path_at_vm_region, pid_str in vm_info_list:
vm_path = vm_path_at_vm_region.split('@')[0]
if vm_path in terminated_ids:
logger.info(f"VM {vm_path} not found or terminated, releasing it.")
continue
elif vm_path in stopped_ids:
logger.info(f"VM {vm_path} stopped, mark it as free")
new_lines.append(f'{vm_path}@{region}|free\n')
continue
if pid_str == "free":
new_lines.append(f'{vm_path}@{region}|{pid_str}\n')
elif int(pid_str) in active_pids:
new_lines.append(f'{vm_path}@{region}|{pid_str}\n')
else:
new_lines.append(f'{vm_path}@{region}|free\n')
except ec2_client.exceptions.ClientError as e:
if 'InvalidInstanceID.NotFound' in str(e):
logger.info(f"VM not found, releasing instances in region {region}.")
continue
# Writing updated lines back to the registry file
with open(self.registry_path, 'w') as file:
file.writelines(new_lines)
# We won't check and clean on the files on aws and delete the unregistered ones
# Since this can lead to unexpected delete on other server
# PLease do monitor the instances to avoid additional cost
def list_free_vms(self, region=DEFAULT_REGION, lock_needed=True):
if lock_needed:
with self.lock:
return self._list_free_vms(region)
else:
return self._list_free_vms(region)
def list_free_vms(self, region=DEFAULT_REGION, lock_needed=True, **kwargs):
pass
def _list_free_vms(self, region=DEFAULT_REGION):
free_vms = []
with open(self.registry_path, 'r') as file:
lines = file.readlines()
for line in lines:
vm_path_at_vm_region, pid_str = line.strip().split('|')
vm_path, vm_region = vm_path_at_vm_region.split("@")
if pid_str == "free" and vm_region == region:
free_vms.append((vm_path, pid_str))
pass
return free_vms
def get_vm_path(self, region=DEFAULT_REGION):
with self.lock:
if not AWSVMManager.checked_and_cleaned:
AWSVMManager.checked_and_cleaned = True
self._check_and_clean()
allocation_needed = False
with self.lock:
free_vms_paths = self._list_free_vms(region)
if len(free_vms_paths) == 0:
# No free virtual machine available, generate a new one
allocation_needed = True
else:
# Choose the first free virtual machine
chosen_vm_path = free_vms_paths[0][0]
self._occupy_vm(chosen_vm_path, os.getpid(), region)
return chosen_vm_path
if allocation_needed:
logger.info("No free virtual machine available. Generating a new one, which would take a while...☕")
def get_vm_path(self, region=DEFAULT_REGION, **kwargs):
if self.proxy_config_file:
logger.info("Allocating a new VM with proxy configuration in region: {}".format(region))
new_vm_path = _allocate_vm_with_proxy(region, self.proxy_config_file)
else:
logger.info("Allocating a new VM in region: {}".format(region))
new_vm_path = _allocate_vm(region)
with self.lock:
self._add_vm(new_vm_path, region)
self._occupy_vm(new_vm_path, os.getpid(), region)
return new_vm_path
return new_vm_path

View File

@@ -4,6 +4,9 @@ from botocore.exceptions import ClientError
import logging
from desktop_env.providers.base import Provider
from datetime import datetime
import time
logger = logging.getLogger("desktopenv.providers.aws.AWSProvider")
logger.setLevel(logging.INFO)
@@ -14,24 +17,43 @@ MAX_ATTEMPTS = 10
class AWSProvider(Provider):
def start_emulator(self, path_to_vm: str, headless: bool):
def start_emulator(self, path_to_vm: str, headless: bool, *args, **kwargs):
logger.info("Starting AWS VM...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
# Start the instance
ec2_client.start_instances(InstanceIds=[path_to_vm])
logger.info(f"Instance {path_to_vm} is starting...")
# Check the current state of the instance
response = ec2_client.describe_instances(InstanceIds=[path_to_vm])
state = response['Reservations'][0]['Instances'][0]['State']['Name']
logger.info(f"Instance {path_to_vm} current state: {state}")
# Wait for the instance to be in the 'running' state
waiter = ec2_client.get_waiter('instance_running')
waiter.wait(InstanceIds=[path_to_vm], WaiterConfig={'Delay': WAIT_DELAY, 'MaxAttempts': MAX_ATTEMPTS})
logger.info(f"Instance {path_to_vm} is now running.")
if state == 'running':
# If the instance is already running, skip starting it
logger.info(f"Instance {path_to_vm} is already running. Skipping start.")
return
if state == 'stopped':
# Start the instance if it's currently stopped
ec2_client.start_instances(InstanceIds=[path_to_vm])
logger.info(f"Instance {path_to_vm} is starting...")
# Wait until the instance reaches 'running' state
waiter = ec2_client.get_waiter('instance_running')
waiter.wait(
InstanceIds=[path_to_vm],
WaiterConfig={'Delay': WAIT_DELAY, 'MaxAttempts': MAX_ATTEMPTS}
)
logger.info(f"Instance {path_to_vm} is now running.")
else:
# For all other states (terminated, pending, etc.), log a warning
logger.warning(f"Instance {path_to_vm} is in state '{state}' and cannot be started.")
except ClientError as e:
logger.error(f"Failed to start the AWS VM {path_to_vm}: {str(e)}")
raise
def get_ip_address(self, path_to_vm: str) -> str:
logger.info("Getting AWS VM IP address...")
ec2_client = boto3.client('ec2', region_name=self.region)
@@ -71,21 +93,23 @@ class AWSProvider(Provider):
security_groups = [sg['GroupId'] for sg in instance['SecurityGroups']]
subnet_id = instance['SubnetId']
instance_type = instance['InstanceType']
instance_snapshot = instance_details['Reservations'][0]['Instances'][0]['ImageId']
# Step 2: Terminate the old instance
ec2_client.terminate_instances(InstanceIds=[path_to_vm])
logger.info(f"Old instance {path_to_vm} has been terminated.")
# Step 3: Launch a new instance from the snapshot
logger.info(f"Launching a new instance from snapshot {snapshot_name}...")
logger.info(f"Launching a new instance from snapshot {instance_snapshot}...")
run_instances_params = {
"MaxCount": 1,
"MinCount": 1,
"ImageId": snapshot_name,
"InstanceType": instance_type,
"EbsOptimized": True,
"NetworkInterfaces": [
new_instance = ec2_client.run_instances(
MaxCount = 1,
MinCount = 1,
ImageId = instance_snapshot,
InstanceType = instance_type,
EbsOptimized = True,
NetworkInterfaces = [
{
"SubnetId": subnet_id,
"AssociatePublicIpAddress": True,
@@ -93,13 +117,12 @@ class AWSProvider(Provider):
"Groups": security_groups
}
]
}
new_instance = ec2_client.run_instances(**run_instances_params)
)
new_instance_id = new_instance['Instances'][0]['InstanceId']
logger.info(f"New instance {new_instance_id} launched from snapshot {snapshot_name}.")
logger.info(f"Waiting for instance {new_instance_id} to be running...")
ec2_client.get_waiter('instance_running').wait(InstanceIds=[new_instance_id])
logger.info(f"Instance {new_instance_id} is ready.")
return new_instance_id
@@ -108,15 +131,14 @@ class AWSProvider(Provider):
logger.error(f"Failed to revert to snapshot {snapshot_name} for the instance {path_to_vm}: {str(e)}")
raise
def stop_emulator(self, path_to_vm, region=None):
logger.info(f"Stopping AWS VM {path_to_vm}...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
ec2_client.stop_instances(InstanceIds=[path_to_vm])
waiter = ec2_client.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[path_to_vm], WaiterConfig={'Delay': WAIT_DELAY, 'MaxAttempts': MAX_ATTEMPTS})
logger.info(f"Instance {path_to_vm} has been stopped.")
ec2_client.terminate_instances(InstanceIds=[path_to_vm])
logger.info(f"Instance {path_to_vm} has been terminated.")
except ClientError as e:
logger.error(f"Failed to stop the AWS VM {path_to_vm}: {str(e)}")
raise

View File

@@ -0,0 +1,275 @@
import boto3
from botocore.exceptions import ClientError
import base64
import logging
import json
from typing import Optional
from desktop_env.providers.base import Provider
from desktop_env.providers.aws.proxy_pool import get_global_proxy_pool, init_proxy_pool, ProxyInfo
logger = logging.getLogger("desktopenv.providers.aws.AWSProviderWithProxy")
logger.setLevel(logging.INFO)
WAIT_DELAY = 15
MAX_ATTEMPTS = 10
class AWSProviderWithProxy(Provider):
def __init__(self, region: str = None, proxy_config_file: str = None):
super().__init__(region)
self.current_proxy: Optional[ProxyInfo] = None
# 初始化代理池
if proxy_config_file:
init_proxy_pool(proxy_config_file)
logger.info(f"Initialized proxy pool from {proxy_config_file}")
# 获取下一个可用代理
self._rotate_proxy()
def _rotate_proxy(self):
"""轮换到下一个可用代理"""
proxy_pool = get_global_proxy_pool()
self.current_proxy = proxy_pool.get_next_proxy()
if self.current_proxy:
logger.info(f"Switched to proxy: {self.current_proxy.host}:{self.current_proxy.port}")
else:
logger.warning("No proxy available, using direct connection")
def _generate_proxy_user_data(self) -> str:
"""生成包含代理配置的user data脚本"""
if not self.current_proxy:
return ""
proxy_url = self._format_proxy_url(self.current_proxy)
user_data_script = f"""#!/bin/bash
# Configure system proxy
echo 'export http_proxy={proxy_url}' >> /etc/environment
echo 'export https_proxy={proxy_url}' >> /etc/environment
echo 'export HTTP_PROXY={proxy_url}' >> /etc/environment
echo 'export HTTPS_PROXY={proxy_url}' >> /etc/environment
# Configure apt proxy
cat > /etc/apt/apt.conf.d/95proxy << EOF
Acquire::http::Proxy "{proxy_url}";
Acquire::https::Proxy "{proxy_url}";
EOF
# Configure chrome/chromium proxy
mkdir -p /etc/opt/chrome/policies/managed
cat > /etc/opt/chrome/policies/managed/proxy.json << EOF
{{
"ProxyMode": "fixed_servers",
"ProxyServer": "{self.current_proxy.host}:{self.current_proxy.port}"
}}
EOF
# Configure chromium proxy (Ubuntu default)
mkdir -p /etc/chromium/policies/managed
cat > /etc/chromium/policies/managed/proxy.json << EOF
{{
"ProxyMode": "fixed_servers",
"ProxyServer": "{self.current_proxy.host}:{self.current_proxy.port}"
}}
EOF
# Configure firefox proxy - support multiple possible paths
for firefox_dir in /etc/firefox/policies /usr/lib/firefox/distribution/policies /etc/firefox-esr/policies; do
if [ -d "$(dirname "$firefox_dir")" ]; then
mkdir -p "$firefox_dir"
cat > "$firefox_dir/policies.json" << EOF
{{
"policies": {{
"Proxy": {{
"Mode": "manual",
"HTTPProxy": "{self.current_proxy.host}:{self.current_proxy.port}",
"HTTPSProxy": "{self.current_proxy.host}:{self.current_proxy.port}",
"UseHTTPProxyForAllProtocols": true
}}
}}
}}
EOF
break
fi
done
# Reload environment variables
source /etc/environment
# Log proxy configuration
echo "$(date): Configured proxy {self.current_proxy.host}:{self.current_proxy.port}" >> /var/log/proxy-setup.log
"""
return base64.b64encode(user_data_script.encode()).decode()
def _format_proxy_url(self, proxy: ProxyInfo) -> str:
"""格式化代理URL"""
if proxy.username and proxy.password:
return f"{proxy.protocol}://{proxy.username}:{proxy.password}@{proxy.host}:{proxy.port}"
else:
return f"{proxy.protocol}://{proxy.host}:{proxy.port}"
def start_emulator(self, path_to_vm: str, headless: bool, *args, **kwargs):
logger.info("Starting AWS VM with proxy configuration...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
# 如果实例已经存在,直接启动
ec2_client.start_instances(InstanceIds=[path_to_vm])
logger.info(f"Instance {path_to_vm} is starting...")
# Wait for the instance to be in the 'running' state
waiter = ec2_client.get_waiter('instance_running')
waiter.wait(InstanceIds=[path_to_vm], WaiterConfig={'Delay': WAIT_DELAY, 'MaxAttempts': MAX_ATTEMPTS})
logger.info(f"Instance {path_to_vm} is now running.")
except ClientError as e:
logger.error(f"Failed to start the AWS VM {path_to_vm}: {str(e)}")
raise
def create_instance_with_proxy(self, image_id: str, instance_type: str,
security_groups: list, subnet_id: str) -> str:
"""创建带有代理配置的新实例"""
ec2_client = boto3.client('ec2', region_name=self.region)
user_data = self._generate_proxy_user_data()
run_instances_params = {
"MaxCount": 1,
"MinCount": 1,
"ImageId": image_id,
"InstanceType": instance_type,
"EbsOptimized": True,
"NetworkInterfaces": [
{
"SubnetId": subnet_id,
"AssociatePublicIpAddress": True,
"DeviceIndex": 0,
"Groups": security_groups
}
]
}
if user_data:
run_instances_params["UserData"] = user_data
try:
response = ec2_client.run_instances(**run_instances_params)
instance_id = response['Instances'][0]['InstanceId']
logger.info(f"Created new instance {instance_id} with proxy configuration")
# 等待实例运行
logger.info(f"Waiting for instance {instance_id} to be running...")
ec2_client.get_waiter('instance_running').wait(InstanceIds=[instance_id])
logger.info(f"Instance {instance_id} is ready.")
return instance_id
except ClientError as e:
logger.error(f"Failed to create instance with proxy: {str(e)}")
# 如果当前代理失败,尝试轮换代理
if self.current_proxy:
proxy_pool = get_global_proxy_pool()
proxy_pool.mark_proxy_failed(self.current_proxy)
self._rotate_proxy()
raise
def get_ip_address(self, path_to_vm: str) -> str:
logger.info("Getting AWS VM IP address...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
response = ec2_client.describe_instances(InstanceIds=[path_to_vm])
for reservation in response['Reservations']:
for instance in reservation['Instances']:
private_ip_address = instance.get('PrivateIpAddress', '')
return private_ip_address
return ''
except ClientError as e:
logger.error(f"Failed to retrieve private IP address for the instance {path_to_vm}: {str(e)}")
raise
def save_state(self, path_to_vm: str, snapshot_name: str):
logger.info("Saving AWS VM state...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
image_response = ec2_client.create_image(InstanceId=path_to_vm, Name=snapshot_name)
image_id = image_response['ImageId']
logger.info(f"AMI {image_id} created successfully from instance {path_to_vm}.")
return image_id
except ClientError as e:
logger.error(f"Failed to create AMI from the instance {path_to_vm}: {str(e)}")
raise
def revert_to_snapshot(self, path_to_vm: str, snapshot_name: str):
logger.info(f"Reverting AWS VM to snapshot: {snapshot_name}...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
# 获取原实例详情
instance_details = ec2_client.describe_instances(InstanceIds=[path_to_vm])
instance = instance_details['Reservations'][0]['Instances'][0]
security_groups = [sg['GroupId'] for sg in instance['SecurityGroups']]
subnet_id = instance['SubnetId']
instance_type = instance['InstanceType']
# 终止旧实例
ec2_client.terminate_instances(InstanceIds=[path_to_vm])
logger.info(f"Old instance {path_to_vm} has been terminated.")
# 轮换到新的代理
self._rotate_proxy()
# 创建新实例
new_instance_id = self.create_instance_with_proxy(
snapshot_name, instance_type, security_groups, subnet_id
)
return new_instance_id
except ClientError as e:
logger.error(f"Failed to revert to snapshot {snapshot_name} for the instance {path_to_vm}: {str(e)}")
raise
def stop_emulator(self, path_to_vm, region=None):
logger.info(f"Stopping AWS VM {path_to_vm}...")
ec2_client = boto3.client('ec2', region_name=self.region)
try:
ec2_client.stop_instances(InstanceIds=[path_to_vm])
waiter = ec2_client.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[path_to_vm], WaiterConfig={'Delay': WAIT_DELAY, 'MaxAttempts': MAX_ATTEMPTS})
logger.info(f"Instance {path_to_vm} has been stopped.")
except ClientError as e:
logger.error(f"Failed to stop the AWS VM {path_to_vm}: {str(e)}")
raise
def get_current_proxy_info(self) -> Optional[dict]:
"""获取当前代理信息"""
if self.current_proxy:
return {
'host': self.current_proxy.host,
'port': self.current_proxy.port,
'protocol': self.current_proxy.protocol,
'failed_count': self.current_proxy.failed_count
}
return None
def force_rotate_proxy(self):
"""强制轮换代理"""
logger.info("Force rotating proxy...")
if self.current_proxy:
proxy_pool = get_global_proxy_pool()
proxy_pool.mark_proxy_failed(self.current_proxy)
self._rotate_proxy()
def get_proxy_stats(self) -> dict:
"""获取代理池统计信息"""
proxy_pool = get_global_proxy_pool()
return proxy_pool.get_stats()

View File

@@ -0,0 +1,193 @@
import random
import requests
import logging
import time
from typing import List, Dict, Optional
from dataclasses import dataclass
from threading import Lock
import json
logger = logging.getLogger("desktopenv.providers.aws.ProxyPool")
logger.setLevel(logging.INFO)
@dataclass
class ProxyInfo:
host: str
port: int
username: Optional[str] = None
password: Optional[str] = None
protocol: str = "http" # http, https, socks5
failed_count: int = 0
last_used: float = 0
is_active: bool = True
class ProxyPool:
def __init__(self, config_file: str = None):
self.proxies: List[ProxyInfo] = []
self.current_index = 0
self.lock = Lock()
self.max_failures = 3 # 最大失败次数
self.cooldown_time = 300 # 5分钟冷却时间
if config_file:
self.load_proxies_from_file(config_file)
def load_proxies_from_file(self, config_file: str):
"""从配置文件加载代理列表"""
try:
with open(config_file, 'r') as f:
proxy_configs = json.load(f)
for config in proxy_configs:
proxy = ProxyInfo(
host=config['host'],
port=config['port'],
username=config.get('username'),
password=config.get('password'),
protocol=config.get('protocol', 'http')
)
self.proxies.append(proxy)
logger.info(f"Loaded {len(self.proxies)} proxies from {config_file}")
except Exception as e:
logger.error(f"Failed to load proxies from {config_file}: {e}")
def add_proxy(self, host: str, port: int, username: str = None,
password: str = None, protocol: str = "http"):
"""添加代理到池中"""
proxy = ProxyInfo(host=host, port=port, username=username,
password=password, protocol=protocol)
with self.lock:
self.proxies.append(proxy)
logger.info(f"Added proxy {host}:{port}")
def get_next_proxy(self) -> Optional[ProxyInfo]:
"""获取下一个可用的代理"""
with self.lock:
if not self.proxies:
return None
# 过滤掉失败次数过多的代理
active_proxies = [p for p in self.proxies if self._is_proxy_available(p)]
if not active_proxies:
logger.warning("No active proxies available")
return None
# 轮询选择代理
proxy = active_proxies[self.current_index % len(active_proxies)]
self.current_index += 1
proxy.last_used = time.time()
return proxy
def _is_proxy_available(self, proxy: ProxyInfo) -> bool:
"""检查代理是否可用"""
if not proxy.is_active:
return False
if proxy.failed_count >= self.max_failures:
# 检查是否过了冷却时间
if time.time() - proxy.last_used < self.cooldown_time:
return False
else:
# 重置失败计数
proxy.failed_count = 0
return True
def mark_proxy_failed(self, proxy: ProxyInfo):
"""标记代理失败"""
with self.lock:
proxy.failed_count += 1
if proxy.failed_count >= self.max_failures:
logger.warning(f"Proxy {proxy.host}:{proxy.port} marked as failed "
f"(failures: {proxy.failed_count})")
def mark_proxy_success(self, proxy: ProxyInfo):
"""标记代理成功"""
with self.lock:
proxy.failed_count = 0
def test_proxy(self, proxy: ProxyInfo, test_url: str = "http://httpbin.org/ip",
timeout: int = 10) -> bool:
"""测试代理是否正常工作"""
try:
proxy_url = self._format_proxy_url(proxy)
proxies = {
'http': proxy_url,
'https': proxy_url
}
response = requests.get(test_url, proxies=proxies, timeout=timeout)
if response.status_code == 200:
self.mark_proxy_success(proxy)
return True
else:
self.mark_proxy_failed(proxy)
return False
except Exception as e:
logger.debug(f"Proxy test failed for {proxy.host}:{proxy.port}: {e}")
self.mark_proxy_failed(proxy)
return False
def _format_proxy_url(self, proxy: ProxyInfo) -> str:
"""格式化代理URL"""
if proxy.username and proxy.password:
return f"{proxy.protocol}://{proxy.username}:{proxy.password}@{proxy.host}:{proxy.port}"
else:
return f"{proxy.protocol}://{proxy.host}:{proxy.port}"
def get_proxy_dict(self, proxy: ProxyInfo) -> Dict[str, str]:
"""获取requests库使用的代理字典"""
proxy_url = self._format_proxy_url(proxy)
return {
'http': proxy_url,
'https': proxy_url
}
def test_all_proxies(self, test_url: str = "http://httpbin.org/ip"):
"""测试所有代理"""
logger.info("Testing all proxies...")
working_count = 0
for proxy in self.proxies:
if self.test_proxy(proxy, test_url):
working_count += 1
logger.info(f"✓ Proxy {proxy.host}:{proxy.port} is working")
else:
logger.warning(f"✗ Proxy {proxy.host}:{proxy.port} failed")
logger.info(f"Proxy test completed: {working_count}/{len(self.proxies)} working")
return working_count
def get_stats(self) -> Dict:
"""获取代理池统计信息"""
with self.lock:
total = len(self.proxies)
active = len([p for p in self.proxies if self._is_proxy_available(p)])
failed = len([p for p in self.proxies if p.failed_count >= self.max_failures])
return {
'total': total,
'active': active,
'failed': failed,
'success_rate': active / total if total > 0 else 0
}
# 全局代理池实例
_proxy_pool = None
def get_global_proxy_pool() -> ProxyPool:
"""获取全局代理池实例"""
global _proxy_pool
if _proxy_pool is None:
_proxy_pool = ProxyPool()
return _proxy_pool
def init_proxy_pool(config_file: str = None):
"""初始化全局代理池"""
global _proxy_pool
_proxy_pool = ProxyPool(config_file)
return _proxy_pool

View File

@@ -39,5 +39,6 @@
"expected": "true"
}
}
}
}
},
"proxy": false
}

View File

@@ -62,5 +62,6 @@
]
}
}
}
}
},
"proxy": true
}

View File

@@ -53,32 +53,37 @@
"chrome"
],
"evaluator": {
"func": ["is_expected_active_tab", "is_expected_active_tab"],
"func": [
"is_expected_active_tab",
"is_expected_active_tab"
],
"conj": "or",
"result": [
{
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
},
{
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
"goto_prefix": "https://www."
}
],
"expected": [
{
"type": "rule",
"rules": {
"type": "url",
"url": "https://www.drugs.com/npc/"
}
"type": "rule",
"rules": {
"type": "url",
"url": "https://www.drugs.com/npc/"
}
},
{
"type": "rule",
"rules": {
"type": "url",
"url": "https://www.drugs.com/npp/"
}
}
}]
}
}
]
},
"proxy": true
}

View File

@@ -3,26 +3,27 @@
"snapshot": "chrome",
"instruction": "Computer, please navigate to the area in my browser settings where my passwords are stored. I want to check my login information for Etsy without revealing it just yet.",
"source": "https://www.quora.com/What-are-the-cool-tricks-to-use-Google-Chrome",
"config": [
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
}],
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
@@ -37,8 +38,9 @@
"type": "rule",
"rules": {
"type": "url",
"url":"chrome://password-manager/passwords"
"url": "chrome://password-manager/passwords"
}
}
}
}
},
"proxy": false
}

View File

@@ -63,9 +63,10 @@
"type": "rule",
"rules": {
"items": [
"The Dota 2 Official Soundtrack"
"The Dota 2 Official Soundtrack"
]
}
}
}
}
},
"proxy": true
}

View File

@@ -1,87 +1,99 @@
{
"id": "1704f00f-79e6-43a7-961b-cedd3724d5fd",
"snapshot": "chrome",
"instruction": "Find a large car with lowest price from next Monday to next Friday in Zurich.",
"source": "test_task_0",
"config": [
"id": "1704f00f-79e6-43a7-961b-cedd3724d5fd",
"snapshot": "chrome",
"instruction": "Find a large car with lowest price from next Monday to next Friday in Zurich.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.rentalcars.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": [
"check_direct_json_object",
"check_direct_json_object"
],
"result": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": [
"locationName",
"dropLocationName",
"filterCriteria_carCategory",
"filterCriteria_sortBy"
]
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.rentalcars.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject": {
"/html/body/main/div/div/div/section/div/div/div/div[1]/div[1]/p": "from",
"/html/body/main/div/div/div/section/div/div/div/div[1]/div[3]/p": "to"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":["check_direct_json_object", "check_direct_json_object"],
"result": [{
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": ["locationName", "dropLocationName", "filterCriteria_carCategory", "filterCriteria_sortBy"]
},
{
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject":{
"/html/body/main/div/div/div/section/div/div/div/div[1]/div[1]/p": "from",
"/html/body/main/div/div/div/section/div/div/div/div[1]/div[3]/p": "to"
"expected": [
{
"type": "rule",
"rules": {
"expected": {
"locationName": "Zürich",
"dropLocationName": "Zürich",
"filterCriteria_carCategory": "large",
"filterCriteria_sortBy": "PRICE"
}
}],
"expected":[{
"type": "rule",
"rules":{
"expected": {
"locationName": "Zürich",
"dropLocationName": "Zürich",
"filterCriteria_carCategory": "large",
"filterCriteria_sortBy": "PRICE"
}
}
},
{
"type": "rule_relativeTime",
"rules":{
"relativeTime":{
"from":"next Monday",
"to":"next Friday"
},
"expected": {
"from": "{DoW}, {DayD} {Month} {Year}, 10:00",
"to": "{DoW}, {DayD} {Month} {Year}, 10:00"
}
},
{
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "next Monday",
"to": "next Friday"
},
"expected": {
"from": "{DoW}, {DayD} {Month} {Year}, 10:00",
"to": "{DoW}, {DayD} {Month} {Year}, 10:00"
}
}}
]
}
}
}
}
]
},
"proxy": true
}

View File

@@ -1,63 +1,62 @@
{
"id": "2888b4e6-5b47-4b57-8bf5-c73827890774",
"snapshot": "chrome",
"instruction": "Find a men's T-Shirt that is in large size with a stripe pattern, short sleeve and under the Sales&Discount.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.macys.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
"id": "2888b4e6-5b47-4b57-8bf5-c73827890774",
"snapshot": "chrome",
"instruction": "Find a men's T-Shirt that is in large size with a stripe pattern, short sleeve and under the Sales&Discount.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.macys.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"exact_match",
"result": {
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -1,
"needDeleteId": true,
"returnType": "string"
},
"expected":{
"type": "rule",
"rules":{
"expected": "Stripe,Men,L,Short%20Sleeve,Sales%20%26%20Discounts"
}
}
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "exact_match",
"result": {
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -1,
"needDeleteId": true,
"returnType": "string"
},
"expected": {
"type": "rule",
"rules": {
"expected": "Stripe,Men,L,Short%20Sleeve,Sales%20%26%20Discounts"
}
}
},
"proxy": true
}

View File

@@ -37,8 +37,11 @@
"type": "rule",
"rules": {
"type": "bookmark_bar_folders_names",
"names": ["Favorites"]
"names": [
"Favorites"
]
}
}
}
}
},
"proxy": false
}

View File

@@ -59,5 +59,6 @@
"expected": "Thomas"
}
}
}
}
},
"proxy": false
}

View File

@@ -53,5 +53,6 @@
"expected": "true"
}
}
}
}
},
"proxy": false
}

View File

@@ -48,5 +48,6 @@
"name": "Play Puzzle Game 2048"
}
}
}
}
},
"proxy": true
}

View File

@@ -1,78 +1,87 @@
{
"id": "368d9ba4-203c-40c1-9fa3-da2f1430ce63",
"snapshot": "chrome",
"instruction": "find the Monthly forecast for Manchester, GB for this month",
"source": "test_task_1",
"config": [
"id": "368d9ba4-203c-40c1-9fa3-da2f1430ce63",
"snapshot": "chrome",
"instruction": "find the Monthly forecast for Manchester, GB for this month",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.accuweather.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": [
"check_direct_json_object",
"is_expected_url_pattern_match"
],
"result": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -2,
"needDeleteId": false,
"returnType": "json",
"key": "time"
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.accuweather.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":["check_direct_json_object", "is_expected_url_pattern_match"],
"result": [{
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -2,
"needDeleteId": false,
"returnType": "json",
"key":"time"
},
{
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
}],
"expected":[
{
"type": "rule_relativeTime",
"rules":{
"relativeTime": {
"from": "this month"
},
"expected": {
"time": "{month}-weather"
}
}
},
{
"type": "rule",
"rules":{
"expected": ["\/manchester\/"]
}
}]
}
}
"expected": [
{
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "this month"
},
"expected": {
"time": "{month}-weather"
}
}
},
{
"type": "rule",
"rules": {
"expected": [
"/manchester/"
]
}
}
]
},
"proxy": true
}

View File

@@ -3,13 +3,13 @@
"snapshot": "chrome",
"instruction": "I am more familiar with Korean as I am from Korea. I want to use chrome with my mother tongue. Could you help me change the Chrome interface language to Korean? ",
"source": "https://superuser.com/questions/984668/change-interface-language-of-chrome-to-english",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -270,5 +270,6 @@
]
}
}
}
}
},
"proxy": true
}

View File

@@ -1,111 +1,117 @@
{
"id": "47543840-672a-467d-80df-8f7c3b9788c9",
"snapshot": "chrome",
"instruction": "Find and select the car with the most number of seats to pick up in Boston Logan Intl Airport from 10th next month to 11th next month.",
"source": "test_task_1",
"config": [
"id": "47543840-672a-467d-80df-8f7c3b9788c9",
"snapshot": "chrome",
"instruction": "Find and select the car with the most number of seats to pick up in Boston Logan Intl Airport from 10th next month to 11th next month.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.budget.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": [
"is_expected_url_pattern_match",
"check_direct_json_object",
"check_direct_json_object"
],
"conj": "and",
"result": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
},
{
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject": {},
"class_multiObject": {
"location-info": {
"0": "start_location",
"1": "end_location"
},
"day-time-info": {
"0": "from",
"1": "to"
}
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.budget.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject": {
"/html/body/div[6]/div[2]/div[1]/div/div/div[2]/div[1]/section[1]/div/form/div[1]/div[2]/div/a": "rank"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":["is_expected_url_pattern_match", "check_direct_json_object", "check_direct_json_object"],
"conj": "and",
"result": [
{
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
},
{
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject":{},
"class_multiObject":{
"location-info":{
"0": "start_location",
"1": "end_location"
},
"day-time-info":{
"0": "from",
"1": "to"
}
}
},
{
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject":{
"/html/body/div[6]/div[2]/div[1]/div/div/div[2]/div[1]/section[1]/div/form/div[1]/div[2]/div/a": "rank"
}
}
],
"expected":[
{
"type": "rule",
"rules":{
"expected": ["reservation#\/vehicles"]
}
},
{
"type": "rule_relativeTime",
"rules":{
"relativeTime":{
"from":"10th next month",
"to": "11th next month"
},
"expected": {
"start_location": "Boston Logan Intl Airport,\n\t\t\t\t\t\t\t\tBOS \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t Pick-Up",
"end_location": "Boston Logan Intl Airport,\n\t\t\t\t\t\t\t\tBOS",
"from": "{DoW}, {Month} {Day0D}, 12:00 PM",
"to": "{DoW}, {Month} {Day0D}, 12:00 PM"
}
}
},
{
"type": "rule",
"rules":{
"expected": {
"rank": "Number of Seats (High to Low)"
}
}
}
]
}
}
"expected": [
{
"type": "rule",
"rules": {
"expected": [
"reservation#/vehicles"
]
}
},
{
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "10th next month",
"to": "11th next month"
},
"expected": {
"start_location": "Boston Logan Intl Airport,\n\t\t\t\t\t\t\t\tBOS \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t \n\t\t\t\t\t\t\t Pick-Up",
"end_location": "Boston Logan Intl Airport,\n\t\t\t\t\t\t\t\tBOS",
"from": "{DoW}, {Month} {Day0D}, 12:00 PM",
"to": "{DoW}, {Month} {Day0D}, 12:00 PM"
}
}
},
{
"type": "rule",
"rules": {
"expected": {
"rank": "Number of Seats (High to Low)"
}
}
}
]
},
"proxy": true
}

View File

@@ -63,5 +63,6 @@
]
}
}
}
}
},
"proxy": false
}

View File

@@ -65,5 +65,6 @@
"url": "https://www.babycenter.com/baby-names/details/carl-853"
}
}
}
}
},
"proxy": true
}

View File

@@ -57,5 +57,6 @@
"expected": "/home/user/Desktop/helloExtension"
}
}
}
},
"proxy": false
}

View File

@@ -1,78 +1,78 @@
{
"id": "6c4c23a1-42a4-43cc-9db1-2f86ff3738cc",
"snapshot": "chrome",
"instruction": "Find flights from Seattle to New York on 5th next month and only show those that can be purchased with miles.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
"id": "6c4c23a1-42a4-43cc-9db1-2f86ff3738cc",
"snapshot": "chrome",
"instruction": "Find flights from Seattle to New York on 5th next month and only show those that can be purchased with miles.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.delta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject": {
"search-date": "time",
"price-in-tabs__nav--selected": "category"
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.delta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"class_multiObject": {
"search-segment-cities__city": {
"0": "start",
"1": "end"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject":{
"search-date": "time",
"price-in-tabs__nav--selected": "category"
},
"class_multiObject":{
"search-segment-cities__city": {
"0": "start",
"1": "end"
}
}
},
"expected": {
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "5th next month"
},
"expected":{
"type": "rule_relativeTime",
"rules":{
"relativeTime": {
"from": "5th next month"
},
"expected": {
"start": "SEA",
"end": "NYC",
"time": "{DoW}, {Month} {DayD}, {Year}",
"category": "Miles"
}
}
"expected": {
"start": "SEA",
"end": "NYC",
"time": "{DoW}, {Month} {DayD}, {Year}",
"category": "Miles"
}
}
}
}
},
"proxy": true
}

View File

@@ -46,8 +46,11 @@
"type": "rule",
"rules": {
"type": "bookmark_bar_websites_urls",
"urls": ["https://jalammar.github.io/illustrated-transformer/"]
"urls": [
"https://jalammar.github.io/illustrated-transformer/"
]
}
}
}
}
},
"proxy": true
}

View File

@@ -47,8 +47,11 @@
"type": "rule",
"rules": {
"type": "domains",
"domains": [".amazon.com"]
"domains": [
".amazon.com"
]
}
}
}
}
},
"proxy": true
}

View File

@@ -1,63 +1,66 @@
{
"id": "7f52cab9-535c-4835-ac8c-391ee64dc930",
"snapshot": "chrome",
"instruction": "Create a list of drip coffee makers that are on sale and within $25-60 and have a black finish.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://shopping.google.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "7f52cab9-535c-4835-ac8c-391ee64dc930",
"snapshot": "chrome",
"instruction": "Create a list of drip coffee makers that are on sale and within $25-60 and have a black finish.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://shopping.google.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": [
"q",
"tbs"
]
},
"expected": {
"type": "rule",
"rules": {
"expected": {
"q": "drip coffee maker",
"tbs": "mr:1,price:1,ppr_min:25,ppr_max:60,sales:1,pdtr0:1825161|1825162"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": ["q", "tbs"]
},
"expected":{
"type": "rule",
"rules":{
"expected": {
"q": "drip coffee maker",
"tbs": "mr:1,price:1,ppr_min:25,ppr_max:60,sales:1,pdtr0:1825161|1825162"
}
}
}
}
}
},
"proxy": true
}

View File

@@ -1,65 +1,70 @@
{
"id": "82279c77-8fc6-46f6-9622-3ba96f61b477",
"snapshot": "chrome",
"instruction": "Find electric cars with a maximum price of $50,000 within 50 miles of 10001.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.cars.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "82279c77-8fc6-46f6-9622-3ba96f61b477",
"snapshot": "chrome",
"instruction": "Find electric cars with a maximum price of $50,000 within 50 miles of 10001.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.cars.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": [
"list_price_max",
"maximum_distance",
"zip",
"fuel_slugs[]"
]
},
"expected": {
"type": "rule",
"rules": {
"expected": {
"list_price_max": "50000",
"maximum_distance": "50",
"zip": "10001",
"fuel_slugs[]": "electric"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": ["list_price_max", "maximum_distance", "zip","fuel_slugs[]"]
},
"expected":{
"type": "rule",
"rules":{
"expected": {
"list_price_max": "50000",
"maximum_distance": "50",
"zip":"10001",
"fuel_slugs[]":"electric"
}
}
}
}
}
},
"proxy": true
}

View File

@@ -1,69 +1,74 @@
{
"id": "82bc8d6a-36eb-4d2d-8801-ef714fb1e55a",
"snapshot": "chrome",
"instruction": "On next Monday, look up a flight from Mumbai to Stockholm.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.qatarairways.com/en-hk/homepage.html"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "82bc8d6a-36eb-4d2d-8801-ef714fb1e55a",
"snapshot": "chrome",
"instruction": "On next Monday, look up a flight from Mumbai to Stockholm.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.qatarairways.com/en-hk/homepage.html"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": [
"fromStation",
"toStation",
"departing"
],
"replace": {
"departing": "time"
}
},
"expected": {
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "next Monday"
},
"expected": {
"fromStation": "BOM",
"toStation": "STO",
"time": "{Year}-{Month0D}-{Day0D}"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": ["fromStation", "toStation", "departing"],
"replace":{
"departing": "time"
}
},
"expected":{
"type": "rule_relativeTime",
"rules":{
"relativeTime":{
"from": "next Monday"
},
"expected": {
"fromStation": "BOM",
"toStation": "STO",
"time": "{Year}-{Month0D}-{Day0D}"
}
}
}
}
}
},
"proxy": true
}

View File

@@ -3,13 +3,13 @@
"snapshot": "chrome",
"instruction": "Could you assist me in turning off the dark mode feature in Google Chrome? I've noticed that while dark mode is great for reducing glare, it actually makes it more challenging for me to read text clearly, especially with my astigmatism.",
"source": "https://superuser.com/questions/1417973/how-to-disable-google-chrome-dark-mode",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -48,5 +48,6 @@
"expected": "true"
}
}
}
}
},
"proxy": false
}

View File

@@ -29,13 +29,15 @@
"chrome"
],
"evaluator": {
"postconfig":[{
"type": "execute",
"parameters": {
"command": "pkill chrome",
"shell": "true"
"postconfig": [
{
"type": "execute",
"parameters": {
"command": "pkill chrome",
"shell": "true"
}
}
}],
],
"func": "exact_match",
"result": {
"type": "data_delete_automacally"
@@ -46,5 +48,6 @@
"expected": "true"
}
}
}
}
},
"proxy": false
}

View File

@@ -1,78 +1,83 @@
{
"id": "9f3f70fc-5afc-4958-a7b7-3bb4fcb01805",
"snapshot": "chrome",
"instruction": "Browse the list of women's Nike jerseys over $60.",
"source": "test_task_1",
"config": [
"id": "9f3f70fc-5afc-4958-a7b7-3bb4fcb01805",
"snapshot": "chrome",
"instruction": "Browse the list of women's Nike jerseys over $60.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.nba.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": [
"is_expected_url_pattern_match",
"check_direct_json_object"
],
"conj": "and",
"result": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
"type": "active_tab_info"
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.nba.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"type": "active_tab_html_parse",
"category": "xpath",
"xpathObject": {
"/html/body/div[2]/div/div[6]/div[2]/div[2]/div/div[1]/div[4]/ul/li[2]": "money"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":["is_expected_url_pattern_match", "check_direct_json_object"],
"conj": "and",
"result": [
{
"type": "active_tab_info"
},
{
"type": "active_tab_html_parse",
"category": "xpath",
"xpathObject":{
"/html/body/div[2]/div/div[6]/div[2]/div[2]/div/div[1]/div[4]/ul/li[2]": "money"
}
}
],
"expected":[
{
"type": "rule",
"rules":{
"expected": ["\/women-jerseys\/"]
}
},
{
"type": "rule",
"rules":{
"expected": {
"money": "over $60"
}
}
}
]
}
}
"expected": [
{
"type": "rule",
"rules": {
"expected": [
"/women-jerseys/"
]
}
},
{
"type": "rule",
"rules": {
"expected": {
"money": "over $60"
}
}
}
]
},
"proxy": true
}

View File

@@ -56,5 +56,6 @@
]
}
}
}
},
"proxy": true
}

View File

@@ -65,5 +65,6 @@
"url": "https://www.dmv.virginia.gov/licenses-ids/license/applying/eligibility"
}
}
}
}
},
"proxy": true
}

View File

@@ -1,69 +1,70 @@
{
"id": "a96b564e-dbe9-42c3-9ccf-b4498073438a",
"snapshot": "chrome",
"instruction": "Find discussions of community and open one with most replies.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.flightaware.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
},
{
"type": "execute",
"parameters": {
"command": [
"python",
"-c",
"import pyautogui; import time; pyautogui.hotkey('alt', 'f10'); time.sleep(0.5);"
]
}
"id": "a96b564e-dbe9-42c3-9ccf-b4498073438a",
"snapshot": "chrome",
"instruction": "Find discussions of community and open one with most replies.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"is_expected_active_tab",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected":{
"type": "rule",
"rules":{
"type": "url",
"url": "https://discussions.flightaware.com/t/the-banter-thread/4412"
}
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.flightaware.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
},
{
"type": "execute",
"parameters": {
"command": [
"python",
"-c",
"import pyautogui; import time; pyautogui.hotkey('alt', 'f10'); time.sleep(0.5);"
]
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "is_expected_active_tab",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected": {
"type": "rule",
"rules": {
"type": "url",
"url": "https://discussions.flightaware.com/t/the-banter-thread/4412"
}
}
},
"proxy": true
}

View File

@@ -3,13 +3,13 @@
"snapshot": "chrome",
"instruction": "Could you please change the number of search results displayed on one page to 50? I find that having more results visible at once significantly enhances my research efficiency, as it reduces the need to constantly click through multiple pages. ",
"source": "https://support.google.com/chrome/thread/219988391/increase-search-results-per-page?hl=en",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -61,5 +61,6 @@
"max": 99999
}
}
}
}
},
"proxy": false
}

View File

@@ -43,24 +43,28 @@
"chrome"
],
"evaluator": {
"func": ["exact_match", "exact_match"],
"conj": "or",
"result": [
{
"func": [
"exact_match",
"exact_match"
],
"conj": "or",
"result": [
{
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -1,
"needDeleteId": false,
"returnType": "string"
},
{
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -1,
"needDeleteId": false,
"returnType": "string"
}],
"expected": [
},
{
"type": "url_dashPart",
"goto_prefix": "https://www.",
"partIndex": -1,
"needDeleteId": false,
"returnType": "string"
}
],
"expected": [
{
"type": "rule",
"rules": {
@@ -72,6 +76,8 @@
"rules": {
"expected": "tamiflu-side-effects.html"
}
}]
}
}
]
},
"proxy": true
}

View File

@@ -1,67 +1,66 @@
{
"id": "b4f95342-463e-4179-8c3f-193cd7241fb2",
"snapshot": "chrome",
"instruction": "Find the next available date for Albion Basin.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.recreation.gov/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "b4f95342-463e-4179-8c3f-193cd7241fb2",
"snapshot": "chrome",
"instruction": "Find the next available date for Albion Basin.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.recreation.gov/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject": {},
"class_multiObject": {
"camp-sortable-column-header": {
"2": "camp-sortable-column-header"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category":"class",
"class_singleObject":{},
"class_multiObject":{
"camp-sortable-column-header":{
"2":"camp-sortable-column-header"
}
}
},
"expected":{
"type":"gotoRecreationPage_and_get_html_content",
"selector": "class",
"class": "camp-sortable-column-header",
"order": "2"
}
},
"expected": {
"type": "gotoRecreationPage_and_get_html_content",
"selector": "class",
"class": "camp-sortable-column-header",
"order": "2"
}
}
},
"proxy": true
}

View File

@@ -1,78 +1,77 @@
{
"id": "b7895e80-f4d1-4648-bee0-4eb45a6f1fa8",
"snapshot": "chrome",
"instruction": "Find a Hotel in New York City with lowest price possible for 2 adults this weekend.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.tripadvisor.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "b7895e80-f4d1-4648-bee0-4eb45a6f1fa8",
"snapshot": "chrome",
"instruction": "Find a Hotel in New York City with lowest price possible for 2 adults this weekend.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.tripadvisor.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject": {
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[1]/div/button/div[3]": "from",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[2]/button/div[3]": "to",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[1]/div/h1": "city",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[3]/button/div[3]/span/span[2]": "adult",
"/html/body/div[1]/main/div[3]/div/div[2]/div/div[1]/div/div[2]/div[1]/div/div[1]/div/div[1]/div[2]/div/div[2]/div/button/div/div": "rank"
}
},
"expected": {
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "this Saturday",
"to": "this Sunday"
},
"expected": {
"from": "{DoW}, {Month} {Day0D}",
"to": "{DoW}, {Month} {Day0D}",
"city": "New York City Hotels",
"adult": "2 adults",
"rank": "Price (low to high)"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "xpath",
"xpathObject":{
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[1]/div/button/div[3]":"from",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[2]/button/div[3]":"to",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[1]/div/h1":"city",
"/html/body/div[1]/main/div[3]/div/div[1]/div[2]/div[1]/div[2]/div/div/div/div/div[3]/button/div[3]/span/span[2]":"adult",
"/html/body/div[1]/main/div[3]/div/div[2]/div/div[1]/div/div[2]/div[1]/div/div[1]/div/div[1]/div[2]/div/div[2]/div/button/div/div":"rank"
}
},
"expected":
{
"type": "rule_relativeTime",
"rules":{
"relativeTime": {
"from": "this Saturday",
"to": "this Sunday"
},
"expected": {
"from": "{DoW}, {Month} {Day0D}",
"to": "{DoW}, {Month} {Day0D}",
"city": "New York City Hotels",
"adult": "2 adults",
"rank": "Price (low to high)"
}
}
}
}
}
},
"proxy": true
}

View File

@@ -36,8 +36,12 @@
"expected": {
"type": "rule",
"rules": {
"expected": ["Microsoft Bing", "Bing"]
"expected": [
"Microsoft Bing",
"Bing"
]
}
}
}
}
},
"proxy": false
}

View File

@@ -1,69 +1,71 @@
{
"id": "c1fa57f3-c3db-4596-8f09-020701085416",
"snapshot": "chrome",
"instruction": "Open the baggage fee calculator.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.united.com/en/us"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
},
{
"type": "execute",
"parameters": {
"command": [
"python",
"-c",
"import pyautogui; import time; pyautogui.hotkey('alt', 'f10'); time.sleep(0.5);"
]
}
"id": "c1fa57f3-c3db-4596-8f09-020701085416",
"snapshot": "chrome",
"instruction": "Open the baggage fee calculator.",
"source": "test_task_1",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.united.com/en/us"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
},
{
"type": "execute",
"parameters": {
"command": [
"python",
"-c",
"import pyautogui; import time; pyautogui.hotkey('alt', 'f10'); time.sleep(0.5);"
]
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"is_expected_url_pattern_match",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected":{
"type": "rule",
"rules":{
"expected": ["checked-bag-fee-calculator"]
}
}
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "is_expected_url_pattern_match",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected": {
"type": "rule",
"rules": {
"expected": [
"checked-bag-fee-calculator"
]
}
}
},
"proxy": true
}

View File

@@ -43,20 +43,21 @@
"chrome"
],
"evaluator": {
"func": "is_expected_url_pattern_match",
"result": {
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
},
"expected": {
"type": "rule",
"rules": {
"expected": [
"AgeAppropriate:Kids",
"search=spider[-%20]?man%20toys",
"S=4"
]
}
"func": "is_expected_url_pattern_match",
"result": {
"type": "active_url_from_accessTree",
"goto_prefix": "https://www."
},
"expected": {
"type": "rule",
"rules": {
"expected": [
"AgeAppropriate:Kids",
"search=spider[-%20]?man%20toys",
"S=4"
]
}
}
}
},
"proxy": true
}

View File

@@ -1,102 +1,109 @@
{
"id": "da46d875-6b82-4681-9284-653b0c7ae241",
"snapshot": "chrome",
"instruction": "Schedule an appointment to apply for transportation access pass in the Charlie Card store on the first Monday four months later, 10:15 am, fill in my details (James Smith, james.smith@gmail.com). And don not click \"book\" directly. Let me review it.",
"source": "test_task_2",
"config": [
"id": "da46d875-6b82-4681-9284-653b0c7ae241",
"snapshot": "chrome",
"instruction": "Schedule an appointment to apply for transportation access pass in the Charlie Card store on the first Monday four months later, 10:15 am, fill in my details (James Smith, james.smith@gmail.com). And don not click \"book\" directly. Let me review it.",
"source": "test_task_2",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.mbta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": [
"is_expected_url_pattern_match",
"check_direct_json_object",
"check_direct_json_object"
],
"conj": "and",
"result": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
"type": "active_tab_info"
},
{
"type": "active_tab_html_parse",
"category": "class",
"class_singleObject": {},
"class_multiObject": {
"breakword": {
"1": "content",
"2": "time"
}
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.mbta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"type": "active_tab_html_parse",
"category": "input",
"inputObject": {
"/html/body/div/div/form/div[7]/div/div/div[1]/input[1]": "name",
"/html/body/div/div/form/div[7]/div/div/div[1]/input[2]": "mail"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":["is_expected_url_pattern_match", "check_direct_json_object", "check_direct_json_object"],
"conj": "and",
"result": [
{
"type": "active_tab_info"
},
{
"type": "active_tab_html_parse",
"category": "class",
"class_singleObject":{},
"class_multiObject":{
"breakword":{
"1": "content",
"2": "time"
}
}
},
{
"type": "active_tab_html_parse",
"category": "input",
"inputObject":{
"/html/body/div/div/form/div[7]/div/div/div[1]/input[1]": "name",
"/html/body/div/div/form/div[7]/div/div/div[1]/input[2]": "mail"
}
}
],
"expected":[
{
"type": "rule",
"rules":{
"expected": ["CharlieCardStoreAppointments@mbta.com\/bookings\/"]
}
},
{
"type": "rule_relativeTime",
"rules":{
"relativeTime":{
"from":"first monday four months later"
},
"expected": {
"content": "Apply for Transportation Access Pass (TAP) CharlieCard non-auto approval",
"time": "{MonthFull} {Day0D}, 10:15 am"
}
}
},
{
"type": "rule",
"rules":{
"expected": {
"name": "James Smith",
"mail": "james.smith@gmail.com"
}
}
}
]
}
}
"expected": [
{
"type": "rule",
"rules": {
"expected": [
"CharlieCardStoreAppointments@mbta.com/bookings/"
]
}
},
{
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "first monday four months later"
},
"expected": {
"content": "Apply for Transportation Access Pass (TAP) CharlieCard non-auto approval",
"time": "{MonthFull} {Day0D}, 10:15 am"
}
}
},
{
"type": "rule",
"rules": {
"expected": {
"name": "James Smith",
"mail": "james.smith@gmail.com"
}
}
}
]
},
"proxy": true
}

View File

@@ -48,5 +48,6 @@
"path": "https://lilianweng.github.io/posts/2023-06-23-agent/",
"dest": "LLM Powered Autonomous Agents _ Lil'Log_gold.pdf"
}
}
}
},
"proxy": true
}

View File

@@ -65,5 +65,6 @@
"url": "https://www.nfl.com/scores/2019/POST4"
}
}
}
}
},
"proxy": true
}

View File

@@ -1,59 +1,61 @@
{
"id": "f3b19d1e-2d48-44e9-b4e1-defcae1a0197",
"snapshot": "chrome",
"instruction": "Find help page about buying tickets.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://seatgeek.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
"id": "f3b19d1e-2d48-44e9-b4e1-defcae1a0197",
"snapshot": "chrome",
"instruction": "Find help page about buying tickets.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://seatgeek.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"is_expected_url_pattern_match",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected":{
"type": "rule",
"rules":{
"expected": ["Buying-Tickets"]
}
}
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "is_expected_url_pattern_match",
"result": {
"type": "active_tab_info",
"goto_prefix": "https://www."
},
"expected": {
"type": "rule",
"rules": {
"expected": [
"Buying-Tickets"
]
}
}
},
"proxy": true
}

View File

@@ -65,5 +65,6 @@
"url": "https://www.apple.com/iphone/compare/?modelList=iphone-15-pro-max,iphone-15-pro,iphone-13-pro-max"
}
}
}
}
},
"proxy": true
}

View File

@@ -1,74 +1,82 @@
{
"id": "f79439ad-3ee8-4f99-a518-0eb60e5652b0",
"snapshot": "chrome",
"instruction": "Search for a one way flight from Dublin to Vienna on 10th next month for 2 adults.",
"source": "test_task_2",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.ryanair.com/gb/en"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"id": "f79439ad-3ee8-4f99-a518-0eb60e5652b0",
"snapshot": "chrome",
"instruction": "Search for a one way flight from Dublin to Vienna on 10th next month for 2 adults.",
"source": "test_task_2",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.ryanair.com/gb/en"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys": [
"originIata",
"destinationIata",
"tpAdults",
"tpTeens",
"tpChildren",
"tpStartDate",
"isReturn"
],
"replace": {
"tpStartDate": "time"
}
},
"expected": {
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "10th next month"
},
"expected": {
"originIata": "DUB",
"destinationIata": "VIE",
"tpAdults": "2",
"tpTeens": "0",
"tpChildren": "0",
"time": "{Year}-{Month0D}-{DayD}",
"isReturn": "false"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_url_parse",
"goto_prefix": "https://www.",
"parse_keys":["originIata", "destinationIata", "tpAdults", "tpTeens", "tpChildren", "tpStartDate", "isReturn"],
"replace":{
"tpStartDate": "time"
}
},
"expected":{
"type": "rule_relativeTime",
"rules":{
"relativeTime": {
"from": "10th next month"
},
"expected": {
"originIata": "DUB",
"destinationIata": "VIE",
"tpAdults": "2",
"tpTeens": "0",
"tpChildren": "0",
"time": "{Year}-{Month0D}-{DayD}",
"isReturn":"false"
}
}
}
}
}
},
"proxy": true
}

View File

@@ -1,76 +1,76 @@
{
"id": "fc6d8143-9452-4171-9459-7f515143419a",
"snapshot": "chrome",
"instruction": "Find the status of tomorrow flights from New York airports to Columbus in Ohio.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
"id": "fc6d8143-9452-4171-9459-7f515143419a",
"snapshot": "chrome",
"instruction": "Find the status of tomorrow flights from New York airports to Columbus in Ohio.",
"source": "test_task_0",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"google-chrome",
"--remote-debugging-port=1337"
]
}
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.delta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func": "check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject": {
"search-date": "time"
},
{
"type": "launch",
"parameters": {
"command": [
"socat",
"tcp-listen:9222,fork",
"tcp:localhost:1337"
]
}
},
{
"type": "chrome_open_tabs",
"parameters": {
"urls_to_open": [
"https://www.delta.com/"
]
}
},
{
"type": "activate_window",
"parameters": {
"window_name": "Google Chrome"
"class_multiObject": {
"search-segment-cities__city": {
"0": "start",
"1": "end"
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"chrome"
],
"evaluator": {
"func":"check_direct_json_object",
"result": {
"type": "active_tab_html_parse",
"goto_prefix": "https://www.",
"category": "class",
"class_singleObject":{
"search-date": "time"
},
"class_multiObject":{
"search-segment-cities__city": {
"0": "start",
"1": "end"
}
}
},
"expected": {
"type": "rule_relativeTime",
"rules": {
"relativeTime": {
"from": "tomorrow"
},
"expected":{
"type": "rule_relativeTime",
"rules":{
"relativeTime": {
"from": "tomorrow"
},
"expected": {
"start": "NYC",
"end": "CMH",
"time": "{DoW}, {Month} {DayD}, {Year}"
}
}
"expected": {
"start": "NYC",
"end": "CMH",
"time": "{DoW}, {Month} {DayD}, {Year}"
}
}
}
}
},
"proxy": true
}

View File

@@ -31,5 +31,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
"path": "/home/user/Desktop/palette_computer.png",
"dest": "palette_computer.png"
}
}
},
"proxy": false
}

View File

@@ -95,5 +95,6 @@
"path": "/home/user/Desktop/dog_without_background.png",
"dest": "dog_without_background.png"
}
}
},
"proxy": false
}

View File

@@ -38,5 +38,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -22,5 +22,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
"path": "/home/user/Desktop/edited_colorful.png",
"dest": "edited_colorful.png"
}
}
},
"proxy": false
}

View File

@@ -31,5 +31,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -3,13 +3,13 @@
"snapshot": "gimp",
"instruction": "Could you help me download the logo of the University of Hong Kong in \".png\" format within GIMP?",
"source": "",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -22,5 +22,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
"path": "/home/user/Desktop/berry_mirror.png",
"dest": "berry_mirror.png"
}
}
},
"proxy": false
}

View File

@@ -95,5 +95,6 @@
"path": "/home/user/Desktop/green_background_with_object.png",
"dest": "green_background_with_object.png"
}
}
},
"proxy": false
}

View File

@@ -50,5 +50,6 @@
"file_name": "gimprc",
"dest": "gimprc"
}
}
},
"proxy": false
}

View File

@@ -41,5 +41,6 @@
"path": "/home/user/Desktop/export.jpg",
"dest": "export.jpg"
}
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
"path": "/home/user/Desktop/edited_darker.png",
"dest": "edited_darker.png"
}
}
},
"proxy": false
}

View File

@@ -1,54 +1,55 @@
{
"id": "7b7617bd-57cc-468e-9c91-40c4ec2bcb3d",
"snapshot": "gimp",
"instruction": "Set the minimum number of undo steps to 100.",
"source": "https://www.youtube.com/watch?v=G_PjQAy0iiU",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"gimp"
]
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"postconfig": [
{
"type": "execute",
"parameters": {
"command": [
"python3",
"-c",
"import pyautogui; pyautogui.hotkey([\"ctrl\", \"q\"]);"
]
}
},
{
"type": "sleep",
"parameters": {
"seconds": 0.5
}
}
],
"func": "check_config_status",
"expected": {
"type": "rule",
"rules": {
"type:": "key-value",
"key": "undo-levels",
"value": "100"
}
},
"result": {
"type": "gimp_config_file",
"file_name": "gimprc",
"dest": "gimprc"
}
"id": "7b7617bd-57cc-468e-9c91-40c4ec2bcb3d",
"snapshot": "gimp",
"instruction": "Set the minimum number of undo steps to 100.",
"source": "https://www.youtube.com/watch?v=G_PjQAy0iiU",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"gimp"
]
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"postconfig": [
{
"type": "execute",
"parameters": {
"command": [
"python3",
"-c",
"import pyautogui; pyautogui.hotkey([\"ctrl\", \"q\"]);"
]
}
},
{
"type": "sleep",
"parameters": {
"seconds": 0.5
}
}
],
"func": "check_config_status",
"expected": {
"type": "rule",
"rules": {
"type:": "key-value",
"key": "undo-levels",
"value": "100"
}
},
"result": {
"type": "gimp_config_file",
"file_name": "gimprc",
"dest": "gimprc"
}
},
"proxy": false
}

View File

@@ -31,5 +31,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -65,5 +65,6 @@
"file_name": "action-history",
"dest": "action-history"
}
}
},
"proxy": false
}

View File

@@ -62,5 +62,6 @@
"file_name": "gimprc",
"dest": "gimprc"
}
}
},
"proxy": false
}

View File

@@ -113,5 +113,6 @@
"dest": "resized.png"
}
]
}
},
"proxy": false
}

View File

@@ -1,54 +1,55 @@
{
"id": "d52d6308-ec58-42b7-a2c9-de80e4837b2b",
"snapshot": "gimp",
"instruction": "Could you help me remove the dock on the left side of the screen?",
"source": "https://superuser.com/questions/1447106/how-to-get-rid-of-the-gimp-tool-options-box",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"gimp"
]
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"postconfig": [
{
"type": "execute",
"parameters": {
"command": [
"python3",
"-c",
"import pyautogui; pyautogui.hotkey([\"ctrl\", \"q\"]);"
]
}
},
{
"type": "sleep",
"parameters": {
"seconds": 0.5
}
}
],
"func": "check_config_status",
"expected": {
"type": "rule",
"rules": {
"type:": "key-value",
"key": "hide-docks",
"value": "yes"
}
},
"result": {
"type": "gimp_config_file",
"file_name": "sessionrc",
"dest": "sessionrc"
}
"id": "d52d6308-ec58-42b7-a2c9-de80e4837b2b",
"snapshot": "gimp",
"instruction": "Could you help me remove the dock on the left side of the screen?",
"source": "https://superuser.com/questions/1447106/how-to-get-rid-of-the-gimp-tool-options-box",
"config": [
{
"type": "launch",
"parameters": {
"command": [
"gimp"
]
}
}
],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"postconfig": [
{
"type": "execute",
"parameters": {
"command": [
"python3",
"-c",
"import pyautogui; pyautogui.hotkey([\"ctrl\", \"q\"]);"
]
}
},
{
"type": "sleep",
"parameters": {
"seconds": 0.5
}
}
],
"func": "check_config_status",
"expected": {
"type": "rule",
"rules": {
"type:": "key-value",
"key": "hide-docks",
"value": "yes"
}
},
"result": {
"type": "gimp_config_file",
"file_name": "sessionrc",
"dest": "sessionrc"
}
},
"proxy": false
}

View File

@@ -31,5 +31,6 @@
],
"evaluator": {
"func": "infeasible"
}
},
"proxy": false
}

View File

@@ -3,13 +3,13 @@
"snapshot": "gimp",
"instruction": "Could you tone down the brightness of my photo at desktop?",
"source": "https://www.quora.com/How-do-I-edit-a-photo-in-GIMP",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -86,5 +86,6 @@
"path": "/home/user/Desktop/leftside_textbox.png",
"dest": "leftside_textbox.png"
}
}
},
"proxy": false
}

View File

@@ -90,5 +90,6 @@
"path": "/home/user/Desktop/Triangle_In_The_Middle.png",
"dest": "Triangle_In_The_Middle.png"
}
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
"path": "/home/user/Desktop/berries_contrast.png",
"dest": "berries_contrast.png"
}
}
},
"proxy": false
}

View File

@@ -3,13 +3,13 @@
"snapshot": "gimp",
"instruction": "Blue is my favorite color, so could you help me change the color theme of GIMP to \"Blue\"?",
"source": "",
"config": [
],
"config": [],
"trajectory": "trajectories/",
"related_apps": [
"gimp"
],
"evaluator": {
"func": "infeasible"
}
}
},
"proxy": false
}

View File

@@ -78,5 +78,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -87,5 +87,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -83,5 +83,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -78,5 +78,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -86,5 +86,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -105,5 +105,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -91,5 +91,6 @@
}
]
}
}
},
"proxy": false
}

View File

@@ -82,5 +82,6 @@
}
]
}
}
},
"proxy": false
}

Some files were not shown because too many files have changed in this diff Show More