diff --git a/PUBLIC_EVALUATION_GUIDELINE.md b/PUBLIC_EVALUATION_GUIDELINE.md index de36e66..b90f41c 100644 --- a/PUBLIC_EVALUATION_GUIDELINE.md +++ b/PUBLIC_EVALUATION_GUIDELINE.md @@ -8,6 +8,7 @@ We have built an AWS-based platform for large-scale parallel evaluation of OSWor All instances use a preconfigured AMI to ensure a consistent environment. +

## 1. Platform Deployment & Connection @@ -89,6 +90,8 @@ In the **Access keys** section, click **"Create access key"** to generate your o pubeval5

+

+ ## 2. Environment Setup ### 2.1 Google Drive Integration @@ -128,6 +131,7 @@ export AWS_SUBNET_ID="subnet-0a4b0c5b8f6066712" export AWS_SECURITY_GROUP_ID="sg-08a53433e9b4abde6" ``` +

## 3. Running Evaluations @@ -155,6 +159,7 @@ Key Parameters: - `--test_all_meta_path`: Path to the test set metadata - `--region`: AWS region +

## 4. Viewing Results