Tianbao Xie
4e11eafd1d
Robust Evaluation, Blocking File Open, Grader Sensitivity, and LibreOffice Writer Fixes ( #217 )
...
* Refactor evaluator structure in LibreOffice Writer example JSON to support multiple expected and result files, enhancing evaluation flexibility.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update instance type to t3.large and add VNC access URL logging for allocated VMs, enhancing remote access capabilities.
* Update time format in get_vm_file function to include hours, minutes, and seconds for more precise file naming with time suffix.
* More delay for 936321ce-5236-426a-9a20-e0e3c5dc536f; support one more potential solutions.
* Enhance SetupController with configurable retry limit and improved error handling for file opening requests. Introduce new function to compare unique training records, and update logging for better debugging. Adjust JSON examples for evaluation to support multiple expected and result files.
* Clean debug code
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn >
2025-06-16 21:37:19 +08:00
Kaixin Li
347238e17e
Get VM IP again when getting screenshot fails ( #215 )
...
In rare cases, the IP of the VM changes after it launches. We can get the IP every time we retry to ensure the correct connection.
2025-06-16 02:40:40 +08:00
Yuan Mengqi
40354322e8
fix pub eval readme typo ( #214 )
...
* update clean code
* fix pub eval readme typo
2025-06-10 22:57:16 +08:00
Yuan Mengqi
362499330e
update clean code ( #213 )
2025-06-10 22:18:03 +08:00
Yuan Mengqi
4ce05b89ae
Merge pull request #212 from yuanmengqi/aws_clean
...
AWS OSWorld Provider Enhancement, Proxy Intergration, new Agent Operator Inplementation
2025-06-10 21:44:18 +08:00
yuanmengqi
8a1fc5c385
edit pub eval readme
2025-06-10 13:37:26 +00:00
yuanmengqi
b8d229cdb3
edit pub eval readme
2025-06-10 13:36:48 +00:00
yuanmengqi
fbe88799cf
edit pub eval readme
2025-06-10 13:36:03 +00:00
yuanmengqi
3b5e4f3b15
edit pub eval readme
2025-06-10 13:34:42 +00:00
yuanmengqi
2d5439d062
edit pub eval readme
2025-06-10 13:32:24 +00:00
yuanmengqi
2d3347ca3e
edit pub eval readme
2025-06-10 13:28:54 +00:00
yuanmengqi
1b09d63cb2
edit pub eval readme
2025-06-10 13:27:53 +00:00
yuanmengqi
2bae228803
merge upstream
2025-06-10 13:23:03 +00:00
yuanmengqi
7315aec6e6
clean code
2025-06-10 04:06:54 +00:00
yuanmengqi
caf487b7cc
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-10 02:36:46 +00:00
yuanmengqi
3da32fe5cf
update operator prompt
2025-06-10 02:35:53 +00:00
yuanmengqi
caaa4e5baa
fix: update AMI ID for us-east-1 region in AWS manager
2025-06-10 02:32:24 +00:00
yuanmengqi
02387f2cee
feat: update DesktopEnv to support VMware provider and add proxy configuration
...
- Changed default provider name from "aws" to "vmware".
- Introduced `enable_proxy` parameter to control proxy support.
- Enhanced retry logic in the `reset` method to use a constant for maximum retries.
- Updated proxy handling to respect the new `enable_proxy` setting.
2025-06-09 16:35:13 +00:00
adlsdztony
168a2694f2
Merge branch 'feat/aws-provider-support' of https://github.com/xlang-ai/OSWorld into feat/aws-provider-support
2025-06-09 16:07:48 +00:00
adlsdztony
bfae51d74d
fix: enhance setup method with retry logic and return status
2025-06-09 16:07:13 +00:00
yuanmengqi
692486f8e7
add GDrive guideline
2025-06-09 14:59:47 +00:00
yuanmengqi
630f92fd7c
fix: correct URL encoding in JSON examples for invoice paths
2025-06-09 08:06:27 +00:00
yuanmengqi
b41339c5e5
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-09 04:27:07 +00:00
yuanmengqi
aee1207fff
fix error
2025-06-09 04:20:59 +00:00
yuanmengqi
40edf0aba6
Merge remote-tracking branch 'origin/main' into feat/aws-provider-support
2025-06-08 14:40:38 +00:00
yuanmengqi
6029c9d496
Merge branch 'main' into feat/aws-provider-support
2025-06-08 14:24:44 +00:00
tsuky_chen
e55810809e
Fix libreoffice impress evaluation ( #209 )
...
Co-authored-by: chenjix <211250101@smail.nju.edu.cn >
2025-06-08 22:12:56 +08:00
yuanmengqi
3e541bb393
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-08 04:01:35 +00:00
yuanmengqi
d8872634ee
edit prompt
2025-06-08 03:59:31 +00:00
yuanmengqi
eaf7b9e48f
refactor: replace hardcoded AMI ID with dynamic retrieval from IMAGE_ID_MAP in AWS DesktopEnv initialization
2025-06-07 21:17:18 +00:00
yuanmengqi
8853671220
fix: enhance instruction clarity and adjust timing in automation script for LibreOffice Impress example
2025-06-07 21:17:00 +00:00
yuanmengqi
ca65022137
fix: update AMI ID for us-east-1 region in AWS manager configuration
2025-06-07 21:16:26 +00:00
yuanmengqi
bba791f690
Merge remote-tracking branch 'chenjix/fix_impress' into feat/aws-provider-support
2025-06-07 17:41:50 +00:00
yuanmengqi
9fa768d24d
refactor: update URLs in multiple JSON files to ensure proper encoding of special characters
2025-06-07 17:26:45 +00:00
yuanmengqi
54953f82fb
Merge branch 'feat/aws-provider-support'
2025-06-07 16:02:39 +00:00
yuanmengqi
8471394cc1
add branch feat/aws-provider-support
2025-06-07 15:57:18 +00:00
yuanmengqi
f48d80002f
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-07 13:22:53 +00:00
yuanmengqi
c57b1d4e7a
eval update
2025-06-07 13:19:22 +00:00
yuanmengqi
bbd4401ff5
Merge branch 'feat/aws-provider-support' of https://github.com/xlang-ai/OSWorld into feat/aws-provider-support
2025-06-07 11:40:26 +00:00
yuanmengqi
fc3ef6b2be
fix: update AMI ID for us-east-1 region in AWS manager configuration
2025-06-07 11:40:09 +00:00
adlsdztony
0375f9d59f
Merge branch 'feat/aws-provider-support' of https://github.com/xlang-ai/OSWorld into feat/aws-provider-support
2025-06-07 11:24:56 +00:00
adlsdztony
493abdeeab
feat&refactor: add proxy setup functionality and update .gitignore for proxy config file
2025-06-07 11:24:49 +00:00
yuanmengqi
8d0ff7c99c
refactor: update VLC command configurations to suppress audio and video title display across multiple JSON examples
2025-06-07 09:02:49 +00:00
yuanmengqi
4ade4114da
add problems from the community
2025-06-07 06:50:15 +00:00
yuanmengqi
e61acece84
problems from the community
2025-06-07 05:30:40 +00:00
yuanmengqi
a146c1e0b7
edit prompt
2025-06-07 05:21:04 +00:00
chenjix
5959c0846e
Fix libreoffice impress evaluation
2025-06-07 00:13:38 +08:00
adlsdztony
7d25f902a4
refactor&fix: update README and main.py for improved configuration and task status handling
2025-06-06 12:55:13 +00:00
yuanmengqi
64177045b5
Merge remote-tracking branch 'upstream/feat/aws-provider-support'
2025-06-06 10:22:56 +00:00
yuanmengqi
4ea24ddfd3
add proxy
2025-06-06 09:41:22 +00:00