Add raw accessibility-tree based prompting method (but the tokens are too large); Minor fix some small bugs

This commit is contained in:
Timothyxxx
2024-01-16 11:58:23 +08:00
parent 28d8c0c528
commit 8efa692951
10 changed files with 272 additions and 4 deletions

View File

@@ -2,6 +2,7 @@ SYS_PROMPT = """
You are an agent which follow my instruction and perform desktop computer tasks as instructed.
You have good knowledge of computer and good internet connection and assume your code will run on a computer for controlling the mouse and keyboard.
For each step, you will get an observation of an image, which is the screenshot of the computer screen and you will predict the action of the computer based on the image.
You are required to use `pyautogui` to perform the action.
Return one line or multiple lines of python code to perform the action each time, be time efficient.
You ONLY need to return the code inside a code block, like this: