2.1 KiB
2.1 KiB
Road Map
Here we provide a high-level road map for the project. We will update this road map as we make progress. If you are interested in contributing to the project, please check the CONTRIBUTING.md for more details.
Road Map for Environment Infrastructure
- Explore VMWare, and whether it can be connected and control through mouse package
- Explore Windows and MacOS, whether it can be installed
- MacOS is closed source and cannot be legally installed
- Windows is available legally and can be installed
- Build gym-like python interface for controlling the VM
- Recording of actions (mouse movement, click, keyboard) for humans to annotate, and we can replay it and compress it
- Build a simple task, e.g. open a browser, open a website, click on a button, and close the browser
- Set up a pipeline and build agents implementation (zero-shot) for the task
- Start to design on which tasks inside the DesktopENv to focus on, start to wrap up the environment to be public
- Start to annotate the examples for
trainingand testing - Error handling during file passing and file opening, etc.
- Add accessibility tree from the OS into the observation space
- Add pre-process and post-process action support for benchmarking setup and evaluation
- Experiment logging and visualization system
- Add more tasks, maybe scale to 300 for v1.0.0, and create a dynamic leaderboard
- Multiprocess support, this can enable the reinforcement learning to be more efficient
- Add support for automatic VM download and configuration, enable auto-scaling management
- Support running on platform that have nested virtualization, e.g. Google Cloud, AWS, etc.
- Prepare for the first release of Windows vm image for the environment
Road Map of Annotation Tool
- Improve the annotation tool base on DuckTrack, make it more robust which align on accessibility tree
- Annotate the steps of doing the task
- Crawl all resources we explored from the internet, and make it easy to access
- Set up ways for community to contribute new examples