multi-node openpi commit

2026-03-17 23:05:23 +08:00
parent 28833f0c0f
commit 7411e0e004
156 changed files with 33951 additions and 1 deletions
--- a/policy/openpi-InternData-A1/docs/docker.md
+++ b/policy/openpi-InternData-A1/docs/docker.md
@@ -0,0 +1,25 @@
+### Docker Setup
+
+All of the examples in this repo provide instructions for being run normally, and also using Docker. Although not required, the Docker option is recommended as this will simplify software installation, produce a more stable environment, and also allow you to avoid installing ROS and cluttering your machine, for examples which depend on ROS.
+
+- Basic Docker installation instructions are [here](https://docs.docker.com/engine/install/).
+- Docker must be installed in [rootless mode](https://docs.docker.com/engine/security/rootless/).
+- To use your GPU you must also install the [NVIDIA container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
+- The version of docker installed with `snap` is incompatible with the NVIDIA container toolkit, preventing it from accessing `libnvidia-ml.so` ([issue](https://github.com/NVIDIA/nvidia-container-toolkit/issues/154)). The snap version can be uninstalled with `sudo snap remove docker`.
+- Docker Desktop is also incompatible with the NVIDIA runtime ([issue](https://github.com/NVIDIA/nvidia-container-toolkit/issues/229)). Docker Desktop can be uninstalled with `sudo apt remove docker-desktop`.
+
+
+If starting from scratch and your host machine is Ubuntu 22.04, you can use accomplish all of the above with the convenience scripts `scripts/docker/install_docker_ubuntu22.sh` and `scripts/docker/install_nvidia_container_toolkit.sh`.
+
+Build the Docker image and start the container with the following command:
+```bash
+docker compose -f scripts/docker/compose.yml up --build
+```
+
+To build and run the Docker image for a specific example, use the following command:
+```bash
+docker compose -f examples/<example_name>/compose.yml up --build
+```
+where `<example_name>` is the name of the example you want to run.
+
+During the first run of any example, Docker will build the images. Go grab a coffee while this happens. Subsequent runs will be faster since the images are cached.
--- a/policy/openpi-InternData-A1/docs/norm_stats.md
+++ b/policy/openpi-InternData-A1/docs/norm_stats.md
@@ -0,0 +1,179 @@
+# Normalization Statistics
+
+Here we provide instructions for computing **normalization statistics** for both **real-world**, **simulation (InternData-A1)** and **sim2real** tasks. The computed statistics are saved in JSON format and are intended to be reused during training and evaluation in the OpenPI pipeline.
+
+Normalization is computed over:
+- `state`
+- `actions`
+
+and follows the exact data preprocessing and repacking logic used during training.
+
+---
+
+## 1. Simulation Tasks (InternData-A1)
+This script `scripts/compute_norm_stats_sim.py` computes normalization statistics for simulation tasks in the InternData-A1 benchmark.
+
+### Supported Robots
+- `split_aloha`
+- `lift2`
+- `genie1`
+- `franka`
+
+### Dataset Structure
+Download the InternData-A1 datasets from [here](https://huggingface.co/datasets/InternRobotics/InternData-A1).
+The structure of the dataset is as follows:
+
+```
+InternData-A1/sim/
+└── <task_category>/
+    └── <robot_name>/
+        └── <task_name>/               # no subtask
+            ├── data/
+            ├── meta/
+            └── videos/
+```
+
+Some tasks may have subtasks / collections:
+
+```
+InternData-A1/sim/
+└── <task_category>/
+    └── <robot_name>/
+        └── <task_name>/
+            └── <collect_name>/
+                ├── data/
+                ├── meta/
+                └── videos/
+```
+
+### Usage
+```
+python scripts/compute_norm_stats_sim.py \
+  --root_data_dir InternData-A1/sim \
+  --task_category pick_and_place_tasks \
+  --save_path stats/sim \
+  --start_ratio 0.0 \
+  --end_ratio 1.0
+```
+
+Arguments
+- `root_data_dir`: Root directory of simulation datasets.
+- `task_category`: Task category to process (e.g. pick_and_place_tasks).
+- `save_path`: Root directory where normalization statistics will be saved.
+- `start_ratio`, `end_ratio`: Fraction of tasks to process (useful for sharding large datasets).
+
+### Output Structure
+```
+<save_path>/
+└── <task_category>/
+    └── <robot_name>/
+        └── <task_name>/
+            └── <collect_name>/   # empty if no subtask
+                └── norm_stats.json
+```
+During pretraining, set the `stats_dir` argument in `DataConfig` to the `save_path` here.
+
+## 2. Real-World Tasks
+This script `scripts/compute_norm_stats_real.py` computes normalization statistics for real-world tasks.
+
+### Supported Robots
+- `lift2`
+- `split_aloha`
+- `acone`
+- `genie1`
+
+### Dataset Structure
+Real-world datasets are expected to follow the LeRobot repository structure:
+```
+InternData-A1/real/
+    └── <robot_name>/
+        └── <task_name>/
+            └── <collect_name>/   # empty if no subtask
+                ├── data/
+                ├── meta/
+                └── videos/
+```
+
+Example task path:
+```
+InternData-A1/real/genie1/
+└── Pick_a_bag_of_bread_with_the_left_arm__then_handover/set_0
+```
+
+### Usage
+```
+python scripts/compute_norm_stats_real.py \
+  --task_path InternData-A1/real/genie1/Pick_a_bag_of_bread_with_the_left_arm__then_handover/* \
+  --robot_name genie1 \
+  --save_path stats/real
+```
+
+Arguments
+- `task_path`: Path (or glob pattern) to a real-world task dataset(e.g. `InternData-A1/real/genie1/Pick_a_bag_of_bread_with_the_left_arm__then_handover/*`)
+- `robot_name`: Robot platform name (must be supported).
+- `save_path`: Root directory where normalization statistics will be saved.
+
+### Output Structure
+```
+<save_path>/
+└── <robot_name>/
+    └── <task_name>/
+        └── norm_stats.json
+```
+During finetuning, set the `fixed_stats_dir` argument in `DataConfig` to `<save_path>/<robot_name>/<task_name>` here.
+
+## 3. Sim2Real Experiments
+This script `scripts/compute_norm_stats_sim2real.py` computes normalization statistics for sim2real experiments.
+
+### Supported Robots
+- `lift2`
+
+### Dataset Structure
+Dataset from InternData-A1 are expected to follow the LeRobot repository structure:
+```
+InternData-A1/sim/
+    └── <task_category>/
+        └── <robot_name>/
+            └── <task_name>/
+                └── <collect_name>/
+                    ├── data/
+                    ├── meta/
+                    └── videos/
+```
+
+Example task path:
+```
+InternData-A1/sim/long_horizon_tasks/lift2/
+└── sort_the_rubbish
+    └── Sort_rubbish_1l2r
+    └── Sort_rubbish_2l1r
+    └── Sort_rubbish_2l2r
+```
+
+### Usage
+```
+python scripts/compute_norm_stats_sim2real.py \
+  --task_path InternData-A1/sim/long_horizon_tasks/lift2/sort_the_rubbish/* \
+  --robot_name lift2 \
+  --save_path stats/sim2real
+```
+
+Arguments
+- `task_path`: Path (or glob pattern) to a task dataset(e.g. `InternData-A1/sim/long_horizon_tasks/lift2/sort_the_rubbish/*` means training on all the collections in the task)
+- `robot_name`: Robot platform name (we only support `lift2` for now, but you can try other robots).
+- `save_path`: Root directory where normalization statistics will be saved.
+
+### Output Structure
+```
+<save_path>/
+└── <robot_name>/
+    └── <task_name>/
+        └── norm_stats.json
+```
+During finetuning, set the `fixed_stats_dir` argument in `DataConfig` to `<save_path>/<robot_name>/<task_name>` here.
+
+## Implementation Notes
+
+For simulation tasks and sim2real experiments, computation may stop early (e.g. after 10k steps) to limit runtime.
+
+For sim2real transfer, we set the gripper dimension in the state vector to zero because the state of the gripper in the real world during inference is not aligned with the state in the simulation. See `src/openpi/policies/sim2real_split_aloha_policy.py` for more details.
--- a/policy/openpi-InternData-A1/docs/remote_inference.md
+++ b/policy/openpi-InternData-A1/docs/remote_inference.md
@@ -0,0 +1,71 @@
+
+# Running openpi models remotely
+
+We provide utilities for running openpi models remotely. This is useful for running inference on more powerful GPUs off-robot, and also helps keep the robot and policy environments separate (and e.g. avoid dependency hell with robot software).
+
+## Starting a remote policy server
+
+To start a remote policy server, you can simply run the following command:
+
+```bash
+uv run scripts/serve_policy.py --env=[DROID | ALOHA | LIBERO]
+```
+
+The `env` argument specifies which $\pi_0$ checkpoint should be loaded. Under the hood, this script will execute a command like the following, which you can use to start a policy server, e.g. for checkpoints you trained yourself (here an example for the DROID environment):
+
+```bash
+uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_fast_droid
+```
+
+This will start a policy server that will serve the policy specified by the `config` and `dir` arguments. The policy will be served on the specified port (default: 8000).
+
+## Querying the remote policy server from your robot code
+
+We provide a client utility with minimal dependencies that you can easily embed into any robot codebase.
+
+First, install the `openpi-client` package in your robot environment:
+
+```bash
+cd $OPENPI_ROOT/packages/openpi-client
+pip install -e .
+```
+
+Then, you can use the client to query the remote policy server from your robot code. Here's an example of how to do this:
+
+```python
+from openpi_client import image_tools
+from openpi_client import websocket_client_policy
+
+# Outside of episode loop, initialize the policy client.
+# Point to the host and port of the policy server (localhost and 8000 are the defaults).
+client = websocket_client_policy.WebsocketClientPolicy(host="localhost", port=8000)
+
+for step in range(num_steps):
+    # Inside the episode loop, construct the observation.
+    # Resize images on the client side to minimize bandwidth / latency. Always return images in uint8 format.
+    # We provide utilities for resizing images + uint8 conversion so you match the training routines.
+    # The typical resize_size for pre-trained pi0 models is 224.
+    # Note that the proprioceptive `state` can be passed unnormalized, normalization will be handled on the server side.
+    observation = {
+        "observation/image": image_tools.convert_to_uint8(
+            image_tools.resize_with_pad(img, 224, 224)
+        ),
+        "observation/wrist_image": image_tools.convert_to_uint8(
+            image_tools.resize_with_pad(wrist_img, 224, 224)
+        ),
+        "observation/state": state,
+        "prompt": task_instruction,
+    }
+
+    # Call the policy server with the current observation.
+    # This returns an action chunk of shape (action_horizon, action_dim).
+    # Note that you typically only need to call the policy every N steps and execute steps
+    # from the predicted action chunk open-loop in the remaining steps.
+    action_chunk = client.infer(observation)["actions"]
+
+    # Execute the actions in the environment.
+    ...
+
+```
+
+Here, the `host` and `port` arguments specify the IP address and port of the remote policy server. You can also specify these as command-line arguments to your robot code, or hard-code them in your robot codebase. The `observation` is a dictionary of observations and the prompt, following the specification of the policy inputs for the policy you are serving. We have concrete examples of how to construct this dictionary for different environments in the [simple client example](examples/simple_client/main.py).
--- a/policy/openpi-InternData-A1/docs/training.md
+++ b/policy/openpi-InternData-A1/docs/training.md
@@ -0,0 +1,102 @@
+# Training Instructions
+
+Here we provide instructions for pretraining on InternData-A1, finetuning on real-world tasks and finetuning on InternData-A1 tasks for sim2real transfer.
+
+Before training, you need to compute the normalization statistics for the tasks you want to train on. Please refer to [norm_stats.md](norm_stats.md) for more details.
+
+---
+
+## 1. Pretraining on InternData-A1
+
+
+### Write a training config
+We provide a `TrainConfig` example named `pretrain-interndata-a1` in `src/openpi/training/config.py`.
+InternData-A1 contains four robot embodiments:
+- `split_aloha`
+- `lift2`
+- `genie1`
+- `franka`
+
+Accordingly, we define three `MultiDataConfigFactory` classes:
+- `MultiSimSplitAlohaDataConfig` for `split_aloha` and `lift2`
+- `MultiSimGenieDataConfig` for `genie1`
+- `MultiSimFrankaDataConfig` for `franka`
+
+Please either:
+- create a soft link from the InternData-A1 dataset to `data/InternData-A1`, or
+- modify the `repo_dir` field in all relevant `MultiDataConfig` entries to point to your local InternData-A1 path.
+
+Set `stats_dir` to your local normalization statistics directory. If you use the default setting, ensure that the normalization statistics for simulation tasks are saved under `stats/sim`.
+
+We initialize the model from PaliGemma-3B using:
+```
+weight_loader=weight_loaders.PaliGemmaWeightLoader("checkpoints/jax/paligemma/pt_224.npz")
+```
+Please download the PaliGemma-3b checkpoint by running 
+```
+python scripts/download_paligemma.py
+```
+
+You may adjust other training parameters based on your available GPUs and training budget:
+- `num_train_steps`: Total number of training steps
+- `num_workers`: Number of data loading workers
+- `fsdp_devices`: Number of GPUs per node
+- `batch_size`: Batch size per GPU
+- `save_interval`: Checkpoint saving interval (in steps)
+
+### Run training
+For multi node training, run
+```
+bash scripts/training_scripts/multi_node.sh
+```
+
+For single node multi-GPU training, run
+```
+config_name=pretrain-interndata-a1
+bash scripts/training_scripts/single_node_multi_gpu.sh  ${config_name}
+```
+
+The ckpts will be saved to `checkpoints/${config_name}`.
+
+## 2. Finetuning on Real-World Tasks
+### Write a training config
+We provide a `TrainConfig` example named `finetune-a2d-pen` in `src/openpi/training/config.py`.
+
+Key arguments you may need to modify include:
+- `MultiDataConfigFactory` class: 
+    - `MultiLeRobotReala2dDataConfig` for `genie1`
+    - `MultiLeRobotRealArxLift2DataConfig` for `lift2` and `acone`
+- `repo_dir`: Path to the real-world task dataset.
+- `robot_name`: the robot name in `repo_dir`, e.g. "genie1".
+- `fixed_stats_dir`: Path to the normalization statistics for the real-world task. When this is set, statistics from `stats_dir` will not be used.
+- `weight_loader`: Pretrained checkpoint used for initialization.
+You may download our pretrained checkpoints from [here]().
+
+### Run training
+For training, run
+For single node multi-GPU training, run
+```
+config_name=finetune-a2d-pen
+bash scripts/training_scripts/single_node_multi_gpu.sh ${config_name}
+```
+
+The ckpts will be saved under `checkpoints/${config_name}`.
+
+## 3. Finetuning on InternData-A1 Tasks for Sim2Real Transfer
+### Write a training config
+We provide a `TrainConfig` example named `finetune-sim2real-lift2-sort-rubbish` in `src/openpi/training/config.py`.
+
+Key arguments you may need to modify include:
+- `MultiDataConfigFactory` class: Currently, sim-to-real transfer is evaluated only on `lift2` tasks:
+    - `MultiSim2RealSplitAlohaDataConfig` for `lift2`
+- `repo_dir`: Path to the corresponding InternData-A1 task.
+- `fixed_stats_dir`: Path to the normalization statistics for the sim-to-real task. When specified, statistics from `stats_dir` will not be used.
+- `weight_loader`: Pretrained checkpoint used for initialization.
+
+### Run training
+For training, run
+For single node multi-GPU training, run
+```
+config_name=finetune-sim2real-lift2-sort-rubbish
+bash scripts/training_scripts/single_node_multi_gpu.sh ${config_name}
+```