Add sim tutorial, fix lekiwi motor config, add notebook links (#1275)

Co-authored-by: AdilZouitine <adilzouitinegm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co> Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com> Co-authored-by: Michel Aractingi <michel.aractingi@gmail.com> Co-authored-by: Eugene Mironov <helper2424@gmail.com> Co-authored-by: imstevenpmwork <steven.palma@huggingface.co> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2025-06-13 18:48:39 +02:00
parent 69e8946480
commit 438334d58e
9 changed files with 331 additions and 5 deletions
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -5,8 +5,10 @@
    title: Installation
  title: Get started
 - sections:
-  - local: getting_started_real_world_robot
-    title: Getting Started with Real-World Robots
+  - local: il_robots
+    title: Imitation Learning for Robots
+  - local: il_sim
+    title: Imitation Learning in Sim
  - local: cameras
    title: Cameras
  - local: integrate_hardware
@@ -30,6 +32,10 @@
  - local: lekiwi
    title: LeKiwi
  title: "Robots"
+- sections:
+  - local: notebooks
+    title: Notebooks
+  title: "Resources"
 - sections:
  - local: contributing
    title: Contribute to LeRobot
--- a/docs/source/getting_started_real_world_robot.mdx
+++ b/docs/source/getting_started_real_world_robot.mdx
@@ -1,4 +1,4 @@
-# Getting Started with Real-World Robots
+# Imitation Learning on Real-World Robots

 This tutorial will explain how to train a neural network to control a real robot autonomously.

@@ -273,6 +273,9 @@ python lerobot/scripts/train.py \
  --resume=true
 ```

+#### Train using Collab
+If your local computer doesn't have a powerful GPU you could utilize Google Collab to train your model by following the [ACT training notebook](./notebooks#training-act).
+
 #### Upload policy checkpoints

 Once training is done, upload the latest checkpoint with:
--- a/docs/source/il_sim.mdx
+++ b/docs/source/il_sim.mdx
@@ -0,0 +1,152 @@
+# Imitation Learning in Sim
+
+This tutorial will explain how to train a neural network to control a robot in simulation with imitation learning.
+
+**You'll learn:**
+1. How to record a dataset in simulation with [gym-hil](https://github.com/huggingface/gym-hil) and visualize the dataset.
+2. How to train a policy using your data.
+3. How to evaluate your policy in simulation and visualize the results.
+
+For the simulation environment we use the same [repo](https://github.com/huggingface/gym-hil) that is also being used by the Human-In-the-Loop (HIL) reinforcement learning algorithm.
+This environment is based on [MuJoCo](https://mujoco.org) and allows you to record datasets in LeRobotDataset format.
+Teleoperation is easiest with a controller like the Logitech F710, but you can also use your keyboard if you are up for the challenge.
+
+## Installation
+
+First, install the `gym_hil` package within the LeRobot environment, go to your LeRobot folder and run this command:
+
+```bash
+pip install -e ".[hilserl]"
+```
+
+## Teleoperate and Record a Dataset
+
+To use `gym_hil` with LeRobot, you need to use a configuration file. An example config file can be found [here](https://huggingface.co/datasets/aractingi/lerobot-example-config-files/blob/main/env_config_gym_hil_il.json).
+
+To teleoperate and collect a dataset, we need to modify this config file and you should add your `repo_id` here: `"repo_id": "il_gym",` and `"num_episodes": 30,` and make sure you set `mode` to `record`, "mode": "record".
+
+If you do not have a Nvidia GPU also change `"device": "cuda"` parameter in the config file (for example to `mps` for MacOS).
+
+By default the config file assumes you use a controller. To use your keyboard please change the envoirment specified at `"task"` in the config file and set it to `"PandaPickCubeKeyboard-v0"`.
+
+Then we can run this command to start:
+
+<hfoptions id="teleop_sim">
+<hfoption id="Linux">
+
+```bash
+python lerobot/scripts/rl/gym_manipulator.py --config_path path/to/env_config_gym_hil_il.json
+```
+
+</hfoption>
+<hfoption id="MacOS">
+
+```bash
+mjpython lerobot/scripts/rl/gym_manipulator.py --config_path path/to/env_config_gym_hil_il.json
+```
+
+</hfoption>
+</hfoptions>
+
+Once rendered you can teleoperate the robot with the gamepad or keyboard, below you can find the gamepad/keyboard controls.
+
+Note that to teleoperate the robot you have to hold the "Human Take Over Pause Policy" Button `RB` to enable control!
+
+**Gamepad Controls**
+
+<p align="center">
+  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/gamepad_guide.jpg?raw=true" alt="Figure shows the control mappings on a Logitech gamepad." title="Gamepad Control Mapping" width="100%"></img>
+</p>
+<p align="center"><i>Gamepad button mapping for robot control and episode management</i></p>
+
+**Keyboard controls**
+
+For keyboard controls use the `spacebar` to enable control and the following keys to move the robot:
+```bash
+  Arrow keys: Move in X-Y plane
+  Shift and Shift_R: Move in Z axis
+  Right Ctrl and Left Ctrl: Open and close gripper
+  ESC: Exit
+```
+
+## Visualize a dataset
+
+If you uploaded your dataset to the hub you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id.
+
+<p align="center">
+  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/dataset_visualizer_sim.png" alt="Figure shows the dataset visualizer" title="Dataset visualization" width="100%"></img>
+</p>
+<p align="center"><i>Dataset visualizer</i></p>
+
+
+## Train a policy
+
+To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+```bash
+python lerobot/scripts/train.py \
+  --dataset.repo_id=${HF_USER}/il_gym \
+  --policy.type=act \
+  --output_dir=outputs/train/il_sim_test \
+  --job_name=il_sim_test \
+  --policy.device=cuda \
+  --wandb.enable=true
+```
+
+Let's explain the command:
+1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/il_gym`.
+2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
+5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+
+Training should take several hours, 100k steps (which is the default) will take about 1h on Nvidia A100. You will find checkpoints in `outputs/train/il_sim_test/checkpoints`.
+
+#### Train using Collab
+If your local computer doesn't have a powerful GPU you could utilize Google Collab to train your model by following the [ACT training notebook](./notebooks#training-act).
+
+#### Upload policy checkpoints
+
+Once training is done, upload the latest checkpoint with:
+```bash
+huggingface-cli upload ${HF_USER}/il_sim_test \
+  outputs/train/il_sim_test/checkpoints/last/pretrained_model
+```
+
+You can also upload intermediate checkpoints with:
+```bash
+CKPT=010000
+huggingface-cli upload ${HF_USER}/il_sim_test${CKPT} \
+  outputs/train/il_sim_test/checkpoints/${CKPT}/pretrained_model
+```
+
+## Evaluate your policy in Sim
+
+To evaluate your policy we have to use the config file that can be found [here](https://huggingface.co/datasets/aractingi/lerobot-example-config-files/blob/main/eval_config_gym_hil.json).
+
+Make sure to replace the `repo_id` with the dataset you trained on, for example `pepijn223/il_sim_dataset` and replace the `pretrained_policy_name_or_path` with your model id, for example `pepijn223/il_sim_model`
+
+Then you can run this command to visualize your trained policy
+
+<hfoptions id="eval_policy">
+<hfoption id="Linux">
+
+```bash
+python lerobot/scripts/rl/eval_policy.py --config_path=path/to/eval_config_gym_hil.json
+```
+
+</hfoption>
+<hfoption id="MacOS">
+
+```bash
+mjpython lerobot/scripts/rl/eval_policy.py --config_path=path/to/eval_config_gym_hil.json
+```
+
+</hfoption>
+</hfoptions>
+
+> [!WARNING]
+> While the main workflow of training ACT in simulation is straightforward, there is significant room for exploring  how to set up the task, define the initial state of the environment, and determine the type of data required during collection to learn the most effective policy. If your trained policy doesn't perform well, investigate the quality of the dataset it was trained on using our visualizers, as well as the action values and various hyperparameters related to ACT and the simulation.
+
+Congrats 🎉, you have finished this tutorial. If you want to continue with using LeRobot in simulation follow this [Tutorial on reinforcement learning in sim with HIL-SERL](https://huggingface.co/docs/lerobot/hilserl_sim)
+
+> [!TIP]
+>  If you have any questions or need help, please reach out on [Discord](https://discord.com/invite/s3KuuzsPFb).
--- a/docs/source/installation.mdx
+++ b/docs/source/installation.mdx
@@ -68,3 +68,5 @@ To use [Weights and Biases](https://docs.wandb.ai/quickstart) for experiment tra
 ```bash
 wandb login
 ```
+
+You can now assemble your robot if it's not ready yet, look for your robot type on the left. Then follow the link below to use Lerobot with your robot.
--- a/docs/source/notebooks.mdx
+++ b/docs/source/notebooks.mdx
@@ -0,0 +1,29 @@
+# 🤗 LeRobot Notebooks
+
+This repository contains example notebooks for using LeRobot. These notebooks demonstrate how to train policies on real or simulation datasets using standardized policies.
+
+---
+
+### Training ACT
+
+[ACT](https://huggingface.co/papers/2304.13705) (Action Chunking Transformer) is a transformer-based policy architecture for imitation learning that processes robot states and camera inputs to generate smooth, chunked action sequences.
+
+We provide a ready-to-run Google Colab notebook to help you train ACT policies using datasets from the Hugging Face Hub, with optional logging to Weights & Biases.
+
+| Notebook | Colab |
+|:---------|:------|
+| [Train ACT with LeRobot](https://github.com/huggingface/notebooks/blob/main/lerobot/training-act.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-act.ipynb) |
+
+Expected training time for 100k steps: ~1.5 hours on an NVIDIA A100 GPU with batch size of `64`.
+
+### Training SmolVLA
+
+[SmolVLA](https://huggingface.co/papers/2506.01844) is a small but efficient Vision-Language-Action model. It is compact in size with 450 M-parameter and is developed by Hugging Face.
+
+We provide a ready-to-run Google Colab notebook to help you train SmolVLA policies using datasets from the Hugging Face Hub, with optional logging to Weights & Biases.
+
+| Notebook                                                                                                        | Colab                                                                                                                                                                                 |
+| :-------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [Train SmolVLA with LeRobot](https://github.com/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb) |
+
+Expected training time for 20k steps: ~5 hours on an NVIDIA A100 GPU with batch size of `64`.
--- a/lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
+++ b/lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
@@ -63,7 +63,7 @@ ALOHA_STATIC_INFO = {
 PUSHT_INFO = {
    "license": "mit",
    "url": "https://diffusion-policy.cs.columbia.edu/",
-    "paper": "https://huggingface.co/papers/2303.04137v5",
+    "paper": "https://huggingface.co/papers/2303.04137",
    "citation_bibtex": dedent(r"""
        @article{chi2024diffusionpolicy,
            author = {Cheng Chi and Zhenjia Xu and Siyuan Feng and Eric Cousineau and Yilun Du and Benjamin Burchfiel and Russ Tedrake and Shuran Song},
--- a/lerobot/common/robots/lekiwi/config_lekiwi.py
+++ b/lerobot/common/robots/lekiwi/config_lekiwi.py
@@ -34,7 +34,7 @@ def lekiwi_cameras_config() -> dict[str, CameraConfig]:
@RobotConfig.register_subclass("lekiwi")
@dataclass
 class LeKiwiConfig(RobotConfig):
-    port = "/dev/ttyACM0"  # port to connect to the bus
+    port: str = "/dev/ttyACM0"  # port to connect to the bus

    disable_torque_on_disconnect: bool = True

--- a/lerobot/common/robots/lekiwi/lekiwi.mdx
+++ b/lerobot/common/robots/lekiwi/lekiwi.mdx
@@ -43,9 +43,69 @@ First, we will assemble the two SO100/SO101 arms. One to attach to the mobile ba
 - [Assemble SO101](./so101#step-by-step-assembly-instructions)
 - [Assemble LeKiwi](https://github.com/SIGRobotics-UIUC/LeKiwi/blob/main/Assembly.md)

+### Find the USB ports associated with motor board
+
+To find the port for each bus servo adapter, run this script:
+```bash
+python lerobot/find_port.py
+```
+
+<hfoptions id="example">
+<hfoption id="Mac">
+
+Example output:
+
+```
+Finding all available ports for the MotorBus.
+['/dev/tty.usbmodem575E0032081']
+Remove the USB cable from your MotorsBus and press Enter when done.
+
+[...Disconnect corresponding leader or follower arm and press Enter...]
+
+The port of this MotorsBus is /dev/tty.usbmodem575E0032081
+Reconnect the USB cable.
+```
+
+Where the found port is: `/dev/tty.usbmodem575E0032081` corresponding to your board.
+
+</hfoption>
+<hfoption id="Linux">
+
+On Linux, you might need to give access to the USB ports by running:
+```bash
+sudo chmod 666 /dev/ttyACM0
+sudo chmod 666 /dev/ttyACM1
+```
+
+Example output:
+
+```
+Finding all available ports for the MotorBus.
+['/dev/ttyACM0']
+Remove the usb cable from your MotorsBus and press Enter when done.
+
+[...Disconnect corresponding leader or follower arm and press Enter...]
+
+The port of this MotorsBus is /dev/ttyACM0
+Reconnect the USB cable.
+```
+
+Where the found port is: `/dev/ttyACM0` corresponding to your board.
+
+</hfoption>
+</hfoptions>
+
 ### Configure motors
 The instructions for configuring the motors can be found in the SO101 [docs](./so101#configure-the-motors). Besides the ids for the arm motors, we also need to set the motor ids for the mobile base. These need to be in a specific order to work. Below an image of the motor ids and motor mounting positions for the mobile base. Note that we only use one Motor Control board on LeKiwi. This means the motor ids for the wheels are 7, 8 and 9.

+You can run this command to setup motors for LeKiwi. It will first setup the motors for arm (id 6..1) and then setup motors for wheels (9,8,7)
+
+```bash
+python -m lerobot.setup_motors \
+    --robot.type=lekiwi \
+    --robot.port=/dev/tty.usbmodem58760431551 # <- paste here the port found at previous step
+```
+
 <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/motor_ids.webp" alt="Motor ID's for mobile robot" title="Motor ID's for mobile robot" width="60%">

 ### Troubleshoot communication
--- a/lerobot/scripts/rl/eval_policy.py
+++ b/lerobot/scripts/rl/eval_policy.py
@@ -0,0 +1,74 @@
+# !/usr/bin/env python
+
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import logging
+
+from lerobot.common.cameras import opencv  # noqa: F401
+from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.common.policies.factory import make_policy
+from lerobot.common.robots import (  # noqa: F401
+    RobotConfig,
+    make_robot_from_config,
+    so100_follower,
+)
+from lerobot.common.teleoperators import (
+    gamepad,  # noqa: F401
+    so101_leader,  # noqa: F401
+)
+from lerobot.configs import parser
+from lerobot.configs.train import TrainRLServerPipelineConfig
+from lerobot.scripts.rl.gym_manipulator import make_robot_env
+
+logging.basicConfig(level=logging.INFO)
+
+
+def eval_policy(env, policy, n_episodes):
+    sum_reward_episode = []
+    for _ in range(n_episodes):
+        obs, _ = env.reset()
+        episode_reward = 0.0
+        while True:
+            action = policy.select_action(obs)
+            obs, reward, terminated, truncated, _ = env.step(action)
+            episode_reward += reward
+            if terminated or truncated:
+                break
+        sum_reward_episode.append(episode_reward)
+
+    logging.info(f"Success after 20 steps {sum_reward_episode}")
+    logging.info(f"success rate {sum(sum_reward_episode) / len(sum_reward_episode)}")
+
+
+@parser.wrap()
+def main(cfg: TrainRLServerPipelineConfig):
+    env_cfg = cfg.env
+    env = make_robot_env(env_cfg)
+    dataset_cfg = cfg.dataset
+    dataset = LeRobotDataset(repo_id=dataset_cfg.repo_id)
+    dataset_meta = dataset.meta
+
+    policy = make_policy(
+        cfg=cfg.policy,
+        # env_cfg=cfg.env,
+        ds_meta=dataset_meta,
+    )
+    policy.from_pretrained(env_cfg.pretrained_policy_name_or_path)
+    policy.eval()
+
+    eval_policy(env, policy=policy, n_episodes=10)
+
+
+if __name__ == "__main__":
+    main()