added tdmpc2 to policy factory; shape fixes in tdmpc2

fixes and updated comments
config comments
2024-11-26 11:58:29 +00:00 · 2024-11-26 09:46:59 +00:00 · 2024-11-25 09:51:33 +00:00 · 2024-11-22 17:11:47 +00:00 · 2024-11-21 17:03:30 +00:00 · 2024-11-21 15:00:03 +00:00
68 changed files with 5631 additions and 975 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -65,7 +65,6 @@ htmlcov/
 .nox/
 .coverage
 .coverage.*
-.cache
 nosetests.xml
 coverage.xml
 *.cover
@@ -73,6 +72,11 @@ coverage.xml
 .hypothesis/
 .pytest_cache/

+# Ignore .cache except calibration
+.cache/*
+!.cache/calibration/
+!.cache/calibration/**
+
 # Translations
 *.mo
 *.pot
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -47,6 +47,7 @@ jobs:
          pipx install poetry && poetry config virtualenvs.in-project true
          echo "${{ github.workspace }}/.venv/bin" >> $GITHUB_PATH

+      # TODO(rcadene, aliberts): python 3.12 seems to be used in the tests, not python 3.10
      - name: Set up Python 3.10
        uses: actions/setup-python@v5
        with:
@@ -84,6 +85,7 @@ jobs:
          pipx install poetry && poetry config virtualenvs.in-project true
          echo "${{ github.workspace }}/.venv/bin" >> $GITHUB_PATH

+      # TODO(rcadene, aliberts): python 3.12 seems to be used in the tests, not python 3.10
      - name: Set up Python 3.10
        uses: actions/setup-python@v5
        with:
--- a/README.md
+++ b/README.md
@@ -23,15 +23,15 @@
 </div>

 <h2 align="center">
-    <p><a href="https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md">Hot new tutorial: Getting started with real-world robots</a></p>
+    <p><a href="https://github.com/huggingface/lerobot/blob/main/examples/10_use_so100.md">New robot in town: SO-100</a></p>
 </h2>

 <div align="center">
-    <img src="media/tutorial/koch_v1_1_leader_follower.webp?raw=true" alt="Koch v1.1 leader and follower arms" title="Koch v1.1 leader and follower arms" width="50%">
-    <p>We just dropped an in-depth tutorial on how to build your own robot!</p>
+    <img src="media/so100/leader_follower.webp?raw=true" alt="SO-100 leader and follower arms" title="SO-100 leader and follower arms" width="50%">
+    <p>We just added a new tutorial on how to build a more affordable robot, at the price of $110 per arm!</p>
    <p>Teach it new skills by showing it a few moves with just a laptop.</p>
    <p>Then watch your homemade robot act autonomously 🤯</p>
-    <p>For more info, see <a href="https://x.com/RemiCadene/status/1825455895561859185">our thread on X</a> or <a href="https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md">our tutorial page</a>.</p>
+    <p>Follow the link to the <a href="https://github.com/huggingface/lerobot/blob/main/examples/10_use_so100.md">full tutorial for SO-100</a>.</p>
 </div>

 <br/>
@@ -55,9 +55,9 @@

 <table>
  <tr>
-    <td><img src="http://remicadene.com/assets/gif/aloha_act.gif" width="100%" alt="ACT policy on ALOHA env"/></td>
-    <td><img src="http://remicadene.com/assets/gif/simxarm_tdmpc.gif" width="100%" alt="TDMPC policy on SimXArm env"/></td>
-    <td><img src="http://remicadene.com/assets/gif/pusht_diffusion.gif" width="100%" alt="Diffusion policy on PushT env"/></td>
+    <td><img src="media/gym/aloha_act.gif" width="100%" alt="ACT policy on ALOHA env"/></td>
+    <td><img src="media/gym/simxarm_tdmpc.gif" width="100%" alt="TDMPC policy on SimXArm env"/></td>
+    <td><img src="media/gym/pusht_diffusion.gif" width="100%" alt="Diffusion policy on PushT env"/></td>
  </tr>
  <tr>
    <td align="center">ACT policy on ALOHA env</td>
@@ -144,7 +144,7 @@ wandb login

 ### Visualize datasets

-Check out [example 1](./examples/1_load_lerobot_dataset.py) that illustrates how to use our dataset class which automatically download data from the Hugging Face hub.
+Check out [example 1](./examples/1_load_lerobot_dataset.py) that illustrates how to use our dataset class which automatically downloads data from the Hugging Face hub.

 You can also locally visualize episodes from a dataset on the hub by executing our script from the command line:
 ```bash
@@ -280,7 +280,7 @@ To use wandb for logging training and evaluation curves, make sure you've run `w
    wandb.enable=true
 ```

-A link to the wandb logs for the run will also show up in yellow in your terminal. Here is an example of what they look like in your browser. Please also check [here](https://github.com/huggingface/lerobot/blob/main/examples/4_train_policy_with_script.md#typical-logs-and-metrics) for the explaination of some commonly used metrics in logs.
+A link to the wandb logs for the run will also show up in yellow in your terminal. Here is an example of what they look like in your browser. Please also check [here](https://github.com/huggingface/lerobot/blob/main/examples/4_train_policy_with_script.md#typical-logs-and-metrics) for the explanation of some commonly used metrics in logs.

 ![](media/wandb.png)

--- a/examples/10_use_so100.md
+++ b/examples/10_use_so100.md
@@ -0,0 +1,280 @@
+This tutorial explains how to use [SO-100](https://github.com/TheRobotStudio/SO-ARM100) with LeRobot.
+
+## Source the parts
+
+Follow this [README](https://github.com/TheRobotStudio/SO-ARM100). It contains the bill of materials, with link to source the parts, as well as the instructions to 3D print the parts, and advices if it's your first time printing or if you don't own a 3D printer already.
+
+**Important**: Before assembling, you will first need to configure your motors. To this end, we provide a nice script, so let's first install LeRobot. After configuration, we will also guide you through assembly.
+
+## Install LeRobot
+
+On your computer:
+
+1. [Install Miniconda](https://docs.anaconda.com/miniconda/#quick-command-line-install):
+```bash
+mkdir -p ~/miniconda3
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
+bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
+rm ~/miniconda3/miniconda.sh
+~/miniconda3/bin/conda init bash
+```
+
+2. Restart shell or `source ~/.bashrc`
+
+3. Create and activate a fresh conda environment for lerobot
+```bash
+conda create -y -n lerobot python=3.10 && conda activate lerobot
+```
+
+4. Clone LeRobot:
+```bash
+git clone https://github.com/huggingface/lerobot.git ~/lerobot
+```
+
+5. Install LeRobot with dependencies for the feetech motors:
+```bash
+cd ~/lerobot && pip install -e ".[feetech]"
+```
+
+For Linux only (not Mac), install extra dependencies for recording datasets:
+```bash
+conda install -y -c conda-forge ffmpeg
+pip uninstall -y opencv-python
+conda install -y -c conda-forge "opencv>=4.10.0"
+```
+
+## Configure the motors
+
+Follow steps 1 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I) which illustrates the use of our scripts below.
+
+**Find USB ports associated to your arms**
+To find the correct ports for each arm, run the utility script twice:
+```bash
+python lerobot/scripts/find_motors_bus_port.py
+```
+
+Example output when identifying the leader arm's port (e.g., `/dev/tty.usbmodem575E0031751` on Mac, or possibly `/dev/ttyACM0` on Linux):
+```
+Finding all available ports for the MotorBus.
+['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
+
+[...Disconnect leader arm and press Enter...]
+
+The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0031751
+Reconnect the usb cable.
+```
+
+Example output when identifying the follower arm's port (e.g., `/dev/tty.usbmodem575E0032081`, or possibly `/dev/ttyACM1` on Linux):
+```
+Finding all available ports for the MotorBus.
+['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
+
+[...Disconnect follower arm and press Enter...]
+
+The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0032081
+Reconnect the usb cable.
+```
+
+Troubleshooting: On Linux, you might need to give access to the USB ports by running:
+```bash
+sudo chmod 666 /dev/ttyACM0
+sudo chmod 666 /dev/ttyACM1
+```
+
+**Configure your motors**
+Plug your first motor and run this script to set its ID to 1. It will also set its present position to 2048, so expect your motor to rotate:
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 1
+```
+
+Note: These motors are currently limitated. They can take values between 0 and 4096 only, which corresponds to a full turn. They can't turn more than that. 2048 is at the middle of this range, so we can take -2048 steps (180 degrees anticlockwise) and reach the maximum range, or take +2048 steps (180 degrees clockwise) and reach the maximum range. The configuration step also sets the homing offset to 0, so that if you misassembled the arm, you can always update the homing offset to account for a shift up to ± 2048 steps (± 180 degrees).
+
+Then unplug your motor and plug the second motor and set its ID to 2.
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 2
+```
+
+Redo the process for all your motors until ID 6. Do the same for the 6 motors of the leader arm.
+
+**Remove the gears of the 6 leader motors**
+Follow step 2 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I). You need to remove the gear for the motors of the leader arm. As a result, you will only use the position encoding of the motor and reduce friction to more easily operate the leader arm.
+
+**Add motor horn to the motors**
+Follow step 3 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I). For SO-100, you need to align the holes on the motor horn to the motor spline to be approximately 1:30, 4:30, 7:30 and 10:30.
+Try to avoid rotating the motor while doing so to keep position 2048 set during configuration. It is especially tricky for the leader motors as it is more sensible without the gears, but it's ok if it's a bit rotated.
+
+## Assemble the arms
+
+Follow step 4 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I). The first arm should take a bit more than 1 hour to assemble, but once you get use to it, you can do it under 1 hour for the second arm.
+
+## Calibrate
+
+Next, you'll need to calibrate your SO-100 robot to ensure that the leader and follower arms have the same position values when they are in the same physical position. This calibration is essential because it allows a neural network trained on one SO-100 robot to work on another.
+
+**Manual calibration of follower arm**
+/!\ Contrarily to step 6 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I) which illustrates the auto calibration, we will actually do manual calibration of follower for now.
+
+You will need to move the follower arm to these positions sequentially:
+
+| 1. Zero position | 2. Rotated position | 3. Rest position |
+|---|---|---|
+| <img src="../media/so100/follower_zero.webp?raw=true" alt="SO-100 follower arm zero position" title="SO-100 follower arm zero position" style="width:100%;"> | <img src="../media/so100/follower_rotated.webp?raw=true" alt="SO-100 follower arm rotated position" title="SO-100 follower arm rotated position" style="width:100%;"> | <img src="../media/so100/follower_rest.webp?raw=true" alt="SO-100 follower arm rest position" title="SO-100 follower arm rest position" style="width:100%;"> |
+
+Make sure both arms are connected and run this script to launch manual calibration:
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --robot-overrides '~cameras' --arms main_follower
+```
+
+**Manual calibration of leader arm**
+Follow step 6 of the [assembly video](https://www.youtube.com/watch?v=FioA2oeFZ5I) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:
+
+| 1. Zero position | 2. Rotated position | 3. Rest position |
+|---|---|---|
+| <img src="../media/so100/leader_zero.webp?raw=true" alt="SO-100 leader arm zero position" title="SO-100 leader arm zero position" style="width:100%;"> | <img src="../media/so100/leader_rotated.webp?raw=true" alt="SO-100 leader arm rotated position" title="SO-100 leader arm rotated position" style="width:100%;"> | <img src="../media/so100/leader_rest.webp?raw=true" alt="SO-100 leader arm rest position" title="SO-100 leader arm rest position" style="width:100%;"> |
+
+Run this script to launch manual calibration:
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --robot-overrides '~cameras' --arms main_leader
+```
+
+## Teleoperate
+
+**Simple teleop**
+Then you are ready to teleoperate your robot! Run this simple script (it won't connect and display the cameras):
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --robot-overrides '~cameras' \
+    --display-cameras 0
+```
+
+
+**Teleop with displaying cameras**
+Follow [this guide to setup your cameras](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#c-add-your-cameras-with-opencvcamera). Then you will be able to display the cameras on your computer while you are teleoperating by running the following code. This is useful to prepare your setup before recording your first dataset.
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/so100.yaml
+```
+
+## Record a dataset
+
+Once you're familiar with teleoperation, you can record your first dataset with SO-100.
+
+If you want to use the Hugging Face hub features for uploading your dataset and you haven't previously done it, make sure you've logged in using a write-access token, which can be generated from the [Hugging Face settings](https://huggingface.co/settings/tokens):
+```bash
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+```
+
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Record 2 episodes and upload your dataset to the hub:
+```bash
+python lerobot/scripts/control_robot.py record \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/so100_test \
+    --tags so100 tutorial \
+    --warmup-time-s 5 \
+    --episode-time-s 40 \
+    --reset-time-s 10 \
+    --num-episodes 2 \
+    --push-to-hub 1
+```
+
+## Visualize a dataset
+
+If you uploaded your dataset to the hub with `--push-to-hub 1`, you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id given by:
+```bash
+echo ${HF_USER}/so100_test
+```
+
+If you didn't upload with `--push-to-hub 0`, you can also visualize it locally with:
+```bash
+python lerobot/scripts/visualize_dataset_html.py \
+  --root data \
+  --repo-id ${HF_USER}/so100_test
+```
+
+## Replay an episode
+
+Now try to replay the first episode on your robot:
+```bash
+DATA_DIR=data python lerobot/scripts/control_robot.py replay \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/so100_test \
+    --episode 0
+```
+
+## Train a policy
+
+To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+```bash
+DATA_DIR=data python lerobot/scripts/train.py \
+  dataset_repo_id=${HF_USER}/so100_test \
+  policy=act_so100_real \
+  env=so100_real \
+  hydra.run.dir=outputs/train/act_so100_test \
+  hydra.job.name=act_so100_test \
+  device=cuda \
+  wandb.enable=true
+```
+
+Let's explain it:
+1. We provided the dataset as argument with `dataset_repo_id=${HF_USER}/so100_test`.
+2. We provided the policy with `policy=act_so100_real`. This loads configurations from [`lerobot/configs/policy/act_so100_real.yaml`](../lerobot/configs/policy/act_so100_real.yaml). Importantly, this policy uses 2 cameras as input `laptop`, `phone`.
+3. We provided an environment as argument with `env=so100_real`. This loads configurations from [`lerobot/configs/env/so100_real.yaml`](../lerobot/configs/env/so100_real.yaml).
+4. We provided `device=cuda` since we are training on a Nvidia GPU, but you can also use `device=mps` if you are using a Mac with Apple silicon, or `device=cpu` otherwise.
+5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.
+
+Training should take several hours. You will find checkpoints in `outputs/train/act_so100_test/checkpoints`.
+
+## Evaluate your policy
+
+You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
+```bash
+python lerobot/scripts/control_robot.py record \
+  --robot-path lerobot/configs/robot/so100.yaml \
+  --fps 30 \
+  --root data \
+  --repo-id ${HF_USER}/eval_act_so100_test \
+  --tags so100 tutorial eval \
+  --warmup-time-s 5 \
+  --episode-time-s 40 \
+  --reset-time-s 10 \
+  --num-episodes 10 \
+  -p outputs/train/act_so100_test/checkpoints/last/pretrained_model
+```
+
+As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:
+1. There is an additional `-p` argument which indicates the path to your policy checkpoint with  (e.g. `-p outputs/train/eval_so100_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `-p ${HF_USER}/act_so100_test`).
+2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `--repo-id ${HF_USER}/eval_act_so100_test`).
+
+## More
+
+Follow this [previous tutorial](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#4-train-a-policy-on-your-data) for a more in-depth tutorial on controlling real robots with LeRobot.
+
+If you have any question or need help, please reach out on Discord in the channel [`#so100-arm`](https://discord.com/channels/1216765309076115607/1237741463832363039).
--- a/examples/11_use_moss.md
+++ b/examples/11_use_moss.md
@@ -0,0 +1,280 @@
+This tutorial explains how to use [Moss v1](https://github.com/jess-moss/moss-robot-arms) with LeRobot.
+
+## Source the parts
+
+Follow this [README](https://github.com/jess-moss/moss-robot-arms). It contains the bill of materials, with link to source the parts, as well as the instructions to 3D print the parts, and advices if it's your first time printing or if you don't own a 3D printer already.
+
+**Important**: Before assembling, you will first need to configure your motors. To this end, we provide a nice script, so let's first install LeRobot. After configuration, we will also guide you through assembly.
+
+## Install LeRobot
+
+On your computer:
+
+1. [Install Miniconda](https://docs.anaconda.com/miniconda/#quick-command-line-install):
+```bash
+mkdir -p ~/miniconda3
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
+bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
+rm ~/miniconda3/miniconda.sh
+~/miniconda3/bin/conda init bash
+```
+
+2. Restart shell or `source ~/.bashrc`
+
+3. Create and activate a fresh conda environment for lerobot
+```bash
+conda create -y -n lerobot python=3.10 && conda activate lerobot
+```
+
+4. Clone LeRobot:
+```bash
+git clone https://github.com/huggingface/lerobot.git ~/lerobot
+```
+
+5. Install LeRobot with dependencies for the feetech motors:
+```bash
+cd ~/lerobot && pip install -e ".[feetech]"
+```
+
+For Linux only (not Mac), install extra dependencies for recording datasets:
+```bash
+conda install -y -c conda-forge ffmpeg
+pip uninstall -y opencv-python
+conda install -y -c conda-forge "opencv>=4.10.0"
+```
+
+## Configure the motors
+
+Follow steps 1 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic) which illustrates the use of our scripts below.
+
+**Find USB ports associated to your arms**
+To find the correct ports for each arm, run the utility script twice:
+```bash
+python lerobot/scripts/find_motors_bus_port.py
+```
+
+Example output when identifying the leader arm's port (e.g., `/dev/tty.usbmodem575E0031751` on Mac, or possibly `/dev/ttyACM0` on Linux):
+```
+Finding all available ports for the MotorBus.
+['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
+
+[...Disconnect leader arm and press Enter...]
+
+The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0031751
+Reconnect the usb cable.
+```
+
+Example output when identifying the follower arm's port (e.g., `/dev/tty.usbmodem575E0032081`, or possibly `/dev/ttyACM1` on Linux):
+```
+Finding all available ports for the MotorBus.
+['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
+
+[...Disconnect follower arm and press Enter...]
+
+The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0032081
+Reconnect the usb cable.
+```
+
+Troubleshooting: On Linux, you might need to give access to the USB ports by running:
+```bash
+sudo chmod 666 /dev/ttyACM0
+sudo chmod 666 /dev/ttyACM1
+```
+
+**Configure your motors**
+Plug your first motor and run this script to set its ID to 1. It will also set its present position to 2048, so expect your motor to rotate:
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 1
+```
+
+Note: These motors are currently limitated. They can take values between 0 and 4096 only, which corresponds to a full turn. They can't turn more than that. 2048 is at the middle of this range, so we can take -2048 steps (180 degrees anticlockwise) and reach the maximum range, or take +2048 steps (180 degrees clockwise) and reach the maximum range. The configuration step also sets the homing offset to 0, so that if you misassembled the arm, you can always update the homing offset to account for a shift up to ± 2048 steps (± 180 degrees).
+
+Then unplug your motor and plug the second motor and set its ID to 2.
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 2
+```
+
+Redo the process for all your motors until ID 6. Do the same for the 6 motors of the leader arm.
+
+**Remove the gears of the 6 leader motors**
+Follow step 2 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic). You need to remove the gear for the motors of the leader arm. As a result, you will only use the position encoding of the motor and reduce friction to more easily operate the leader arm.
+
+**Add motor horn to the motors**
+Follow step 3 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic). For Moss v1, you need to align the holes on the motor horn to the motor spline to be approximately 3, 6, 9 and 12 o'clock.
+Try to avoid rotating the motor while doing so to keep position 2048 set during configuration. It is especially tricky for the leader motors as it is more sensible without the gears, but it's ok if it's a bit rotated.
+
+## Assemble the arms
+
+Follow step 4 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic). The first arm should take a bit more than 1 hour to assemble, but once you get use to it, you can do it under 1 hour for the second arm.
+
+## Calibrate
+
+Next, you'll need to calibrate your Moss v1 robot to ensure that the leader and follower arms have the same position values when they are in the same physical position. This calibration is essential because it allows a neural network trained on one Moss v1 robot to work on another.
+
+**Manual calibration of follower arm**
+/!\ Contrarily to step 6 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic) which illustrates the auto calibration, we will actually do manual calibration of follower for now.
+
+You will need to move the follower arm to these positions sequentially:
+
+| 1. Zero position | 2. Rotated position | 3. Rest position |
+|---|---|---|
+| <img src="../media/moss/follower_zero.webp?raw=true" alt="Moss v1 follower arm zero position" title="Moss v1 follower arm zero position" style="width:100%;"> | <img src="../media/moss/follower_rotated.webp?raw=true" alt="Moss v1 follower arm rotated position" title="Moss v1 follower arm rotated position" style="width:100%;"> | <img src="../media/moss/follower_rest.webp?raw=true" alt="Moss v1 follower arm rest position" title="Moss v1 follower arm rest position" style="width:100%;"> |
+
+Make sure both arms are connected and run this script to launch manual calibration:
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/moss.yaml \
+    --robot-overrides '~cameras' --arms main_follower
+```
+
+**Manual calibration of leader arm**
+Follow step 6 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:
+
+| 1. Zero position | 2. Rotated position | 3. Rest position |
+|---|---|---|
+| <img src="../media/moss/leader_zero.webp?raw=true" alt="Moss v1 leader arm zero position" title="Moss v1 leader arm zero position" style="width:100%;"> | <img src="../media/moss/leader_rotated.webp?raw=true" alt="Moss v1 leader arm rotated position" title="Moss v1 leader arm rotated position" style="width:100%;"> | <img src="../media/moss/leader_rest.webp?raw=true" alt="Moss v1 leader arm rest position" title="Moss v1 leader arm rest position" style="width:100%;"> |
+
+Run this script to launch manual calibration:
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/moss.yaml \
+    --robot-overrides '~cameras' --arms main_leader
+```
+
+## Teleoperate
+
+**Simple teleop**
+Then you are ready to teleoperate your robot! Run this simple script (it won't connect and display the cameras):
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/moss.yaml \
+    --robot-overrides '~cameras' \
+    --display-cameras 0
+```
+
+
+**Teleop with displaying cameras**
+Follow [this guide to setup your cameras](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#c-add-your-cameras-with-opencvcamera). Then you will be able to display the cameras on your computer while you are teleoperating by running the following code. This is useful to prepare your setup before recording your first dataset.
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/moss.yaml
+```
+
+## Record a dataset
+
+Once you're familiar with teleoperation, you can record your first dataset with Moss v1.
+
+If you want to use the Hugging Face hub features for uploading your dataset and you haven't previously done it, make sure you've logged in using a write-access token, which can be generated from the [Hugging Face settings](https://huggingface.co/settings/tokens):
+```bash
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+```
+
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Record 2 episodes and upload your dataset to the hub:
+```bash
+python lerobot/scripts/control_robot.py record \
+    --robot-path lerobot/configs/robot/moss.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/moss_test \
+    --tags moss tutorial \
+    --warmup-time-s 5 \
+    --episode-time-s 40 \
+    --reset-time-s 10 \
+    --num-episodes 2 \
+    --push-to-hub 1
+```
+
+## Visualize a dataset
+
+If you uploaded your dataset to the hub with `--push-to-hub 1`, you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id given by:
+```bash
+echo ${HF_USER}/moss_test
+```
+
+If you didn't upload with `--push-to-hub 0`, you can also visualize it locally with:
+```bash
+python lerobot/scripts/visualize_dataset_html.py \
+  --root data \
+  --repo-id ${HF_USER}/moss_test
+```
+
+## Replay an episode
+
+Now try to replay the first episode on your robot:
+```bash
+DATA_DIR=data python lerobot/scripts/control_robot.py replay \
+    --robot-path lerobot/configs/robot/moss.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/moss_test \
+    --episode 0
+```
+
+## Train a policy
+
+To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+```bash
+DATA_DIR=data python lerobot/scripts/train.py \
+  dataset_repo_id=${HF_USER}/moss_test \
+  policy=act_moss_real \
+  env=moss_real \
+  hydra.run.dir=outputs/train/act_moss_test \
+  hydra.job.name=act_moss_test \
+  device=cuda \
+  wandb.enable=true
+```
+
+Let's explain it:
+1. We provided the dataset as argument with `dataset_repo_id=${HF_USER}/moss_test`.
+2. We provided the policy with `policy=act_moss_real`. This loads configurations from [`lerobot/configs/policy/act_moss_real.yaml`](../lerobot/configs/policy/act_moss_real.yaml). Importantly, this policy uses 2 cameras as input `laptop`, `phone`.
+3. We provided an environment as argument with `env=moss_real`. This loads configurations from [`lerobot/configs/env/moss_real.yaml`](../lerobot/configs/env/moss_real.yaml).
+4. We provided `device=cuda` since we are training on a Nvidia GPU, but you can also use `device=mps` if you are using a Mac with Apple silicon, or `device=cpu` otherwise.
+5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.
+
+Training should take several hours. You will find checkpoints in `outputs/train/act_moss_test/checkpoints`.
+
+## Evaluate your policy
+
+You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
+```bash
+python lerobot/scripts/control_robot.py record \
+  --robot-path lerobot/configs/robot/moss.yaml \
+  --fps 30 \
+  --root data \
+  --repo-id ${HF_USER}/eval_act_moss_test \
+  --tags moss tutorial eval \
+  --warmup-time-s 5 \
+  --episode-time-s 40 \
+  --reset-time-s 10 \
+  --num-episodes 10 \
+  -p outputs/train/act_moss_test/checkpoints/last/pretrained_model
+```
+
+As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:
+1. There is an additional `-p` argument which indicates the path to your policy checkpoint with  (e.g. `-p outputs/train/eval_moss_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `-p ${HF_USER}/act_moss_test`).
+2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `--repo-id ${HF_USER}/eval_act_moss_test`).
+
+## More
+
+Follow this [previous tutorial](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#4-train-a-policy-on-your-data) for a more in-depth tutorial on controlling real robots with LeRobot.
+
+If you have any question or need help, please reach out on Discord in the channel [`#moss-arm`](https://discord.com/channels/1216765309076115607/1275374638985252925).
--- a/examples/7_get_started_with_real_robot.md
+++ b/examples/7_get_started_with_real_robot.md
@@ -11,7 +11,7 @@ This tutorial will guide you through the process of setting up and training a ne

 By following these steps, you'll be able to replicate tasks like picking up a Lego block and placing it in a bin with a high success rate, as demonstrated in [this video](https://x.com/RemiCadene/status/1814680760592572934).

-This tutorial is specifically made for the affordable [Koch v1.1](https://github.com/jess-moss/koch-v1-1) robot, but it contains additional information to be easily adapted to various types of robots like [Aloha bimanual robot](aloha-2.github.io) by changing some configurations. The Koch v1.1 consists of a leader arm and a follower arm, each with 6 motors. It can work with one or several cameras to record the scene, which serve as visual sensors for the robot.
+This tutorial is specifically made for the affordable [Koch v1.1](https://github.com/jess-moss/koch-v1-1) robot, but it contains additional information to be easily adapted to various types of robots like [Aloha bimanual robot](https://aloha-2.github.io) by changing some configurations. The Koch v1.1 consists of a leader arm and a follower arm, each with 6 motors. It can work with one or several cameras to record the scene, which serve as visual sensors for the robot.

 During the data collection phase, you will control the follower arm by moving the leader arm. This process is known as "teleoperation." This technique is used to collect robot trajectories. Afterward, you'll train a neural network to imitate these trajectories and deploy the network to enable your robot to operate autonomously.

@@ -78,12 +78,12 @@ To begin, create two instances of the  [`DynamixelMotorsBus`](../lerobot/common/

 To find the correct ports for each arm, run the utility script twice:
 ```bash
-python lerobot/common/robot_devices/motors/dynamixel.py
+python lerobot/scripts/find_motors_bus_port.py
 ```

 Example output when identifying the leader arm's port (e.g., `/dev/tty.usbmodem575E0031751` on Mac, or possibly `/dev/ttyACM0` on Linux):
 ```
-Finding all available ports for the DynamixelMotorsBus.
+Finding all available ports for the MotorBus.
 ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
 Remove the usb cable from your DynamixelMotorsBus and press Enter when done.

@@ -95,7 +95,7 @@ Reconnect the usb cable.

 Example output when identifying the follower arm's port (e.g., `/dev/tty.usbmodem575E0032081`, or possibly `/dev/ttyACM1` on Linux):
 ```
-Finding all available ports for the DynamixelMotorsBus.
+Finding all available ports for the MotorBus.
 ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
 Remove the usb cable from your DynamixelMotorsBus and press Enter when done.

--- a/examples/8_use_stretch.md
+++ b/examples/8_use_stretch.md
@@ -50,7 +50,7 @@ cd ~/lerobot && pip install -e ".[stretch]"

 > **Note:** If you get this message, you can ignore it: `ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.`

-And install extra dependencies for recording datasets on Linux:
+For Linux only (not Mac), install extra dependencies for recording datasets:
 ```bash
 conda install -y -c conda-forge ffmpeg
 pip uninstall -y opencv-python
--- a/examples/9_use_aloha.md
+++ b/examples/9_use_aloha.md
@@ -0,0 +1,179 @@
+This tutorial explains how to use [Aloha and Aloha 2 stationary](https://www.trossenrobotics.com/aloha-stationary) with LeRobot.
+
+## Setup
+
+Follow the [documentation from Trossen Robotics](https://docs.trossenrobotics.com/aloha_docs/getting_started/stationary/hardware_setup.html) for setting up the hardware and plugging the 4 arms and 4 cameras to your computer.
+
+
+## Install LeRobot
+
+On your computer:
+
+1. [Install Miniconda](https://docs.anaconda.com/miniconda/#quick-command-line-install):
+```bash
+mkdir -p ~/miniconda3
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
+bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
+rm ~/miniconda3/miniconda.sh
+~/miniconda3/bin/conda init bash
+```
+
+2. Restart shell or `source ~/.bashrc`
+
+3. Create and activate a fresh conda environment for lerobot
+```bash
+conda create -y -n lerobot python=3.10 && conda activate lerobot
+```
+
+4. Clone LeRobot:
+```bash
+git clone https://github.com/huggingface/lerobot.git ~/lerobot
+```
+
+5. Install LeRobot with dependencies for the Aloha motors (dynamixel) and cameras (intelrealsense):
+```bash
+cd ~/lerobot && pip install -e ".[dynamixel, intelrealsense]"
+```
+
+For Linux only (not Mac), install extra dependencies for recording datasets:
+```bash
+conda install -y -c conda-forge ffmpeg
+pip uninstall -y opencv-python
+conda install -y -c conda-forge "opencv>=4.10.0"
+```
+
+## Teleoperate
+
+**/!\ FOR SAFETY, READ THIS /!\**
+Teleoperation consists in manually operating the leader arms to move the follower arms. Importantly:
+1. Make sure your leader arms are in the same position as the follower arms, so that the follower arms don't move too fast to match the leader arms,
+2. Our code assumes that your robot has been assembled following Trossen Robotics instructions. This allows us to skip calibration, as we use the pre-defined calibration files in `.cache/calibration/aloha_default`. If you replace a motor, make sure you follow the exact instructions from Trossen Robotics.
+
+By running the following code, you can start your first **SAFE** teleoperation:
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/aloha.yaml \
+    --robot-overrides max_relative_target=5
+```
+
+By adding `--robot-overrides max_relative_target=5`, we override the default value for `max_relative_target` defined in `lerobot/configs/robot/aloha.yaml`. It is expected to be `5` to limit the magnitude of the movement for more safety, but the teloperation won't be smooth. When you feel confident, you can disable this limit by adding `--robot-overrides max_relative_target=null` to the command line:
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/aloha.yaml \
+    --robot-overrides max_relative_target=null
+```
+
+## Record a dataset
+
+Once you're familiar with teleoperation, you can record your first dataset with Aloha.
+
+If you want to use the Hugging Face hub features for uploading your dataset and you haven't previously done it, make sure you've logged in using a write-access token, which can be generated from the [Hugging Face settings](https://huggingface.co/settings/tokens):
+```bash
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+```
+
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Record 2 episodes and upload your dataset to the hub:
+```bash
+python lerobot/scripts/control_robot.py record \
+    --robot-path lerobot/configs/robot/aloha.yaml \
+    --robot-overrides max_relative_target=null \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/aloha_test \
+    --tags aloha tutorial \
+    --warmup-time-s 5 \
+    --episode-time-s 40 \
+    --reset-time-s 10 \
+    --num-episodes 2 \
+    --push-to-hub 1
+```
+
+## Visualize a dataset
+
+If you uploaded your dataset to the hub with `--push-to-hub 1`, you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id given by:
+```bash
+echo ${HF_USER}/aloha_test
+```
+
+If you didn't upload with `--push-to-hub 0`, you can also visualize it locally with:
+```bash
+python lerobot/scripts/visualize_dataset_html.py \
+  --root data \
+  --repo-id ${HF_USER}/aloha_test
+```
+
+## Replay an episode
+
+**/!\ FOR SAFETY, READ THIS /!\**
+Replay consists in automatically replaying the sequence of actions (i.e. goal positions for your motors) recorded in a given dataset episode. Make sure the current initial position of your robot is similar to the one in your episode, so that your follower arms don't move too fast to go to the first goal positions. For safety, you might want to add `--robot-overrides max_relative_target=5` to your command line as explained above.
+
+Now try to replay the first episode on your robot:
+```bash
+python lerobot/scripts/control_robot.py replay \
+    --robot-path lerobot/configs/robot/aloha.yaml \
+    --robot-overrides max_relative_target=null \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/aloha_test \
+    --episode 0
+```
+
+## Train a policy
+
+To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+```bash
+DATA_DIR=data python lerobot/scripts/train.py \
+  dataset_repo_id=${HF_USER}/aloha_test \
+  policy=act_aloha_real \
+  env=aloha_real \
+  hydra.run.dir=outputs/train/act_aloha_test \
+  hydra.job.name=act_aloha_test \
+  device=cuda \
+  wandb.enable=true
+```
+
+Let's explain it:
+1. We provided the dataset as argument with `dataset_repo_id=${HF_USER}/aloha_test`.
+2. We provided the policy with `policy=act_aloha_real`. This loads configurations from [`lerobot/configs/policy/act_aloha_real.yaml`](../lerobot/configs/policy/act_aloha_real.yaml). Importantly, this policy uses 4 cameras as input `cam_right_wrist`, `cam_left_wrist`, `cam_high`, and `cam_low`.
+3. We provided an environment as argument with `env=aloha_real`. This loads configurations from [`lerobot/configs/env/aloha_real.yaml`](../lerobot/configs/env/aloha_real.yaml). Note: this yaml defines 18 dimensions for the `state_dim` and `action_dim`, corresponding to 18 motors, not 14 motors as used in previous Aloha work. This is because, we include the `shoulder_shadow` and `elbow_shadow` motors for simplicity.
+4. We provided `device=cuda` since we are training on a Nvidia GPU.
+5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.
+
+Training should take several hours. You will find checkpoints in `outputs/train/act_aloha_test/checkpoints`.
+
+## Evaluate your policy
+
+You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
+```bash
+python lerobot/scripts/control_robot.py record \
+  --robot-path lerobot/configs/robot/aloha.yaml \
+  --robot-overrides max_relative_target=null \
+  --fps 30 \
+  --root data \
+  --repo-id ${HF_USER}/eval_act_aloha_test \
+  --tags aloha tutorial eval \
+  --warmup-time-s 5 \
+  --episode-time-s 40 \
+  --reset-time-s 10 \
+  --num-episodes 10 \
+  --num-image-writer-processes 1 \
+  -p outputs/train/act_aloha_test/checkpoints/last/pretrained_model
+```
+
+As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:
+1. There is an additional `-p` argument which indicates the path to your policy checkpoint with  (e.g. `-p outputs/train/eval_aloha_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `-p ${HF_USER}/act_aloha_test`).
+2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `--repo-id ${HF_USER}/eval_act_aloha_test`).
+3. We use `--num-image-writer-processes 1` instead of the default value (`0`). On our computer, using a dedicated process to write images from the 4 cameras on disk allows to reach constent 30 fps during inference. Feel free to explore different values for `--num-image-writer-processes`.
+
+## More
+
+Follow this [previous tutorial](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#4-train-a-policy-on-your-data) for a more in-depth explaination.
+
+If you have any question or need help, please reach out on Discord in the channel `#aloha-arm`.
--- a/lerobot/init.py
+++ b/lerobot/init.py
@@ -195,9 +195,11 @@ available_policies = [

 # lists all available robots from `lerobot/common/robot_devices/robots`
 available_robots = [
-    # "koch",
-    # "koch_bimanual",
+    "koch",
+    "koch_bimanual",
    "aloha",
+    "so100",
+    "moss",
 ]

 # lists all available cameras from `lerobot/common/robot_devices/cameras`
@@ -209,6 +211,7 @@ available_cameras = [
 # lists all available motors from `lerobot/common/robot_devices/motors`
 available_motors = [
    "dynamixel",
+    "feetech",
 ]

 # keys and values refer to yaml files
--- a/lerobot/common/datasets/populate_dataset.py
+++ b/lerobot/common/datasets/populate_dataset.py
@@ -0,0 +1,468 @@
+"""Functions to create an empty dataset, and populate it with frames."""
+# TODO(rcadene, aliberts): to adapt as class methods of next version of LeRobotDataset
+
+import concurrent
+import json
+import logging
+import multiprocessing
+import shutil
+from pathlib import Path
+
+import torch
+import tqdm
+from PIL import Image
+
+from lerobot.common.datasets.compute_stats import compute_stats
+from lerobot.common.datasets.lerobot_dataset import CODEBASE_VERSION, LeRobotDataset
+from lerobot.common.datasets.push_dataset_to_hub.aloha_hdf5_format import to_hf_dataset
+from lerobot.common.datasets.push_dataset_to_hub.utils import concatenate_episodes, get_default_encoding
+from lerobot.common.datasets.utils import calculate_episode_data_index, create_branch
+from lerobot.common.datasets.video_utils import encode_video_frames
+from lerobot.common.utils.utils import log_say
+from lerobot.scripts.push_dataset_to_hub import (
+    push_dataset_card_to_hub,
+    push_meta_data_to_hub,
+    push_videos_to_hub,
+    save_meta_data,
+)
+
+########################################################################################
+# Asynchrounous saving of images on disk
+########################################################################################
+
+
+def safe_stop_image_writer(func):
+    # TODO(aliberts): Allow to pass custom exceptions
+    # (e.g. ThreadServiceExit, KeyboardInterrupt, SystemExit, UnpluggedError, DynamixelCommError)
+    def wrapper(*args, **kwargs):
+        try:
+            return func(*args, **kwargs)
+        except Exception as e:
+            image_writer = kwargs.get("dataset", {}).get("image_writer")
+            if image_writer is not None:
+                print("Waiting for image writer to terminate...")
+                stop_image_writer(image_writer, timeout=20)
+            raise e
+
+    return wrapper
+
+
+def save_image(img_tensor, key, frame_index, episode_index, videos_dir: str):
+    img = Image.fromarray(img_tensor.numpy())
+    path = Path(videos_dir) / f"{key}_episode_{episode_index:06d}" / f"frame_{frame_index:06d}.png"
+    path.parent.mkdir(parents=True, exist_ok=True)
+    img.save(str(path), quality=100)
+
+
+def loop_to_save_images_in_threads(image_queue, num_threads):
+    if num_threads < 1:
+        raise NotImplementedError(f"Only `num_threads>=1` is supported for now, but {num_threads=} given.")
+
+    with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
+        futures = []
+        while True:
+            # Blocks until a frame is available
+            frame_data = image_queue.get()
+
+            # As usually done, exit loop when receiving None to stop the worker
+            if frame_data is None:
+                break
+
+            image, key, frame_index, episode_index, videos_dir = frame_data
+            futures.append(executor.submit(save_image, image, key, frame_index, episode_index, videos_dir))
+
+        # Before exiting function, wait for all threads to complete
+        with tqdm.tqdm(total=len(futures), desc="Writing images") as progress_bar:
+            concurrent.futures.wait(futures)
+            progress_bar.update(len(futures))
+
+
+def start_image_writer_processes(image_queue, num_processes, num_threads_per_process):
+    if num_processes < 1:
+        raise ValueError(f"Only `num_processes>=1` is supported, but {num_processes=} given.")
+
+    if num_threads_per_process < 1:
+        raise NotImplementedError(
+            "Only `num_threads_per_process>=1` is supported for now, but {num_threads_per_process=} given."
+        )
+
+    processes = []
+    for _ in range(num_processes):
+        process = multiprocessing.Process(
+            target=loop_to_save_images_in_threads,
+            args=(image_queue, num_threads_per_process),
+        )
+        process.start()
+        processes.append(process)
+    return processes
+
+
+def stop_processes(processes, queue, timeout):
+    # Send None to each process to signal them to stop
+    for _ in processes:
+        queue.put(None)
+
+    # Wait maximum 20 seconds for all processes to terminate
+    for process in processes:
+        process.join(timeout=timeout)
+
+    # If not terminated after 20 seconds, force termination
+    if process.is_alive():
+        process.terminate()
+
+    # Close the queue, no more items can be put in the queue
+    queue.close()
+
+    # Ensure all background queue threads have finished
+    queue.join_thread()
+
+
+def start_image_writer(num_processes, num_threads):
+    """This function abstract away the initialisation of processes or/and threads to
+    save images on disk asynchrounously, which is critical to control a robot and record data
+    at a high frame rate.
+
+    When `num_processes=0`, it returns a dictionary containing a threads pool of size `num_threads`.
+    When `num_processes>0`, it returns a dictionary containing a processes pool of size `num_processes`,
+    where each subprocess starts their own threads pool of size `num_threads`.
+
+    The optimal number of processes and threads depends on your computer capabilities.
+    We advise to use 4 threads per camera with 0 processes. If the fps is not stable, try to increase or lower
+    the number of threads. If it is still not stable, try to use 1 subprocess, or more.
+    """
+    image_writer = {}
+
+    if num_processes == 0:
+        futures = []
+        threads_pool = concurrent.futures.ThreadPoolExecutor(max_workers=num_threads)
+        image_writer["threads_pool"], image_writer["futures"] = threads_pool, futures
+    else:
+        # TODO(rcadene): When using num_processes>1, `multiprocessing.Manager().Queue()`
+        # might be better than `multiprocessing.Queue()`. Source: https://www.geeksforgeeks.org/python-multiprocessing-queue-vs-multiprocessing-manager-queue
+        image_queue = multiprocessing.Queue()
+        processes_pool = start_image_writer_processes(
+            image_queue, num_processes=num_processes, num_threads_per_process=num_threads
+        )
+        image_writer["processes_pool"], image_writer["image_queue"] = processes_pool, image_queue
+
+    return image_writer
+
+
+def async_save_image(image_writer, image, key, frame_index, episode_index, videos_dir):
+    """This function abstract away the saving of an image on disk asynchrounously. It uses a dictionary
+    called image writer which contains either a pool of processes or a pool of threads.
+    """
+    if "threads_pool" in image_writer:
+        threads_pool, futures = image_writer["threads_pool"], image_writer["futures"]
+        futures.append(threads_pool.submit(save_image, image, key, frame_index, episode_index, videos_dir))
+    else:
+        image_queue = image_writer["image_queue"]
+        image_queue.put((image, key, frame_index, episode_index, videos_dir))
+
+
+def stop_image_writer(image_writer, timeout):
+    if "threads_pool" in image_writer:
+        futures = image_writer["futures"]
+        # Before exiting function, wait for all threads to complete
+        with tqdm.tqdm(total=len(futures), desc="Writing images") as progress_bar:
+            concurrent.futures.wait(futures, timeout=timeout)
+            progress_bar.update(len(futures))
+    else:
+        processes_pool, image_queue = image_writer["processes_pool"], image_writer["image_queue"]
+        stop_processes(processes_pool, image_queue, timeout=timeout)
+
+
+########################################################################################
+# Functions to initialize, resume and populate a dataset
+########################################################################################
+
+
+def init_dataset(
+    repo_id,
+    root,
+    force_override,
+    fps,
+    video,
+    write_images,
+    num_image_writer_processes,
+    num_image_writer_threads,
+):
+    local_dir = Path(root) / repo_id
+    if local_dir.exists() and force_override:
+        shutil.rmtree(local_dir)
+
+    episodes_dir = local_dir / "episodes"
+    episodes_dir.mkdir(parents=True, exist_ok=True)
+
+    videos_dir = local_dir / "videos"
+    videos_dir.mkdir(parents=True, exist_ok=True)
+
+    # Logic to resume data recording
+    rec_info_path = episodes_dir / "data_recording_info.json"
+    if rec_info_path.exists():
+        with open(rec_info_path) as f:
+            rec_info = json.load(f)
+        num_episodes = rec_info["last_episode_index"] + 1
+    else:
+        num_episodes = 0
+
+    dataset = {
+        "repo_id": repo_id,
+        "local_dir": local_dir,
+        "videos_dir": videos_dir,
+        "episodes_dir": episodes_dir,
+        "fps": fps,
+        "video": video,
+        "rec_info_path": rec_info_path,
+        "num_episodes": num_episodes,
+    }
+
+    if write_images:
+        # Initialize processes or/and threads dedicated to save images on disk asynchronously,
+        # which is critical to control a robot and record data at a high frame rate.
+        image_writer = start_image_writer(
+            num_processes=num_image_writer_processes,
+            num_threads=num_image_writer_threads,
+        )
+        dataset["image_writer"] = image_writer
+
+    return dataset
+
+
+def add_frame(dataset, observation, action):
+    if "current_episode" not in dataset:
+        # initialize episode dictionary
+        ep_dict = {}
+        for key in observation:
+            if key not in ep_dict:
+                ep_dict[key] = []
+        for key in action:
+            if key not in ep_dict:
+                ep_dict[key] = []
+
+        ep_dict["episode_index"] = []
+        ep_dict["frame_index"] = []
+        ep_dict["timestamp"] = []
+        ep_dict["next.done"] = []
+
+        dataset["current_episode"] = ep_dict
+        dataset["current_frame_index"] = 0
+
+    ep_dict = dataset["current_episode"]
+    episode_index = dataset["num_episodes"]
+    frame_index = dataset["current_frame_index"]
+    videos_dir = dataset["videos_dir"]
+    video = dataset["video"]
+    fps = dataset["fps"]
+
+    ep_dict["episode_index"].append(episode_index)
+    ep_dict["frame_index"].append(frame_index)
+    ep_dict["timestamp"].append(frame_index / fps)
+    ep_dict["next.done"].append(False)
+
+    img_keys = [key for key in observation if "image" in key]
+    non_img_keys = [key for key in observation if "image" not in key]
+
+    # Save all observed modalities except images
+    for key in non_img_keys:
+        ep_dict[key].append(observation[key])
+
+    # Save actions
+    for key in action:
+        ep_dict[key].append(action[key])
+
+    if "image_writer" not in dataset:
+        dataset["current_frame_index"] += 1
+        return
+
+    # Save images
+    image_writer = dataset["image_writer"]
+    for key in img_keys:
+        imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
+        async_save_image(
+            image_writer,
+            image=observation[key],
+            key=key,
+            frame_index=frame_index,
+            episode_index=episode_index,
+            videos_dir=str(videos_dir),
+        )
+
+        if video:
+            fname = f"{key}_episode_{episode_index:06d}.mp4"
+            frame_info = {"path": f"videos/{fname}", "timestamp": frame_index / fps}
+        else:
+            frame_info = str(imgs_dir / f"frame_{frame_index:06d}.png")
+
+        ep_dict[key].append(frame_info)
+
+    dataset["current_frame_index"] += 1
+
+
+def delete_current_episode(dataset):
+    del dataset["current_episode"]
+    del dataset["current_frame_index"]
+
+    # delete temporary images
+    episode_index = dataset["num_episodes"]
+    videos_dir = dataset["videos_dir"]
+    for tmp_imgs_dir in videos_dir.glob(f"*_episode_{episode_index:06d}"):
+        shutil.rmtree(tmp_imgs_dir)
+
+
+def save_current_episode(dataset):
+    episode_index = dataset["num_episodes"]
+    ep_dict = dataset["current_episode"]
+    episodes_dir = dataset["episodes_dir"]
+    rec_info_path = dataset["rec_info_path"]
+
+    ep_dict["next.done"][-1] = True
+
+    for key in ep_dict:
+        if "observation" in key and "image" not in key:
+            ep_dict[key] = torch.stack(ep_dict[key])
+
+    ep_dict["action"] = torch.stack(ep_dict["action"])
+    ep_dict["episode_index"] = torch.tensor(ep_dict["episode_index"])
+    ep_dict["frame_index"] = torch.tensor(ep_dict["frame_index"])
+    ep_dict["timestamp"] = torch.tensor(ep_dict["timestamp"])
+    ep_dict["next.done"] = torch.tensor(ep_dict["next.done"])
+
+    ep_path = episodes_dir / f"episode_{episode_index}.pth"
+    torch.save(ep_dict, ep_path)
+
+    rec_info = {
+        "last_episode_index": episode_index,
+    }
+    with open(rec_info_path, "w") as f:
+        json.dump(rec_info, f)
+
+    # force re-initialization of episode dictionnary during add_frame
+    del dataset["current_episode"]
+
+    dataset["num_episodes"] += 1
+
+
+def encode_videos(dataset, image_keys, play_sounds):
+    log_say("Encoding videos", play_sounds)
+
+    num_episodes = dataset["num_episodes"]
+    videos_dir = dataset["videos_dir"]
+    local_dir = dataset["local_dir"]
+    fps = dataset["fps"]
+
+    # Use ffmpeg to convert frames stored as png into mp4 videos
+    for episode_index in tqdm.tqdm(range(num_episodes)):
+        for key in image_keys:
+            # key = f"observation.images.{name}"
+            tmp_imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
+            fname = f"{key}_episode_{episode_index:06d}.mp4"
+            video_path = local_dir / "videos" / fname
+            if video_path.exists():
+                # Skip if video is already encoded. Could be the case when resuming data recording.
+                continue
+            # note: `encode_video_frames` is a blocking call. Making it asynchronous shouldn't speedup encoding,
+            # since video encoding with ffmpeg is already using multithreading.
+            encode_video_frames(tmp_imgs_dir, video_path, fps, overwrite=True)
+            shutil.rmtree(tmp_imgs_dir)
+
+
+def from_dataset_to_lerobot_dataset(dataset, play_sounds):
+    log_say("Consolidate episodes", play_sounds)
+
+    num_episodes = dataset["num_episodes"]
+    episodes_dir = dataset["episodes_dir"]
+    videos_dir = dataset["videos_dir"]
+    video = dataset["video"]
+    fps = dataset["fps"]
+    repo_id = dataset["repo_id"]
+
+    ep_dicts = []
+    for episode_index in tqdm.tqdm(range(num_episodes)):
+        ep_path = episodes_dir / f"episode_{episode_index}.pth"
+        ep_dict = torch.load(ep_path)
+        ep_dicts.append(ep_dict)
+    data_dict = concatenate_episodes(ep_dicts)
+
+    if video:
+        image_keys = [key for key in data_dict if "image" in key]
+        encode_videos(dataset, image_keys, play_sounds)
+
+    hf_dataset = to_hf_dataset(data_dict, video)
+    episode_data_index = calculate_episode_data_index(hf_dataset)
+
+    info = {
+        "codebase_version": CODEBASE_VERSION,
+        "fps": fps,
+        "video": video,
+    }
+    if video:
+        info["encoding"] = get_default_encoding()
+
+    lerobot_dataset = LeRobotDataset.from_preloaded(
+        repo_id=repo_id,
+        hf_dataset=hf_dataset,
+        episode_data_index=episode_data_index,
+        info=info,
+        videos_dir=videos_dir,
+    )
+
+    return lerobot_dataset
+
+
+def save_lerobot_dataset_on_disk(lerobot_dataset):
+    hf_dataset = lerobot_dataset.hf_dataset
+    info = lerobot_dataset.info
+    stats = lerobot_dataset.stats
+    episode_data_index = lerobot_dataset.episode_data_index
+    local_dir = lerobot_dataset.videos_dir.parent
+    meta_data_dir = local_dir / "meta_data"
+
+    hf_dataset = hf_dataset.with_format(None)  # to remove transforms that cant be saved
+    hf_dataset.save_to_disk(str(local_dir / "train"))
+
+    save_meta_data(info, stats, episode_data_index, meta_data_dir)
+
+
+def push_lerobot_dataset_to_hub(lerobot_dataset, tags):
+    hf_dataset = lerobot_dataset.hf_dataset
+    local_dir = lerobot_dataset.videos_dir.parent
+    videos_dir = lerobot_dataset.videos_dir
+    repo_id = lerobot_dataset.repo_id
+    video = lerobot_dataset.video
+    meta_data_dir = local_dir / "meta_data"
+
+    if not (local_dir / "train").exists():
+        raise ValueError(
+            "You need to run `save_lerobot_dataset_on_disk(lerobot_dataset)` before pushing to the hub."
+        )
+
+    hf_dataset.push_to_hub(repo_id, revision="main")
+    push_meta_data_to_hub(repo_id, meta_data_dir, revision="main")
+    push_dataset_card_to_hub(repo_id, revision="main", tags=tags)
+    if video:
+        push_videos_to_hub(repo_id, videos_dir, revision="main")
+    create_branch(repo_id, repo_type="dataset", branch=CODEBASE_VERSION)
+
+
+def create_lerobot_dataset(dataset, run_compute_stats, push_to_hub, tags, play_sounds):
+    if "image_writer" in dataset:
+        logging.info("Waiting for image writer to terminate...")
+        image_writer = dataset["image_writer"]
+        stop_image_writer(image_writer, timeout=20)
+
+    lerobot_dataset = from_dataset_to_lerobot_dataset(dataset, play_sounds)
+
+    if run_compute_stats:
+        log_say("Computing dataset statistics", play_sounds)
+        lerobot_dataset.stats = compute_stats(lerobot_dataset)
+    else:
+        logging.info("Skipping computation of the dataset statistics")
+        lerobot_dataset.stats = {}
+
+    save_lerobot_dataset_on_disk(lerobot_dataset)
+
+    if push_to_hub:
+        push_lerobot_dataset_to_hub(lerobot_dataset, tags)
+
+    return lerobot_dataset
--- a/lerobot/common/logger.py
+++ b/lerobot/common/logger.py
@@ -189,7 +189,7 @@ class Logger:
            training_state["scheduler"] = scheduler.state_dict()
        torch.save(training_state, save_dir / self.training_state_file_name)

-    def save_checkpont(
+    def save_checkpoint(
        self,
        train_step: int,
        policy: Policy,
--- a/lerobot/common/policies/diffusion/configuration_diffusion.py
+++ b/lerobot/common/policies/diffusion/configuration_diffusion.py
@@ -67,6 +67,7 @@ class DiffusionConfig:
        use_group_norm: Whether to replace batch normalization with group normalization in the backbone.
            The group sizes are set to be about 16 (to be precise, feature_dim // 16).
        spatial_softmax_num_keypoints: Number of keypoints for SpatialSoftmax.
+        use_separate_rgb_encoders_per_camera: Whether to use a separate RGB encoder for each camera view.
        down_dims: Feature dimension for each stage of temporal downsampling in the diffusion modeling Unet.
            You may provide a variable number of dimensions, therefore also controlling the degree of
            downsampling.
@@ -130,6 +131,7 @@ class DiffusionConfig:
    pretrained_backbone_weights: str | None = None
    use_group_norm: bool = True
    spatial_softmax_num_keypoints: int = 32
+    use_separate_rgb_encoder_per_camera: bool = False
    # Unet.
    down_dims: tuple[int, ...] = (512, 1024, 2048)
    kernel_size: int = 5
--- a/lerobot/common/policies/diffusion/modeling_diffusion.py
+++ b/lerobot/common/policies/diffusion/modeling_diffusion.py
@@ -182,8 +182,13 @@ class DiffusionModel(nn.Module):
        self._use_env_state = False
        if num_images > 0:
            self._use_images = True
-            self.rgb_encoder = DiffusionRgbEncoder(config)
-            global_cond_dim += self.rgb_encoder.feature_dim * num_images
+            if self.config.use_separate_rgb_encoder_per_camera:
+                encoders = [DiffusionRgbEncoder(config) for _ in range(num_images)]
+                self.rgb_encoder = nn.ModuleList(encoders)
+                global_cond_dim += encoders[0].feature_dim * num_images
+            else:
+                self.rgb_encoder = DiffusionRgbEncoder(config)
+                global_cond_dim += self.rgb_encoder.feature_dim * num_images
        if "observation.environment_state" in config.input_shapes:
            self._use_env_state = True
            global_cond_dim += config.input_shapes["observation.environment_state"][0]
@@ -239,16 +244,32 @@ class DiffusionModel(nn.Module):
        """Encode image features and concatenate them all together along with the state vector."""
        batch_size, n_obs_steps = batch["observation.state"].shape[:2]
        global_cond_feats = [batch["observation.state"]]
-        # Extract image feature (first combine batch, sequence, and camera index dims).
+        # Extract image features.
        if self._use_images:
-            img_features = self.rgb_encoder(
-                einops.rearrange(batch["observation.images"], "b s n ... -> (b s n) ...")
-            )
-            # Separate batch dim and sequence dim back out. The camera index dim gets absorbed into the
-            # feature dim (effectively concatenating the camera features).
-            img_features = einops.rearrange(
-                img_features, "(b s n) ... -> b s (n ...)", b=batch_size, s=n_obs_steps
-            )
+            if self.config.use_separate_rgb_encoder_per_camera:
+                # Combine batch and sequence dims while rearranging to make the camera index dimension first.
+                images_per_camera = einops.rearrange(batch["observation.images"], "b s n ... -> n (b s) ...")
+                img_features_list = torch.cat(
+                    [
+                        encoder(images)
+                        for encoder, images in zip(self.rgb_encoder, images_per_camera, strict=True)
+                    ]
+                )
+                # Separate batch and sequence dims back out. The camera index dim gets absorbed into the
+                # feature dim (effectively concatenating the camera features).
+                img_features = einops.rearrange(
+                    img_features_list, "(n b s) ... -> b s (n ...)", b=batch_size, s=n_obs_steps
+                )
+            else:
+                # Combine batch, sequence, and "which camera" dims before passing to shared encoder.
+                img_features = self.rgb_encoder(
+                    einops.rearrange(batch["observation.images"], "b s n ... -> (b s n) ...")
+                )
+                # Separate batch dim and sequence dim back out. The camera index dim gets absorbed into the
+                # feature dim (effectively concatenating the camera features).
+                img_features = einops.rearrange(
+                    img_features, "(b s n) ... -> b s (n ...)", b=batch_size, s=n_obs_steps
+                )
            global_cond_feats.append(img_features)

        if self._use_env_state:
--- a/lerobot/common/policies/factory.py
+++ b/lerobot/common/policies/factory.py
@@ -51,6 +51,13 @@ def get_policy_and_config_classes(name: str) -> tuple[Policy, object]:
        from lerobot.common.policies.tdmpc.modeling_tdmpc import TDMPCPolicy

        return TDMPCPolicy, TDMPCConfig
+
+    elif name == "tdmpc2":
+        from lerobot.common.policies.tdmpc2.configuration_tdmpc2 import TDMPC2Config
+        from lerobot.common.policies.tdmpc2.modeling_tdmpc2 import TDMPC2Policy
+
+        return TDMPC2Policy, TDMPC2Config
+
    elif name == "diffusion":
        from lerobot.common.policies.diffusion.configuration_diffusion import DiffusionConfig
        from lerobot.common.policies.diffusion.modeling_diffusion import DiffusionPolicy
--- a/lerobot/common/policies/tdmpc2/configuration_tdmpc2.py
+++ b/lerobot/common/policies/tdmpc2/configuration_tdmpc2.py
@@ -0,0 +1,193 @@
+#!/usr/bin/env python
+
+# Copyright 2024 Nicklas Hansen, Xiaolong Wang, Hao Su,
+# and The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from dataclasses import dataclass, field
+
+
+@dataclass
+class TDMPC2Config:
+    """Configuration class for TDMPC2Policy.
+
+    Defaults are configured for training with xarm_lift_medium_replay providing proprioceptive and single
+    camera observations.
+
+    The parameters you will most likely need to change are the ones which depend on the environment / sensors.
+    Those are: `input_shapes`, `output_shapes`, and perhaps `max_random_shift_ratio`.
+
+    Args:
+        n_action_repeats: The number of times to repeat the action returned by the planning. (hint: Google
+            action repeats in Q-learning or ask your favorite chatbot)
+        horizon: Horizon for model predictive control.
+        n_action_steps: Number of action steps to take from the plan given by model predictive control. This
+            is an alternative to using action repeats. If this is set to more than 1, then we require
+            `n_action_repeats == 1`, `use_mpc == True` and `n_action_steps <= horizon`. Note that this
+            approach of using multiple steps from the plan is not in the original implementation.
+        input_shapes: A dictionary defining the shapes of the input data for the policy. The key represents
+            the input data name, and the value is a list indicating the dimensions of the corresponding data.
+            For example, "observation.image" refers to an input from a camera with dimensions [3, 96, 96],
+            indicating it has three color channels and 96x96 resolution. Importantly, `input_shapes` doesn't
+            include batch dimension or temporal dimension.
+        output_shapes: A dictionary defining the shapes of the output data for the policy. The key represents
+            the output data name, and the value is a list indicating the dimensions of the corresponding data.
+            For example, "action" refers to an output shape of [14], indicating 14-dimensional actions.
+            Importantly, `output_shapes` doesn't include batch dimension or temporal dimension.
+        input_normalization_modes: A dictionary with key representing the modality (e.g. "observation.state"),
+            and the value specifies the normalization mode to apply. The two available modes are "mean_std"
+            which subtracts the mean and divides by the standard deviation and "min_max" which rescale in a
+            [-1, 1] range. Note that here this defaults to None meaning inputs are not normalized. This is to
+            match the original implementation.
+        output_normalization_modes: Similar dictionary as `normalize_input_modes`, but to unnormalize to the
+            original scale. Note that this is also used for normalizing the training targets. NOTE: Clipping
+            to [-1, +1] is used during MPPI/CEM. Therefore, it is recommended that you stick with "min_max"
+            normalization mode here.
+        image_encoder_hidden_dim: Number of channels for the convolutional layers used for image encoding.
+        state_encoder_hidden_dim: Hidden dimension for MLP used for state vector encoding.
+        latent_dim: Observation's latent embedding dimension.
+        q_ensemble_size: Number of Q function estimators to use in an ensemble for uncertainty estimation.
+        mlp_dim: Hidden dimension of MLPs used for modelling the dynamics encoder, reward function, policy
+            (π), Q ensemble, and V.
+        discount: Discount factor (γ) to use for the reinforcement learning formalism.
+        use_mpc: Whether to use model predictive control. The alternative is to just sample the policy model
+            (π) for each step.
+        cem_iterations: Number of iterations for the MPPI/CEM loop in MPC.
+        max_std: Maximum standard deviation for actions sampled from the gaussian PDF in CEM.
+        min_std: Minimum standard deviation for noise applied to actions sampled from the policy model (π).
+            Doubles up as the minimum standard deviation for actions sampled from the gaussian PDF in CEM.
+        n_gaussian_samples: Number of samples to draw from the gaussian distribution every CEM iteration. Must
+            be non-zero.
+        n_pi_samples: Number of samples to draw from the policy / world model rollout every CEM iteration. Can
+            be zero.
+        n_elites: The number of elite samples to use for updating the gaussian parameters every CEM iteration.
+        elite_weighting_temperature: The temperature to use for softmax weighting (by trajectory value) of the
+            elites, when updating the gaussian parameters for CEM.
+        max_random_shift_ratio: Maximum random shift (as a proportion of the image size) to apply to the
+            image(s) (in units of pixels) for training-time augmentation. If set to 0, no such augmentation
+            is applied. Note that the input images are assumed to be square for this augmentation.
+        reward_coeff: Loss weighting coefficient for the reward regression loss.
+        value_coeff: Loss weighting coefficient for both the state-action value (Q) TD loss, and the state
+            value (V) expectile regression loss.
+        consistency_coeff: Loss weighting coefficient for the consistency loss.
+        temporal_decay_coeff: Exponential decay coefficient for decaying the loss coefficient for future time-
+            steps. Hint: each loss computation involves `horizon` steps worth of actions starting from the
+            current time step.
+        target_model_momentum: Momentum (α) used for EMA updates of the target models. Updates are calculated
+            as ϕ ← αϕ + (1-α)θ where ϕ are the parameters of the target model and θ are the parameters of the
+            model being trained.
+    """
+
+    # Input / output structure.
+    n_action_repeats: int = 1
+    horizon: int = 3
+    n_action_steps: int = 1
+
+    input_shapes: dict[str, list[int]] = field(
+        default_factory=lambda: {
+            "observation.image": [3, 84, 84],
+            "observation.state": [4],
+        }
+    )
+    output_shapes: dict[str, list[int]] = field(
+        default_factory=lambda: {
+            "action": [4],
+        }
+    )
+
+    # Normalization / Unnormalization
+    input_normalization_modes: dict[str, str] | None = None
+    output_normalization_modes: dict[str, str] = field(
+        default_factory=lambda: {"action": "min_max"},
+    )
+
+    # Architecture / modeling.
+    # Neural networks.
+    image_encoder_hidden_dim: int = 32
+    state_encoder_hidden_dim: int = 256
+    latent_dim: int = 512
+    q_ensemble_size: int = 5
+    num_enc_layers: int = 2
+    mlp_dim: int = 512
+    # Reinforcement learning.
+    discount: float = 0.9
+    simnorm_dim: int = 8
+    dropout: float = 0.01
+
+    # actor
+    log_std_min: float = -10
+    log_std_max: float = 2
+
+    # critic
+    num_bins: int = 101
+    vmin: int = -10
+    vmax: int = +10
+
+    # Inference.
+    use_mpc: bool = True
+    cem_iterations: int = 6
+    max_std: float = 2.0
+    min_std: float = 0.05
+    n_gaussian_samples: int = 512
+    n_pi_samples: int = 24
+    n_elites: int = 64
+    elite_weighting_temperature: float = 0.5
+
+    # Training and loss computation.
+    max_random_shift_ratio: float = 0.0476
+    # Loss coefficients.
+    reward_coeff: float = 0.1
+    value_coeff: float = 0.1
+    consistency_coeff: float = 20.0
+    entropy_coef: float = 1e-4
+    temporal_decay_coeff: float = 0.5
+    # Target model. NOTE (michel_aractingi) this is equivelant to
+    # 1 - target_model_momentum of our TD-MPC1 implementation because
+    # of the use of `torch.lerp`
+    target_model_momentum: float = 0.01
+
+    def __post_init__(self):
+        """Input validation (not exhaustive)."""
+        # There should only be one image key.
+        image_keys = {k for k in self.input_shapes if k.startswith("observation.image")}
+        if len(image_keys) > 1:
+            raise ValueError(
+                f"{self.__class__.__name__} handles at most one image for now. Got image keys {image_keys}."
+            )
+        if len(image_keys) > 0:
+            image_key = next(iter(image_keys))
+            if self.input_shapes[image_key][-2] != self.input_shapes[image_key][-1]:
+                # TODO(alexander-soare): This limitation is solely because of code in the random shift
+                # augmentation. It should be able to be removed.
+                raise ValueError(
+                    f"Only square images are handled now. Got image shape {self.input_shapes[image_key]}."
+                )
+        if self.n_gaussian_samples <= 0:
+            raise ValueError(
+                f"The number of guassian samples for CEM should be non-zero. Got `{self.n_gaussian_samples=}`"
+            )
+        if self.output_normalization_modes != {"action": "min_max"}:
+            raise ValueError(
+                "TD-MPC assumes the action space dimensions to all be in [-1, 1]. Therefore it is strongly "
+                f"advised that you stick with the default. See {self.__class__.__name__} docstring for more "
+                "information."
+            )
+        if self.n_action_steps > 1:
+            if self.n_action_repeats != 1:
+                raise ValueError(
+                    "If `n_action_steps > 1`, `n_action_repeats` must be left to its default value of 1."
+                )
+            if not self.use_mpc:
+                raise ValueError("If `n_action_steps > 1`, `use_mpc` must be set to `True`.")
+            if self.n_action_steps > self.horizon:
+                raise ValueError("`n_action_steps` must be less than or equal to `horizon`.")
--- a/lerobot/common/policies/tdmpc2/modeling_tdmpc2.py
+++ b/lerobot/common/policies/tdmpc2/modeling_tdmpc2.py
@@ -0,0 +1,834 @@
+#!/usr/bin/env python
+
+# Copyright 2024 Nicklas Hansen and The HuggingFace Inc. team.
+# All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Implementation of TD-MPC2: Scalable, Robust World Models for Continuous Control
+
+We refer to the main paper and codebase:
+    TD-MPC2 paper: (https://arxiv.org/abs/2310.16828)
+    TD-MPC2 code:  (https://github.com/nicklashansen/tdmpc2)
+"""
+
+# ruff: noqa: N806
+
+from collections import deque
+from copy import deepcopy
+from functools import partial
+from typing import Callable
+
+import einops
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F  # noqa: N812
+from huggingface_hub import PyTorchModelHubMixin
+from torch import Tensor
+
+from lerobot.common.policies.normalize import Normalize, Unnormalize
+from lerobot.common.policies.tdmpc2.configuration_tdmpc2 import TDMPC2Config
+from lerobot.common.policies.tdmpc2.tdmpc2_utils import (
+    NormedLinear,
+    SimNorm,
+    gaussian_logprob,
+    soft_cross_entropy,
+    squash,
+    two_hot_inv,
+)
+from lerobot.common.policies.utils import get_device_from_parameters, populate_queues
+
+
+class TDMPC2Policy(
+    nn.Module,
+    PyTorchModelHubMixin,
+    library_name="lerobot",
+    repo_url="https://github.com/huggingface/lerobot",
+    tags=["robotics", "tdmpc2"],
+):
+    """Implementation of TD-MPC2 learning + inference."""
+
+    name = "tdmpc2"
+
+    def __init__(
+        self, config: TDMPC2Config | None = None, dataset_stats: dict[str, dict[str, Tensor]] | None = None
+    ):
+        """
+        Args:
+            config: Policy configuration class instance or None, in which case the default instantiation of
+                the configuration class is used.
+            dataset_stats: Dataset statistics to be used for normalization. If not passed here, it is expected
+                that they will be passed with a call to `load_state_dict` before the policy is used.
+        """
+        super().__init__()
+
+        if config is None:
+            config = TDMPC2Config()
+        self.config = config
+        self.model = TDMPC2WorldModel(config)
+        # TODO (michel-aractingi) temp fix for gpu
+        self.model = self.model.to("cuda:0")
+
+        if config.input_normalization_modes is not None:
+            self.normalize_inputs = Normalize(
+                config.input_shapes, config.input_normalization_modes, dataset_stats
+            )
+        else:
+            self.normalize_inputs = nn.Identity()
+        self.normalize_targets = Normalize(
+            config.output_shapes, config.output_normalization_modes, dataset_stats
+        )
+        self.unnormalize_outputs = Unnormalize(
+            config.output_shapes, config.output_normalization_modes, dataset_stats
+        )
+
+        image_keys = [k for k in config.input_shapes if k.startswith("observation.image")]
+        # Note: This check is covered in the post-init of the config but have a sanity check just in case.
+        self._use_image = False
+        self._use_env_state = False
+        if len(image_keys) > 0:
+            assert len(image_keys) == 1
+            self._use_image = True
+            self.input_image_key = image_keys[0]
+        if "observation.environment_state" in config.input_shapes:
+            self._use_env_state = True
+
+        self.scale = RunningScale(self.config.target_model_momentum)
+        self.discount = (
+            self.config.discount
+        )  # TODO (michel-aractingi) downscale discount according to episode length
+
+        self.reset()
+
+    def reset(self):
+        """
+        Clear observation and action queues. Clear previous means for warm starting of MPPI/CEM. Should be
+        called on `env.reset()`
+        """
+        self._queues = {
+            "observation.state": deque(maxlen=1),
+            "action": deque(maxlen=max(self.config.n_action_steps, self.config.n_action_repeats)),
+        }
+        if self._use_image:
+            self._queues["observation.image"] = deque(maxlen=1)
+        if self._use_env_state:
+            self._queues["observation.environment_state"] = deque(maxlen=1)
+        # Previous mean obtained from the cross-entropy method (CEM) used during MPC. It is used to warm start
+        # CEM for the next step.
+        self._prev_mean: torch.Tensor | None = None
+
+    @torch.no_grad()
+    def select_action(self, batch: dict[str, Tensor]) -> Tensor:
+        """Select a single action given environment observations."""
+        batch = self.normalize_inputs(batch)
+        if self._use_image:
+            batch = dict(batch)  # shallow copy so that adding a key doesn't modify the original
+            batch["observation.image"] = batch[self.input_image_key]
+
+        self._queues = populate_queues(self._queues, batch)
+
+        # When the action queue is depleted, populate it again by querying the policy.
+        if len(self._queues["action"]) == 0:
+            batch = {key: torch.stack(list(self._queues[key]), dim=1) for key in batch}
+
+            # Remove the time dimensions as it is not handled yet.
+            for key in batch:
+                assert batch[key].shape[1] == 1
+                batch[key] = batch[key][:, 0]
+
+            # NOTE: Order of observations matters here.
+            encode_keys = []
+            if self._use_image:
+                encode_keys.append("observation.image")
+            if self._use_env_state:
+                encode_keys.append("observation.environment_state")
+            encode_keys.append("observation.state")
+            z = self.model.encode({k: batch[k] for k in encode_keys})
+            if self.config.use_mpc:  # noqa: SIM108
+                actions = self.plan(z)  # (horizon, batch, action_dim)
+            else:
+                # Plan with the policy (π) alone. This always returns one action so unsqueeze to get a
+                # sequence dimension like in the MPC branch.
+                actions = self.model.pi(z)[0].unsqueeze(0)
+
+            actions = torch.clamp(actions, -1, +1)
+
+            actions = self.unnormalize_outputs({"action": actions})["action"]
+
+            if self.config.n_action_repeats > 1:
+                for _ in range(self.config.n_action_repeats):
+                    self._queues["action"].append(actions[0])
+            else:
+                # Action queue is (n_action_steps, batch_size, action_dim), so we transpose the action.
+                self._queues["action"].extend(actions[: self.config.n_action_steps])
+
+        action = self._queues["action"].popleft()
+        return action
+
+    @torch.no_grad()
+    def plan(self, z: Tensor) -> Tensor:
+        """Plan sequence of actions using TD-MPC inference.
+
+        Args:
+            z: (batch, latent_dim,) tensor for the initial state.
+        Returns:
+            (horizon, batch, action_dim,) tensor for the planned trajectory of actions.
+        """
+        device = get_device_from_parameters(self)
+
+        batch_size = z.shape[0]
+
+        # Sample Nπ trajectories from the policy.
+        pi_actions = torch.empty(
+            self.config.horizon,
+            self.config.n_pi_samples,
+            batch_size,
+            self.config.output_shapes["action"][0],
+            device=device,
+        )
+        if self.config.n_pi_samples > 0:
+            _z = einops.repeat(z, "b d -> n b d", n=self.config.n_pi_samples)
+            for t in range(self.config.horizon):
+                # Note: Adding a small amount of noise here doesn't hurt during inference and may even be
+                # helpful for CEM.
+                pi_actions[t] = self.model.pi(_z)[0]
+                _z = self.model.latent_dynamics(_z, pi_actions[t])
+
+        # In the CEM loop we will need this for a call to estimate_value with the gaussian sampled
+        # trajectories.
+        z = einops.repeat(z, "b d -> n b d", n=self.config.n_gaussian_samples + self.config.n_pi_samples)
+
+        # Model Predictive Path Integral (MPPI) with the cross-entropy method (CEM) as the optimization
+        # algorithm.
+        # The initial mean and standard deviation for the cross-entropy method (CEM).
+        mean = torch.zeros(
+            self.config.horizon, batch_size, self.config.output_shapes["action"][0], device=device
+        )
+        # Maybe warm start CEM with the mean from the previous step.
+        if self._prev_mean is not None:
+            mean[:-1] = self._prev_mean[1:]
+        std = self.config.max_std * torch.ones_like(mean)
+
+        for _ in range(self.config.cem_iterations):
+            # Randomly sample action trajectories for the gaussian distribution.
+            std_normal_noise = torch.randn(
+                self.config.horizon,
+                self.config.n_gaussian_samples,
+                batch_size,
+                self.config.output_shapes["action"][0],
+                device=std.device,
+            )
+            gaussian_actions = torch.clamp(mean.unsqueeze(1) + std.unsqueeze(1) * std_normal_noise, -1, 1)
+
+            # Compute elite actions.
+            actions = torch.cat([gaussian_actions, pi_actions], dim=1)
+            value = self.estimate_value(z, actions).nan_to_num_(0).squeeze()
+            elite_idxs = torch.topk(value, self.config.n_elites, dim=0).indices  # (n_elites, batch)
+            elite_value = value.take_along_dim(elite_idxs, dim=0)  # (n_elites, batch)
+            # (horizon, n_elites, batch, action_dim)
+            elite_actions = actions.take_along_dim(einops.rearrange(elite_idxs, "n b -> 1 n b 1"), dim=1)
+
+            # Update gaussian PDF parameters to be the (weighted) mean and standard deviation of the elites.
+            max_value = elite_value.max(0, keepdim=True)[0]  # (1, batch)
+            # The weighting is a softmax over trajectory values. Note that this is not the same as the usage
+            # of Ω in eqn 4 of the TD-MPC paper. Instead it is the normalized version of it: s = Ω/ΣΩ. This
+            # makes the equations: μ = Σ(s⋅Γ), σ = Σ(s⋅(Γ-μ)²).
+            score = torch.exp(self.config.elite_weighting_temperature * (elite_value - max_value))
+            score /= score.sum(axis=0, keepdim=True)
+            # (horizon, batch, action_dim)
+            mean = torch.sum(einops.rearrange(score, "n b -> n b 1") * elite_actions, dim=1) / (
+                einops.rearrange(score.sum(0), "b -> 1 b 1") + 1e-9
+            )
+            std = torch.sqrt(
+                torch.sum(
+                    einops.rearrange(score, "n b -> n b 1")
+                    * (elite_actions - einops.rearrange(mean, "h b d -> h 1 b d")) ** 2,
+                    dim=1,
+                )
+                / (einops.rearrange(score.sum(0), "b -> 1 b 1") + 1e-9)
+            ).clamp_(self.config.min_std, self.config.max_std)
+
+        # Keep track of the mean for warm-starting subsequent steps.
+        self._prev_mean = mean
+
+        # Randomly select one of the elite actions from the last iteration of MPPI/CEM using the softmax
+        # scores from the last iteration.
+        actions = elite_actions[:, torch.multinomial(score.T, 1).squeeze(), torch.arange(batch_size)]
+        return actions
+
+    @torch.no_grad()
+    def estimate_value(self, z: Tensor, actions: Tensor):
+        """Estimates the value of a trajectory as per eqn 4 of the FOWM paper.
+
+        Args:
+            z: (batch, latent_dim) tensor of initial latent states.
+            actions: (horizon, batch, action_dim) tensor of action trajectories.
+        Returns:
+            (batch,) tensor of values.
+        """
+        # Initialize return and running discount factor.
+        G, running_discount = 0, 1
+        # Iterate over the actions in the trajectory to simulate the trajectory using the latent dynamics
+        # model. Keep track of return.
+        for t in range(actions.shape[0]):
+            # Estimate the next state (latent) and reward.
+            z, reward = self.model.latent_dynamics_and_reward(z, actions[t], discretize_reward=True)
+            # Update the return and running discount.
+            G += running_discount * reward
+            running_discount *= self.config.discount
+
+        # next_action = self.model.pi(z)[0]  # (batch, action_dim)
+        # terminal_values = self.model.Qs(z, next_action, return_type="avg")  # (ensemble, batch)
+
+        return G + running_discount * self.model.Qs(z, self.model.pi(z)[0], return_type="avg")
+
+    def forward(self, batch: dict[str, Tensor]) -> dict[str, Tensor | float]:
+        """Run the batch through the model and compute the loss.
+
+        Returns a dictionary with loss as a tensor, and other information as native floats.
+        """
+        device = get_device_from_parameters(self)
+
+        batch = self.normalize_inputs(batch)
+        if self._use_image:
+            batch = dict(batch)  # shallow copy so that adding a key doesn't modify the original
+            batch["observation.image"] = batch[self.input_image_key]
+        batch = self.normalize_targets(batch)
+
+        info = {}
+
+        # (b, t) -> (t, b)
+        for key in batch:
+            if batch[key].ndim > 1:
+                batch[key] = batch[key].transpose(1, 0)
+
+        action = batch["action"]  # (t, b, action_dim)
+        reward = batch["next.reward"]  # (t, b)
+        observations = {k: v for k, v in batch.items() if k.startswith("observation.")}
+
+        # Apply random image augmentations.
+        if self._use_image and self.config.max_random_shift_ratio > 0:
+            observations["observation.image"] = flatten_forward_unflatten(
+                partial(random_shifts_aug, max_random_shift_ratio=self.config.max_random_shift_ratio),
+                observations["observation.image"],
+            )
+
+        # Get the current observation for predicting trajectories, and all future observations for use in
+        # the latent consistency loss and TD loss.
+        current_observation, next_observations = {}, {}
+        for k in observations:
+            current_observation[k] = observations[k][0]
+            next_observations[k] = observations[k][1:]
+        horizon, batch_size = next_observations[
+            "observation.image" if self._use_image else "observation.environment_state"
+        ].shape[:2]
+
+        # Run latent rollout using the latent dynamics model and policy model.
+        # Note this has shape `horizon+1` because there are `horizon` actions and a current `z`. Each action
+        # gives us a next `z`.
+        batch_size = batch["index"].shape[0]
+        z_preds = torch.empty(horizon + 1, batch_size, self.config.latent_dim, device=device)
+        z_preds[0] = self.model.encode(current_observation)
+        reward_preds = torch.empty(horizon, batch_size, self.config.num_bins, device=device)
+        for t in range(horizon):
+            z_preds[t + 1], reward_preds[t] = self.model.latent_dynamics_and_reward(z_preds[t], action[t])
+
+        # Compute Q value predictions based on the latent rollout.
+        q_preds_ensemble = self.model.Qs(
+            z_preds[:-1], action, return_type="all"
+        )  # (ensemble, horizon, batch)
+        info.update({"Q": q_preds_ensemble.mean().item()})
+
+        # Compute various targets with stopgrad.
+        with torch.no_grad():
+            # Latent state consistency targets for consistency loss.
+            z_targets = self.model.encode(next_observations)
+
+            # Compute the TD-target from a reward and the next observation
+            pi = self.model.pi(z_targets)[0]
+            td_targets = (
+                reward
+                + self.config.discount
+                * self.model.Qs(z_targets, pi, return_type="min", target=True).squeeze()
+            )
+
+        # Compute losses.
+        # Exponentially decay the loss weight with respect to the timestep. Steps that are more distant in the
+        # future have less impact on the loss. Note: unsqueeze will let us broadcast to (seq, batch).
+        temporal_loss_coeffs = torch.pow(
+            self.config.temporal_decay_coeff, torch.arange(horizon, device=device)
+        ).unsqueeze(-1)
+
+        # Compute consistency loss as MSE loss between latents predicted from the rollout and latents
+        # predicted from the (target model's) observation encoder.
+        consistency_loss = (
+            (
+                temporal_loss_coeffs
+                * F.mse_loss(z_preds[1:], z_targets, reduction="none").mean(dim=-1)
+                # `z_preds` depends on the current observation and the actions.
+                * ~batch["observation.state_is_pad"][0]
+                * ~batch["action_is_pad"]
+                # `z_targets` depends on the next observation.
+                * ~batch["observation.state_is_pad"][1:]
+            )
+            .sum(0)
+            .mean()
+        )
+        # Compute the reward loss as MSE loss between rewards predicted from the rollout and the dataset
+        # rewards.
+        reward_loss = (
+            (
+                temporal_loss_coeffs
+                * soft_cross_entropy(reward_preds, reward, self.config).mean(1)
+                * ~batch["next.reward_is_pad"]
+                * ~batch["observation.state_is_pad"][0]
+                * ~batch["action_is_pad"]
+            )
+            .sum(0)
+            .mean()
+        )
+
+        # Compute state-action value loss (TD loss) for all of the Q functions in the ensemble.
+        ce_value_loss = 0.0
+        for i in range(self.config.q_ensemble_size):
+            ce_value_loss += soft_cross_entropy(q_preds_ensemble[i], td_targets, self.config).mean(1)
+
+        q_value_loss = (
+            (
+                temporal_loss_coeffs
+                * ce_value_loss
+                # `q_preds_ensemble` depends on the first observation and the actions.
+                * ~batch["observation.state_is_pad"][0]
+                * ~batch["action_is_pad"]
+                # q_targets depends on the reward and the next observations.
+                * ~batch["next.reward_is_pad"]
+                * ~batch["observation.state_is_pad"][1:]
+            )
+            .sum(0)
+            .mean()
+        )
+
+        # Calculate the advantage weighted regression loss for π as detailed in FOWM 3.1.
+        # We won't need these gradients again so detach.
+        z_preds = z_preds.detach()
+        action_preds, _, log_pis, _ = self.model.pi(z_preds[:-1])
+
+        with torch.no_grad():
+            # avoid unnessecary computation of the gradients during policy optimization
+            # TODO (michel-aractingi): the same logic should be extended when adding task embeddings
+            qs = self.model.Qs(z_preds[:-1], action_preds, return_type="avg")
+            self.scale.update(qs[0])
+            qs = self.scale(qs)
+
+        pi_loss = (
+            (self.config.entropy_coef * log_pis - qs).mean(dim=2)
+            * temporal_loss_coeffs
+            # `action_preds` depends on the first observation and the actions.
+            * ~batch["observation.state_is_pad"][0]
+            * ~batch["action_is_pad"]
+        ).mean()
+
+        loss = (
+            self.config.consistency_coeff * consistency_loss
+            + self.config.reward_coeff * reward_loss
+            + self.config.value_coeff * q_value_loss
+            + pi_loss
+        )
+
+        info.update(
+            {
+                "consistency_loss": consistency_loss.item(),
+                "reward_loss": reward_loss.item(),
+                "Q_value_loss": q_value_loss.item(),
+                "pi_loss": pi_loss.item(),
+                "loss": loss,
+                "sum_loss": loss.item() * self.config.horizon,
+                "pi_scale": float(self.scale.value),
+            }
+        )
+
+        # Undo (b, t) -> (t, b).
+        for key in batch:
+            if batch[key].ndim > 1:
+                batch[key] = batch[key].transpose(1, 0)
+
+        return info
+
+    def update(self):
+        """Update the target model's using polyak averaging."""
+        self.model.update_target_Q()
+
+
+class TDMPC2WorldModel(nn.Module):
+    """Latent dynamics model used in TD-MPC2."""
+
+    def __init__(self, config: TDMPC2Config):
+        super().__init__()
+        self.config = config
+
+        self._encoder = TDMPC2ObservationEncoder(config)
+
+        # Define latent dynamics head
+        self._dynamics = nn.Sequential(
+            NormedLinear(config.latent_dim + config.output_shapes["action"][0], config.mlp_dim),
+            NormedLinear(config.mlp_dim, config.mlp_dim),
+            NormedLinear(config.mlp_dim, config.latent_dim, act=SimNorm(config.simnorm_dim)),
+        )
+
+        # Define reward head
+        self._reward = nn.Sequential(
+            NormedLinear(config.latent_dim + config.output_shapes["action"][0], config.mlp_dim),
+            NormedLinear(config.mlp_dim, config.mlp_dim),
+            nn.Linear(config.mlp_dim, max(config.num_bins, 1)),
+        )
+
+        # Define policy head
+        self._pi = nn.Sequential(
+            NormedLinear(config.latent_dim, config.mlp_dim),
+            NormedLinear(config.mlp_dim, config.mlp_dim),
+            nn.Linear(config.mlp_dim, 2 * config.output_shapes["action"][0]),
+        )
+
+        # Define ensemble of Q functions
+        self._Qs = nn.ModuleList(
+            [
+                nn.Sequential(
+                    NormedLinear(
+                        config.latent_dim + config.output_shapes["action"][0],
+                        config.mlp_dim,
+                        dropout=config.dropout,
+                    ),
+                    NormedLinear(config.mlp_dim, config.mlp_dim),
+                    nn.Linear(config.mlp_dim, max(config.num_bins, 1)),
+                )
+                for _ in range(config.q_ensemble_size)
+            ]
+        )
+
+        self._init_weights()
+
+        self._target_Qs = deepcopy(self._Qs).requires_grad_(False)
+
+        self.log_std_min = torch.tensor(config.log_std_min)
+        self.log_std_dif = torch.tensor(config.log_std_max) - self.log_std_min
+
+        self.bins = torch.linspace(config.vmin, config.vmax, config.num_bins)
+        self.config.bin_size = (config.vmax - config.vmin) / (config.num_bins - 1)
+
+    def _init_weights(self):
+        """Initialize model weights.
+        Custom weight initializations proposed in TD-MPC2.
+
+        """
+
+        def _apply_fn(m):
+            if isinstance(m, nn.Linear):
+                nn.init.trunc_normal_(m.weight, std=0.02)
+                if m.bias is not None:
+                    nn.init.constant_(m.bias, 0)
+            elif isinstance(m, nn.ParameterList):
+                for i, p in enumerate(m):
+                    if p.dim() == 3:  # Linear
+                        nn.init.trunc_normal_(p, std=0.02)  # Weight
+                        nn.init.constant_(m[i + 1], 0)  # Bias
+
+        self.apply(_apply_fn)
+
+        # initialize parameters of the
+        for m in [self._reward, *self._Qs]:
+            assert isinstance(
+                m[-1], nn.Linear
+            ), "Sanity check. The last linear layer needs 0 initialization on weights."
+            nn.init.zeros_(m[-1].weight)
+
+    def to(self, *args, **kwargs):
+        """
+        Overriding `to` method to also move additional tensors to device.
+        """
+        super().to(*args, **kwargs)
+        self.log_std_min = self.log_std_min.to(*args, **kwargs)
+        self.log_std_dif = self.log_std_dif.to(*args, **kwargs)
+        self.bins = self.bins.to(*args, **kwargs)
+        return self
+
+    def train(self, mode):
+        super().train(mode)
+        self._target_Qs.train(False)
+        return self
+
+    def encode(self, obs: dict[str, Tensor]) -> Tensor:
+        """Encodes an observation into its latent representation."""
+        return self._encoder(obs)
+
+    def latent_dynamics_and_reward(
+        self, z: Tensor, a: Tensor, discretize_reward: bool = False
+    ) -> tuple[Tensor, Tensor, bool]:
+        """Predict the next state's latent representation and the reward given a current latent and action.
+
+        Args:
+            z: (*, latent_dim) tensor for the current state's latent representation.
+            a: (*, action_dim) tensor for the action to be applied.
+        Returns:
+            A tuple containing:
+                - (*, latent_dim) tensor for the next state's latent representation.
+                - (*,) tensor for the estimated reward.
+        """
+        x = torch.cat([z, a], dim=-1)
+        reward = self._reward(x).squeeze(-1)
+        if discretize_reward:
+            reward = two_hot_inv(reward, self.bins)
+        return self._dynamics(x), reward
+
+    def latent_dynamics(self, z: Tensor, a: Tensor) -> Tensor:
+        """Predict the next state's latent representation given a current latent and action.
+
+        Args:
+            z: (*, latent_dim) tensor for the current state's latent representation.
+            a: (*, action_dim) tensor for the action to be applied.
+        Returns:
+            (*, latent_dim) tensor for the next state's latent representation.
+        """
+        x = torch.cat([z, a], dim=-1)
+        return self._dynamics(x)
+
+    def pi(self, z: Tensor) -> Tensor:
+        """Samples an action from the learned policy.
+
+        The policy can also have added (truncated) Gaussian noise injected for encouraging exploration when
+        generating rollouts for online training.
+
+        Args:
+            z: (*, latent_dim) tensor for the current state's latent representation.
+            std: The standard deviation of the injected noise.
+        Returns:
+            (*, action_dim) tensor for the sampled action.
+        """
+        mu, log_std = self._pi(z).chunk(2, dim=-1)
+        log_std = self.log_std_min + 0.5 * self.log_std_dif * (torch.tanh(log_std) + 1)
+        eps = torch.randn_like(mu)
+
+        log_pi = gaussian_logprob(eps, log_std)
+        pi = mu + eps * log_std.exp()
+        mu, pi, log_pi = squash(mu, pi, log_pi)
+
+        return pi, mu, log_pi, log_std
+
+    def Qs(self, z: Tensor, a: Tensor, return_type: str = "min", target=False) -> Tensor:  # noqa: N802
+        """Predict state-action value for all of the learned Q functions.
+
+        Args:
+            z: (*, latent_dim) tensor for the current state's latent representation.
+            a: (*, action_dim) tensor for the action to be applied.
+            return_type: either 'min' or 'all' otherwise the average is returned
+        Returns:
+            (q_ensemble, *) tensor for the value predictions of each learned Q function in the ensemble or the average or min
+        """
+        x = torch.cat([z, a], dim=-1)
+
+        if target:
+            out = torch.stack([q(x).squeeze(-1) for q in self._target_Qs], dim=0)
+        else:
+            out = torch.stack([q(x).squeeze(-1) for q in self._Qs], dim=0)
+
+        if return_type == "all":
+            return out
+
+        Q1, Q2 = out[np.random.choice(len(self._Qs), size=2, replace=False)]
+        Q1, Q2 = two_hot_inv(Q1, self.bins), two_hot_inv(Q2, self.bins)
+        return torch.min(Q1, Q2) if return_type == "min" else (Q1 + Q2) / 2
+
+    def update_target_Q(self):
+        """
+        Soft-update target Q-networks using Polyak averaging.
+        """
+        with torch.no_grad():
+            for p, p_target in zip(self._Qs.parameters(), self._target_Qs.parameters(), strict=False):
+                p_target.data.lerp_(p.data, self.config.target_model_momentum)
+
+
+class TDMPC2ObservationEncoder(nn.Module):
+    """Encode image and/or state vector observations."""
+
+    def __init__(self, config: TDMPC2Config):
+        """
+        Creates encoders for pixel and/or state modalities.
+        TODO(alexander-soare): The original work allows for multiple images by concatenating them along the
+            channel dimension. Re-implement this capability.
+        """
+        super().__init__()
+        self.config = config
+
+        # Define the observation encoder whether its pixels or states
+        encoder_dict = {}
+        for obs_key in config.input_shapes:
+            if "observation.image" in config.input_shapes:
+                encoder_module = nn.Sequential(
+                    nn.Conv2d(config.input_shapes[obs_key][0], config.image_encoder_hidden_dim, 7, stride=2),
+                    nn.ReLU(inplace=True),
+                    nn.Conv2d(config.image_encoder_hidden_dim, config.image_encoder_hidden_dim, 5, stride=2),
+                    nn.ReLU(inplace=True),
+                    nn.Conv2d(config.image_encoder_hidden_dim, config.image_encoder_hidden_dim, 3, stride=2),
+                    nn.ReLU(inplace=True),
+                    nn.Conv2d(config.image_encoder_hidden_dim, config.image_encoder_hidden_dim, 3, stride=1),
+                )
+                dummy_batch = torch.zeros(1, *config.input_shapes[obs_key])
+                with torch.inference_mode():
+                    out_shape = encoder_module(dummy_batch).shape[1:]
+                encoder_module.extend(
+                    nn.Sequential(
+                        nn.Flatten(),
+                        NormedLinear(np.prod(out_shape), config.latent_dim, act=SimNorm(config.simnorm_dim)),
+                    )
+                )
+
+            elif (
+                "observation.state" in config.input_shapes
+                or "observation.environment_state" in config.input_shapes
+            ):
+                encoder_module = nn.ModuleList()
+                encoder_module.append(
+                    NormedLinear(config.input_shapes[obs_key][0], config.state_encoder_hidden_dim)
+                )
+                assert config.num_enc_layers > 0
+                for _ in range(config.num_enc_layers - 1):
+                    encoder_module.append(
+                        NormedLinear(config.state_encoder_hidden_dim, config.state_encoder_hidden_dim)
+                    )
+                encoder_module.append(
+                    NormedLinear(
+                        config.state_encoder_hidden_dim, config.latent_dim, act=SimNorm(config.simnorm_dim)
+                    )
+                )
+                encoder_module = nn.Sequential(*encoder_module)
+
+            else:
+                raise NotImplementedError(f"No corresponding encoder module for key {obs_key}.")
+
+            encoder_dict[obs_key.replace(".", "")] = encoder_module
+
+        self.encoder = nn.ModuleDict(encoder_dict)
+
+    def forward(self, obs_dict: dict[str, Tensor]) -> Tensor:
+        """Encode the image and/or state vector.
+
+        Each modality is encoded into a feature vector of size (latent_dim,) and then a uniform mean is taken
+        over all features.
+        """
+        feat = []
+        for obs_key in self.config.input_shapes:
+            if "observation.image" in obs_key:
+                feat.append(
+                    flatten_forward_unflatten(self.encoder[obs_key.replace(".", "")], obs_dict[obs_key])
+                )
+            else:
+                feat.append(self.encoder[obs_key.replace(".", "")](obs_dict[obs_key]))
+        return torch.stack(feat, dim=0).mean(0)
+
+
+def random_shifts_aug(x: Tensor, max_random_shift_ratio: float) -> Tensor:
+    """Randomly shifts images horizontally and vertically.
+
+    Adapted from https://github.com/facebookresearch/drqv2
+    """
+    b, _, h, w = x.size()
+    assert h == w, "non-square images not handled yet"
+    pad = int(round(max_random_shift_ratio * h))
+    x = F.pad(x, tuple([pad] * 4), "replicate")
+    eps = 1.0 / (h + 2 * pad)
+    arange = torch.linspace(
+        -1.0 + eps,
+        1.0 - eps,
+        h + 2 * pad,
+        device=x.device,
+        dtype=torch.float32,
+    )[:h]
+    arange = einops.repeat(arange, "w -> h w 1", h=h)
+    base_grid = torch.cat([arange, arange.transpose(1, 0)], dim=2)
+    base_grid = einops.repeat(base_grid, "h w c -> b h w c", b=b)
+    # A random shift in units of pixels and within the boundaries of the padding.
+    shift = torch.randint(
+        0,
+        2 * pad + 1,
+        size=(b, 1, 1, 2),
+        device=x.device,
+        dtype=torch.float32,
+    )
+    shift *= 2.0 / (h + 2 * pad)
+    grid = base_grid + shift
+    return F.grid_sample(x, grid, padding_mode="zeros", align_corners=False)
+
+
+def flatten_forward_unflatten(fn: Callable[[Tensor], Tensor], image_tensor: Tensor) -> Tensor:
+    """Helper to temporarily flatten extra dims at the start of the image tensor.
+
+    Args:
+        fn: Callable that the image tensor will be passed to. It should accept (B, C, H, W) and return
+            (B, *), where * is any number of dimensions.
+        image_tensor: An image tensor of shape (**, C, H, W), where ** is any number of dimensions, generally
+            different from *.
+    Returns:
+        A return value from the callable reshaped to (**, *).
+    """
+    if image_tensor.ndim == 4:
+        return fn(image_tensor)
+    start_dims = image_tensor.shape[:-3]
+    inp = torch.flatten(image_tensor, end_dim=-4)
+    flat_out = fn(inp)
+    return torch.reshape(flat_out, (*start_dims, *flat_out.shape[1:]))
+
+
+class RunningScale:
+    """Running trimmed scale estimator."""
+
+    def __init__(self, tau):
+        self.tau = tau
+        self._value = torch.ones(1, dtype=torch.float32, device=torch.device("cuda"))
+        self._percentiles = torch.tensor([5, 95], dtype=torch.float32, device=torch.device("cuda"))
+
+    def state_dict(self):
+        return dict(value=self._value, percentiles=self._percentiles)
+
+    def load_state_dict(self, state_dict):
+        self._value.data.copy_(state_dict["value"])
+        self._percentiles.data.copy_(state_dict["percentiles"])
+
+    @property
+    def value(self):
+        return self._value.cpu().item()
+
+    def _percentile(self, x):
+        x_dtype, x_shape = x.dtype, x.shape
+        x = x.view(x.shape[0], -1)
+        in_sorted, _ = torch.sort(x, dim=0)
+        positions = self._percentiles * (x.shape[0] - 1) / 100
+        floored = torch.floor(positions)
+        ceiled = floored + 1
+        ceiled[ceiled > x.shape[0] - 1] = x.shape[0] - 1
+        weight_ceiled = positions - floored
+        weight_floored = 1.0 - weight_ceiled
+        d0 = in_sorted[floored.long(), :] * weight_floored[:, None]
+        d1 = in_sorted[ceiled.long(), :] * weight_ceiled[:, None]
+        return (d0 + d1).view(-1, *x_shape[1:]).type(x_dtype)
+
+    def update(self, x):
+        percentiles = self._percentile(x.detach())
+        value = torch.clamp(percentiles[1] - percentiles[0], min=1.0)
+        self._value.data.lerp_(value, self.tau)
+
+    def __call__(self, x, update=False):
+        if update:
+            self.update(x)
+        return x * (1 / self.value)
+
+    def __repr__(self):
+        return f"RunningScale(S: {self.value})"
--- a/lerobot/common/policies/tdmpc2/tdmpc2_utils.py
+++ b/lerobot/common/policies/tdmpc2/tdmpc2_utils.py
@@ -0,0 +1,164 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from functorch import combine_state_for_ensemble
+
+
+class Ensemble(nn.Module):
+    """
+    Vectorized ensemble of modules.
+    """
+
+    def __init__(self, modules, **kwargs):
+        super().__init__()
+        modules = nn.ModuleList(modules)
+        fn, params, _ = combine_state_for_ensemble(modules)
+        self.vmap = torch.vmap(fn, in_dims=(0, 0, None), randomness="different", **kwargs)
+        self.params = nn.ParameterList([nn.Parameter(p) for p in params])
+        self._repr = str(modules)
+
+    def forward(self, *args, **kwargs):
+        return self.vmap([p for p in self.params], (), *args, **kwargs)
+
+    def __repr__(self):
+        return "Vectorized " + self._repr
+
+
+class SimNorm(nn.Module):
+    """
+    Simplicial normalization.
+    Adapted from https://arxiv.org/abs/2204.00616.
+    """
+
+    def __init__(self, dim):
+        super().__init__()
+        self.dim = dim
+
+    def forward(self, x):
+        shp = x.shape
+        x = x.view(*shp[:-1], -1, self.dim)
+        x = F.softmax(x, dim=-1)
+        return x.view(*shp)
+
+    def __repr__(self):
+        return f"SimNorm(dim={self.dim})"
+
+
+class NormedLinear(nn.Linear):
+    """
+    Linear layer with LayerNorm, activation, and optionally dropout.
+    """
+
+    def __init__(self, *args, dropout=0.0, act=nn.Mish(inplace=True), **kwargs):
+        super().__init__(*args, **kwargs)
+        self.ln = nn.LayerNorm(self.out_features)
+        self.act = act
+        self.dropout = nn.Dropout(dropout, inplace=True) if dropout else None
+
+    def forward(self, x):
+        x = super().forward(x)
+        if self.dropout:
+            x = self.dropout(x)
+        return self.act(self.ln(x))
+
+    def __repr__(self):
+        repr_dropout = f", dropout={self.dropout.p}" if self.dropout else ""
+        return (
+            f"NormedLinear(in_features={self.in_features}, "
+            f"out_features={self.out_features}, "
+            f"bias={self.bias is not None}{repr_dropout}, "
+            f"act={self.act.__class__.__name__})"
+        )
+
+
+def soft_cross_entropy(pred, target, cfg):
+    """Computes the cross entropy loss between predictions and soft targets."""
+    pred = F.log_softmax(pred, dim=-1)
+    target = two_hot(target, cfg)
+    return -(target * pred).sum(-1, keepdim=True)
+
+
+@torch.jit.script
+def log_std(x, low, dif):
+    return low + 0.5 * dif * (torch.tanh(x) + 1)
+
+
+@torch.jit.script
+def _gaussian_residual(eps, log_std):
+    return -0.5 * eps.pow(2) - log_std
+
+
+@torch.jit.script
+def _gaussian_logprob(residual):
+    return residual - 0.5 * torch.log(2 * torch.pi)
+
+
+def gaussian_logprob(eps, log_std, size=None):
+    """Compute Gaussian log probability."""
+    residual = _gaussian_residual(eps, log_std).sum(-1, keepdim=True)
+    if size is None:
+        size = eps.size(-1)
+    return _gaussian_logprob(residual) * size
+
+
+@torch.jit.script
+def _squash(pi):
+    return torch.log(F.relu(1 - pi.pow(2)) + 1e-6)
+
+
+def squash(mu, pi, log_pi):
+    """Apply squashing function."""
+    mu = torch.tanh(mu)
+    pi = torch.tanh(pi)
+    log_pi -= _squash(pi).sum(-1, keepdim=True)
+    return mu, pi, log_pi
+
+
+@torch.jit.script
+def symlog(x):
+    """
+    Symmetric logarithmic function.
+    Adapted from https://github.com/danijar/dreamerv3.
+    """
+    return torch.sign(x) * torch.log(1 + torch.abs(x))
+
+
+@torch.jit.script
+def symexp(x):
+    """
+    Symmetric exponential function.
+    Adapted from https://github.com/danijar/dreamerv3.
+    """
+    return torch.sign(x) * (torch.exp(torch.abs(x)) - 1)
+
+
+def two_hot(x, cfg):
+    """Converts a batch of scalars to soft two-hot encoded targets for discrete regression."""
+
+    # x shape [horizon, num_features]
+    if cfg.num_bins == 0:
+        return x
+    elif cfg.num_bins == 1:
+        return symlog(x)
+    x = torch.clamp(symlog(x), cfg.vmin, cfg.vmax)
+    bin_idx = torch.floor((x - cfg.vmin) / cfg.bin_size).long()  # shape [num_features]
+    bin_offset = ((x - cfg.vmin) / cfg.bin_size - bin_idx.float()).unsqueeze(-1)  # shape [num_features , 1]
+    soft_two_hot = torch.zeros(
+        *x.shape, cfg.num_bins, device=x.device
+    )  # shape [horizon, num_features, num_bins]
+    soft_two_hot.scatter_(2, bin_idx.unsqueeze(-1), 1 - bin_offset)
+    soft_two_hot.scatter_(2, (bin_idx.unsqueeze(-1) + 1) % cfg.num_bins, bin_offset)
+    return soft_two_hot
+
+
+def two_hot_inv(x, bins):
+    """Converts a batch of soft two-hot encoded vectors to scalars."""
+    num_bins = bins.shape[0]
+    if num_bins == 0:
+        return x
+    elif num_bins == 1:
+        return symexp(x)
+
+    x = F.softmax(x, dim=-1)
+    x = torch.sum(x * bins, dim=-1, keepdim=True)
+    return symexp(x)
--- a/lerobot/common/robot_devices/cameras/intelrealsense.py
+++ b/lerobot/common/robot_devices/cameras/intelrealsense.py
@@ -21,9 +21,9 @@ from PIL import Image
 from lerobot.common.robot_devices.utils import (
    RobotDeviceAlreadyConnectedError,
    RobotDeviceNotConnectedError,
+    busy_wait,
 )
 from lerobot.common.utils.utils import capture_timestamp_utc
-from lerobot.scripts.control_robot import busy_wait

 SERIAL_NUMBER_INDEX = 1

--- a/lerobot/common/robot_devices/cameras/opencv.py
+++ b/lerobot/common/robot_devices/cameras/opencv.py
@@ -156,7 +156,7 @@ def save_images_from_cameras(
                executor.submit(
                    save_image,
                    image,
-                    camera.index,
+                    camera.camera_index,
                    frame_index,
                    images_dir,
                )
--- a/lerobot/common/robot_devices/control_utils.py
+++ b/lerobot/common/robot_devices/control_utils.py
@@ -0,0 +1,330 @@
+########################################################################################
+# Utilities
+########################################################################################
+
+
+import logging
+import time
+import traceback
+from contextlib import nullcontext
+from copy import copy
+from functools import cache
+
+import cv2
+import torch
+import tqdm
+from termcolor import colored
+
+from lerobot.common.datasets.populate_dataset import add_frame, safe_stop_image_writer
+from lerobot.common.policies.factory import make_policy
+from lerobot.common.robot_devices.robots.utils import Robot
+from lerobot.common.robot_devices.utils import busy_wait
+from lerobot.common.utils.utils import get_safe_torch_device, init_hydra_config, set_global_seed
+from lerobot.scripts.eval import get_pretrained_policy_path
+
+
+def log_control_info(robot: Robot, dt_s, episode_index=None, frame_index=None, fps=None):
+    log_items = []
+    if episode_index is not None:
+        log_items.append(f"ep:{episode_index}")
+    if frame_index is not None:
+        log_items.append(f"frame:{frame_index}")
+
+    def log_dt(shortname, dt_val_s):
+        nonlocal log_items, fps
+        info_str = f"{shortname}:{dt_val_s * 1000:5.2f} ({1/ dt_val_s:3.1f}hz)"
+        if fps is not None:
+            actual_fps = 1 / dt_val_s
+            if actual_fps < fps - 1:
+                info_str = colored(info_str, "yellow")
+        log_items.append(info_str)
+
+    # total step time displayed in milliseconds and its frequency
+    log_dt("dt", dt_s)
+
+    # TODO(aliberts): move robot-specific logs logic in robot.print_logs()
+    if not robot.robot_type.startswith("stretch"):
+        for name in robot.leader_arms:
+            key = f"read_leader_{name}_pos_dt_s"
+            if key in robot.logs:
+                log_dt("dtRlead", robot.logs[key])
+
+        for name in robot.follower_arms:
+            key = f"write_follower_{name}_goal_pos_dt_s"
+            if key in robot.logs:
+                log_dt("dtWfoll", robot.logs[key])
+
+            key = f"read_follower_{name}_pos_dt_s"
+            if key in robot.logs:
+                log_dt("dtRfoll", robot.logs[key])
+
+        for name in robot.cameras:
+            key = f"read_camera_{name}_dt_s"
+            if key in robot.logs:
+                log_dt(f"dtR{name}", robot.logs[key])
+
+    info_str = " ".join(log_items)
+    logging.info(info_str)
+
+
+@cache
+def is_headless():
+    """Detects if python is running without a monitor."""
+    try:
+        import pynput  # noqa
+
+        return False
+    except Exception:
+        print(
+            "Error trying to import pynput. Switching to headless mode. "
+            "As a result, the video stream from the cameras won't be shown, "
+            "and you won't be able to change the control flow with keyboards. "
+            "For more info, see traceback below.\n"
+        )
+        traceback.print_exc()
+        print()
+        return True
+
+
+def has_method(_object: object, method_name: str):
+    return hasattr(_object, method_name) and callable(getattr(_object, method_name))
+
+
+def predict_action(observation, policy, device, use_amp):
+    observation = copy(observation)
+    with (
+        torch.inference_mode(),
+        torch.autocast(device_type=device.type) if device.type == "cuda" and use_amp else nullcontext(),
+    ):
+        # Convert to pytorch format: channel first and float32 in [0,1] with batch dimension
+        for name in observation:
+            if "image" in name:
+                observation[name] = observation[name].type(torch.float32) / 255
+                observation[name] = observation[name].permute(2, 0, 1).contiguous()
+            observation[name] = observation[name].unsqueeze(0)
+            observation[name] = observation[name].to(device)
+
+        # Compute the next action with the policy
+        # based on the current observation
+        action = policy.select_action(observation)
+
+        # Remove batch dimension
+        action = action.squeeze(0)
+
+        # Move to cpu, if not already the case
+        action = action.to("cpu")
+
+    return action
+
+
+def init_keyboard_listener():
+    # Allow to exit early while recording an episode or resetting the environment,
+    # by tapping the right arrow key '->'. This might require a sudo permission
+    # to allow your terminal to monitor keyboard events.
+    events = {}
+    events["exit_early"] = False
+    events["rerecord_episode"] = False
+    events["stop_recording"] = False
+
+    if is_headless():
+        logging.warning(
+            "Headless environment detected. On-screen cameras display and keyboard inputs will not be available."
+        )
+        listener = None
+        return listener, events
+
+    # Only import pynput if not in a headless environment
+    from pynput import keyboard
+
+    def on_press(key):
+        try:
+            if key == keyboard.Key.right:
+                print("Right arrow key pressed. Exiting loop...")
+                events["exit_early"] = True
+            elif key == keyboard.Key.left:
+                print("Left arrow key pressed. Exiting loop and rerecord the last episode...")
+                events["rerecord_episode"] = True
+                events["exit_early"] = True
+            elif key == keyboard.Key.esc:
+                print("Escape key pressed. Stopping data recording...")
+                events["stop_recording"] = True
+                events["exit_early"] = True
+        except Exception as e:
+            print(f"Error handling key press: {e}")
+
+    listener = keyboard.Listener(on_press=on_press)
+    listener.start()
+
+    return listener, events
+
+
+def init_policy(pretrained_policy_name_or_path, policy_overrides):
+    """Instantiate the policy and load fps, device and use_amp from config yaml"""
+    pretrained_policy_path = get_pretrained_policy_path(pretrained_policy_name_or_path)
+    hydra_cfg = init_hydra_config(pretrained_policy_path / "config.yaml", policy_overrides)
+    policy = make_policy(hydra_cfg=hydra_cfg, pretrained_policy_name_or_path=pretrained_policy_path)
+
+    # Check device is available
+    device = get_safe_torch_device(hydra_cfg.device, log=True)
+    use_amp = hydra_cfg.use_amp
+    policy_fps = hydra_cfg.env.fps
+
+    policy.eval()
+    policy.to(device)
+
+    torch.backends.cudnn.benchmark = True
+    torch.backends.cuda.matmul.allow_tf32 = True
+    set_global_seed(hydra_cfg.seed)
+    return policy, policy_fps, device, use_amp
+
+
+def warmup_record(
+    robot,
+    events,
+    enable_teloperation,
+    warmup_time_s,
+    display_cameras,
+    fps,
+):
+    control_loop(
+        robot=robot,
+        control_time_s=warmup_time_s,
+        display_cameras=display_cameras,
+        events=events,
+        fps=fps,
+        teleoperate=enable_teloperation,
+    )
+
+
+def record_episode(
+    robot,
+    dataset,
+    events,
+    episode_time_s,
+    display_cameras,
+    policy,
+    device,
+    use_amp,
+    fps,
+):
+    control_loop(
+        robot=robot,
+        control_time_s=episode_time_s,
+        display_cameras=display_cameras,
+        dataset=dataset,
+        events=events,
+        policy=policy,
+        device=device,
+        use_amp=use_amp,
+        fps=fps,
+        teleoperate=policy is None,
+    )
+
+
+@safe_stop_image_writer
+def control_loop(
+    robot,
+    control_time_s=None,
+    teleoperate=False,
+    display_cameras=False,
+    dataset=None,
+    events=None,
+    policy=None,
+    device=None,
+    use_amp=None,
+    fps=None,
+):
+    # TODO(rcadene): Add option to record logs
+    if not robot.is_connected:
+        robot.connect()
+
+    if events is None:
+        events = {"exit_early": False}
+
+    if control_time_s is None:
+        control_time_s = float("inf")
+
+    if teleoperate and policy is not None:
+        raise ValueError("When `teleoperate` is True, `policy` should be None.")
+
+    if dataset is not None and fps is not None and dataset["fps"] != fps:
+        raise ValueError(f"The dataset fps should be equal to requested fps ({dataset['fps']} != {fps}).")
+
+    timestamp = 0
+    start_episode_t = time.perf_counter()
+    while timestamp < control_time_s:
+        start_loop_t = time.perf_counter()
+
+        if teleoperate:
+            observation, action = robot.teleop_step(record_data=True)
+        else:
+            observation = robot.capture_observation()
+
+            if policy is not None:
+                pred_action = predict_action(observation, policy, device, use_amp)
+                # Action can eventually be clipped using `max_relative_target`,
+                # so action actually sent is saved in the dataset.
+                action = robot.send_action(pred_action)
+                action = {"action": action}
+
+        if dataset is not None:
+            add_frame(dataset, observation, action)
+
+        if display_cameras and not is_headless():
+            image_keys = [key for key in observation if "image" in key]
+            for key in image_keys:
+                cv2.imshow(key, cv2.cvtColor(observation[key].numpy(), cv2.COLOR_RGB2BGR))
+            cv2.waitKey(1)
+
+        if fps is not None:
+            dt_s = time.perf_counter() - start_loop_t
+            busy_wait(1 / fps - dt_s)
+
+        dt_s = time.perf_counter() - start_loop_t
+        log_control_info(robot, dt_s, fps=fps)
+
+        timestamp = time.perf_counter() - start_episode_t
+        if events["exit_early"]:
+            events["exit_early"] = False
+            break
+
+
+def reset_environment(robot, events, reset_time_s):
+    # TODO(rcadene): refactor warmup_record and reset_environment
+    # TODO(alibets): allow for teleop during reset
+    if has_method(robot, "teleop_safety_stop"):
+        robot.teleop_safety_stop()
+
+    timestamp = 0
+    start_vencod_t = time.perf_counter()
+
+    # Wait if necessary
+    with tqdm.tqdm(total=reset_time_s, desc="Waiting") as pbar:
+        while timestamp < reset_time_s:
+            time.sleep(1)
+            timestamp = time.perf_counter() - start_vencod_t
+            pbar.update(1)
+            if events["exit_early"]:
+                events["exit_early"] = False
+                break
+
+
+def stop_recording(robot, listener, display_cameras):
+    robot.disconnect()
+
+    if not is_headless():
+        if listener is not None:
+            listener.stop()
+
+        if display_cameras:
+            cv2.destroyAllWindows()
+
+
+def sanity_check_dataset_name(repo_id, policy):
+    _, dataset_name = repo_id.split("/")
+    # either repo_id doesnt start with "eval_" and there is no policy
+    # or repo_id starts with "eval_" and there is a policy
+    if dataset_name.startswith("eval_") == (policy is None):
+        raise ValueError(
+            f"Your dataset name begins by 'eval_' ({dataset_name}) but no policy is provided ({policy})."
+        )
--- a/lerobot/common/robot_devices/motors/dynamixel.py
+++ b/lerobot/common/robot_devices/motors/dynamixel.py
@@ -4,7 +4,6 @@ import math
 import time
 import traceback
 from copy import deepcopy
-from pathlib import Path

 import numpy as np
 import tqdm
@@ -229,35 +228,6 @@ def assert_same_address(model_ctrl_table, motor_models, data_name):
        )


-def find_available_ports():
-    ports = []
-    for path in Path("/dev").glob("tty*"):
-        ports.append(str(path))
-    return ports
-
-
-def find_port():
-    print("Finding all available ports for the DynamixelMotorsBus.")
-    ports_before = find_available_ports()
-    print(ports_before)
-
-    print("Remove the usb cable from your DynamixelMotorsBus and press Enter when done.")
-    input()
-
-    time.sleep(0.5)
-    ports_after = find_available_ports()
-    ports_diff = list(set(ports_before) - set(ports_after))
-
-    if len(ports_diff) == 1:
-        port = ports_diff[0]
-        print(f"The port of this DynamixelMotorsBus is '{port}'")
-        print("Reconnect the usb cable.")
-    elif len(ports_diff) == 0:
-        raise OSError(f"Could not detect the port. No difference was found ({ports_diff}).")
-    else:
-        raise OSError(f"Could not detect the port. More than one port was found ({ports_diff}).")
-
-
 class TorqueMode(enum.Enum):
    ENABLED = 1
    DISABLED = 0
@@ -290,8 +260,8 @@ class DynamixelMotorsBus:
    A DynamixelMotorsBus instance requires a port (e.g. `DynamixelMotorsBus(port="/dev/tty.usbmodem575E0031751"`)).
    To find the port, you can run our utility script:
    ```bash
-    python lerobot/common/robot_devices/motors/dynamixel.py
-    >>> Finding all available ports for the DynamixelMotorsBus.
+    python lerobot/scripts/find_motors_bus_port.py
+    >>> Finding all available ports for the MotorBus.
    >>> ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
    >>> Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
    >>> The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0031751.
@@ -369,7 +339,7 @@ class DynamixelMotorsBus:
        except Exception:
            traceback.print_exc()
            print(
-                "\nTry running `python lerobot/common/robot_devices/motors/dynamixel.py` to make sure you are using the correct port.\n"
+                "\nTry running `python lerobot/scripts/find_motors_bus_port.py` to make sure you are using the correct port.\n"
            )
            raise

@@ -378,20 +348,6 @@ class DynamixelMotorsBus:

        self.port_handler.setPacketTimeoutMillis(TIMEOUT_MS)

-        # Set expected baudrate for the bus
-        self.set_bus_baudrate(BAUDRATE)
-
-        if not self.are_motors_configured():
-            input(
-                "\n/!\\ A configuration issue has been detected with your motors: \n"
-                "If it's the first time that you use these motors, press enter to configure your motors... but before "
-                "verify that all the cables are connected the proper way. If you find an issue, before making a modification, "
-                "kill the python process, unplug the power cord to not damage the motors, rewire correctly, then plug the power "
-                "again and relaunch the script.\n"
-            )
-            print()
-            self.configure_motors()
-
    def reconnect(self):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
@@ -415,120 +371,14 @@ class DynamixelMotorsBus:
            print(e)
            return False

-    def configure_motors(self):
-        # TODO(rcadene): This script assumes motors follow the X_SERIES baudrates
-        # TODO(rcadene): Refactor this function with intermediate high-level functions
-
-        print("Scanning all baudrates and motor indices")
-        all_baudrates = set(X_SERIES_BAUDRATE_TABLE.values())
-        ids_per_baudrate = {}
-        for baudrate in all_baudrates:
-            self.set_bus_baudrate(baudrate)
-            present_ids = self.find_motor_indices()
-            if len(present_ids) > 0:
-                ids_per_baudrate[baudrate] = present_ids
-        print(f"Motor indices detected: {ids_per_baudrate}")
-        print()
-
-        possible_baudrates = list(ids_per_baudrate.keys())
-        possible_ids = list({idx for sublist in ids_per_baudrate.values() for idx in sublist})
-        untaken_ids = list(set(range(MAX_ID_RANGE)) - set(possible_ids) - set(self.motor_indices))
-
-        # Connect successively one motor to the chain and write a unique random index for each
-        for i in range(len(self.motors)):
-            self.disconnect()
-            input(
-                "1. Unplug the power cord\n"
-                "2. Plug/unplug minimal number of cables to only have the first "
-                f"{i+1} motor(s) ({self.motor_names[:i+1]}) connected.\n"
-                "3. Re-plug the power cord\n"
-                "Press Enter to continue..."
-            )
-            print()
-            self.reconnect()
-
-            if i > 0:
-                try:
-                    self._read_with_motor_ids(self.motor_models, untaken_ids[:i], "ID")
-                except ConnectionError:
-                    print(f"Failed to read from {untaken_ids[:i+1]}. Make sure the power cord is plugged in.")
-                    input("Press Enter to continue...")
-                    print()
-                    self.reconnect()
-
-            print("Scanning possible baudrates and motor indices")
-            motor_found = False
-            for baudrate in possible_baudrates:
-                self.set_bus_baudrate(baudrate)
-                present_ids = self.find_motor_indices(possible_ids)
-                if len(present_ids) == 1:
-                    present_idx = present_ids[0]
-                    print(f"Detected motor with index {present_idx}")
-
-                    if baudrate != BAUDRATE:
-                        print(f"Setting its baudrate to {BAUDRATE}")
-                        baudrate_idx = list(X_SERIES_BAUDRATE_TABLE.values()).index(BAUDRATE)
-
-                        # The write can fail, so we allow retries
-                        for _ in range(NUM_WRITE_RETRY):
-                            self._write_with_motor_ids(
-                                self.motor_models, present_idx, "Baud_Rate", baudrate_idx
-                            )
-                            time.sleep(0.5)
-                            self.set_bus_baudrate(BAUDRATE)
-                            try:
-                                present_baudrate_idx = self._read_with_motor_ids(
-                                    self.motor_models, present_idx, "Baud_Rate"
-                                )
-                            except ConnectionError:
-                                print("Failed to write baudrate. Retrying.")
-                                self.set_bus_baudrate(baudrate)
-                                continue
-                            break
-                        else:
-                            raise
-
-                        if present_baudrate_idx != baudrate_idx:
-                            raise OSError("Failed to write baudrate.")
-
-                    print(f"Setting its index to a temporary untaken index ({untaken_ids[i]})")
-                    self._write_with_motor_ids(self.motor_models, present_idx, "ID", untaken_ids[i])
-
-                    present_idx = self._read_with_motor_ids(self.motor_models, untaken_ids[i], "ID")
-                    if present_idx != untaken_ids[i]:
-                        raise OSError("Failed to write index.")
-
-                    motor_found = True
-                    break
-                elif len(present_ids) > 1:
-                    raise OSError(f"More than one motor detected ({present_ids}), but only one was expected.")
-
-            if not motor_found:
-                raise OSError(
-                    "No motor found, but one new motor expected. Verify power cord is plugged in and retry."
-                )
-            print()
-
-        print(f"Setting expected motor indices: {self.motor_indices}")
-        self.set_bus_baudrate(BAUDRATE)
-        self._write_with_motor_ids(
-            self.motor_models, untaken_ids[: len(self.motors)], "ID", self.motor_indices
-        )
-        print()
-
-        if (self.read("ID") != self.motor_indices).any():
-            raise OSError("Failed to write motors indices.")
-
-        print("Configuration is done!")
-
-    def find_motor_indices(self, possible_ids=None):
+    def find_motor_indices(self, possible_ids=None, num_retry=2):
        if possible_ids is None:
            possible_ids = range(MAX_ID_RANGE)

        indices = []
        for idx in tqdm.tqdm(possible_ids):
            try:
-                present_idx = self._read_with_motor_ids(self.motor_models, [idx], "ID")[0]
+                present_idx = self.read_with_motor_ids(self.motor_models, [idx], "ID", num_retry=num_retry)[0]
            except ConnectionError:
                continue

@@ -788,7 +638,7 @@ class DynamixelMotorsBus:
        values = np.round(values).astype(np.int32)
        return values

-    def _read_with_motor_ids(self, motor_models, motor_ids, data_name):
+    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
        else:
@@ -805,7 +655,11 @@ class DynamixelMotorsBus:
        for idx in motor_ids:
            group.addParam(idx)

-        comm = group.txRxPacket()
+        for _ in range(num_retry):
+            comm = group.txRxPacket()
+            if comm == dxl.COMM_SUCCESS:
+                break
+
        if comm != dxl.COMM_SUCCESS:
            raise ConnectionError(
                f"Read failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
@@ -895,7 +749,7 @@ class DynamixelMotorsBus:

        return values

-    def _write_with_motor_ids(self, motor_models, motor_ids, data_name, values):
+    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
        else:
@@ -913,7 +767,11 @@ class DynamixelMotorsBus:
            data = convert_to_bytes(value, bytes, self.mock)
            group.addParam(idx, data)

-        comm = group.txPacket()
+        for _ in range(num_retry):
+            comm = group.txPacket()
+            if comm == dxl.COMM_SUCCESS:
+                break
+
        if comm != dxl.COMM_SUCCESS:
            raise ConnectionError(
                f"Write failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
@@ -1007,8 +865,3 @@ class DynamixelMotorsBus:
    def __del__(self):
        if getattr(self, "is_connected", False):
            self.disconnect()
-
-
-if __name__ == "__main__":
-    # Helper to find the usb port associated to all your DynamixelMotorsBus.
-    find_port()
--- a/lerobot/common/robot_devices/motors/feetech.py
+++ b/lerobot/common/robot_devices/motors/feetech.py
@@ -0,0 +1,887 @@
+import enum
+import logging
+import math
+import time
+import traceback
+from copy import deepcopy
+
+import numpy as np
+import tqdm
+
+from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
+from lerobot.common.utils.utils import capture_timestamp_utc
+
+PROTOCOL_VERSION = 0
+BAUDRATE = 1_000_000
+TIMEOUT_MS = 1000
+
+MAX_ID_RANGE = 252
+
+# The following bounds define the lower and upper joints range (after calibration).
+# For joints in degree (i.e. revolute joints), their nominal range is [-180, 180] degrees
+# which corresponds to a half rotation on the left and half rotation on the right.
+# Some joints might require higher range, so we allow up to [-270, 270] degrees until
+# an error is raised.
+LOWER_BOUND_DEGREE = -270
+UPPER_BOUND_DEGREE = 270
+# For joints in percentage (i.e. joints that move linearly like the prismatic joint of a gripper),
+# their nominal range is [0, 100] %. For instance, for Aloha gripper, 0% is fully
+# closed, and 100% is fully open. To account for slight calibration issue, we allow up to
+# [-10, 110] until an error is raised.
+LOWER_BOUND_LINEAR = -10
+UPPER_BOUND_LINEAR = 110
+
+HALF_TURN_DEGREE = 180
+
+
+# See this link for STS3215 Memory Table:
+# https://docs.google.com/spreadsheets/d/1GVs7W1VS1PqdhA1nW-abeyAHhTUxKUdR/edit?usp=sharing&ouid=116566590112741600240&rtpof=true&sd=true
+# data_name: (address, size_byte)
+SCS_SERIES_CONTROL_TABLE = {
+    "Model": (3, 2),
+    "ID": (5, 1),
+    "Baud_Rate": (6, 1),
+    "Return_Delay": (7, 1),
+    "Response_Status_Level": (8, 1),
+    "Min_Angle_Limit": (9, 2),
+    "Max_Angle_Limit": (11, 2),
+    "Max_Temperature_Limit": (13, 1),
+    "Max_Voltage_Limit": (14, 1),
+    "Min_Voltage_Limit": (15, 1),
+    "Max_Torque_Limit": (16, 2),
+    "Phase": (18, 1),
+    "Unloading_Condition": (19, 1),
+    "LED_Alarm_Condition": (20, 1),
+    "P_Coefficient": (21, 1),
+    "D_Coefficient": (22, 1),
+    "I_Coefficient": (23, 1),
+    "Minimum_Startup_Force": (24, 2),
+    "CW_Dead_Zone": (26, 1),
+    "CCW_Dead_Zone": (27, 1),
+    "Protection_Current": (28, 2),
+    "Angular_Resolution": (30, 1),
+    "Offset": (31, 2),
+    "Mode": (33, 1),
+    "Protective_Torque": (34, 1),
+    "Protection_Time": (35, 1),
+    "Overload_Torque": (36, 1),
+    "Speed_closed_loop_P_proportional_coefficient": (37, 1),
+    "Over_Current_Protection_Time": (38, 1),
+    "Velocity_closed_loop_I_integral_coefficient": (39, 1),
+    "Torque_Enable": (40, 1),
+    "Acceleration": (41, 1),
+    "Goal_Position": (42, 2),
+    "Goal_Time": (44, 2),
+    "Goal_Speed": (46, 2),
+    "Torque_Limit": (48, 2),
+    "Lock": (55, 1),
+    "Present_Position": (56, 2),
+    "Present_Speed": (58, 2),
+    "Present_Load": (60, 2),
+    "Present_Voltage": (62, 1),
+    "Present_Temperature": (63, 1),
+    "Status": (65, 1),
+    "Moving": (66, 1),
+    "Present_Current": (69, 2),
+    # Not in the Memory Table
+    "Maximum_Acceleration": (85, 2),
+}
+
+SCS_SERIES_BAUDRATE_TABLE = {
+    0: 1_000_000,
+    1: 500_000,
+    2: 250_000,
+    3: 128_000,
+    4: 115_200,
+    5: 57_600,
+    6: 38_400,
+    7: 19_200,
+}
+
+CALIBRATION_REQUIRED = ["Goal_Position", "Present_Position"]
+CONVERT_UINT32_TO_INT32_REQUIRED = ["Goal_Position", "Present_Position"]
+
+
+MODEL_CONTROL_TABLE = {
+    "scs_series": SCS_SERIES_CONTROL_TABLE,
+    "sts3215": SCS_SERIES_CONTROL_TABLE,
+}
+
+MODEL_RESOLUTION = {
+    "scs_series": 4096,
+    "sts3215": 4096,
+}
+
+MODEL_BAUDRATE_TABLE = {
+    "scs_series": SCS_SERIES_BAUDRATE_TABLE,
+    "sts3215": SCS_SERIES_BAUDRATE_TABLE,
+}
+
+# High number of retries is needed for feetech compared to dynamixel motors.
+NUM_READ_RETRY = 20
+NUM_WRITE_RETRY = 20
+
+
+def convert_degrees_to_steps(degrees: float | np.ndarray, models: str | list[str]) -> np.ndarray:
+    """This function converts the degree range to the step range for indicating motors rotation.
+    It assumes a motor achieves a full rotation by going from -180 degree position to +180.
+    The motor resolution (e.g. 4096) corresponds to the number of steps needed to achieve a full rotation.
+    """
+    resolutions = [MODEL_RESOLUTION[model] for model in models]
+    steps = degrees / 180 * np.array(resolutions) / 2
+    steps = steps.astype(int)
+    return steps
+
+
+def convert_to_bytes(value, bytes, mock=False):
+    if mock:
+        return value
+
+    import scservo_sdk as scs
+
+    # Note: No need to convert back into unsigned int, since this byte preprocessing
+    # already handles it for us.
+    if bytes == 1:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+        ]
+    elif bytes == 2:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_LOWORD(value)),
+        ]
+    elif bytes == 4:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_LOBYTE(scs.SCS_HIWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_HIWORD(value)),
+        ]
+    else:
+        raise NotImplementedError(
+            f"Value of the number of bytes to be sent is expected to be in [1, 2, 4], but "
+            f"{bytes} is provided instead."
+        )
+    return data
+
+
+def get_group_sync_key(data_name, motor_names):
+    group_key = f"{data_name}_" + "_".join(motor_names)
+    return group_key
+
+
+def get_result_name(fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    rslt_name = f"{fn_name}_{group_key}"
+    return rslt_name
+
+
+def get_queue_name(fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    queue_name = f"{fn_name}_{group_key}"
+    return queue_name
+
+
+def get_log_name(var_name, fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    log_name = f"{var_name}_{fn_name}_{group_key}"
+    return log_name
+
+
+def assert_same_address(model_ctrl_table, motor_models, data_name):
+    all_addr = []
+    all_bytes = []
+    for model in motor_models:
+        addr, bytes = model_ctrl_table[model][data_name]
+        all_addr.append(addr)
+        all_bytes.append(bytes)
+
+    if len(set(all_addr)) != 1:
+        raise NotImplementedError(
+            f"At least two motor models use a different address for `data_name`='{data_name}' ({list(zip(motor_models, all_addr, strict=False))}). Contact a LeRobot maintainer."
+        )
+
+    if len(set(all_bytes)) != 1:
+        raise NotImplementedError(
+            f"At least two motor models use a different bytes representation for `data_name`='{data_name}' ({list(zip(motor_models, all_bytes, strict=False))}). Contact a LeRobot maintainer."
+        )
+
+
+class TorqueMode(enum.Enum):
+    ENABLED = 1
+    DISABLED = 0
+
+
+class DriveMode(enum.Enum):
+    NON_INVERTED = 0
+    INVERTED = 1
+
+
+class CalibrationMode(enum.Enum):
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    DEGREE = 0
+    # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
+    LINEAR = 1
+
+
+class JointOutOfRangeError(Exception):
+    def __init__(self, message="Joint is out of range"):
+        self.message = message
+        super().__init__(self.message)
+
+
+class FeetechMotorsBus:
+    """
+    The FeetechMotorsBus class allows to efficiently read and write to the attached motors. It relies on
+    the python feetech sdk to communicate with the motors. For more info, see the [feetech SDK Documentation](https://emanual.robotis.com/docs/en/software/feetech/feetech_sdk/sample_code/python_read_write_protocol_2_0/#python-read-write-protocol-20).
+
+    A FeetechMotorsBus instance requires a port (e.g. `FeetechMotorsBus(port="/dev/tty.usbmodem575E0031751"`)).
+    To find the port, you can run our utility script:
+    ```bash
+    python lerobot/scripts/find_motors_bus_port.py
+    >>> Finding all available ports for the MotorsBus.
+    >>> ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+    >>> Remove the usb cable from your FeetechMotorsBus and press Enter when done.
+    >>> The port of this FeetechMotorsBus is /dev/tty.usbmodem575E0031751.
+    >>> Reconnect the usb cable.
+    ```
+
+    Example of usage for 1 motor connected to the bus:
+    ```python
+    motor_name = "gripper"
+    motor_index = 6
+    motor_model = "sts3215"
+
+    motors_bus = FeetechMotorsBus(
+        port="/dev/tty.usbmodem575E0031751",
+        motors={motor_name: (motor_index, motor_model)},
+    )
+    motors_bus.connect()
+
+    position = motors_bus.read("Present_Position")
+
+    # move from a few motor steps as an example
+    few_steps = 30
+    motors_bus.write("Goal_Position", position + few_steps)
+
+    # when done, consider disconnecting
+    motors_bus.disconnect()
+    ```
+    """
+
+    def __init__(
+        self,
+        port: str,
+        motors: dict[str, tuple[int, str]],
+        extra_model_control_table: dict[str, list[tuple]] | None = None,
+        extra_model_resolution: dict[str, int] | None = None,
+        mock=False,
+    ):
+        self.port = port
+        self.motors = motors
+        self.mock = mock
+
+        self.model_ctrl_table = deepcopy(MODEL_CONTROL_TABLE)
+        if extra_model_control_table:
+            self.model_ctrl_table.update(extra_model_control_table)
+
+        self.model_resolution = deepcopy(MODEL_RESOLUTION)
+        if extra_model_resolution:
+            self.model_resolution.update(extra_model_resolution)
+
+        self.port_handler = None
+        self.packet_handler = None
+        self.calibration = None
+        self.is_connected = False
+        self.group_readers = {}
+        self.group_writers = {}
+        self.logs = {}
+
+        self.track_positions = {}
+
+    def connect(self):
+        if self.is_connected:
+            raise RobotDeviceAlreadyConnectedError(
+                f"FeetechMotorsBus({self.port}) is already connected. Do not call `motors_bus.connect()` twice."
+            )
+
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        self.port_handler = scs.PortHandler(self.port)
+        self.packet_handler = scs.PacketHandler(PROTOCOL_VERSION)
+
+        try:
+            if not self.port_handler.openPort():
+                raise OSError(f"Failed to open port '{self.port}'.")
+        except Exception:
+            traceback.print_exc()
+            print(
+                "\nTry running `python lerobot/scripts/find_motors_bus_port.py` to make sure you are using the correct port.\n"
+            )
+            raise
+
+        # Allow to read and write
+        self.is_connected = True
+
+        self.port_handler.setPacketTimeoutMillis(TIMEOUT_MS)
+
+    def reconnect(self):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        self.port_handler = scs.PortHandler(self.port)
+        self.packet_handler = scs.PacketHandler(PROTOCOL_VERSION)
+
+        if not self.port_handler.openPort():
+            raise OSError(f"Failed to open port '{self.port}'.")
+
+        self.is_connected = True
+
+    def are_motors_configured(self):
+        # Only check the motor indices and not baudrate, since if the motor baudrates are incorrect,
+        # a ConnectionError will be raised anyway.
+        try:
+            return (self.motor_indices == self.read("ID")).all()
+        except ConnectionError as e:
+            print(e)
+            return False
+
+    def find_motor_indices(self, possible_ids=None, num_retry=2):
+        if possible_ids is None:
+            possible_ids = range(MAX_ID_RANGE)
+
+        indices = []
+        for idx in tqdm.tqdm(possible_ids):
+            try:
+                present_idx = self.read_with_motor_ids(self.motor_models, [idx], "ID", num_retry=num_retry)[0]
+            except ConnectionError:
+                continue
+
+            if idx != present_idx:
+                # sanity check
+                raise OSError(
+                    "Motor index used to communicate through the bus is not the same as the one present in the motor memory. The motor memory might be damaged."
+                )
+            indices.append(idx)
+
+        return indices
+
+    def set_bus_baudrate(self, baudrate):
+        present_bus_baudrate = self.port_handler.getBaudRate()
+        if present_bus_baudrate != baudrate:
+            print(f"Setting bus baud rate to {baudrate}. Previously {present_bus_baudrate}.")
+            self.port_handler.setBaudRate(baudrate)
+
+            if self.port_handler.getBaudRate() != baudrate:
+                raise OSError("Failed to write bus baud rate.")
+
+    @property
+    def motor_names(self) -> list[str]:
+        return list(self.motors.keys())
+
+    @property
+    def motor_models(self) -> list[str]:
+        return [model for _, model in self.motors.values()]
+
+    @property
+    def motor_indices(self) -> list[int]:
+        return [idx for idx, _ in self.motors.values()]
+
+    def set_calibration(self, calibration: dict[str, list]):
+        self.calibration = calibration
+
+    def apply_calibration_autocorrect(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """This function apply the calibration, automatically detects out of range errors for motors values and attempt to correct.
+
+        For more info, see docstring of `apply_calibration` and `autocorrect_calibration`.
+        """
+        try:
+            values = self.apply_calibration(values, motor_names)
+        except JointOutOfRangeError as e:
+            print(e)
+            self.autocorrect_calibration(values, motor_names)
+            values = self.apply_calibration(values, motor_names)
+        return values
+
+    def apply_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """Convert from unsigned int32 joint position range [0, 2**32[ to the universal float32 nominal degree range ]-180.0, 180.0[ with
+        a "zero position" at 0 degree.
+
+        Note: We say "nominal degree range" since the motors can take values outside this range. For instance, 190 degrees, if the motor
+        rotate more than a half a turn from the zero position. However, most motors can't rotate more than 180 degrees and will stay in this range.
+
+        Joints values are original in [0, 2**32[ (unsigned int32). Each motor are expected to complete a full rotation
+        when given a goal position that is + or - their resolution. For instance, feetech xl330-m077 have a resolution of 4096, and
+        at any position in their original range, let's say the position 56734, they complete a full rotation clockwise by moving to 60830,
+        or anticlockwise by moving to 52638. The position in the original range is arbitrary and might change a lot between each motor.
+        To harmonize between motors of the same model, different robots, or even models of different brands, we propose to work
+        in the centered nominal degree range ]-180, 180[.
+        """
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        # Convert from unsigned int32 original range [0, 2**32] to signed float32 range
+        values = values.astype(np.float32)
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                # Update direction of rotation of the motor to match between leader and follower.
+                # In fact, the motor of the leader for a given joint can be assembled in an
+                # opposite direction in term of rotation than the motor of the follower on the same joint.
+                if drive_mode:
+                    values[i] *= -1
+
+                # Convert from range [-2**31, 2**31[ to
+                # nominal range ]-resolution, resolution[ (e.g. ]-2048, 2048[)
+                values[i] += homing_offset
+
+                # Convert from range ]-resolution, resolution[ to
+                # universal float32 centered degree range ]-180, 180[
+                values[i] = values[i] / (resolution // 2) * HALF_TURN_DEGREE
+
+                if (values[i] < LOWER_BOUND_DEGREE) or (values[i] > UPPER_BOUND_DEGREE):
+                    raise JointOutOfRangeError(
+                        f"Wrong motor position range detected for {name}. "
+                        f"Expected to be in nominal range of [-{HALF_TURN_DEGREE}, {HALF_TURN_DEGREE}] degrees (a full rotation), "
+                        f"with a maximum range of [{LOWER_BOUND_DEGREE}, {UPPER_BOUND_DEGREE}] degrees to account for joints that can rotate a bit more, "
+                        f"but present value is {values[i]} degree. "
+                        "This might be due to a cable connection issue creating an artificial 360 degrees jump in motor values. "
+                        "You need to recalibrate by running: `python lerobot/scripts/control_robot.py calibrate`"
+                    )
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Rescale the present position to a nominal range [0, 100] %,
+                # useful for joints with linear motions like Aloha gripper
+                values[i] = (values[i] - start_pos) / (end_pos - start_pos) * 100
+
+                if (values[i] < LOWER_BOUND_LINEAR) or (values[i] > UPPER_BOUND_LINEAR):
+                    raise JointOutOfRangeError(
+                        f"Wrong motor position range detected for {name}. "
+                        f"Expected to be in nominal range of [0, 100] % (a full linear translation), "
+                        f"with a maximum range of [{LOWER_BOUND_LINEAR}, {UPPER_BOUND_LINEAR}] % to account for some imprecision during calibration, "
+                        f"but present value is {values[i]} %. "
+                        "This might be due to a cable connection issue creating an artificial jump in motor values. "
+                        "You need to recalibrate by running: `python lerobot/scripts/control_robot.py calibrate`"
+                    )
+
+        return values
+
+    def autocorrect_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """This function automatically detects issues with values of motors after calibration, and correct for these issues.
+
+        Some motors might have values outside of expected maximum bounds after calibration.
+        For instance, for a joint in degree, its value can be outside [-270, 270] degrees, which is totally unexpected given
+        a nominal range of [-180, 180] degrees, which represents half a turn to the left or right starting from zero position.
+
+        Known issues:
+        #1: Motor value randomly shifts of a full turn, caused by hardware/connection errors.
+        #2: Motor internal homing offset is shifted of a full turn, caused by using default calibration (e.g Aloha).
+        #3: motor internal homing offset is shifted of less or more than a full turn, caused by using default calibration
+            or by human error during manual calibration.
+
+        Issues #1 and #2 can be solved by shifting the calibration homing offset by a full turn.
+        Issue #3 will be visually detected by user and potentially captured by the safety feature `max_relative_target`,
+        that will slow down the motor, raise an error asking to recalibrate. Manual recalibrating will solve the issue.
+
+        Note: A full turn corresponds to 360 degrees but also to 4096 steps for a motor resolution of 4096.
+        """
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        # Convert from unsigned int32 original range [0, 2**32] to signed float32 range
+        values = values.astype(np.float32)
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                if drive_mode:
+                    values[i] *= -1
+
+                # Convert from initial range to range [-180, 180] degrees
+                calib_val = (values[i] + homing_offset) / (resolution // 2) * HALF_TURN_DEGREE
+                in_range = (calib_val > LOWER_BOUND_DEGREE) and (calib_val < UPPER_BOUND_DEGREE)
+
+                # Solve this inequality to find the factor to shift the range into [-180, 180] degrees
+                # values[i] = (values[i] + homing_offset + resolution * factor) / (resolution // 2) * HALF_TURN_DEGREE
+                # - HALF_TURN_DEGREE <= (values[i] + homing_offset + resolution * factor) / (resolution // 2) * HALF_TURN_DEGREE <= HALF_TURN_DEGREE
+                # (- HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset) / resolution <= factor <= (HALF_TURN_DEGREE / 180 * (resolution // 2) - values[i] - homing_offset) / resolution
+                low_factor = (
+                    -HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset
+                ) / resolution
+                upp_factor = (
+                    HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset
+                ) / resolution
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Convert from initial range to range [0, 100] in %
+                calib_val = (values[i] - start_pos) / (end_pos - start_pos) * 100
+                in_range = (calib_val > LOWER_BOUND_LINEAR) and (calib_val < UPPER_BOUND_LINEAR)
+
+                # Solve this inequality to find the factor to shift the range into [0, 100] %
+                # values[i] = (values[i] - start_pos + resolution * factor) / (end_pos + resolution * factor - start_pos - resolution * factor) * 100
+                # values[i] = (values[i] - start_pos + resolution * factor) / (end_pos - start_pos) * 100
+                # 0 <= (values[i] - start_pos + resolution * factor) / (end_pos - start_pos) * 100 <= 100
+                # (start_pos - values[i]) / resolution <= factor <= (end_pos - values[i]) / resolution
+                low_factor = (start_pos - values[i]) / resolution
+                upp_factor = (end_pos - values[i]) / resolution
+
+            if not in_range:
+                # Get first integer between the two bounds
+                if low_factor < upp_factor:
+                    factor = math.ceil(low_factor)
+
+                    if factor > upp_factor:
+                        raise ValueError(f"No integer found between bounds [{low_factor=}, {upp_factor=}]")
+                else:
+                    factor = math.ceil(upp_factor)
+
+                    if factor > low_factor:
+                        raise ValueError(f"No integer found between bounds [{low_factor=}, {upp_factor=}]")
+
+                if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                    out_of_range_str = f"{LOWER_BOUND_DEGREE} < {calib_val} < {UPPER_BOUND_DEGREE} degrees"
+                    in_range_str = f"{LOWER_BOUND_DEGREE} < {calib_val} < {UPPER_BOUND_DEGREE} degrees"
+                elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                    out_of_range_str = f"{LOWER_BOUND_LINEAR} < {calib_val} < {UPPER_BOUND_LINEAR} %"
+                    in_range_str = f"{LOWER_BOUND_LINEAR} < {calib_val} < {UPPER_BOUND_LINEAR} %"
+
+                logging.warning(
+                    f"Auto-correct calibration of motor '{name}' by shifting value by {abs(factor)} full turns, "
+                    f"from '{out_of_range_str}' to '{in_range_str}'."
+                )
+
+                # A full turn corresponds to 360 degrees but also to 4096 steps for a motor resolution of 4096.
+                self.calibration["homing_offset"][calib_idx] += resolution * factor
+
+    def revert_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """Inverse of `apply_calibration`."""
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                # Convert from nominal 0-centered degree range [-180, 180] to
+                # 0-centered resolution range (e.g. [-2048, 2048] for resolution=4096)
+                values[i] = values[i] / HALF_TURN_DEGREE * (resolution // 2)
+
+                # Substract the homing offsets to come back to actual motor range of values
+                # which can be arbitrary.
+                values[i] -= homing_offset
+
+                # Remove drive mode, which is the rotation direction of the motor, to come back to
+                # actual motor rotation direction which can be arbitrary.
+                if drive_mode:
+                    values[i] *= -1
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Convert from nominal lnear range of [0, 100] % to
+                # actual motor range of values which can be arbitrary.
+                values[i] = values[i] / 100 * (end_pos - start_pos) + start_pos
+
+        values = np.round(values).astype(np.int32)
+        return values
+
+    def avoid_rotation_reset(self, values, motor_names, data_name):
+        if data_name not in self.track_positions:
+            self.track_positions[data_name] = {
+                "prev": [None] * len(self.motor_names),
+                # Assume False at initialization
+                "below_zero": [False] * len(self.motor_names),
+                "above_max": [False] * len(self.motor_names),
+            }
+
+        track = self.track_positions[data_name]
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        for i, name in enumerate(motor_names):
+            idx = self.motor_names.index(name)
+
+            if track["prev"][idx] is None:
+                track["prev"][idx] = values[i]
+                continue
+
+            # Detect a full rotation occured
+            if abs(track["prev"][idx] - values[i]) > 2048:
+                # Position went below 0 and got reset to 4095
+                if track["prev"][idx] < values[i]:
+                    # So we set negative value by adding a full rotation
+                    values[i] -= 4096
+
+                # Position went above 4095 and got reset to 0
+                elif track["prev"][idx] > values[i]:
+                    # So we add a full rotation
+                    values[i] += 4096
+
+            track["prev"][idx] = values[i]
+
+        return values
+
+    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        return_list = True
+        if not isinstance(motor_ids, list):
+            return_list = False
+            motor_ids = [motor_ids]
+
+        assert_same_address(self.model_ctrl_table, self.motor_models, data_name)
+        addr, bytes = self.model_ctrl_table[motor_models[0]][data_name]
+        group = scs.GroupSyncRead(self.port_handler, self.packet_handler, addr, bytes)
+        for idx in motor_ids:
+            group.addParam(idx)
+
+        for _ in range(num_retry):
+            comm = group.txRxPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Read failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        values = []
+        for idx in motor_ids:
+            value = group.getData(idx, addr, bytes)
+            values.append(value)
+
+        if return_list:
+            return values
+        else:
+            return values[0]
+
+    def read(self, data_name, motor_names: str | list[str] | None = None):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. You need to run `motors_bus.connect()`."
+            )
+
+        start_time = time.perf_counter()
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        motor_ids = []
+        models = []
+        for name in motor_names:
+            motor_idx, model = self.motors[name]
+            motor_ids.append(motor_idx)
+            models.append(model)
+
+        assert_same_address(self.model_ctrl_table, models, data_name)
+        addr, bytes = self.model_ctrl_table[model][data_name]
+        group_key = get_group_sync_key(data_name, motor_names)
+
+        if data_name not in self.group_readers:
+            # create new group reader
+            self.group_readers[group_key] = scs.GroupSyncRead(
+                self.port_handler, self.packet_handler, addr, bytes
+            )
+            for idx in motor_ids:
+                self.group_readers[group_key].addParam(idx)
+
+        for _ in range(NUM_READ_RETRY):
+            comm = self.group_readers[group_key].txRxPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Read failed due to communication error on port {self.port} for group_key {group_key}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        values = []
+        for idx in motor_ids:
+            value = self.group_readers[group_key].getData(idx, addr, bytes)
+            values.append(value)
+
+        values = np.array(values)
+
+        # Convert to signed int to use range [-2048, 2048] for our motor positions.
+        if data_name in CONVERT_UINT32_TO_INT32_REQUIRED:
+            values = values.astype(np.int32)
+
+        if data_name in CALIBRATION_REQUIRED:
+            values = self.avoid_rotation_reset(values, motor_names, data_name)
+
+        if data_name in CALIBRATION_REQUIRED and self.calibration is not None:
+            values = self.apply_calibration_autocorrect(values, motor_names)
+
+        # log the number of seconds it took to read the data from the motors
+        delta_ts_name = get_log_name("delta_timestamp_s", "read", data_name, motor_names)
+        self.logs[delta_ts_name] = time.perf_counter() - start_time
+
+        # log the utc time at which the data was received
+        ts_utc_name = get_log_name("timestamp_utc", "read", data_name, motor_names)
+        self.logs[ts_utc_name] = capture_timestamp_utc()
+
+        return values
+
+    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if not isinstance(motor_ids, list):
+            motor_ids = [motor_ids]
+        if not isinstance(values, list):
+            values = [values]
+
+        assert_same_address(self.model_ctrl_table, motor_models, data_name)
+        addr, bytes = self.model_ctrl_table[motor_models[0]][data_name]
+        group = scs.GroupSyncWrite(self.port_handler, self.packet_handler, addr, bytes)
+        for idx, value in zip(motor_ids, values, strict=True):
+            data = convert_to_bytes(value, bytes, self.mock)
+            group.addParam(idx, data)
+
+        for _ in range(num_retry):
+            comm = group.txPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Write failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+    def write(self, data_name, values: int | float | np.ndarray, motor_names: str | list[str] | None = None):
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. You need to run `motors_bus.connect()`."
+            )
+
+        start_time = time.perf_counter()
+
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        if isinstance(values, (int, float, np.integer)):
+            values = [int(values)] * len(motor_names)
+
+        values = np.array(values)
+
+        motor_ids = []
+        models = []
+        for name in motor_names:
+            motor_idx, model = self.motors[name]
+            motor_ids.append(motor_idx)
+            models.append(model)
+
+        if data_name in CALIBRATION_REQUIRED and self.calibration is not None:
+            values = self.revert_calibration(values, motor_names)
+
+        values = values.tolist()
+
+        assert_same_address(self.model_ctrl_table, models, data_name)
+        addr, bytes = self.model_ctrl_table[model][data_name]
+        group_key = get_group_sync_key(data_name, motor_names)
+
+        init_group = data_name not in self.group_readers
+        if init_group:
+            self.group_writers[group_key] = scs.GroupSyncWrite(
+                self.port_handler, self.packet_handler, addr, bytes
+            )
+
+        for idx, value in zip(motor_ids, values, strict=True):
+            data = convert_to_bytes(value, bytes, self.mock)
+            if init_group:
+                self.group_writers[group_key].addParam(idx, data)
+            else:
+                self.group_writers[group_key].changeParam(idx, data)
+
+        comm = self.group_writers[group_key].txPacket()
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Write failed due to communication error on port {self.port} for group_key {group_key}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        # log the number of seconds it took to write the data to the motors
+        delta_ts_name = get_log_name("delta_timestamp_s", "write", data_name, motor_names)
+        self.logs[delta_ts_name] = time.perf_counter() - start_time
+
+        # TODO(rcadene): should we log the time before sending the write command?
+        # log the utc time when the write has been completed
+        ts_utc_name = get_log_name("timestamp_utc", "write", data_name, motor_names)
+        self.logs[ts_utc_name] = capture_timestamp_utc()
+
+    def disconnect(self):
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. Try running `motors_bus.connect()` first."
+            )
+
+        if self.port_handler is not None:
+            self.port_handler.closePort()
+            self.port_handler = None
+
+        self.packet_handler = None
+        self.group_readers = {}
+        self.group_writers = {}
+        self.is_connected = False
+
+    def __del__(self):
+        if getattr(self, "is_connected", False):
+            self.disconnect()
--- a/lerobot/common/robot_devices/robots/dynamixel_calibration.py
+++ b/lerobot/common/robot_devices/robots/dynamixel_calibration.py
@@ -0,0 +1,130 @@
+"""Logic to calibrate a robot arm built with dynamixel motors"""
+# TODO(rcadene, aliberts): move this logic into the robot code when refactoring
+
+import numpy as np
+
+from lerobot.common.robot_devices.motors.dynamixel import (
+    CalibrationMode,
+    TorqueMode,
+    convert_degrees_to_steps,
+)
+from lerobot.common.robot_devices.motors.utils import MotorsBus
+
+URL_TEMPLATE = (
+    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
+)
+
+# The following positions are provided in nominal degree range ]-180, +180[
+# For more info on these constants, see comments in the code where they get used.
+ZERO_POSITION_DEGREE = 0
+ROTATED_POSITION_DEGREE = 90
+
+
+def assert_drive_mode(drive_mode):
+    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
+    if not np.all(np.isin(drive_mode, [0, 1])):
+        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
+
+
+def apply_drive_mode(position, drive_mode):
+    assert_drive_mode(drive_mode)
+    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
+    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
+    signed_drive_mode = -(drive_mode * 2 - 1)
+    position *= signed_drive_mode
+    return position
+
+
+def compute_nearest_rounded_position(position, models):
+    delta_turn = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, models)
+    nearest_pos = np.round(position.astype(float) / delta_turn) * delta_turn
+    return nearest_pos.astype(position.dtype)
+
+
+def run_arm_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """This function ensures that a neural network trained on data collected on a given robot
+    can work on another robot. For instance before calibration, setting a same goal position
+    for each motor of two different robots will get two very different positions. But after calibration,
+    the two robots will move to the same position.To this end, this function computes the homing offset
+    and the drive mode for each motor of a given robot.
+
+    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
+    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
+    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
+
+    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
+    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
+    to the "rotated position".
+
+    After calibration, the homing offsets and drive modes are stored in a cache.
+
+    Example of usage:
+    ```python
+    run_arm_calibration(arm, "koch", "left", "follower")
+    ```
+    """
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to zero position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
+    input("Press Enter to continue...")
+
+    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
+    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
+    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
+    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
+
+    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
+    zero_pos = arm.read("Present_Position")
+    zero_nearest_pos = compute_nearest_rounded_position(zero_pos, arm.motor_models)
+    homing_offset = zero_target_pos - zero_nearest_pos
+
+    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
+    # This allows to identify the rotation direction of each motor.
+    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
+    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
+    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
+    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
+    # of the previous motor in the kinetic chain.
+    print("\nMove arm to rotated target position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
+    input("Press Enter to continue...")
+
+    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
+
+    # Find drive mode by rotating each motor by a quarter of a turn.
+    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
+    rotated_pos = arm.read("Present_Position")
+    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
+
+    # Re-compute homing offset to take into account drive mode
+    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
+    rotated_nearest_pos = compute_nearest_rounded_position(rotated_drived_pos, arm.motor_models)
+    homing_offset = rotated_target_pos - rotated_nearest_pos
+
+    print("\nMove arm to rest position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
+    input("Press Enter to continue...")
+    print()
+
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    calib_mode = [CalibrationMode.DEGREE.name] * len(arm.motor_names)
+
+    # TODO(rcadene): make type of joints (DEGREE or LINEAR) configurable from yaml?
+    if robot_type in ["aloha"] and "gripper" in arm.motor_names:
+        # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
+        calib_idx = arm.motor_names.index("gripper")
+        calib_mode[calib_idx] = CalibrationMode.LINEAR.name
+
+    calib_data = {
+        "homing_offset": homing_offset.tolist(),
+        "drive_mode": drive_mode.tolist(),
+        "start_pos": zero_pos.tolist(),
+        "end_pos": rotated_pos.tolist(),
+        "calib_mode": calib_mode,
+        "motor_names": arm.motor_names,
+    }
+    return calib_data
--- a/lerobot/common/robot_devices/robots/feetech_calibration.py
+++ b/lerobot/common/robot_devices/robots/feetech_calibration.py
@@ -0,0 +1,484 @@
+"""Logic to calibrate a robot arm built with feetech motors"""
+# TODO(rcadene, aliberts): move this logic into the robot code when refactoring
+
+import time
+
+import numpy as np
+
+from lerobot.common.robot_devices.motors.feetech import (
+    CalibrationMode,
+    TorqueMode,
+    convert_degrees_to_steps,
+)
+from lerobot.common.robot_devices.motors.utils import MotorsBus
+
+URL_TEMPLATE = (
+    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
+)
+
+# The following positions are provided in nominal degree range ]-180, +180[
+# For more info on these constants, see comments in the code where they get used.
+ZERO_POSITION_DEGREE = 0
+ROTATED_POSITION_DEGREE = 90
+
+
+def assert_drive_mode(drive_mode):
+    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
+    if not np.all(np.isin(drive_mode, [0, 1])):
+        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
+
+
+def apply_drive_mode(position, drive_mode):
+    assert_drive_mode(drive_mode)
+    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
+    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
+    signed_drive_mode = -(drive_mode * 2 - 1)
+    position *= signed_drive_mode
+    return position
+
+
+def move_until_block(arm, motor_name, positive_direction=True, while_move_hook=None):
+    count = 0
+    while True:
+        present_pos = arm.read("Present_Position", motor_name)
+        if positive_direction:
+            # Move +100 steps every time. Lower the steps to lower the speed at which the arm moves.
+            arm.write("Goal_Position", present_pos + 100, motor_name)
+        else:
+            arm.write("Goal_Position", present_pos - 100, motor_name)
+
+        if while_move_hook is not None:
+            while_move_hook()
+
+        present_pos = arm.read("Present_Position", motor_name).item()
+        present_speed = arm.read("Present_Speed", motor_name).item()
+        present_current = arm.read("Present_Current", motor_name).item()
+        # present_load = arm.read("Present_Load", motor_name).item()
+        # present_voltage = arm.read("Present_Voltage", motor_name).item()
+        # present_temperature = arm.read("Present_Temperature", motor_name).item()
+
+        # print(f"{present_pos=}")
+        # print(f"{present_speed=}")
+        # print(f"{present_current=}")
+        # print(f"{present_load=}")
+        # print(f"{present_voltage=}")
+        # print(f"{present_temperature=}")
+
+        if present_speed == 0 and present_current > 40:
+            count += 1
+            if count > 100 or present_current > 300:
+                return present_pos
+        else:
+            count = 0
+
+
+def move_to_calibrate(
+    arm,
+    motor_name,
+    invert_drive_mode=False,
+    positive_first=True,
+    in_between_move_hook=None,
+    while_move_hook=None,
+):
+    initial_pos = arm.read("Present_Position", motor_name)
+
+    if positive_first:
+        p_present_pos = move_until_block(
+            arm, motor_name, positive_direction=True, while_move_hook=while_move_hook
+        )
+    else:
+        n_present_pos = move_until_block(
+            arm, motor_name, positive_direction=False, while_move_hook=while_move_hook
+        )
+
+    if in_between_move_hook is not None:
+        in_between_move_hook()
+
+    if positive_first:
+        n_present_pos = move_until_block(
+            arm, motor_name, positive_direction=False, while_move_hook=while_move_hook
+        )
+    else:
+        p_present_pos = move_until_block(
+            arm, motor_name, positive_direction=True, while_move_hook=while_move_hook
+        )
+
+    zero_pos = (n_present_pos + p_present_pos) / 2
+
+    calib_data = {
+        "initial_pos": initial_pos,
+        "homing_offset": zero_pos if invert_drive_mode else -zero_pos,
+        "invert_drive_mode": invert_drive_mode,
+        "drive_mode": -1 if invert_drive_mode else 0,
+        "zero_pos": zero_pos,
+        "start_pos": n_present_pos if invert_drive_mode else p_present_pos,
+        "end_pos": p_present_pos if invert_drive_mode else n_present_pos,
+    }
+    return calib_data
+
+
+def apply_offset(calib, offset):
+    calib["zero_pos"] += offset
+    if calib["drive_mode"]:
+        calib["homing_offset"] += offset
+    else:
+        calib["homing_offset"] -= offset
+    return calib
+
+
+def run_arm_auto_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    if robot_type == "so100":
+        return run_arm_auto_calibration_so100(arm, robot_type, arm_name, arm_type)
+    elif robot_type == "moss":
+        return run_arm_auto_calibration_moss(arm, robot_type, arm_name, arm_type)
+    else:
+        raise ValueError(robot_type)
+
+
+def run_arm_auto_calibration_so100(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """All the offsets and magic numbers are hand tuned, and are unique to SO-100 follower arms"""
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    if not (robot_type == "so100" and arm_type == "follower"):
+        raise NotImplementedError("Auto calibration only supports the follower of so100 arms for now.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to initial position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="initial"))
+    input("Press Enter to continue...")
+
+    # Lower the acceleration of the motors (in [0,254])
+    initial_acceleration = arm.read("Acceleration")
+    arm.write("Lock", 0)
+    arm.write("Acceleration", 10)
+    time.sleep(1)
+
+    arm.write("Torque_Enable", TorqueMode.ENABLED.value)
+
+    print(f'{arm.read("Present_Position", "elbow_flex")=}')
+
+    calib = {}
+
+    init_wf_pos = arm.read("Present_Position", "wrist_flex")
+    init_sl_pos = arm.read("Present_Position", "shoulder_lift")
+    init_ef_pos = arm.read("Present_Position", "elbow_flex")
+    arm.write("Goal_Position", init_wf_pos - 800, "wrist_flex")
+    arm.write("Goal_Position", init_sl_pos + 150 + 1024, "shoulder_lift")
+    arm.write("Goal_Position", init_ef_pos - 2048, "elbow_flex")
+    time.sleep(2)
+
+    print("Calibrate shoulder_pan")
+    calib["shoulder_pan"] = move_to_calibrate(arm, "shoulder_pan")
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    print("Calibrate gripper")
+    calib["gripper"] = move_to_calibrate(arm, "gripper", invert_drive_mode=True)
+    time.sleep(1)
+
+    print("Calibrate wrist_flex")
+    calib["wrist_flex"] = move_to_calibrate(arm, "wrist_flex")
+    calib["wrist_flex"] = apply_offset(calib["wrist_flex"], offset=80)
+
+    def in_between_move_hook():
+        nonlocal arm, calib
+        time.sleep(2)
+        ef_pos = arm.read("Present_Position", "elbow_flex")
+        sl_pos = arm.read("Present_Position", "shoulder_lift")
+        arm.write("Goal_Position", ef_pos + 1024, "elbow_flex")
+        arm.write("Goal_Position", sl_pos - 1024, "shoulder_lift")
+        time.sleep(2)
+
+    print("Calibrate elbow_flex")
+    calib["elbow_flex"] = move_to_calibrate(
+        arm, "elbow_flex", positive_first=False, in_between_move_hook=in_between_move_hook
+    )
+    calib["elbow_flex"] = apply_offset(calib["elbow_flex"], offset=80 - 1024)
+
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 1024 + 512, "elbow_flex")
+    time.sleep(1)
+
+    def in_between_move_hook():
+        nonlocal arm, calib
+        arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"], "elbow_flex")
+
+    print("Calibrate shoulder_lift")
+    calib["shoulder_lift"] = move_to_calibrate(
+        arm,
+        "shoulder_lift",
+        invert_drive_mode=True,
+        positive_first=False,
+        in_between_move_hook=in_between_move_hook,
+    )
+    # add an 30 steps as offset to align with body
+    calib["shoulder_lift"] = apply_offset(calib["shoulder_lift"], offset=1024 - 50)
+
+    def while_move_hook():
+        nonlocal arm, calib
+        positions = {
+            "shoulder_lift": round(calib["shoulder_lift"]["zero_pos"] - 1600),
+            "elbow_flex": round(calib["elbow_flex"]["zero_pos"] + 1700),
+            "wrist_flex": round(calib["wrist_flex"]["zero_pos"] + 800),
+            "gripper": round(calib["gripper"]["end_pos"]),
+        }
+        arm.write("Goal_Position", list(positions.values()), list(positions.keys()))
+
+    arm.write("Goal_Position", round(calib["shoulder_lift"]["zero_pos"] - 1600), "shoulder_lift")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["elbow_flex"]["zero_pos"] + 1700), "elbow_flex")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["wrist_flex"]["zero_pos"] + 800), "wrist_flex")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["gripper"]["end_pos"]), "gripper")
+    time.sleep(2)
+
+    print("Calibrate wrist_roll")
+    calib["wrist_roll"] = move_to_calibrate(
+        arm, "wrist_roll", invert_drive_mode=True, positive_first=False, while_move_hook=while_move_hook
+    )
+
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"], "wrist_roll")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["gripper"]["start_pos"], "gripper")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"], "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 2048, "elbow_flex")
+    arm.write("Goal_Position", calib["shoulder_lift"]["zero_pos"] - 2048, "shoulder_lift")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": [calib[name]["homing_offset"] for name in arm.motor_names],
+        "drive_mode": [calib[name]["drive_mode"] for name in arm.motor_names],
+        "start_pos": [calib[name]["start_pos"] for name in arm.motor_names],
+        "end_pos": [calib[name]["end_pos"] for name in arm.motor_names],
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+
+    # Re-enable original accerlation
+    arm.write("Lock", 0)
+    arm.write("Acceleration", initial_acceleration)
+    time.sleep(1)
+
+    return calib_dict
+
+
+def run_arm_auto_calibration_moss(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """All the offsets and magic numbers are hand tuned, and are unique to SO-100 follower arms"""
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    if not (robot_type == "moss" and arm_type == "follower"):
+        raise NotImplementedError("Auto calibration only supports the follower of moss arms for now.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to initial position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="initial"))
+    input("Press Enter to continue...")
+
+    # Lower the acceleration of the motors (in [0,254])
+    initial_acceleration = arm.read("Acceleration")
+    arm.write("Lock", 0)
+    arm.write("Acceleration", 10)
+    time.sleep(1)
+
+    arm.write("Torque_Enable", TorqueMode.ENABLED.value)
+
+    sl_pos = arm.read("Present_Position", "shoulder_lift")
+    arm.write("Goal_Position", sl_pos - 1024 - 450, "shoulder_lift")
+    ef_pos = arm.read("Present_Position", "elbow_flex")
+    arm.write("Goal_Position", ef_pos + 1024 + 450, "elbow_flex")
+    time.sleep(2)
+
+    calib = {}
+
+    print("Calibrate shoulder_pan")
+    calib["shoulder_pan"] = move_to_calibrate(arm, "shoulder_pan")
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    print("Calibrate gripper")
+    calib["gripper"] = move_to_calibrate(arm, "gripper", invert_drive_mode=True)
+    time.sleep(1)
+
+    print("Calibrate wrist_flex")
+    calib["wrist_flex"] = move_to_calibrate(arm, "wrist_flex", invert_drive_mode=True)
+    calib["wrist_flex"] = apply_offset(calib["wrist_flex"], offset=-210 + 1024)
+
+    wr_pos = arm.read("Present_Position", "wrist_roll")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", wr_pos - 1024, "wrist_roll")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 2048, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["gripper"]["end_pos"], "gripper")
+    time.sleep(1)
+
+    print("Calibrate wrist_roll")
+    calib["wrist_roll"] = move_to_calibrate(arm, "wrist_roll", invert_drive_mode=True)
+    calib["wrist_roll"] = apply_offset(calib["wrist_roll"], offset=790)
+
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"] - 1024, "wrist_roll")
+    arm.write("Goal_Position", calib["gripper"]["start_pos"], "gripper")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"], "wrist_roll")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 2048, "wrist_flex")
+
+    def in_between_move_elbow_flex_hook():
+        nonlocal arm, calib
+        arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"], "wrist_flex")
+
+    print("Calibrate elbow_flex")
+    calib["elbow_flex"] = move_to_calibrate(
+        arm,
+        "elbow_flex",
+        invert_drive_mode=True,
+        in_between_move_hook=in_between_move_elbow_flex_hook,
+    )
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+
+    def in_between_move_shoulder_lift_hook():
+        nonlocal arm, calib
+        sl = arm.read("Present_Position", "shoulder_lift")
+        arm.write("Goal_Position", sl - 1500, "shoulder_lift")
+        time.sleep(1)
+        arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 1536, "elbow_flex")
+        time.sleep(1)
+        arm.write("Goal_Position", calib["wrist_flex"]["start_pos"], "wrist_flex")
+        time.sleep(1)
+
+    print("Calibrate shoulder_lift")
+    calib["shoulder_lift"] = move_to_calibrate(
+        arm, "shoulder_lift", in_between_move_hook=in_between_move_shoulder_lift_hook
+    )
+    calib["shoulder_lift"] = apply_offset(calib["shoulder_lift"], offset=-1024)
+
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["shoulder_lift"]["zero_pos"] + 2048, "shoulder_lift")
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] - 1024 - 400, "elbow_flex")
+    time.sleep(2)
+
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": [calib[name]["homing_offset"] for name in arm.motor_names],
+        "drive_mode": [calib[name]["drive_mode"] for name in arm.motor_names],
+        "start_pos": [calib[name]["start_pos"] for name in arm.motor_names],
+        "end_pos": [calib[name]["end_pos"] for name in arm.motor_names],
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+
+    # Re-enable original accerlation
+    arm.write("Lock", 0)
+    arm.write("Acceleration", initial_acceleration)
+    time.sleep(1)
+
+    return calib_dict
+
+
+def run_arm_manual_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """This function ensures that a neural network trained on data collected on a given robot
+    can work on another robot. For instance before calibration, setting a same goal position
+    for each motor of two different robots will get two very different positions. But after calibration,
+    the two robots will move to the same position.To this end, this function computes the homing offset
+    and the drive mode for each motor of a given robot.
+
+    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
+    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
+    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
+
+    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
+    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
+    to the "rotated position".
+
+    After calibration, the homing offsets and drive modes are stored in a cache.
+
+    Example of usage:
+    ```python
+    run_arm_calibration(arm, "so100", "left", "follower")
+    ```
+    """
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to zero position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
+    input("Press Enter to continue...")
+
+    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
+    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
+    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
+    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
+
+    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
+    zero_pos = arm.read("Present_Position")
+    homing_offset = zero_target_pos - zero_pos
+
+    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
+    # This allows to identify the rotation direction of each motor.
+    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
+    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
+    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
+    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
+    # of the previous motor in the kinetic chain.
+    print("\nMove arm to rotated target position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
+    input("Press Enter to continue...")
+
+    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
+
+    # Find drive mode by rotating each motor by a quarter of a turn.
+    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
+    rotated_pos = arm.read("Present_Position")
+    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
+
+    # Re-compute homing offset to take into account drive mode
+    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
+    homing_offset = rotated_target_pos - rotated_drived_pos
+
+    print("\nMove arm to rest position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
+    input("Press Enter to continue...")
+    print()
+
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": homing_offset.tolist(),
+        "drive_mode": drive_mode.tolist(),
+        "start_pos": zero_pos.tolist(),
+        "end_pos": rotated_pos.tolist(),
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+    return calib_dict
--- a/lerobot/common/robot_devices/robots/manipulator.py
+++ b/lerobot/common/robot_devices/robots/manipulator.py
@@ -1,3 +1,9 @@
+"""Contains logic to instantiate a robot, read information from its motors and cameras,
+and send orders to its motors.
+"""
+# TODO(rcadene, aliberts): reorganize the codebase into one file per robot, with the associated
+# calibration procedure, to make it easy for people to add their own robot.
+
 import json
 import logging
 import time
@@ -10,138 +16,10 @@ import numpy as np
 import torch

 from lerobot.common.robot_devices.cameras.utils import Camera
-from lerobot.common.robot_devices.motors.dynamixel import (
-    CalibrationMode,
-    TorqueMode,
-    convert_degrees_to_steps,
-)
 from lerobot.common.robot_devices.motors.utils import MotorsBus
 from lerobot.common.robot_devices.robots.utils import get_arm_id
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError

-########################################################################
-# Calibration logic
-########################################################################
-
-URL_TEMPLATE = (
-    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
-)
-
-# The following positions are provided in nominal degree range ]-180, +180[
-# For more info on these constants, see comments in the code where they get used.
-ZERO_POSITION_DEGREE = 0
-ROTATED_POSITION_DEGREE = 90
-
-
-def assert_drive_mode(drive_mode):
-    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
-    if not np.all(np.isin(drive_mode, [0, 1])):
-        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
-
-
-def apply_drive_mode(position, drive_mode):
-    assert_drive_mode(drive_mode)
-    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
-    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
-    signed_drive_mode = -(drive_mode * 2 - 1)
-    position *= signed_drive_mode
-    return position
-
-
-def compute_nearest_rounded_position(position, models):
-    delta_turn = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, models)
-    nearest_pos = np.round(position.astype(float) / delta_turn) * delta_turn
-    return nearest_pos.astype(position.dtype)
-
-
-def run_arm_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
-    """This function ensures that a neural network trained on data collected on a given robot
-    can work on another robot. For instance before calibration, setting a same goal position
-    for each motor of two different robots will get two very different positions. But after calibration,
-    the two robots will move to the same position.To this end, this function computes the homing offset
-    and the drive mode for each motor of a given robot.
-
-    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
-    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
-    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
-
-    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
-    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
-    to the "rotated position".
-
-    After calibration, the homing offsets and drive modes are stored in a cache.
-
-    Example of usage:
-    ```python
-    run_arm_calibration(arm, "koch", "left", "follower")
-    ```
-    """
-    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
-        raise ValueError("To run calibration, the torque must be disabled on all motors.")
-
-    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
-
-    print("\nMove arm to zero position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
-    input("Press Enter to continue...")
-
-    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
-    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
-    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
-    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
-
-    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
-    zero_pos = arm.read("Present_Position")
-    zero_nearest_pos = compute_nearest_rounded_position(zero_pos, arm.motor_models)
-    homing_offset = zero_target_pos - zero_nearest_pos
-
-    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
-    # This allows to identify the rotation direction of each motor.
-    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
-    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
-    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
-    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
-    # of the previous motor in the kinetic chain.
-    print("\nMove arm to rotated target position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
-    input("Press Enter to continue...")
-
-    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
-
-    # Find drive mode by rotating each motor by a quarter of a turn.
-    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
-    rotated_pos = arm.read("Present_Position")
-    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
-
-    # Re-compute homing offset to take into account drive mode
-    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
-    rotated_nearest_pos = compute_nearest_rounded_position(rotated_drived_pos, arm.motor_models)
-    homing_offset = rotated_target_pos - rotated_nearest_pos
-
-    print("\nMove arm to rest position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
-    input("Press Enter to continue...")
-    print()
-
-    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
-    calib_mode = [CalibrationMode.DEGREE.name] * len(arm.motor_names)
-
-    # TODO(rcadene): make type of joints (DEGREE or LINEAR) configurable from yaml?
-    if robot_type == "aloha" and "gripper" in arm.motor_names:
-        # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
-        calib_idx = arm.motor_names.index("gripper")
-        calib_mode[calib_idx] = CalibrationMode.LINEAR.name
-
-    calib_data = {
-        "homing_offset": homing_offset.tolist(),
-        "drive_mode": drive_mode.tolist(),
-        "start_pos": zero_pos.tolist(),
-        "end_pos": rotated_pos.tolist(),
-        "calib_mode": calib_mode,
-        "motor_names": arm.motor_names,
-    }
-    return calib_data
-

 def ensure_safe_goal_position(
    goal_pos: torch.Tensor, present_pos: torch.Tensor, max_relative_target: float | list[float]
@@ -163,11 +41,6 @@ def ensure_safe_goal_position(
    return safe_goal_pos


-########################################################################
-# Manipulator robot
-########################################################################
-
-
@dataclass
 class ManipulatorRobotConfig:
    """
@@ -178,7 +51,7 @@ class ManipulatorRobotConfig:
    """

    # Define all components of the robot
-    robot_type: str | None = None
+    robot_type: str = "koch"
    leader_arms: dict[str, MotorsBus] = field(default_factory=lambda: {})
    follower_arms: dict[str, MotorsBus] = field(default_factory=lambda: {})
    cameras: dict[str, Camera] = field(default_factory=lambda: {})
@@ -207,6 +80,10 @@ class ManipulatorRobotConfig:
                    )
        super().__setattr__(prop, val)

+    def __post_init__(self):
+        if self.robot_type not in ["koch", "koch_bimanual", "aloha", "so100", "moss"]:
+            raise ValueError(f"Provided robot type ({self.robot_type}) is not supported.")
+

 class ManipulatorRobot:
    # TODO(rcadene): Implement force feedback
@@ -349,6 +226,25 @@ class ManipulatorRobot:
        self.is_connected = False
        self.logs = {}

+    @property
+    def has_camera(self):
+        return len(self.cameras) > 0
+
+    @property
+    def num_cameras(self):
+        return len(self.cameras)
+
+    @property
+    def available_arms(self):
+        available_arms = []
+        for name in self.follower_arms:
+            arm_id = get_arm_id(name, "follower")
+            available_arms.append(arm_id)
+        for name in self.leader_arms:
+            arm_id = get_arm_id(name, "leader")
+            available_arms.append(arm_id)
+        return available_arms
+
    def connect(self):
        if self.is_connected:
            raise RobotDeviceAlreadyConnectedError(
@@ -364,9 +260,15 @@ class ManipulatorRobot:
        for name in self.follower_arms:
            print(f"Connecting {name} follower arm.")
            self.follower_arms[name].connect()
+        for name in self.leader_arms:
            print(f"Connecting {name} leader arm.")
            self.leader_arms[name].connect()

+        if self.robot_type in ["koch", "koch_bimanual", "aloha"]:
+            from lerobot.common.robot_devices.motors.dynamixel import TorqueMode
+        elif self.robot_type in ["so100", "moss"]:
+            from lerobot.common.robot_devices.motors.feetech import TorqueMode
+
        # We assume that at connection time, arms are in a rest position, and torque can
        # be safely disabled to run calibration and/or set robot preset configurations.
        for name in self.follower_arms:
@@ -377,12 +279,12 @@ class ManipulatorRobot:
        self.activate_calibration()

        # Set robot preset (e.g. torque in leader gripper for Koch v1.1)
-        if self.robot_type == "koch":
+        if self.robot_type in ["koch", "koch_bimanual"]:
            self.set_koch_robot_preset()
        elif self.robot_type == "aloha":
            self.set_aloha_robot_preset()
-        else:
-            warnings.warn(f"No preset found for robot type: {self.robot_type}", stacklevel=1)
+        elif self.robot_type in ["so100", "moss"]:
+            self.set_so100_robot_preset()

        # Enable torque on all motors of the follower arms
        for name in self.follower_arms:
@@ -390,12 +292,22 @@ class ManipulatorRobot:
            self.follower_arms[name].write("Torque_Enable", 1)

        if self.config.gripper_open_degree is not None:
+            if self.robot_type not in ["koch", "koch_bimanual"]:
+                raise NotImplementedError(
+                    f"{self.robot_type} does not support position AND current control in the handle, which is require to set the gripper open."
+                )
            # Set the leader arm in torque mode with the gripper motor set to an angle. This makes it possible
            # to squeeze the gripper and have it spring back to an open position on its own.
            for name in self.leader_arms:
                self.leader_arms[name].write("Torque_Enable", 1, "gripper")
                self.leader_arms[name].write("Goal_Position", self.config.gripper_open_degree, "gripper")

+        # Check both arms can be read
+        for name in self.follower_arms:
+            self.follower_arms[name].read("Present_Position")
+        for name in self.leader_arms:
+            self.leader_arms[name].read("Present_Position")
+
        # Connect the cameras
        for name in self.cameras:
            self.cameras[name].connect()
@@ -416,8 +328,20 @@ class ManipulatorRobot:
                with open(arm_calib_path) as f:
                    calibration = json.load(f)
            else:
+                # TODO(rcadene): display a warning in __init__ if calibration file not available
                print(f"Missing calibration file '{arm_calib_path}'")
-                calibration = run_arm_calibration(arm, self.robot_type, name, arm_type)
+
+                if self.robot_type in ["koch", "koch_bimanual", "aloha"]:
+                    from lerobot.common.robot_devices.robots.dynamixel_calibration import run_arm_calibration
+
+                    calibration = run_arm_calibration(arm, self.robot_type, name, arm_type)
+
+                elif self.robot_type in ["so100", "moss"]:
+                    from lerobot.common.robot_devices.robots.feetech_calibration import (
+                        run_arm_manual_calibration,
+                    )
+
+                    calibration = run_arm_manual_calibration(arm, self.robot_type, name, arm_type)

                print(f"Calibration is done! Saving calibration file '{arm_calib_path}'")
                arm_calib_path.parent.mkdir(parents=True, exist_ok=True)
@@ -435,6 +359,8 @@ class ManipulatorRobot:

    def set_koch_robot_preset(self):
        def set_operating_mode_(arm):
+            from lerobot.common.robot_devices.motors.dynamixel import TorqueMode
+
            if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
                raise ValueError("To run set robot preset, the torque must be disabled on all motors.")

@@ -522,6 +448,23 @@ class ManipulatorRobot:
                stacklevel=1,
            )

+    def set_so100_robot_preset(self):
+        for name in self.follower_arms:
+            # Mode=0 for Position Control
+            self.follower_arms[name].write("Mode", 0)
+            # Set P_Coefficient to lower value to avoid shakiness (Default is 32)
+            self.follower_arms[name].write("P_Coefficient", 16)
+            # Set I_Coefficient and D_Coefficient to default value 0 and 32
+            self.follower_arms[name].write("I_Coefficient", 0)
+            self.follower_arms[name].write("D_Coefficient", 32)
+            # Close the write lock so that Maximum_Acceleration gets written to EPROM address,
+            # which is mandatory for Maximum_Acceleration to take effect after rebooting.
+            self.follower_arms[name].write("Lock", 0)
+            # Set Maximum_Acceleration to 254 to speedup acceleration and deceleration of
+            # the motors. Note: this configuration is not in the official STS3215 Memory Table
+            self.follower_arms[name].write("Maximum_Acceleration", 254)
+            self.follower_arms[name].write("Acceleration", 254)
+
    def teleop_step(
        self, record_data=False
    ) -> None | tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]:
--- a/lerobot/common/utils/utils.py
+++ b/lerobot/common/utils/utils.py
@@ -16,6 +16,7 @@
 import logging
 import os
 import os.path as osp
+import platform
 import random
 from contextlib import contextmanager
 from datetime import datetime, timezone
@@ -28,6 +29,12 @@ import torch
 from omegaconf import DictConfig


+def none_or_int(value):
+    if value == "None":
+        return None
+    return int(value)
+
+
 def inside_slurm():
    """Check whether the python process was launched through slurm"""
    # TODO(rcadene): return False for interactive mode `--pty bash`
@@ -183,3 +190,30 @@ def print_cuda_memory_usage():

 def capture_timestamp_utc():
    return datetime.now(timezone.utc)
+
+
+def say(text, blocking=False):
+    # Check if mac, linux, or windows.
+    if platform.system() == "Darwin":
+        cmd = f'say "{text}"'
+        if not blocking:
+            cmd += " &"
+    elif platform.system() == "Linux":
+        cmd = f'spd-say "{text}"'
+        if blocking:
+            cmd += "  --wait"
+    elif platform.system() == "Windows":
+        # TODO(rcadene): Make blocking option work for Windows
+        cmd = (
+            'PowerShell -Command "Add-Type -AssemblyName System.Speech; '
+            f"(New-Object System.Speech.Synthesis.SpeechSynthesizer).Speak('{text}')\""
+        )
+
+    os.system(cmd)
+
+
+def log_say(text, play_sounds, blocking=False):
+    logging.info(text)
+
+    if play_sounds:
+        say(text, blocking)
--- a/lerobot/configs/env/aloha_real.yaml
+++ b/lerobot/configs/env/aloha_real.yaml
@@ -5,6 +5,6 @@ fps: 30
 env:
  name: real_world
  task: null
-  state_dim: 14
-  action_dim: 14
+  state_dim: 18
+  action_dim: 18
  fps: ${fps}
--- a/lerobot/configs/env/moss_real.yaml
+++ b/lerobot/configs/env/moss_real.yaml
@@ -0,0 +1,10 @@
+# @package _global_
+
+fps: 30
+
+env:
+  name: real_world
+  task: null
+  state_dim: 6
+  action_dim: 6
+  fps: ${fps}
--- a/lerobot/configs/env/so100_real.yaml
+++ b/lerobot/configs/env/so100_real.yaml
@@ -0,0 +1,10 @@
+# @package _global_
+
+fps: 30
+
+env:
+  name: real_world
+  task: null
+  state_dim: 6
+  action_dim: 6
+  fps: ${fps}
--- a/lerobot/configs/policy/act_moss_real.yaml
+++ b/lerobot/configs/policy/act_moss_real.yaml
@@ -0,0 +1,102 @@
+# @package _global_
+
+# Use `act_koch_real.yaml` to train on real-world datasets collected on Alexander Koch's robots.
+# Compared to `act.yaml`, it contains 2 cameras (i.e. laptop, phone) instead of 1 camera (i.e. top).
+# Also, `training.eval_freq` is set to -1. This config is used to evaluate checkpoints at a certain frequency of training steps.
+# When it is set to -1, it deactivates evaluation. This is because real-world evaluation is done through our `control_robot.py` script.
+# Look at the documentation in header of `control_robot.py` for more information on how to collect data , train and evaluate a policy.
+#
+# Example of usage for training:
+# ```bash
+# python lerobot/scripts/train.py \
+#   policy=act_koch_real \
+#   env=koch_real
+# ```
+
+seed: 1000
+dataset_repo_id: lerobot/moss_pick_place_lego
+
+override_dataset_stats:
+  observation.images.laptop:
+    # stats from imagenet, since we use a pretrained vision model
+    mean: [[[0.485]], [[0.456]], [[0.406]]]  # (c,1,1)
+    std: [[[0.229]], [[0.224]], [[0.225]]]  # (c,1,1)
+  observation.images.phone:
+    # stats from imagenet, since we use a pretrained vision model
+    mean: [[[0.485]], [[0.456]], [[0.406]]]  # (c,1,1)
+    std: [[[0.229]], [[0.224]], [[0.225]]]  # (c,1,1)
+
+training:
+  offline_steps: 80000
+  online_steps: 0
+  eval_freq: -1
+  save_freq: 10000
+  log_freq: 100
+  save_checkpoint: true
+
+  batch_size: 8
+  lr: 1e-5
+  lr_backbone: 1e-5
+  weight_decay: 1e-4
+  grad_clip_norm: 10
+  online_steps_between_rollouts: 1
+
+  delta_timestamps:
+    action: "[i / ${fps} for i in range(${policy.chunk_size})]"
+
+eval:
+  n_episodes: 50
+  batch_size: 50
+
+# See `configuration_act.py` for more details.
+policy:
+  name: act
+
+  # Input / output structure.
+  n_obs_steps: 1
+  chunk_size: 100
+  n_action_steps: 100
+
+  input_shapes:
+    # TODO(rcadene, alexander-soare): add variables for height and width from the dataset/env?
+    observation.images.laptop: [3, 480, 640]
+    observation.images.phone: [3, 480, 640]
+    observation.state: ["${env.state_dim}"]
+  output_shapes:
+    action: ["${env.action_dim}"]
+
+  # Normalization / Unnormalization
+  input_normalization_modes:
+    observation.images.laptop: mean_std
+    observation.images.phone: mean_std
+    observation.state: mean_std
+  output_normalization_modes:
+    action: mean_std
+
+  # Architecture.
+  # Vision backbone.
+  vision_backbone: resnet18
+  pretrained_backbone_weights: ResNet18_Weights.IMAGENET1K_V1
+  replace_final_stride_with_dilation: false
+  # Transformer layers.
+  pre_norm: false
+  dim_model: 512
+  n_heads: 8
+  dim_feedforward: 3200
+  feedforward_activation: relu
+  n_encoder_layers: 4
+  # Note: Although the original ACT implementation has 7 for `n_decoder_layers`, there is a bug in the code
+  # that means only the first layer is used. Here we match the original implementation by setting this to 1.
+  # See this issue https://github.com/tonyzhaozh/act/issues/25#issue-2258740521.
+  n_decoder_layers: 1
+  # VAE.
+  use_vae: true
+  latent_dim: 32
+  n_vae_encoder_layers: 4
+
+  # Inference.
+  temporal_ensemble_momentum: null
+
+  # Training and loss computation.
+  dropout: 0.1
+  kl_weight: 10.0
--- a/lerobot/configs/policy/act_so100_real.yaml
+++ b/lerobot/configs/policy/act_so100_real.yaml
@@ -0,0 +1,102 @@
+# @package _global_
+
+# Use `act_koch_real.yaml` to train on real-world datasets collected on Alexander Koch's robots.
+# Compared to `act.yaml`, it contains 2 cameras (i.e. laptop, phone) instead of 1 camera (i.e. top).
+# Also, `training.eval_freq` is set to -1. This config is used to evaluate checkpoints at a certain frequency of training steps.
+# When it is set to -1, it deactivates evaluation. This is because real-world evaluation is done through our `control_robot.py` script.
+# Look at the documentation in header of `control_robot.py` for more information on how to collect data , train and evaluate a policy.
+#
+# Example of usage for training:
+# ```bash
+# python lerobot/scripts/train.py \
+#   policy=act_koch_real \
+#   env=koch_real
+# ```
+
+seed: 1000
+dataset_repo_id: lerobot/so100_pick_place_lego
+
+override_dataset_stats:
+  observation.images.laptop:
+    # stats from imagenet, since we use a pretrained vision model
+    mean: [[[0.485]], [[0.456]], [[0.406]]]  # (c,1,1)
+    std: [[[0.229]], [[0.224]], [[0.225]]]  # (c,1,1)
+  observation.images.phone:
+    # stats from imagenet, since we use a pretrained vision model
+    mean: [[[0.485]], [[0.456]], [[0.406]]]  # (c,1,1)
+    std: [[[0.229]], [[0.224]], [[0.225]]]  # (c,1,1)
+
+training:
+  offline_steps: 80000
+  online_steps: 0
+  eval_freq: -1
+  save_freq: 10000
+  log_freq: 100
+  save_checkpoint: true
+
+  batch_size: 8
+  lr: 1e-5
+  lr_backbone: 1e-5
+  weight_decay: 1e-4
+  grad_clip_norm: 10
+  online_steps_between_rollouts: 1
+
+  delta_timestamps:
+    action: "[i / ${fps} for i in range(${policy.chunk_size})]"
+
+eval:
+  n_episodes: 50
+  batch_size: 50
+
+# See `configuration_act.py` for more details.
+policy:
+  name: act
+
+  # Input / output structure.
+  n_obs_steps: 1
+  chunk_size: 100
+  n_action_steps: 100
+
+  input_shapes:
+    # TODO(rcadene, alexander-soare): add variables for height and width from the dataset/env?
+    observation.images.laptop: [3, 480, 640]
+    observation.images.phone: [3, 480, 640]
+    observation.state: ["${env.state_dim}"]
+  output_shapes:
+    action: ["${env.action_dim}"]
+
+  # Normalization / Unnormalization
+  input_normalization_modes:
+    observation.images.laptop: mean_std
+    observation.images.phone: mean_std
+    observation.state: mean_std
+  output_normalization_modes:
+    action: mean_std
+
+  # Architecture.
+  # Vision backbone.
+  vision_backbone: resnet18
+  pretrained_backbone_weights: ResNet18_Weights.IMAGENET1K_V1
+  replace_final_stride_with_dilation: false
+  # Transformer layers.
+  pre_norm: false
+  dim_model: 512
+  n_heads: 8
+  dim_feedforward: 3200
+  feedforward_activation: relu
+  n_encoder_layers: 4
+  # Note: Although the original ACT implementation has 7 for `n_decoder_layers`, there is a bug in the code
+  # that means only the first layer is used. Here we match the original implementation by setting this to 1.
+  # See this issue https://github.com/tonyzhaozh/act/issues/25#issue-2258740521.
+  n_decoder_layers: 1
+  # VAE.
+  use_vae: true
+  latent_dim: 32
+  n_vae_encoder_layers: 4
+
+  # Inference.
+  temporal_ensemble_momentum: null
+
+  # Training and loss computation.
+  dropout: 0.1
+  kl_weight: 10.0
--- a/lerobot/configs/robot/aloha.yaml
+++ b/lerobot/configs/robot/aloha.yaml
@@ -1,11 +1,13 @@
-# Aloha: A Low-Cost Hardware for Bimanual Teleoperation
+# [Aloha: A Low-Cost Hardware for Bimanual Teleoperation](https://www.trossenrobotics.com/aloha-stationary)
 # https://aloha-2.github.io
-# https://www.trossenrobotics.com/aloha-stationary

 # Requires installing extras packages
 # With pip: `pip install -e ".[dynamixel intelrealsense]"`
 # With poetry: `poetry install --sync --extras "dynamixel intelrealsense"`

+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/9_use_aloha.md)
+
+
 _target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
 robot_type: aloha
 # Specific to Aloha, LeRobot comes with default calibration files. Assuming the motors have been
--- a/lerobot/configs/robot/koch_bimanual.yaml
+++ b/lerobot/configs/robot/koch_bimanual.yaml
@@ -1,5 +1,5 @@
 _target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
-robot_type: koch
+robot_type: koch_bimanual
 calibration_dir: .cache/calibration/koch_bimanual

 # `max_relative_target` limits the magnitude of the relative positional target vector for safety purposes.
--- a/lerobot/configs/robot/moss.yaml
+++ b/lerobot/configs/robot/moss.yaml
@@ -0,0 +1,56 @@
+# [Moss v1 robot arm](https://github.com/jess-moss/moss-robot-arms)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[feetech]"`
+# With poetry: `poetry install --sync --extras "feetech"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/11_use_moss.md)
+
+_target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
+robot_type: moss
+calibration_dir: .cache/calibration/moss
+
+# `max_relative_target` limits the magnitude of the relative positional target vector for safety purposes.
+# Set this to a positive scalar to have the same value for all motors, or a list that is the same length as
+# the number of motors in your follower arms.
+max_relative_target: null
+
+leader_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem58760431091
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+follower_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem58760431191
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+cameras:
+  laptop:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 0
+    fps: 30
+    width: 640
+    height: 480
+  phone:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 1
+    fps: 30
+    width: 640
+    height: 480
--- a/lerobot/configs/robot/so100.yaml
+++ b/lerobot/configs/robot/so100.yaml
@@ -0,0 +1,56 @@
+# [SO-100 robot arm](https://github.com/TheRobotStudio/SO-ARM100)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[feetech]"`
+# With poetry: `poetry install --sync --extras "feetech"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/10_use_so100.md)
+
+_target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
+robot_type: so100
+calibration_dir: .cache/calibration/so100
+
+# `max_relative_target` limits the magnitude of the relative positional target vector for safety purposes.
+# Set this to a positive scalar to have the same value for all motors, or a list that is the same length as
+# the number of motors in your follower arms.
+max_relative_target: null
+
+leader_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem585A0077581
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+follower_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem585A0080971
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+cameras:
+  laptop:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 0
+    fps: 30
+    width: 640
+    height: 480
+  phone:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 1
+    fps: 30
+    width: 640
+    height: 480
--- a/lerobot/configs/robot/stretch.yaml
+++ b/lerobot/configs/robot/stretch.yaml
@@ -1,3 +1,12 @@
+# [Stretch3 from Hello Robot](https://hello-robot.com/stretch-3-product)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[stretch]"`
+# With poetry: `poetry install --sync --extras "stretch"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/8_use_stretch.md)
+
+
 _target_: lerobot.common.robot_devices.robots.stretch.StretchRobot
 robot_type: stretch3

--- a/lerobot/scripts/configure_motor.py
+++ b/lerobot/scripts/configure_motor.py
@@ -0,0 +1,145 @@
+"""
+This script configure a single motor at a time to a given ID and baudrate.
+
+Example of usage:
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem585A0080521 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 1
+```
+"""
+
+import argparse
+import time
+
+
+def configure_motor(port, brand, model, motor_idx_des, baudrate_des):
+    if brand == "feetech":
+        from lerobot.common.robot_devices.motors.feetech import MODEL_BAUDRATE_TABLE
+        from lerobot.common.robot_devices.motors.feetech import (
+            SCS_SERIES_BAUDRATE_TABLE as SERIES_BAUDRATE_TABLE,
+        )
+        from lerobot.common.robot_devices.motors.feetech import FeetechMotorsBus as MotorsBusClass
+    elif brand == "dynamixel":
+        from lerobot.common.robot_devices.motors.dynamixel import MODEL_BAUDRATE_TABLE
+        from lerobot.common.robot_devices.motors.dynamixel import (
+            X_SERIES_BAUDRATE_TABLE as SERIES_BAUDRATE_TABLE,
+        )
+        from lerobot.common.robot_devices.motors.dynamixel import DynamixelMotorsBus as MotorsBusClass
+    else:
+        raise ValueError(
+            f"Currently we do not support this motor brand: {brand}. We currently support feetech and dynamixel motors."
+        )
+
+    # Check if the provided model exists in the model_baud_rate_table
+    if model not in MODEL_BAUDRATE_TABLE:
+        raise ValueError(
+            f"Invalid model '{model}' for brand '{brand}'. Supported models: {list(MODEL_BAUDRATE_TABLE.keys())}"
+        )
+
+    # Setup motor names, indices, and models
+    motor_name = "motor"
+    motor_index_arbitrary = motor_idx_des  # Use the motor ID passed via argument
+    motor_model = model  # Use the motor model passed via argument
+
+    # Initialize the MotorBus with the correct port and motor configurations
+    motor_bus = MotorsBusClass(port=port, motors={motor_name: (motor_index_arbitrary, motor_model)})
+
+    # Try to connect to the motor bus and handle any connection-specific errors
+    try:
+        motor_bus.connect()
+        print(f"Connected on port {motor_bus.port}")
+    except OSError as e:
+        print(f"Error occurred when connecting to the motor bus: {e}")
+        return
+
+    # Motor bus is connected, proceed with the rest of the operations
+    try:
+        print("Scanning all baudrates and motor indices")
+        all_baudrates = set(SERIES_BAUDRATE_TABLE.values())
+        motor_index = -1  # Set the motor index to an out-of-range value.
+
+        for baudrate in all_baudrates:
+            motor_bus.set_bus_baudrate(baudrate)
+            present_ids = motor_bus.find_motor_indices(list(range(1, 10)))
+            if len(present_ids) > 1:
+                raise ValueError(
+                    "Error: More than one motor ID detected. This script is designed to only handle one motor at a time. Please disconnect all but one motor."
+                )
+
+            if len(present_ids) == 1:
+                if motor_index != -1:
+                    raise ValueError(
+                        "Error: More than one motor ID detected. This script is designed to only handle one motor at a time. Please disconnect all but one motor."
+                    )
+                motor_index = present_ids[0]
+
+        if motor_index == -1:
+            raise ValueError("No motors detected. Please ensure you have one motor connected.")
+
+        print(f"Motor index found at: {motor_index}")
+
+        if brand == "feetech":
+            # Allows ID and BAUDRATE to be written in memory
+            motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Lock", 0)
+
+        if baudrate != baudrate_des:
+            print(f"Setting its baudrate to {baudrate_des}")
+            baudrate_idx = list(SERIES_BAUDRATE_TABLE.values()).index(baudrate_des)
+
+            # The write can fail, so we allow retries
+            motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Baud_Rate", baudrate_idx)
+            time.sleep(0.5)
+            motor_bus.set_bus_baudrate(baudrate_des)
+            present_baudrate_idx = motor_bus.read_with_motor_ids(
+                motor_bus.motor_models, motor_index, "Baud_Rate", num_retry=2
+            )
+
+            if present_baudrate_idx != baudrate_idx:
+                raise OSError("Failed to write baudrate.")
+
+        print(f"Setting its index to desired index {motor_idx_des}")
+        motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Lock", 0)
+        motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "ID", motor_idx_des)
+
+        present_idx = motor_bus.read_with_motor_ids(motor_bus.motor_models, motor_idx_des, "ID", num_retry=2)
+        if present_idx != motor_idx_des:
+            raise OSError("Failed to write index.")
+
+        if brand == "feetech":
+            # Set Maximum_Acceleration to 254 to speedup acceleration and deceleration of
+            # the motors. Note: this configuration is not in the official STS3215 Memory Table
+            motor_bus.write("Lock", 0)
+            motor_bus.write("Maximum_Acceleration", 254)
+
+            motor_bus.write("Goal_Position", 2048)
+            time.sleep(4)
+            print("Present Position", motor_bus.read("Present_Position"))
+
+            motor_bus.write("Offset", 0)
+            time.sleep(4)
+            print("Offset", motor_bus.read("Offset"))
+
+    except Exception as e:
+        print(f"Error occurred during motor configuration: {e}")
+
+    finally:
+        motor_bus.disconnect()
+        print("Disconnected from motor bus.")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--port", type=str, required=True, help="Motors bus port (e.g. dynamixel,feetech)")
+    parser.add_argument("--brand", type=str, required=True, help="Motor brand (e.g. dynamixel,feetech)")
+    parser.add_argument("--model", type=str, required=True, help="Motor model (e.g. xl330-m077,sts3215)")
+    parser.add_argument("--ID", type=int, required=True, help="Desired ID of the current motor (e.g. 1,2,3)")
+    parser.add_argument(
+        "--baudrate", type=int, default=1000000, help="Desired baudrate for the motor (default: 1000000)"
+    )
+    args = parser.parse_args()
+
+    configure_motor(args.port, args.brand, args.model, args.ID, args.baudrate)
--- a/lerobot/scripts/control_robot.py
+++ b/lerobot/scripts/control_robot.py
@@ -99,204 +99,35 @@ python lerobot/scripts/control_robot.py record \
 """

 import argparse
-import concurrent.futures
-import json
 import logging
-import multiprocessing
-import os
-import platform
-import shutil
 import time
-import traceback
-from contextlib import nullcontext
-from functools import cache
 from pathlib import Path
-
-import cv2
-import torch
-import tqdm
-from omegaconf import DictConfig
-from PIL import Image
-from termcolor import colored
+from typing import List

 # from safetensors.torch import load_file, save_file
-from lerobot.common.datasets.compute_stats import compute_stats
-from lerobot.common.datasets.lerobot_dataset import CODEBASE_VERSION, LeRobotDataset
-from lerobot.common.datasets.push_dataset_to_hub.aloha_hdf5_format import to_hf_dataset
-from lerobot.common.datasets.push_dataset_to_hub.utils import concatenate_episodes, get_default_encoding
-from lerobot.common.datasets.utils import calculate_episode_data_index, create_branch
-from lerobot.common.datasets.video_utils import encode_video_frames
-from lerobot.common.policies.factory import make_policy
-from lerobot.common.robot_devices.robots.factory import make_robot
-from lerobot.common.robot_devices.robots.utils import Robot, get_arm_id
-from lerobot.common.robot_devices.utils import busy_wait, safe_disconnect
-from lerobot.common.utils.utils import get_safe_torch_device, init_hydra_config, init_logging, set_global_seed
-from lerobot.scripts.eval import get_pretrained_policy_path
-from lerobot.scripts.push_dataset_to_hub import (
-    push_dataset_card_to_hub,
-    push_meta_data_to_hub,
-    push_videos_to_hub,
-    save_meta_data,
+from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.common.datasets.populate_dataset import (
+    create_lerobot_dataset,
+    delete_current_episode,
+    init_dataset,
+    save_current_episode,
 )
-
-########################################################################################
-# Utilities
-########################################################################################
-
-
-def say(text, blocking=False):
-    # Check if mac, linux, or windows.
-    if platform.system() == "Darwin":
-        cmd = f'say "{text}"'
-    elif platform.system() == "Linux":
-        cmd = f'spd-say "{text}"'
-    elif platform.system() == "Windows":
-        cmd = (
-            'PowerShell -Command "Add-Type -AssemblyName System.Speech; '
-            f"(New-Object System.Speech.Synthesis.SpeechSynthesizer).Speak('{text}')\""
-        )
-
-    if not blocking and platform.system() in ["Darwin", "Linux"]:
-        # TODO(rcadene): Make it work for Windows
-        # Use the ampersand to run command in the background
-        cmd += " &"
-
-    os.system(cmd)
-
-
-def save_image(img_tensor, key, frame_index, episode_index, videos_dir):
-    img = Image.fromarray(img_tensor.numpy())
-    path = videos_dir / f"{key}_episode_{episode_index:06d}" / f"frame_{frame_index:06d}.png"
-    path.parent.mkdir(parents=True, exist_ok=True)
-    img.save(str(path), quality=100)
-
-
-def none_or_int(value):
-    if value == "None":
-        return None
-    return int(value)
-
-
-def log_control_info(robot: Robot, dt_s, episode_index=None, frame_index=None, fps=None):
-    log_items = []
-    if episode_index is not None:
-        log_items.append(f"ep:{episode_index}")
-    if frame_index is not None:
-        log_items.append(f"frame:{frame_index}")
-
-    def log_dt(shortname, dt_val_s):
-        nonlocal log_items, fps
-        info_str = f"{shortname}:{dt_val_s * 1000:5.2f} ({1/ dt_val_s:3.1f}hz)"
-        if fps is not None:
-            actual_fps = 1 / dt_val_s
-            if actual_fps < fps - 1:
-                info_str = colored(info_str, "yellow")
-        log_items.append(info_str)
-
-    # total step time displayed in milliseconds and its frequency
-    log_dt("dt", dt_s)
-
-    # TODO(aliberts): move robot-specific logs logic in robot.print_logs()
-    if not robot.robot_type.startswith("stretch"):
-        for name in robot.leader_arms:
-            key = f"read_leader_{name}_pos_dt_s"
-            if key in robot.logs:
-                log_dt("dtRlead", robot.logs[key])
-
-        for name in robot.follower_arms:
-            key = f"write_follower_{name}_goal_pos_dt_s"
-            if key in robot.logs:
-                log_dt("dtWfoll", robot.logs[key])
-
-            key = f"read_follower_{name}_pos_dt_s"
-            if key in robot.logs:
-                log_dt("dtRfoll", robot.logs[key])
-
-        for name in robot.cameras:
-            key = f"read_camera_{name}_dt_s"
-            if key in robot.logs:
-                log_dt(f"dtR{name}", robot.logs[key])
-
-    info_str = " ".join(log_items)
-    logging.info(info_str)
-
-
-@cache
-def is_headless():
-    """Detects if python is running without a monitor."""
-    try:
-        import pynput  # noqa
-
-        return False
-    except Exception:
-        print(
-            "Error trying to import pynput. Switching to headless mode. "
-            "As a result, the video stream from the cameras won't be shown, "
-            "and you won't be able to change the control flow with keyboards. "
-            "For more info, see traceback below.\n"
-        )
-        traceback.print_exc()
-        print()
-        return True
-
-
-def loop_to_save_frame_in_threads(frame_queue, num_image_writers):
-    with concurrent.futures.ThreadPoolExecutor(max_workers=num_image_writers) as executor:
-        futures = []
-        while True:
-            # Blocks until a frame is available
-            frame_data = frame_queue.get()
-
-            # Exit if we send None to stop the worker
-            if frame_data is None:
-                # Wait for all submitted futures to complete before exiting
-                for _ in tqdm.tqdm(
-                    concurrent.futures.as_completed(futures), total=len(futures), desc="Writting images"
-                ):
-                    pass
-                break
-
-            frame, key, frame_index, episode_index, videos_dir = frame_data
-            futures.append(executor.submit(save_image, frame, key, frame_index, episode_index, videos_dir))
-
-
-def start_frame_workers(frame_queue, num_image_writers, num_workers=1):
-    workers = []
-    for _ in range(num_workers):
-        worker = multiprocessing.Process(
-            target=loop_to_save_frame_in_threads,
-            args=(frame_queue, num_image_writers),
-        )
-        worker.start()
-        workers.append(worker)
-    return workers
-
-
-def stop_workers(workers, frame_queue):
-    # Send None to each process to signal it to stop
-    for _ in workers:
-        frame_queue.put(None)
-
-    # Wait for all processes to terminate
-    for process in workers:
-        process.join()
-
-
-def has_method(_object: object, method_name: str):
-    return hasattr(_object, method_name) and callable(getattr(_object, method_name))
-
-
-def get_available_arms(robot):
-    # TODO(rcadene): moves this function in manipulator class?
-    available_arms = []
-    for name in robot.follower_arms:
-        arm_id = get_arm_id(name, "follower")
-        available_arms.append(arm_id)
-    for name in robot.leader_arms:
-        arm_id = get_arm_id(name, "leader")
-        available_arms.append(arm_id)
-    return available_arms
-
+from lerobot.common.robot_devices.control_utils import (
+    control_loop,
+    has_method,
+    init_keyboard_listener,
+    init_policy,
+    log_control_info,
+    record_episode,
+    reset_environment,
+    sanity_check_dataset_name,
+    stop_recording,
+    warmup_record,
+)
+from lerobot.common.robot_devices.robots.factory import make_robot
+from lerobot.common.robot_devices.robots.utils import Robot
+from lerobot.common.robot_devices.utils import busy_wait, safe_disconnect
+from lerobot.common.utils.utils import init_hydra_config, init_logging, log_say, none_or_int

 ########################################################################################
 # Control modes
@@ -313,9 +144,11 @@ def calibrate(robot: Robot, arms: list[str] | None):
            robot.home()
        return

-    available_arms = get_available_arms(robot)
-    unknown_arms = [arm_id for arm_id in arms if arm_id not in available_arms]
-    available_arms_str = " ".join(available_arms)
+    if arms is None:
+        arms = robot.available_arms
+
+    unknown_arms = [arm_id for arm_id in arms if arm_id not in robot.available_arms]
+    available_arms_str = " ".join(robot.available_arms)
    unknown_arms_str = " ".join(unknown_arms)

    if arms is None or len(arms) == 0:
@@ -348,35 +181,26 @@ def calibrate(robot: Robot, arms: list[str] | None):


@safe_disconnect
-def teleoperate(robot: Robot, fps: int | None = None, teleop_time_s: float | None = None):
-    # TODO(rcadene): Add option to record logs
-    if not robot.is_connected:
-        robot.connect()
-
-    start_teleop_t = time.perf_counter()
-    while True:
-        start_loop_t = time.perf_counter()
-        robot.teleop_step()
-
-        if fps is not None:
-            dt_s = time.perf_counter() - start_loop_t
-            busy_wait(1 / fps - dt_s)
-
-        dt_s = time.perf_counter() - start_loop_t
-        log_control_info(robot, dt_s, fps=fps)
-
-        if teleop_time_s is not None and time.perf_counter() - start_teleop_t > teleop_time_s:
-            break
+def teleoperate(
+    robot: Robot, fps: int | None = None, teleop_time_s: float | None = None, display_cameras: bool = False
+):
+    control_loop(
+        robot,
+        control_time_s=teleop_time_s,
+        fps=fps,
+        teleoperate=True,
+        display_cameras=display_cameras,
+    )


@safe_disconnect
 def record(
    robot: Robot,
-    policy: torch.nn.Module | None = None,
-    hydra_cfg: DictConfig | None = None,
+    root: str,
+    repo_id: str,
+    pretrained_policy_name_or_path: str | None = None,
+    policy_overrides: List[str] | None = None,
    fps: int | None = None,
-    root="data",
-    repo_id="lerobot/debug",
    warmup_time_s=2,
    episode_time_s=10,
    reset_time_s=5,
@@ -385,389 +209,115 @@ def record(
    run_compute_stats=True,
    push_to_hub=True,
    tags=None,
-    num_image_writers_per_camera=4,
+    num_image_writer_processes=0,
+    num_image_writer_threads_per_camera=4,
    force_override=False,
    display_cameras=True,
+    play_sounds=True,
 ):
    # TODO(rcadene): Add option to record logs
-    # TODO(rcadene): Clean this function via decomposition in higher level functions
+    listener = None
+    events = None
+    policy = None
+    device = None
+    use_amp = None

-    _, dataset_name = repo_id.split("/")
-    if dataset_name.startswith("eval_") and policy is None:
-        raise ValueError(
-            f"Your dataset name begins by 'eval_' ({dataset_name}) but no policy is provided ({policy})."
-        )
+    # Load pretrained policy
+    if pretrained_policy_name_or_path is not None:
+        policy, policy_fps, device, use_amp = init_policy(pretrained_policy_name_or_path, policy_overrides)
+
+        if fps is None:
+            fps = policy_fps
+            logging.warning(f"No fps provided, so using the fps from policy config ({policy_fps}).")
+        elif fps != policy_fps:
+            logging.warning(
+                f"There is a mismatch between the provided fps ({fps}) and the one from policy config ({policy_fps})."
+            )
+
+    # Create empty dataset or load existing saved episodes
+    sanity_check_dataset_name(repo_id, policy)
+    dataset = init_dataset(
+        repo_id,
+        root,
+        force_override,
+        fps,
+        video,
+        write_images=robot.has_camera,
+        num_image_writer_processes=num_image_writer_processes,
+        num_image_writer_threads=num_image_writer_threads_per_camera * robot.num_cameras,
+    )

    if not robot.is_connected:
        robot.connect()

-    local_dir = Path(root) / repo_id
-    if local_dir.exists() and force_override:
-        shutil.rmtree(local_dir)
+    listener, events = init_keyboard_listener()

-    episodes_dir = local_dir / "episodes"
-    episodes_dir.mkdir(parents=True, exist_ok=True)
-
-    videos_dir = local_dir / "videos"
-    videos_dir.mkdir(parents=True, exist_ok=True)
-
-    # Logic to resume data recording
-    rec_info_path = episodes_dir / "data_recording_info.json"
-    if rec_info_path.exists():
-        with open(rec_info_path) as f:
-            rec_info = json.load(f)
-        episode_index = rec_info["last_episode_index"] + 1
-    else:
-        episode_index = 0
-
-    if is_headless():
-        logging.warning(
-            "Headless environment detected. On-screen cameras display and keyboard inputs will not be available."
-        )
-
-    # Allow to exit early while recording an episode or resetting the environment,
-    # by tapping the right arrow key '->'. This might require a sudo permission
-    # to allow your terminal to monitor keyboard events.
-    exit_early = False
-    rerecord_episode = False
-    stop_recording = False
-
-    # Only import pynput if not in a headless environment
-    if not is_headless():
-        from pynput import keyboard
-
-        def on_press(key):
-            nonlocal exit_early, rerecord_episode, stop_recording
-            try:
-                if key == keyboard.Key.right:
-                    print("Right arrow key pressed. Exiting loop...")
-                    exit_early = True
-                elif key == keyboard.Key.left:
-                    print("Left arrow key pressed. Exiting loop and rerecord the last episode...")
-                    rerecord_episode = True
-                    exit_early = True
-                elif key == keyboard.Key.esc:
-                    print("Escape key pressed. Stopping data recording...")
-                    stop_recording = True
-                    exit_early = True
-            except Exception as e:
-                print(f"Error handling key press: {e}")
-
-        listener = keyboard.Listener(on_press=on_press)
-        listener.start()
-
-    # Load policy if any
-    if policy is not None:
-        # Check device is available
-        device = get_safe_torch_device(hydra_cfg.device, log=True)
-
-        policy.eval()
-        policy.to(device)
-
-        torch.backends.cudnn.benchmark = True
-        torch.backends.cuda.matmul.allow_tf32 = True
-        set_global_seed(hydra_cfg.seed)
-
-        # override fps using policy fps
-        fps = hydra_cfg.env.fps
-
-    # Execute a few seconds without recording data, to give times
-    # to the robot devices to connect and start synchronizing.
-    timestamp = 0
-    start_warmup_t = time.perf_counter()
-    is_warmup_print = False
-    while timestamp < warmup_time_s:
-        if not is_warmup_print:
-            logging.info("Warming up (no data recording)")
-            say("Warming up")
-            is_warmup_print = True
-
-        start_loop_t = time.perf_counter()
-
-        if policy is None:
-            observation, action = robot.teleop_step(record_data=True)
-        else:
-            observation = robot.capture_observation()
-
-        if display_cameras and not is_headless():
-            image_keys = [key for key in observation if "image" in key]
-            for key in image_keys:
-                cv2.imshow(key, cv2.cvtColor(observation[key].numpy(), cv2.COLOR_RGB2BGR))
-            cv2.waitKey(1)
-
-        dt_s = time.perf_counter() - start_loop_t
-        busy_wait(1 / fps - dt_s)
-
-        dt_s = time.perf_counter() - start_loop_t
-        log_control_info(robot, dt_s, fps=fps)
-
-        timestamp = time.perf_counter() - start_warmup_t
+    # Execute a few seconds without recording to:
+    # 1. teleoperate the robot to move it in starting position if no policy provided,
+    # 2. give times to the robot devices to connect and start synchronizing,
+    # 3. place the cameras windows on screen
+    enable_teleoperation = policy is None
+    log_say("Warmup record", play_sounds)
+    warmup_record(robot, events, enable_teleoperation, warmup_time_s, display_cameras, fps)

    if has_method(robot, "teleop_safety_stop"):
        robot.teleop_safety_stop()

-    # Save images using threads to reach high fps (30 and more)
-    # Using `with` to exist smoothly if an execption is raised.
-    num_image_writers = num_image_writers_per_camera * len(robot.cameras)
-    num_image_writers = max(num_image_writers, 1)
-    frame_queue = multiprocessing.Queue()
-    frame_workers = start_frame_workers(frame_queue, num_image_writers)
+    while True:
+        if dataset["num_episodes"] >= num_episodes:
+            break

-    # Using `try` to exist smoothly if an exception is raised
-    try:
-        # Start recording all episodes
-        while episode_index < num_episodes:
-            logging.info(f"Recording episode {episode_index}")
-            say(f"Recording episode {episode_index}")
-            ep_dict = {}
-            frame_index = 0
-            timestamp = 0
-            start_episode_t = time.perf_counter()
-            while timestamp < episode_time_s:
-                start_loop_t = time.perf_counter()
+        episode_index = dataset["num_episodes"]
+        log_say(f"Recording episode {episode_index}", play_sounds)
+        record_episode(
+            dataset=dataset,
+            robot=robot,
+            events=events,
+            episode_time_s=episode_time_s,
+            display_cameras=display_cameras,
+            policy=policy,
+            device=device,
+            use_amp=use_amp,
+            fps=fps,
+        )

-                if policy is None:
-                    observation, action = robot.teleop_step(record_data=True)
-                else:
-                    observation = robot.capture_observation()
+        # Execute a few seconds without recording to give time to manually reset the environment
+        # Current code logic doesn't allow to teleoperate during this time.
+        # TODO(rcadene): add an option to enable teleoperation during reset
+        # Skip reset for the last episode to be recorded
+        if not events["stop_recording"] and (
+            (episode_index < num_episodes - 1) or events["rerecord_episode"]
+        ):
+            log_say("Reset the environment", play_sounds)
+            reset_environment(robot, events, reset_time_s)

-                image_keys = [key for key in observation if "image" in key]
-                not_image_keys = [key for key in observation if "image" not in key]
+        if events["rerecord_episode"]:
+            log_say("Re-record episode", play_sounds)
+            events["rerecord_episode"] = False
+            events["exit_early"] = False
+            delete_current_episode(dataset)
+            continue

-                for key in image_keys:
-                    frame_queue.put((observation[key], key, frame_index, episode_index, videos_dir))
+        # Increment by one dataset["current_episode_index"]
+        save_current_episode(dataset)

-                if display_cameras and not is_headless():
-                    image_keys = [key for key in observation if "image" in key]
-                    for key in image_keys:
-                        cv2.imshow(key, cv2.cvtColor(observation[key].numpy(), cv2.COLOR_RGB2BGR))
-                    cv2.waitKey(1)
+        if events["stop_recording"]:
+            break

-                for key in not_image_keys:
-                    if key not in ep_dict:
-                        ep_dict[key] = []
-                    ep_dict[key].append(observation[key])
+    log_say("Stop recording", play_sounds, blocking=True)
+    stop_recording(robot, listener, display_cameras)

-                if policy is not None:
-                    with (
-                        torch.inference_mode(),
-                        torch.autocast(device_type=device.type)
-                        if device.type == "cuda" and hydra_cfg.use_amp
-                        else nullcontext(),
-                    ):
-                        # Convert to pytorch format: channel first and float32 in [0,1] with batch dimension
-                        for name in observation:
-                            if "image" in name:
-                                observation[name] = observation[name].type(torch.float32) / 255
-                                observation[name] = observation[name].permute(2, 0, 1).contiguous()
-                            observation[name] = observation[name].unsqueeze(0)
-                            observation[name] = observation[name].to(device)
+    lerobot_dataset = create_lerobot_dataset(dataset, run_compute_stats, push_to_hub, tags, play_sounds)

-                        # Compute the next action with the policy
-                        # based on the current observation
-                        action = policy.select_action(observation)
-
-                        # Remove batch dimension
-                        action = action.squeeze(0)
-
-                        # Move to cpu, if not already the case
-                        action = action.to("cpu")
-
-                    # Order the robot to move
-                    action_sent = robot.send_action(action)
-
-                    # Action can eventually be clipped using `max_relative_target`,
-                    # so action actually sent is saved in the dataset.
-                    action = {"action": action_sent}
-
-                for key in action:
-                    if key not in ep_dict:
-                        ep_dict[key] = []
-                    ep_dict[key].append(action[key])
-
-                frame_index += 1
-
-                dt_s = time.perf_counter() - start_loop_t
-                busy_wait(1 / fps - dt_s)
-
-                dt_s = time.perf_counter() - start_loop_t
-                log_control_info(robot, dt_s, fps=fps)
-
-                timestamp = time.perf_counter() - start_episode_t
-                if exit_early:
-                    exit_early = False
-                    break
-
-            # TODO(alibets): allow for teleop during reset
-            if has_method(robot, "teleop_safety_stop"):
-                robot.teleop_safety_stop()
-
-            if not stop_recording:
-                # Start resetting env while the executor are finishing
-                logging.info("Reset the environment")
-                say("Reset the environment")
-
-            timestamp = 0
-            start_vencod_t = time.perf_counter()
-
-            # During env reset we save the data and encode the videos
-            num_frames = frame_index
-
-            for key in image_keys:
-                if video:
-                    tmp_imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
-                    fname = f"{key}_episode_{episode_index:06d}.mp4"
-                    video_path = local_dir / "videos" / fname
-                    if video_path.exists():
-                        video_path.unlink()
-                    # Store the reference to the video frame, even tho the videos are not yet encoded
-                    ep_dict[key] = []
-                    for i in range(num_frames):
-                        ep_dict[key].append({"path": f"videos/{fname}", "timestamp": i / fps})
-
-                else:
-                    imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
-                    ep_dict[key] = []
-                    for i in range(num_frames):
-                        img_path = imgs_dir / f"frame_{i:06d}.png"
-                        ep_dict[key].append({"path": str(img_path)})
-
-            for key in not_image_keys:
-                ep_dict[key] = torch.stack(ep_dict[key])
-
-            for key in action:
-                ep_dict[key] = torch.stack(ep_dict[key])
-
-            ep_dict["episode_index"] = torch.tensor([episode_index] * num_frames)
-            ep_dict["frame_index"] = torch.arange(0, num_frames, 1)
-            ep_dict["timestamp"] = torch.arange(0, num_frames, 1) / fps
-
-            done = torch.zeros(num_frames, dtype=torch.bool)
-            done[-1] = True
-            ep_dict["next.done"] = done
-
-            ep_path = episodes_dir / f"episode_{episode_index}.pth"
-            print("Saving episode dictionary...")
-            torch.save(ep_dict, ep_path)
-
-            rec_info = {
-                "last_episode_index": episode_index,
-            }
-            with open(rec_info_path, "w") as f:
-                json.dump(rec_info, f)
-
-            is_last_episode = stop_recording or (episode_index == (num_episodes - 1))
-
-            # Wait if necessary
-            with tqdm.tqdm(total=reset_time_s, desc="Waiting") as pbar:
-                while timestamp < reset_time_s and not is_last_episode:
-                    time.sleep(1)
-                    timestamp = time.perf_counter() - start_vencod_t
-                    pbar.update(1)
-                    if exit_early:
-                        exit_early = False
-                        break
-
-            # Skip updating episode index which forces re-recording episode
-            if rerecord_episode:
-                rerecord_episode = False
-                continue
-
-            episode_index += 1
-
-            if is_last_episode:
-                logging.info("Done recording")
-                say("Done recording", blocking=True)
-                if not is_headless():
-                    listener.stop()
-
-                logging.info("Waiting for threads writing the images on disk to terminate...")
-                stop_workers(frame_workers, frame_queue)
-
-    except Exception:
-        traceback.print_exc()
-        stop_workers(frame_workers, frame_queue)
-
-    robot.disconnect()
-    if display_cameras and not is_headless():
-        cv2.destroyAllWindows()
-
-    num_episodes = episode_index
-
-    if video:
-        logging.info("Encoding videos")
-        say("Encoding videos")
-        # Use ffmpeg to convert frames stored as png into mp4 videos
-        for episode_index in tqdm.tqdm(range(num_episodes)):
-            for key in image_keys:
-                tmp_imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
-                fname = f"{key}_episode_{episode_index:06d}.mp4"
-                video_path = local_dir / "videos" / fname
-                if video_path.exists():
-                    # Skip if video is already encoded. Could be the case when resuming data recording.
-                    continue
-                # note: `encode_video_frames` is a blocking call. Making it asynchronous shouldn't speedup encoding,
-                # since video encoding with ffmpeg is already using multithreading.
-                encode_video_frames(tmp_imgs_dir, video_path, fps, overwrite=True)
-                shutil.rmtree(tmp_imgs_dir)
-
-    logging.info("Concatenating episodes")
-    ep_dicts = []
-    for episode_index in tqdm.tqdm(range(num_episodes)):
-        ep_path = episodes_dir / f"episode_{episode_index}.pth"
-        ep_dict = torch.load(ep_path)
-        ep_dicts.append(ep_dict)
-    data_dict = concatenate_episodes(ep_dicts)
-
-    total_frames = data_dict["frame_index"].shape[0]
-    data_dict["index"] = torch.arange(0, total_frames, 1)
-
-    hf_dataset = to_hf_dataset(data_dict, video)
-    episode_data_index = calculate_episode_data_index(hf_dataset)
-    info = {
-        "codebase_version": CODEBASE_VERSION,
-        "fps": fps,
-        "video": video,
-    }
-    if video:
-        info["encoding"] = get_default_encoding()
-
-    lerobot_dataset = LeRobotDataset.from_preloaded(
-        repo_id=repo_id,
-        hf_dataset=hf_dataset,
-        episode_data_index=episode_data_index,
-        info=info,
-        videos_dir=videos_dir,
-    )
-    if run_compute_stats:
-        logging.info("Computing dataset statistics")
-        say("Computing dataset statistics")
-        stats = compute_stats(lerobot_dataset)
-        lerobot_dataset.stats = stats
-    else:
-        stats = {}
-        logging.info("Skipping computation of the dataset statistics")
-
-    hf_dataset = hf_dataset.with_format(None)  # to remove transforms that cant be saved
-    hf_dataset.save_to_disk(str(local_dir / "train"))
-
-    meta_data_dir = local_dir / "meta_data"
-    save_meta_data(info, stats, episode_data_index, meta_data_dir)
-
-    if push_to_hub:
-        hf_dataset.push_to_hub(repo_id, revision="main")
-        push_meta_data_to_hub(repo_id, meta_data_dir, revision="main")
-        push_dataset_card_to_hub(repo_id, revision="main", tags=tags)
-        if video:
-            push_videos_to_hub(repo_id, videos_dir, revision="main")
-        create_branch(repo_id, repo_type="dataset", branch=CODEBASE_VERSION)
-
-    logging.info("Exiting")
-    say("Exiting")
+    log_say("Exiting", play_sounds)
    return lerobot_dataset


-def replay(robot: Robot, episode: int, fps: int | None = None, root="data", repo_id="lerobot/debug"):
+@safe_disconnect
+def replay(
+    robot: Robot, episode: int, fps: int | None = None, root="data", repo_id="lerobot/debug", play_sounds=True
+):
+    # TODO(rcadene, aliberts): refactor with control_loop, once `dataset` is an instance of LeRobotDataset
    # TODO(rcadene): Add option to record logs
    local_dir = Path(root) / repo_id
    if not local_dir.exists():
@@ -781,8 +331,7 @@ def replay(robot: Robot, episode: int, fps: int | None = None, root="data", repo
    if not robot.is_connected:
        robot.connect()

-    logging.info("Replaying episode")
-    say("Replaying episode", blocking=True)
+    log_say("Replaying episode", play_sounds, blocking=True)
    for idx in range(from_idx, to_idx):
        start_episode_t = time.perf_counter()

@@ -827,6 +376,12 @@ if __name__ == "__main__":
    parser_teleop.add_argument(
        "--fps", type=none_or_int, default=None, help="Frames per second (set to None to disable)"
    )
+    parser_teleop.add_argument(
+        "--display-cameras",
+        type=int,
+        default=1,
+        help="Display all cameras on screen (set to 1 to display or 0).",
+    )

    parser_record = subparsers.add_parser("record", parents=[base_parser])
    parser_record.add_argument(
@@ -882,12 +437,23 @@ if __name__ == "__main__":
        help="Add tags to your dataset on the hub.",
    )
    parser_record.add_argument(
-        "--num-image-writers-per-camera",
+        "--num-image-writer-processes",
+        type=int,
+        default=0,
+        help=(
+            "Number of subprocesses handling the saving of frames as PNGs. Set to 0 to use threads only; "
+            "set to ≥1 to use subprocesses, each using threads to write images. The best number of processes "
+            "and threads depends on your system. We recommend 4 threads per camera with 0 processes. "
+            "If fps is unstable, adjust the thread count. If still unstable, try using 1 or more subprocesses."
+        ),
+    )
+    parser_record.add_argument(
+        "--num-image-writer-threads-per-camera",
        type=int,
        default=4,
        help=(
            "Number of threads writing the frames as png images on disk, per camera. "
-            "Too much threads might cause unstable teleoperation fps due to main thread being blocked. "
+            "Too many threads might cause unstable teleoperation fps due to main thread being blocked. "
            "Not enough threads might cause low camera fps."
        ),
    )
@@ -953,19 +519,7 @@ if __name__ == "__main__":
        teleoperate(robot, **kwargs)

    elif control_mode == "record":
-        pretrained_policy_name_or_path = args.pretrained_policy_name_or_path
-        policy_overrides = args.policy_overrides
-        del kwargs["pretrained_policy_name_or_path"]
-        del kwargs["policy_overrides"]
-
-        policy_cfg = None
-        if pretrained_policy_name_or_path is not None:
-            pretrained_policy_path = get_pretrained_policy_path(pretrained_policy_name_or_path)
-            policy_cfg = init_hydra_config(pretrained_policy_path / "config.yaml", policy_overrides)
-            policy = make_policy(hydra_cfg=policy_cfg, pretrained_policy_name_or_path=pretrained_policy_path)
-            record(robot, policy, policy_cfg, **kwargs)
-        else:
-            record(robot, **kwargs)
+        record(robot, **kwargs)

    elif control_mode == "replay":
        replay(robot, **kwargs)
--- a/lerobot/scripts/find_motors_bus_port.py
+++ b/lerobot/scripts/find_motors_bus_port.py
@@ -0,0 +1,36 @@
+import time
+from pathlib import Path
+
+
+def find_available_ports():
+    ports = []
+    for path in Path("/dev").glob("tty*"):
+        ports.append(str(path))
+    return ports
+
+
+def find_port():
+    print("Finding all available ports for the MotorsBus.")
+    ports_before = find_available_ports()
+    print(ports_before)
+
+    print("Remove the usb cable from your MotorsBus and press Enter when done.")
+    input()
+
+    time.sleep(0.5)
+    ports_after = find_available_ports()
+    ports_diff = list(set(ports_before) - set(ports_after))
+
+    if len(ports_diff) == 1:
+        port = ports_diff[0]
+        print(f"The port of this MotorsBus is '{port}'")
+        print("Reconnect the usb cable.")
+    elif len(ports_diff) == 0:
+        raise OSError(f"Could not detect the port. No difference was found ({ports_diff}).")
+    else:
+        raise OSError(f"Could not detect the port. More than one port was found ({ports_diff}).")
+
+
+if __name__ == "__main__":
+    # Helper to find the usb port associated to all your MotorsBus.
+    find_port()
--- a/lerobot/scripts/train.py
+++ b/lerobot/scripts/train.py
@@ -93,6 +93,18 @@ def make_optimizer_and_scheduler(cfg, policy):
    elif policy.name == "tdmpc":
        optimizer = torch.optim.Adam(policy.parameters(), cfg.training.lr)
        lr_scheduler = None
+
+    elif policy.name == "tdmpc2":
+        params_group = [
+            {"params": policy.model._encoder.parameters(), "lr": cfg.training.lr * cfg.training.enc_lr_scale},
+            {"params": policy.model._dynamics.parameters()},
+            {"params": policy.model._reward.parameters()},
+            {"params": policy.model._Qs.parameters()},
+            {"params": policy.model._pi.parameters(), "eps": 1e-5},
+        ]
+        optimizer = torch.optim.Adam(params_group, lr=cfg.training.lr)
+        lr_scheduler = None
+
    elif cfg.policy.name == "vqbet":
        from lerobot.common.policies.vqbet.modeling_vqbet import VQBeTOptimizer, VQBeTScheduler

@@ -383,7 +395,7 @@ def train(cfg: DictConfig, out_dir: str | None = None, job_name: str | None = No
            logging.info(f"Checkpoint policy after step {step}")
            # Note: Save with step as the identifier, and format it to have at least 6 digits but more if
            # needed (choose 6 as a minimum for consistency without being overkill).
-            logger.save_checkpont(
+            logger.save_checkpoint(
                step,
                policy,
                optimizer,
--- a/lerobot/templates/visualize_dataset_template.html
+++ b/lerobot/templates/visualize_dataset_template.html
@@ -250,7 +250,7 @@
                    if(!canPlayVideos){
                        this.videoCodecError = true;
                    }
-                    
+
                    // process CSV data
                    this.videos = document.querySelectorAll('video');
                    this.video = this.videos[0];
--- a/media/gym/aloha_act.gif
+++ b/media/gym/aloha_act.gif
--- a/media/gym/pusht_diffusion.gif
+++ b/media/gym/pusht_diffusion.gif
--- a/media/gym/simxarm_tdmpc.gif
+++ b/media/gym/simxarm_tdmpc.gif
--- a/media/moss/follower_initial.webp
+++ b/media/moss/follower_initial.webp
--- a/media/moss/follower_rest.webp
+++ b/media/moss/follower_rest.webp
--- a/media/moss/follower_rotated.webp
+++ b/media/moss/follower_rotated.webp
--- a/media/moss/follower_zero.webp
+++ b/media/moss/follower_zero.webp
--- a/media/moss/leader_rest.webp
+++ b/media/moss/leader_rest.webp
--- a/media/moss/leader_rotated.webp
+++ b/media/moss/leader_rotated.webp
--- a/media/moss/leader_zero.webp
+++ b/media/moss/leader_zero.webp
--- a/media/so100/follower_initial.webp
+++ b/media/so100/follower_initial.webp
--- a/media/so100/follower_rest.webp
+++ b/media/so100/follower_rest.webp
--- a/media/so100/follower_rotated.webp
+++ b/media/so100/follower_rotated.webp
--- a/media/so100/follower_zero.webp
+++ b/media/so100/follower_zero.webp
--- a/media/so100/leader_follower.webp
+++ b/media/so100/leader_follower.webp
--- a/media/so100/leader_rest.webp
+++ b/media/so100/leader_rest.webp
--- a/media/so100/leader_rotated.webp
+++ b/media/so100/leader_rotated.webp
--- a/media/so100/leader_zero.webp
+++ b/media/so100/leader_zero.webp
--- a/poetry.lock
+++ b/poetry.lock
@@ -1413,6 +1413,19 @@ files = [
 [package.extras]
 devel = ["colorama", "json-spec", "jsonschema", "pylint", "pytest", "pytest-benchmark", "pytest-cache", "validictory"]

+[[package]]
+name = "feetech-servo-sdk"
+version = "1.0.0"
+description = "This is source code from official feetech repository"
+optional = true
+python-versions = "*"
+files = [
+    {file = "feetech-servo-sdk-1.0.0.tar.gz", hash = "sha256:d4d3832e4b1b22a8222133a414db9f868224c2fb639426a1b11d96ddfe84e69c"},
+]
+
+[package.dependencies]
+pyserial = "*"
+
 [[package]]
 name = "filelock"
 version = "3.16.1"
@@ -5245,7 +5258,7 @@ docs = ["sphinx", "sphinx-automodapi", "sphinx-rtd-theme"]
 name = "pyserial"
 version = "3.5"
 description = "Python Serial Port Extension"
-optional = false
+optional = true
 python-versions = "*"
 files = [
    {file = "pyserial-3.5-py2.py3-none-any.whl", hash = "sha256:c4451db6ba391ca6ca299fb3ec7bae67a5c55dde170964c7a14ceefec02f2cf0"},
@@ -7416,6 +7429,7 @@ aloha = ["gym-aloha"]
 dev = ["debugpy", "pre-commit"]
 dora = ["gym-dora"]
 dynamixel = ["dynamixel-sdk", "pynput"]
+feetech = ["feetech-servo-sdk", "pynput"]
 intelrealsense = ["pyrealsense2"]
 pusht = ["gym-pusht"]
 stretch = ["hello-robot-stretch-body", "pynput", "pyrealsense2", "pyrender"]
@@ -7427,4 +7441,4 @@ xarm = ["gym-xarm"]
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.10,<3.13"
-content-hash = "78f31561a7e4b6f0a97e27a65ec00c2c1826f420d2587396762bb5485d12f676"
+content-hash = "7ff63d36a5524a77cba916d212741082adcb49dfdc0dc49f8bf8ccee53c02254"
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -44,7 +44,8 @@ diffusers = ">=0.27.2"
 torchvision = ">=0.17.1"
 h5py = ">=3.10.0"
 huggingface-hub = {extras = ["hf-transfer", "cli"], version = ">=0.25.0"}
-gymnasium = ">=0.29.1"
+# TODO(rcadene, aliberts): Make gym 1.0.0 work
+gymnasium = "==0.29.1"
 cmake = ">=3.29.0.1"
 gym-dora = { git = "https://github.com/dora-rs/dora-lerobot.git", subdirectory = "gym_dora", optional = true }
 gym-pusht = { version = ">=0.1.5", optional = true}
@@ -64,6 +65,7 @@ pandas = {version = ">=2.2.2", optional = true}
 scikit-image = {version = ">=0.23.2", optional = true}
 dynamixel-sdk = {version = ">=3.7.31", optional = true}
 pynput = {version = ">=1.7.7", optional = true}
+feetech-servo-sdk = {version = ">=1.0.0", optional = true}
 setuptools = {version = "!=71.0.1", optional = true}  # TODO(rcadene, aliberts): 71.0.1 has a bug
 pyrealsense2 = {version = ">=2.55.1.6486", markers = "sys_platform != 'darwin'", optional = true}  # TODO(rcadene, aliberts): Fix on Mac
 pyrender = {git = "https://github.com/mmatl/pyrender.git", markers = "sys_platform == 'linux'", optional = true}
@@ -81,6 +83,7 @@ test = ["pytest", "pytest-cov", "pyserial"]
 umi = ["imagecodecs"]
 video_benchmark = ["scikit-image", "pandas"]
 dynamixel = ["dynamixel-sdk", "pynput"]
+feetech = ["feetech-servo-sdk", "pynput"]
 intelrealsense = ["pyrealsense2"]
 stretch = ["hello-robot-stretch-body", "pyrender", "pyrealsense2", "pynput"]

--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -52,8 +52,9 @@ def is_robot_available(robot_type):
            print(f"\nInstall module '{e.name}'")
        elif isinstance(e, SerialException):
            print("\nNo physical motors bus detected.")
+        else:
+            traceback.print_exc()

-        traceback.print_exc()
        return False


@@ -77,8 +78,9 @@ def is_camera_available(camera_type):
            print(f"\nInstall module '{e.name}'")
        elif isinstance(e, ValueError) and "camera_index" in e.args[0]:
            print("\nNo physical camera detected.")
+        else:
+            traceback.print_exc()

-        traceback.print_exc()
        return False


@@ -102,8 +104,9 @@ def is_motor_available(motor_type):
            print(f"\nInstall module '{e.name}'")
        elif isinstance(e, SerialException):
            print("\nNo physical motors bus detected.")
+        else:
+            traceback.print_exc()

-        traceback.print_exc()
        return False


--- a/tests/mock_dynamixel_sdk.py
+++ b/tests/mock_dynamixel_sdk.py
@@ -18,6 +18,19 @@ def convert_to_bytes(value, bytes):
    return value


+def get_default_motor_values(motor_index):
+    return {
+        # Key (int) are from X_SERIES_CONTROL_TABLE
+        7: motor_index,  # ID
+        8: DEFAULT_BAUDRATE,  # Baud_rate
+        10: 0,  # Drive_Mode
+        64: 0,  # Torque_Enable
+        # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
+        # For other joints, 2560 will be autocorrected to be in calibration range
+        132: 2560,  # Present_Position
+    }
+
+
 class PortHandler:
    def __init__(self, port):
        self.port = port
@@ -52,18 +65,9 @@ class GroupSyncRead:
        self.packet_handler = packet_handler

    def addParam(self, motor_index):  # noqa: N802
+        # Initialize motor default values
        if motor_index not in self.packet_handler.data:
-            # Initialize motor default values
-            self.packet_handler.data[motor_index] = {
-                # Key (int) are from X_SERIES_CONTROL_TABLE
-                7: motor_index,  # ID
-                8: DEFAULT_BAUDRATE,  # Baud_rate
-                10: 0,  # Drive_Mode
-                64: 0,  # Torque_Enable
-                # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
-                # For other joints, 2560 will be autocorrected to be in calibration range
-                132: 2560,  # Present_Position
-            }
+            self.packet_handler.data[motor_index] = get_default_motor_values(motor_index)

    def txRxPacket(self):  # noqa: N802
        return COMM_SUCCESS
@@ -78,6 +82,9 @@ class GroupSyncWrite:
        self.address = address

    def addParam(self, index, data):  # noqa: N802
+        # Initialize motor default values
+        if index not in self.packet_handler.data:
+            self.packet_handler.data[index] = get_default_motor_values(index)
        self.changeParam(index, data)

    def txPacket(self):  # noqa: N802
--- a/tests/mock_scservo_sdk.py
+++ b/tests/mock_scservo_sdk.py
@@ -0,0 +1,103 @@
+"""Mocked classes and functions from dynamixel_sdk to allow for continuous integration
+and testing code logic that requires hardware and devices (e.g. robot arms, cameras)
+
+Warning: These mocked versions are minimalist. They do not exactly mock every behaviors
+from the original classes and functions (e.g. return types might be None instead of boolean).
+"""
+
+# from dynamixel_sdk import COMM_SUCCESS
+
+DEFAULT_BAUDRATE = 1_000_000
+COMM_SUCCESS = 0  # tx or rx packet communication success
+
+
+def convert_to_bytes(value, bytes):
+    # TODO(rcadene): remove need to mock `convert_to_bytes` by implemented the inverse transform
+    # `convert_bytes_to_value`
+    del bytes  # unused
+    return value
+
+
+def get_default_motor_values(motor_index):
+    return {
+        # Key (int) are from SCS_SERIES_CONTROL_TABLE
+        5: motor_index,  # ID
+        6: DEFAULT_BAUDRATE,  # Baud_rate
+        10: 0,  # Drive_Mode
+        21: 32,  # P_Coefficient
+        22: 32,  # D_Coefficient
+        23: 0,  # I_Coefficient
+        40: 0,  # Torque_Enable
+        41: 254,  # Acceleration
+        31: -2047,  # Offset
+        33: 0,  # Mode
+        55: 1,  # Lock
+        # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
+        # For other joints, 2560 will be autocorrected to be in calibration range
+        56: 2560,  # Present_Position
+        58: 0,  # Present_Speed
+        69: 0,  # Present_Current
+        85: 150,  # Maximum_Acceleration
+    }
+
+
+class PortHandler:
+    def __init__(self, port):
+        self.port = port
+        # factory default baudrate
+        self.baudrate = DEFAULT_BAUDRATE
+
+    def openPort(self):  # noqa: N802
+        return True
+
+    def closePort(self):  # noqa: N802
+        pass
+
+    def setPacketTimeoutMillis(self, timeout_ms):  # noqa: N802
+        del timeout_ms  # unused
+
+    def getBaudRate(self):  # noqa: N802
+        return self.baudrate
+
+    def setBaudRate(self, baudrate):  # noqa: N802
+        self.baudrate = baudrate
+
+
+class PacketHandler:
+    def __init__(self, protocol_version):
+        del protocol_version  # unused
+        # Use packet_handler.data to communicate across Read and Write
+        self.data = {}
+
+
+class GroupSyncRead:
+    def __init__(self, port_handler, packet_handler, address, bytes):
+        self.packet_handler = packet_handler
+
+    def addParam(self, motor_index):  # noqa: N802
+        # Initialize motor default values
+        if motor_index not in self.packet_handler.data:
+            self.packet_handler.data[motor_index] = get_default_motor_values(motor_index)
+
+    def txRxPacket(self):  # noqa: N802
+        return COMM_SUCCESS
+
+    def getData(self, index, address, bytes):  # noqa: N802
+        return self.packet_handler.data[index][address]
+
+
+class GroupSyncWrite:
+    def __init__(self, port_handler, packet_handler, address, bytes):
+        self.packet_handler = packet_handler
+        self.address = address
+
+    def addParam(self, index, data):  # noqa: N802
+        if index not in self.packet_handler.data:
+            self.packet_handler.data[index] = get_default_motor_values(index)
+        self.changeParam(index, data)
+
+    def txPacket(self):  # noqa: N802
+        return COMM_SUCCESS
+
+    def changeParam(self, index, data):  # noqa: N802
+        self.packet_handler.data[index][self.address] = data
--- a/tests/test_control_robot.py
+++ b/tests/test_control_robot.py
@@ -23,27 +23,33 @@ pytest -sx 'tests/test_control_robot.py::test_teleoperate[aloha-True]'
 ```
 """

+import multiprocessing
 from pathlib import Path
+from unittest.mock import patch

 import pytest

+from lerobot.common.datasets.populate_dataset import add_frame, init_dataset
+from lerobot.common.logger import Logger
 from lerobot.common.policies.factory import make_policy
 from lerobot.common.utils.utils import init_hydra_config
-from lerobot.scripts.control_robot import calibrate, get_available_arms, record, replay, teleoperate
+from lerobot.scripts.control_robot import calibrate, record, replay, teleoperate
+from lerobot.scripts.train import make_optimizer_and_scheduler
 from tests.test_robots import make_robot
-from tests.utils import DEFAULT_CONFIG_PATH, DEVICE, TEST_ROBOT_TYPES, require_robot
+from tests.utils import DEFAULT_CONFIG_PATH, DEVICE, TEST_ROBOT_TYPES, mock_calibration_dir, require_robot


@pytest.mark.parametrize("robot_type, mock", TEST_ROBOT_TYPES)
@require_robot
 def test_teleoperate(tmpdir, request, robot_type, mock):
-    if mock:
+    if mock and robot_type != "aloha":
        request.getfixturevalue("patch_builtins_input")

        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        tmpdir = Path(tmpdir)
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False
@@ -68,7 +74,7 @@ def test_calibrate(tmpdir, request, robot_type, mock):
    overrides_calibration_dir = [f"calibration_dir={calibration_dir}"]

    robot = make_robot(robot_type, overrides=overrides_calibration_dir, mock=mock)
-    calibrate(robot, arms=get_available_arms(robot))
+    calibrate(robot, arms=robot.available_arms)
    del robot


@@ -78,12 +84,13 @@ def test_record_without_cameras(tmpdir, request, robot_type, mock):
    # Avoid using cameras
    overrides = ["~cameras"]

-    if mock:
+    if mock and robot_type != "aloha":
        request.getfixturevalue("patch_builtins_input")

        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = Path(tmpdir) / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides.append(f"calibration_dir={calibration_dir}")

    root = Path(tmpdir) / "data"
@@ -101,49 +108,55 @@ def test_record_without_cameras(tmpdir, request, robot_type, mock):
        run_compute_stats=False,
        push_to_hub=False,
        video=False,
+        play_sounds=False,
    )


@pytest.mark.parametrize("robot_type, mock", TEST_ROBOT_TYPES)
@require_robot
 def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
-    if mock:
+    tmpdir = Path(tmpdir)
+
+    if mock and robot_type != "aloha":
        request.getfixturevalue("patch_builtins_input")

        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
-        calibration_dir = Path(tmpdir) / robot_type
+        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
-        # Use the default .cache/calibration folder when mock=False
+        # Use the default .cache/calibration folder when mock=False or for aloha
        overrides = None

-    if robot_type == "aloha":
-        pytest.skip("TODO(rcadene): enable test once aloha_real and act_aloha_real are merged")
-
    env_name = "koch_real"
    policy_name = "act_koch_real"

-    root = Path(tmpdir) / "data"
+    root = tmpdir / "data"
    repo_id = "lerobot/debug"
+    eval_repo_id = "lerobot/eval_debug"

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
    dataset = record(
        robot,
-        fps=30,
-        root=root,
-        repo_id=repo_id,
+        root,
+        repo_id,
+        fps=1,
        warmup_time_s=1,
        episode_time_s=1,
+        reset_time_s=1,
        num_episodes=2,
        push_to_hub=False,
        # TODO(rcadene, aliberts): test video=True
        video=False,
        # TODO(rcadene): display cameras through cv2 sometimes crashes on mac
        display_cameras=False,
+        play_sounds=False,
    )
+    assert dataset.num_episodes == 2
+    assert len(dataset) == 2

-    replay(robot, episode=0, fps=30, root=root, repo_id=repo_id)
+    replay(robot, episode=0, fps=1, root=root, repo_id=repo_id, play_sounds=False)

    # TODO(rcadene, aliberts): rethink this design
    if robot_type == "aloha":
@@ -152,6 +165,12 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
    elif robot_type in ["koch", "koch_bimanual"]:
        env_name = "koch_real"
        policy_name = "act_koch_real"
+    elif robot_type == "so100":
+        env_name = "so100_real"
+        policy_name = "act_so100_real"
+    elif robot_type == "moss":
+        env_name = "moss_real"
+        policy_name = "act_moss_real"
    else:
        raise NotImplementedError(robot_type)

@@ -164,24 +183,281 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
    if robot_type == "koch_bimanual":
        overrides += ["env.state_dim=12", "env.action_dim=12"]

+    overrides += ["wandb.enable=false"]
+    overrides += ["env.fps=1"]
+
    cfg = init_hydra_config(
        DEFAULT_CONFIG_PATH,
        overrides=overrides,
    )

    policy = make_policy(hydra_cfg=cfg, dataset_stats=dataset.stats)
+    optimizer, lr_scheduler = make_optimizer_and_scheduler(cfg, policy)
+    out_dir = tmpdir / "logger"
+    logger = Logger(cfg, out_dir, wandb_job_name="debug")
+    logger.save_checkpoint(
+        0,
+        policy,
+        optimizer,
+        lr_scheduler,
+        identifier=0,
+    )
+    pretrained_policy_name_or_path = out_dir / "checkpoints/last/pretrained_model"
+
+    # In `examples/9_use_aloha.md`, we advise using `num_image_writer_processes=1`
+    # during inference, to reach constent fps, so we test this here.
+    if robot_type == "aloha":
+        num_image_writer_processes = 1
+
+        # `multiprocessing.set_start_method("spawn", force=True)` avoids a hanging issue
+        # before exiting pytest. However, it outputs the following error in the log:
+        # Traceback (most recent call last):
+        #     File "<string>", line 1, in <module>
+        #     File "/Users/rcadene/miniconda3/envs/lerobot/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
+        #         exitcode = _main(fd, parent_sentinel)
+        #     File "/Users/rcadene/miniconda3/envs/lerobot/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
+        #         self = reduction.pickle.load(from_parent)
+        #     File "/Users/rcadene/miniconda3/envs/lerobot/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
+        #         self._semlock = _multiprocessing.SemLock._rebuild(*state)
+        # FileNotFoundError: [Errno 2] No such file or directory
+        # TODO(rcadene, aliberts): fix FileNotFoundError in multiprocessing
+        multiprocessing.set_start_method("spawn", force=True)
+    else:
+        num_image_writer_processes = 0

    record(
        robot,
-        policy,
-        cfg,
+        root,
+        eval_repo_id,
+        pretrained_policy_name_or_path,
        warmup_time_s=1,
        episode_time_s=1,
+        reset_time_s=1,
        num_episodes=2,
        run_compute_stats=False,
        push_to_hub=False,
        video=False,
        display_cameras=False,
+        play_sounds=False,
+        num_image_writer_processes=num_image_writer_processes,
    )

+    assert dataset.num_episodes == 2
+    assert len(dataset) == 2
+
    del robot
+
+
+@pytest.mark.parametrize("robot_type, mock", [("koch", True)])
+@require_robot
+def test_resume_record(tmpdir, request, robot_type, mock):
+    if mock and robot_type != "aloha":
+        request.getfixturevalue("patch_builtins_input")
+
+        # Create an empty calibration directory to trigger manual calibration
+        # and avoid writing calibration files in user .cache/calibration folder
+        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
+        overrides = [f"calibration_dir={calibration_dir}"]
+    else:
+        # Use the default .cache/calibration folder when mock=False or for aloha
+        overrides = []
+
+    robot = make_robot(robot_type, overrides=overrides, mock=mock)
+
+    root = Path(tmpdir) / "data"
+    repo_id = "lerobot/debug"
+
+    dataset = record(
+        robot,
+        root,
+        repo_id,
+        fps=1,
+        warmup_time_s=0,
+        episode_time_s=1,
+        num_episodes=1,
+        push_to_hub=False,
+        video=False,
+        display_cameras=False,
+        play_sounds=False,
+        run_compute_stats=False,
+    )
+    assert len(dataset) == 1, "`dataset` should contain only 1 frame"
+
+    init_dataset_return_value = {}
+
+    def wrapped_init_dataset(*args, **kwargs):
+        nonlocal init_dataset_return_value
+        init_dataset_return_value = init_dataset(*args, **kwargs)
+        return init_dataset_return_value
+
+    with patch("lerobot.scripts.control_robot.init_dataset", wraps=wrapped_init_dataset):
+        dataset = record(
+            robot,
+            root,
+            repo_id,
+            fps=1,
+            warmup_time_s=0,
+            episode_time_s=1,
+            num_episodes=2,
+            push_to_hub=False,
+            video=False,
+            display_cameras=False,
+            play_sounds=False,
+            run_compute_stats=False,
+        )
+        assert len(dataset) == 2, "`dataset` should contain only 1 frame"
+        assert (
+            init_dataset_return_value["num_episodes"] == 2
+        ), "`init_dataset` should load the previous episode"
+
+
+@pytest.mark.parametrize("robot_type, mock", [("koch", True)])
+@require_robot
+def test_record_with_event_rerecord_episode(tmpdir, request, robot_type, mock):
+    if mock and robot_type != "aloha":
+        request.getfixturevalue("patch_builtins_input")
+
+        # Create an empty calibration directory to trigger manual calibration
+        # and avoid writing calibration files in user .cache/calibration folder
+        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
+        overrides = [f"calibration_dir={calibration_dir}"]
+    else:
+        # Use the default .cache/calibration folder when mock=False or for aloha
+        overrides = []
+
+    robot = make_robot(robot_type, overrides=overrides, mock=mock)
+    with (
+        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
+        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
+    ):
+        mock_events = {}
+        mock_events["exit_early"] = True
+        mock_events["rerecord_episode"] = True
+        mock_events["stop_recording"] = False
+        mock_listener.return_value = (None, mock_events)
+
+        root = Path(tmpdir) / "data"
+        repo_id = "lerobot/debug"
+
+        dataset = record(
+            robot,
+            root,
+            repo_id,
+            fps=1,
+            warmup_time_s=0,
+            episode_time_s=1,
+            num_episodes=1,
+            push_to_hub=False,
+            video=False,
+            display_cameras=False,
+            play_sounds=False,
+            run_compute_stats=False,
+        )
+
+        assert not mock_events["rerecord_episode"], "`rerecord_episode` wasn't properly reset to False"
+        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
+        assert mock_add_frame.call_count == 2, "`add_frame` should have been called 2 times"
+        assert len(dataset) == 1, "`dataset` should contain only 1 frame"
+
+
+@pytest.mark.parametrize("robot_type, mock", [("koch", True)])
+@require_robot
+def test_record_with_event_exit_early(tmpdir, request, robot_type, mock):
+    if mock:
+        request.getfixturevalue("patch_builtins_input")
+
+        # Create an empty calibration directory to trigger manual calibration
+        # and avoid writing calibration files in user .cache/calibration folder
+        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
+        overrides = [f"calibration_dir={calibration_dir}"]
+    else:
+        # Use the default .cache/calibration folder when mock=False or for aloha
+        overrides = []
+
+    robot = make_robot(robot_type, overrides=overrides, mock=mock)
+    with (
+        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
+        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
+    ):
+        mock_events = {}
+        mock_events["exit_early"] = True
+        mock_events["rerecord_episode"] = False
+        mock_events["stop_recording"] = False
+        mock_listener.return_value = (None, mock_events)
+
+        root = Path(tmpdir) / "data"
+        repo_id = "lerobot/debug"
+
+        dataset = record(
+            robot,
+            fps=2,
+            root=root,
+            repo_id=repo_id,
+            warmup_time_s=0,
+            episode_time_s=1,
+            num_episodes=1,
+            push_to_hub=False,
+            video=False,
+            display_cameras=False,
+            play_sounds=False,
+            run_compute_stats=False,
+        )
+
+        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
+        assert mock_add_frame.call_count == 1, "`add_frame` should have been called 1 time"
+        assert len(dataset) == 1, "`dataset` should contain only 1 frame"
+
+
+@pytest.mark.parametrize(
+    "robot_type, mock, num_image_writer_processes", [("koch", True, 0), ("koch", True, 1)]
+)
+@require_robot
+def test_record_with_event_stop_recording(tmpdir, request, robot_type, mock, num_image_writer_processes):
+    if mock:
+        request.getfixturevalue("patch_builtins_input")
+
+        # Create an empty calibration directory to trigger manual calibration
+        # and avoid writing calibration files in user .cache/calibration folder
+        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
+        overrides = [f"calibration_dir={calibration_dir}"]
+    else:
+        # Use the default .cache/calibration folder when mock=False or for aloha
+        overrides = []
+
+    robot = make_robot(robot_type, overrides=overrides, mock=mock)
+    with (
+        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
+        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
+    ):
+        mock_events = {}
+        mock_events["exit_early"] = True
+        mock_events["rerecord_episode"] = False
+        mock_events["stop_recording"] = True
+        mock_listener.return_value = (None, mock_events)
+
+        root = Path(tmpdir) / "data"
+        repo_id = "lerobot/debug"
+
+        dataset = record(
+            robot,
+            root,
+            repo_id,
+            fps=1,
+            warmup_time_s=0,
+            episode_time_s=1,
+            num_episodes=2,
+            push_to_hub=False,
+            video=False,
+            display_cameras=False,
+            play_sounds=False,
+            run_compute_stats=False,
+            num_image_writer_processes=num_image_writer_processes,
+        )
+
+        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
+        assert mock_add_frame.call_count == 1, "`add_frame` should have been called 1 time"
+        assert len(dataset) == 1, "`dataset` should contain only 1 frame"
--- a/tests/test_motors.py
+++ b/tests/test_motors.py
@@ -30,8 +30,8 @@ import time
 import numpy as np
 import pytest

-from lerobot.common.robot_devices.motors.dynamixel import find_port
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
+from lerobot.scripts.find_motors_bus_port import find_port
 from tests.utils import TEST_MOTOR_TYPES, make_motors_bus, require_motor


@@ -52,12 +52,24 @@ def test_configure_motors_all_ids_1(request, motor_type, mock):
    if mock:
        request.getfixturevalue("patch_builtins_input")

+    if motor_type == "dynamixel":
+        # see X_SERIES_BAUDRATE_TABLE
+        smaller_baudrate = 9_600
+        smaller_baudrate_value = 0
+    elif motor_type == "feetech":
+        # see SCS_SERIES_BAUDRATE_TABLE
+        smaller_baudrate = 19_200
+        smaller_baudrate_value = 7
+    else:
+        raise ValueError(motor_type)
+
    input("Are you sure you want to re-configure the motors? Press enter to continue...")
    # This test expect the configuration was already correct.
    motors_bus = make_motors_bus(motor_type, mock=mock)
    motors_bus.connect()
-    motors_bus.write("Baud_Rate", [0] * len(motors_bus.motors))
-    motors_bus.set_bus_baudrate(9_600)
+    motors_bus.write("Baud_Rate", [smaller_baudrate_value] * len(motors_bus.motors))
+
+    motors_bus.set_bus_baudrate(smaller_baudrate)
    motors_bus.write("ID", [1] * len(motors_bus.motors))
    del motors_bus

--- a/tests/test_robots.py
+++ b/tests/test_robots.py
@@ -30,7 +30,7 @@ import torch

 from lerobot.common.robot_devices.robots.manipulator import ManipulatorRobot
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
-from tests.utils import TEST_ROBOT_TYPES, make_robot, require_robot
+from tests.utils import TEST_ROBOT_TYPES, make_robot, mock_calibration_dir, require_robot


@pytest.mark.parametrize("robot_type, mock", TEST_ROBOT_TYPES)
@@ -39,7 +39,6 @@ def test_robot(tmpdir, request, robot_type, mock):
    # TODO(rcadene): measure fps in nightly?
    # TODO(rcadene): test logs
    # TODO(rcadene): add compatibility with other robots
-
    robot_kwargs = {"robot_type": robot_type}

    if robot_type == "aloha" and mock:
@@ -54,6 +53,7 @@ def test_robot(tmpdir, request, robot_type, mock):
        tmpdir = Path(tmpdir)
        calibration_dir = tmpdir / robot_type
        overrides_calibration_dir = [f"calibration_dir={calibration_dir}"]
+        mock_calibration_dir(calibration_dir)
        robot_kwargs["calibration_dir"] = calibration_dir

    # Test connecting without devices raises an error
@@ -127,6 +127,7 @@ def test_robot(tmpdir, request, robot_type, mock):
            # TODO(rcadene): skipping image for now as it's challenging to assess equality between two consecutive frames
            continue
        assert torch.allclose(captured_observation[name], observation[name], atol=1)
+        assert captured_observation[name].shape == observation[name].shape

    # Test send_action can run
    robot.send_action(action["action"])
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -13,10 +13,12 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import json
 import os
 import platform
 from copy import copy
 from functools import wraps
+from pathlib import Path

 import pytest
 import torch
@@ -52,7 +54,7 @@ for motor_type in available_motors:
 OPENCV_CAMERA_INDEX = int(os.environ.get("LEROBOT_TEST_OPENCV_CAMERA_INDEX", 0))
 INTELREALSENSE_CAMERA_INDEX = int(os.environ.get("LEROBOT_TEST_INTELREALSENSE_CAMERA_INDEX", 128422271614))

-DYNAMIXEL_PORT = "/dev/tty.usbmodem575E0032081"
+DYNAMIXEL_PORT = os.environ.get("LEROBOT_TEST_DYNAMIXEL_PORT", "/dev/tty.usbmodem575E0032081")
 DYNAMIXEL_MOTORS = {
    "shoulder_pan": [1, "xl430-w250"],
    "shoulder_lift": [2, "xl430-w250"],
@@ -62,6 +64,16 @@ DYNAMIXEL_MOTORS = {
    "gripper": [6, "xl330-m288"],
 }

+FEETECH_PORT = os.environ.get("LEROBOT_TEST_FEETECH_PORT", "/dev/tty.usbmodem585A0080971")
+FEETECH_MOTORS = {
+    "shoulder_pan": [1, "sts3215"],
+    "shoulder_lift": [2, "sts3215"],
+    "elbow_flex": [3, "sts3215"],
+    "wrist_flex": [4, "sts3215"],
+    "wrist_roll": [5, "sts3215"],
+    "gripper": [6, "sts3215"],
+}
+

 def require_x86_64_kernel(func):
    """
@@ -271,13 +283,39 @@ def require_motor(func):
    return wrapper


+def mock_calibration_dir(calibration_dir):
+    # TODO(rcadene): remove this hack
+    # calibration file produced with Moss v1, but works with Koch, Koch bimanual and SO-100
+    example_calib = {
+        "homing_offset": [-1416, -845, 2130, 2872, 1950, -2211],
+        "drive_mode": [0, 0, 1, 1, 1, 0],
+        "start_pos": [1442, 843, 2166, 2849, 1988, 1835],
+        "end_pos": [2440, 1869, -1106, -1848, -926, 3235],
+        "calib_mode": ["DEGREE", "DEGREE", "DEGREE", "DEGREE", "DEGREE", "LINEAR"],
+        "motor_names": ["shoulder_pan", "shoulder_lift", "elbow_flex", "wrist_flex", "wrist_roll", "gripper"],
+    }
+    Path(str(calibration_dir)).mkdir(parents=True, exist_ok=True)
+    with open(calibration_dir / "main_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "main_leader.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "left_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "left_leader.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "right_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "right_leader.json", "w") as f:
+        json.dump(example_calib, f)
+
+
 def make_robot(robot_type: str, overrides: list[str] | None = None, mock=False) -> Robot:
    if mock:
        overrides = [] if overrides is None else copy(overrides)

        # Explicitely add mock argument to the cameras and set it to true
        # TODO(rcadene, aliberts): redesign when we drop hydra
-        if robot_type == "koch":
+        if robot_type in ["koch", "so100", "moss"]:
            overrides.append("+leader_arms.main.mock=true")
            overrides.append("+follower_arms.main.mock=true")
            if "~cameras" not in overrides:
@@ -338,5 +376,12 @@ def make_motors_bus(motor_type: str, **kwargs) -> MotorsBus:
        motors = kwargs.pop("motors", DYNAMIXEL_MOTORS)
        return DynamixelMotorsBus(port, motors, **kwargs)

+    elif motor_type == "feetech":
+        from lerobot.common.robot_devices.motors.feetech import FeetechMotorsBus
+
+        port = kwargs.pop("port", FEETECH_PORT)
+        motors = kwargs.pop("motors", FEETECH_MOTORS)
+        return FeetechMotorsBus(port, motors, **kwargs)
+
    else:
        raise ValueError(f"The motor type '{motor_type}' is not valid.")
Author	SHA1	Message	Date
Michel Aractingi	14490148f3	added tdmpc2 to policy factory; shape fixes in tdmpc2	2024-11-26 11:58:29 +00:00
Michel Aractingi	16edbbdeee	fixes and updated comments	2024-11-26 09:46:59 +00:00
Michel Aractingi	15090c2544	config comments	2024-11-25 09:51:33 +00:00
Michel Aractingi	166c1fc776	updated configuration parameters	2024-11-22 17:11:47 +00:00
Michel Aractingi	31984645da	simplified estimate_value function in policy	2024-11-21 17:03:30 +00:00
Michel Aractingi	c41ec08ec1	remove self.model_target and added a target q ensemble only without the need to copy the entire policy	2024-11-21 15:00:03 +00:00
Michel Aractingi	a146544765	added new implementation of tdmpc2	2024-11-20 17:30:19 +00:00
Ivelin Ivanov	963738d983	fix: broken images and a few minor typos in README (#499 ) Signed-off-by: ivelin <ivelin117@gmail.com>	2024-11-05 15:30:59 +01:00
Arsen Ohanyan	e0df56de62	Fix config file (#495 )	2024-10-31 16:41:49 +01:00
Hirokazu Ishida	538455a965	feat: enable to use multiple rgb encoders per camera in diffusion policy (#484 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-10-30 11:00:05 +01:00
Remi	172809a502	[Fix] Move back to manual calibration (#488 )	2024-10-26 15:27:21 +02:00
Remi	55e4ff6742	Fix autocalib moss (#486 )	2024-10-26 12:15:17 +02:00
Remi	07e8716315	Add FeetechMotorsBus, SO-100, Moss-v1 (#419 ) Co-authored-by: jess-moss <jess.moss@huggingface.co> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-10-25 11:23:55 +02:00
Arsen Ohanyan	114870d703	Fix link (#482 ) Co-authored-by: Remi <remi.cadene@huggingface.co>	2024-10-23 16:24:06 +02:00
Bastian Krohg	2efee45ef1	Update 9_use_aloha.md, missing comma (#479 )	2024-10-23 16:13:26 +02:00
Boris Zimka	c351e1fff9	Fix gymnasium version as pre-1.0.0 (#471 ) Co-authored-by: Remi <re.cadene@gmail.com> Co-authored-by: Remi <remi.cadene@huggingface.co>	2024-10-18 10:23:27 +02:00
Alexander Soare	cd0fc261c0	Make `say(blocking=True)` work for Linux (#460 )	2024-10-17 15:22:21 +01:00
Remi	77478d50e5	Refactor `record` with `add_frame` (#468 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-10-16 20:51:35 +02:00
Remi	97b1feb0b3	Add policy/act_aloha_real.yaml + env/act_real.yaml (#429 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-10-10 17:12:45 +02:00
Eugene Mironov	c29e70e5a1	Fix issue with wrong using index instead of camera_index in opencv (#466 )	2024-10-09 11:35:19 +02:00
Remi	d5b669634a	Fix nightly by updating .cache in dockerignore (#464 )	2024-10-07 11:35:35 +02:00