Fix record

Merge remote-tracking branch 'origin/user/aliberts/2024_09_25_reshape_dataset' into user/rcadene/2024_10_24_feetech_dataset_v2
Revert "isolate tests"
2024-10-24 15:27:33 +02:00 · 2024-10-24 13:23:31 +02:00 · 2024-10-24 11:52:16 +02:00 · 2024-10-24 11:45:02 +02:00 · 2024-10-24 11:44:39 +02:00 · 2024-10-24 11:38:32 +02:00
47 changed files with 5373 additions and 1866 deletions
--- a/examples/10_use_so100.md
+++ b/examples/10_use_so100.md
@@ -0,0 +1,208 @@
+This tutorial explains how to use [SO-100](https://github.com/TheRobotStudio/SO-ARM100) with LeRobot.
+
+## Source the parts
+
+Follow this [README](https://github.com/TheRobotStudio/SO-ARM100). It contains the bill of materials, with link to source the parts, as well as the instructions to 3D print the parts, and advices if it's your first time printing or if you don't own a 3D printer already.
+
+**Important**: Before assembling, you will first need to configure your motors. To this end, we provide a nice script, so let's install LeRobot. We will next provide a tutorial for assembly.
+
+## Install LeRobot
+
+On your computer:
+
+1. [Install Miniconda](https://docs.anaconda.com/miniconda/#quick-command-line-install):
+```bash
+mkdir -p ~/miniconda3
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
+bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
+rm ~/miniconda3/miniconda.sh
+~/miniconda3/bin/conda init bash
+```
+
+2. Restart shell or `source ~/.bashrc`
+
+3. Create and activate a fresh conda environment for lerobot
+```bash
+conda create -y -n lerobot python=3.10 && conda activate lerobot
+```
+
+4. Clone LeRobot:
+```bash
+git clone https://github.com/huggingface/lerobot.git ~/lerobot
+```
+
+5. Install LeRobot with dependencies for the feetech motors:
+```bash
+cd ~/lerobot && pip install -e ".[feetech]"
+```
+
+For Linux only (not Mac), install extra dependencies for recording datasets:
+```bash
+conda install -y -c conda-forge ffmpeg
+pip uninstall -y opencv-python
+conda install -y -c conda-forge "opencv>=4.10.0"
+```
+
+## Configure the motors
+
+Run this script two times to find the ports (e.g. "/dev/tty.usbmodem58760432961") of your motor buses:
+```bash
+python lerobot/scripts/find_motors_bus_port.py
+```
+
+Then plug your first motor, corresponding to "shoulder_pan" and run this script to set its ID to 1 and set its present position and offset to ~2048 (useful for calibration).
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 1
+```
+
+Then unplug your motor and plug the second motor, corresponding to "shoulder lift", and set its ID to 2.
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem58760432961 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 2
+```
+
+Redo the process for all your motors until the gripper with ID 6. Do the same for the motors of the leader arm, starting for ID 1 up to 6.
+
+## Assemble the arms
+
+TODO
+
+## Calibrate
+
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --robot-overrides '~cameras'
+```
+
+## Teleoperate
+
+Without displaying the cameras:
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --robot-overrides '~cameras' \
+    --display-cameras 0
+```
+
+With displaying the cameras:
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/so100.yaml
+```
+
+## Record a dataset
+
+Once you're familiar with teleoperation, you can record your first dataset with so100.
+
+If you want to use the Hugging Face hub features for uploading your dataset and you haven't previously done it, make sure you've logged in using a write-access token, which can be generated from the [Hugging Face settings](https://huggingface.co/settings/tokens):
+```bash
+huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential
+```
+
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Record 2 episodes and upload your dataset to the hub:
+```bash
+python lerobot/scripts/control_robot.py record \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/so100_test \
+    --tags so100 tutorial \
+    --warmup-time-s 5 \
+    --episode-time-s 40 \
+    --reset-time-s 10 \
+    --num-episodes 2 \
+    --push-to-hub 1
+```
+
+## Visualize a dataset
+
+If you uploaded your dataset to the hub with `--push-to-hub 1`, you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id given by:
+```bash
+echo ${HF_USER}/so100_test
+```
+
+If you didn't upload with `--push-to-hub 0`, you can also visualize it locally with:
+```bash
+python lerobot/scripts/visualize_dataset_html.py \
+  --root data \
+  --repo-id ${HF_USER}/so100_test
+```
+
+## Replay an episode
+
+Now try to replay the first episode on your robot:
+```bash
+DATA_DIR=data python lerobot/scripts/control_robot.py replay \
+    --robot-path lerobot/configs/robot/so100.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/so100_test \
+    --episode 0
+```
+
+## Train a policy
+
+To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+```bash
+DATA_DIR=data python lerobot/scripts/train.py \
+  dataset_repo_id=${HF_USER}/so100_test \
+  policy=act_so100_real \
+  env=so100_real \
+  hydra.run.dir=outputs/train/act_so100_test \
+  hydra.job.name=act_so100_test \
+  device=cuda \
+  wandb.enable=true
+```
+
+Let's explain it:
+1. We provided the dataset as argument with `dataset_repo_id=${HF_USER}/so100_test`.
+2. We provided the policy with `policy=act_so100_real`. This loads configurations from [`lerobot/configs/policy/act_so100_real.yaml`](../lerobot/configs/policy/act_so100_real.yaml). Importantly, this policy uses 2 cameras as input `laptop`, `phone`.
+3. We provided an environment as argument with `env=so100_real`. This loads configurations from [`lerobot/configs/env/so100_real.yaml`](../lerobot/configs/env/so100_real.yaml).
+4. We provided `device=cuda` since we are training on a Nvidia GPU, but you can also use `device=mps` if you are using a Mac with Apple silicon, or `device=cpu` otherwise.
+5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+6. We added `DATA_DIR=data` to access your dataset stored in your local `data` directory. If you dont provide `DATA_DIR`, your dataset will be downloaded from Hugging Face hub to your cache folder `$HOME/.cache/hugginface`. In future versions of `lerobot`, both directories will be in sync.
+
+Training should take several hours. You will find checkpoints in `outputs/train/act_so100_test/checkpoints`.
+
+## Evaluate your policy
+
+You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
+```bash
+python lerobot/scripts/control_robot.py record \
+  --robot-path lerobot/configs/robot/so100.yaml \
+  --fps 30 \
+  --root data \
+  --repo-id ${HF_USER}/eval_act_so100_test \
+  --tags so100 tutorial eval \
+  --warmup-time-s 5 \
+  --episode-time-s 40 \
+  --reset-time-s 10 \
+  --num-episodes 10 \
+  -p outputs/train/act_so100_test/checkpoints/last/pretrained_model
+```
+
+As you can see, it's almost the same command as previously used to record your training dataset. Two things changed:
+1. There is an additional `-p` argument which indicates the path to your policy checkpoint with  (e.g. `-p outputs/train/eval_so100_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `-p ${HF_USER}/act_so100_test`).
+2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `--repo-id ${HF_USER}/eval_act_so100_test`).
+
+## More
+
+Follow this [previous tutorial](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#4-train-a-policy-on-your-data) for a more in-depth explaination.
+
+If you have any question or need help, please reach out on Discord in the channel `#so100-arm`.
--- a/examples/7_get_started_with_real_robot.md
+++ b/examples/7_get_started_with_real_robot.md
@@ -78,12 +78,12 @@ To begin, create two instances of the  [`DynamixelMotorsBus`](../lerobot/common/

 To find the correct ports for each arm, run the utility script twice:
 ```bash
-python lerobot/common/robot_devices/motors/dynamixel.py
+python lerobot/scripts/find_motors_bus_port.py
 ```

 Example output when identifying the leader arm's port (e.g., `/dev/tty.usbmodem575E0031751` on Mac, or possibly `/dev/ttyACM0` on Linux):
 ```
-Finding all available ports for the DynamixelMotorsBus.
+Finding all available ports for the MotorBus.
 ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
 Remove the usb cable from your DynamixelMotorsBus and press Enter when done.

@@ -95,7 +95,7 @@ Reconnect the usb cable.

 Example output when identifying the follower arm's port (e.g., `/dev/tty.usbmodem575E0032081`, or possibly `/dev/ttyACM1` on Linux):
 ```
-Finding all available ports for the DynamixelMotorsBus.
+Finding all available ports for the MotorBus.
 ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
 Remove the usb cable from your DynamixelMotorsBus and press Enter when done.

--- a/examples/8_use_stretch.md
+++ b/examples/8_use_stretch.md
@@ -50,7 +50,7 @@ cd ~/lerobot && pip install -e ".[stretch]"

 > **Note:** If you get this message, you can ignore it: `ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.`

-And install extra dependencies for recording datasets on Linux:
+For Linux only (not Mac), install extra dependencies for recording datasets:
 ```bash
 conda install -y -c conda-forge ffmpeg
 pip uninstall -y opencv-python
--- a/examples/9_use_aloha.md
+++ b/examples/9_use_aloha.md
@@ -35,7 +35,7 @@ git clone https://github.com/huggingface/lerobot.git ~/lerobot
 cd ~/lerobot && pip install -e ".[dynamixel, intelrealsense]"
 ```

-And install extra dependencies for recording datasets on Linux:
+For Linux only (not Mac), install extra dependencies for recording datasets:
 ```bash
 conda install -y -c conda-forge ffmpeg
 pip uninstall -y opencv-python
--- a/lerobot/init.py
+++ b/lerobot/init.py
@@ -181,8 +181,8 @@ available_real_world_datasets = [
    "lerobot/usc_cloth_sim",
 ]

-available_datasets = list(
-    itertools.chain(*available_datasets_per_env.values(), available_real_world_datasets)
+available_datasets = sorted(
+    set(itertools.chain(*available_datasets_per_env.values(), available_real_world_datasets))
 )

 # lists all available policies from `lerobot/common/policies`
@@ -198,6 +198,8 @@ available_robots = [
    "koch",
    "koch_bimanual",
    "aloha",
+    "so100",
+    "moss",
 ]

 # lists all available cameras from `lerobot/common/robot_devices/cameras`
@@ -209,6 +211,7 @@ available_cameras = [
 # lists all available motors from `lerobot/common/robot_devices/motors`
 available_motors = [
    "dynamixel",
+    "feetech",
 ]

 # keys and values refer to yaml files
--- a/lerobot/common/datasets/factory.py
+++ b/lerobot/common/datasets/factory.py
@@ -91,9 +91,9 @@ def make_dataset(cfg, split: str = "train") -> LeRobotDataset | MultiLeRobotData
        )

    if isinstance(cfg.dataset_repo_id, str):
+        # TODO (aliberts): add 'episodes' arg from config after removing hydra
        dataset = LeRobotDataset(
            cfg.dataset_repo_id,
-            split=split,
            delta_timestamps=cfg.training.get("delta_timestamps"),
            image_transforms=image_transforms,
            video_backend=cfg.video_backend,
@@ -101,7 +101,6 @@ def make_dataset(cfg, split: str = "train") -> LeRobotDataset | MultiLeRobotData
    else:
        dataset = MultiLeRobotDataset(
            cfg.dataset_repo_id,
-            split=split,
            delta_timestamps=cfg.training.get("delta_timestamps"),
            image_transforms=image_transforms,
            video_backend=cfg.video_backend,
--- a/lerobot/common/datasets/image_writer.py
+++ b/lerobot/common/datasets/image_writer.py
@@ -0,0 +1,134 @@
+#!/usr/bin/env python
+
+# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import multiprocessing
+from concurrent.futures import ThreadPoolExecutor, wait
+from pathlib import Path
+
+import torch
+import tqdm
+from PIL import Image
+
+DEFAULT_IMAGE_PATH = "{image_key}/episode_{episode_index:06d}/frame_{frame_index:06d}.png"
+
+
+def safe_stop_image_writer(func):
+    def wrapper(*args, **kwargs):
+        try:
+            return func(*args, **kwargs)
+        except Exception as e:
+            dataset = kwargs.get("dataset", None)
+            image_writer = getattr(dataset, "image_writer", None) if dataset else None
+            if image_writer is not None:
+                print("Waiting for image writer to terminate...")
+                image_writer.stop()
+            raise e
+
+    return wrapper
+
+
+class ImageWriter:
+    """This class abstract away the initialisation of processes or/and threads to
+    save images on disk asynchrounously, which is critical to control a robot and record data
+    at a high frame rate.
+
+    When `num_processes=0`, it creates a threads pool of size `num_threads`.
+    When `num_processes>0`, it creates processes pool of size `num_processes`, where each subprocess starts
+    their own threads pool of size `num_threads`.
+
+    The optimal number of processes and threads depends on your computer capabilities.
+    We advise to use 4 threads per camera with 0 processes. If the fps is not stable, try to increase or lower
+    the number of threads. If it is still not stable, try to use 1 subprocess, or more.
+    """
+
+    def __init__(self, write_dir: Path, num_processes: int = 0, num_threads: int = 1):
+        self.dir = write_dir
+        self.dir.mkdir(parents=True, exist_ok=True)
+        self.image_path = DEFAULT_IMAGE_PATH
+        self.num_processes = num_processes
+        self.num_threads = self.num_threads_per_process = num_threads
+
+        if self.num_processes <= 0:
+            self.type = "threads"
+            self.threads = ThreadPoolExecutor(max_workers=self.num_threads)
+            self.futures = []
+        else:
+            self.type = "processes"
+            self.num_threads_per_process = self.num_threads
+            self.image_queue = multiprocessing.Queue()
+            self.processes: list[multiprocessing.Process] = []
+            for _ in range(num_processes):
+                process = multiprocessing.Process(target=self._loop_to_save_images_in_threads)
+                process.start()
+                self.processes.append(process)
+
+    def _loop_to_save_images_in_threads(self) -> None:
+        with ThreadPoolExecutor(max_workers=self.num_threads) as executor:
+            futures = []
+            while True:
+                frame_data = self.image_queue.get()
+                if frame_data is None:
+                    break
+
+                image, file_path = frame_data
+                futures.append(executor.submit(self._save_image, image, file_path))
+
+            with tqdm.tqdm(total=len(futures), desc="Writing images") as progress_bar:
+                wait(futures)
+                progress_bar.update(len(futures))
+
+    def async_save_image(self, image: torch.Tensor, file_path: Path) -> None:
+        """Save an image asynchronously using threads or processes."""
+        if self.type == "threads":
+            self.futures.append(self.threads.submit(self._save_image, image, file_path))
+        else:
+            self.image_queue.put((image, file_path))
+
+    def _save_image(self, image: torch.Tensor, file_path: Path) -> None:
+        img = Image.fromarray(image.numpy())
+        img.save(str(file_path), quality=100)
+
+    def get_image_file_path(self, episode_index: int, image_key: str, frame_index: int) -> Path:
+        fpath = self.image_path.format(
+            image_key=image_key, episode_index=episode_index, frame_index=frame_index
+        )
+        return self.dir / fpath
+
+    def get_episode_dir(self, episode_index: int, image_key: str) -> Path:
+        return self.get_image_file_path(
+            episode_index=episode_index, image_key=image_key, frame_index=0
+        ).parent
+
+    def stop(self, timeout=20) -> None:
+        """Stop the image writer, waiting for all processes or threads to finish."""
+        if self.type == "threads":
+            with tqdm.tqdm(total=len(self.futures), desc="Writing images") as progress_bar:
+                wait(self.futures, timeout=timeout)
+                progress_bar.update(len(self.futures))
+        else:
+            self._stop_processes(timeout)
+
+    def _stop_processes(self, timeout) -> None:
+        for _ in self.processes:
+            self.image_queue.put(None)
+
+        for process in self.processes:
+            process.join(timeout=timeout)
+
+        if process.is_alive():
+            process.terminate()
+
+        self.image_queue.close()
+        self.image_queue.join_thread()
--- a/lerobot/common/datasets/lerobot_dataset.py
+++ b/lerobot/common/datasets/lerobot_dataset.py
@@ -13,63 +13,312 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import json
 import logging
 import os
+import shutil
+from functools import cached_property
 from pathlib import Path
 from typing import Callable

 import datasets
+import pyarrow.parquet as pq
 import torch
 import torch.utils
+from datasets import load_dataset
+from huggingface_hub import snapshot_download, upload_folder

-from lerobot.common.datasets.compute_stats import aggregate_stats
+from lerobot.common.datasets.compute_stats import aggregate_stats, compute_stats
+from lerobot.common.datasets.image_writer import ImageWriter
 from lerobot.common.datasets.utils import (
-    calculate_episode_data_index,
-    load_episode_data_index,
-    load_hf_dataset,
+    EPISODES_PATH,
+    INFO_PATH,
+    STATS_PATH,
+    TASKS_PATH,
+    _get_info_from_robot,
+    append_jsonl,
+    check_delta_timestamps,
+    check_timestamps_sync,
+    check_version_compatibility,
+    create_branch,
+    create_empty_dataset_info,
+    get_delta_indices,
+    get_episode_data_index,
+    get_hub_safe_version,
+    hf_transform_to_torch,
+    load_episode_dicts,
    load_info,
-    load_previous_and_future_frames,
    load_stats,
-    load_videos,
-    reset_episode_index,
+    load_tasks,
+    write_json,
+    write_stats,
 )
-from lerobot.common.datasets.video_utils import VideoFrame, load_from_videos
+from lerobot.common.datasets.video_utils import (
+    VideoFrame,
+    decode_video_frames_torchvision,
+    encode_video_frames,
+)
+from lerobot.common.robot_devices.robots.utils import Robot

 # For maintainers, see lerobot/common/datasets/push_dataset_to_hub/CODEBASE_VERSION.md
-CODEBASE_VERSION = "v1.6"
-DATA_DIR = Path(os.environ["DATA_DIR"]) if "DATA_DIR" in os.environ else None
+CODEBASE_VERSION = "v2.0"
+LEROBOT_HOME = Path(os.getenv("LEROBOT_HOME", "~/.cache/huggingface/lerobot")).expanduser()


 class LeRobotDataset(torch.utils.data.Dataset):
    def __init__(
        self,
        repo_id: str,
-        root: Path | None = DATA_DIR,
-        split: str = "train",
+        root: Path | None = None,
+        episodes: list[int] | None = None,
        image_transforms: Callable | None = None,
        delta_timestamps: dict[list[float]] | None = None,
+        tolerance_s: float = 1e-4,
+        download_videos: bool = True,
+        local_files_only: bool = False,
        video_backend: str | None = None,
+        image_writer: ImageWriter | None = None,
    ):
+        """LeRobotDataset encapsulates 3 main things:
+            - metadata:
+                - info contains various information about the dataset like shapes, keys, fps etc.
+                - stats stores the dataset statistics of the different modalities for normalization
+                - tasks contains the prompts for each task of the dataset, which can be used for
+                  task-conditionned training.
+            - hf_dataset (from datasets.Dataset), which will read any values from parquet files.
+            - (optional) videos from which frames are loaded to be synchronous with data from parquet files.
+
+        3 modes are available for this class, depending on 3 different use cases:
+
+        1. Your dataset already exists on the Hugging Face Hub at the address
+        https://huggingface.co/datasets/{repo_id} and is not on your local disk in the 'root' folder:
+            Instantiating this class with this 'repo_id' will download the dataset from that address and load
+            it, pending your dataset is compliant with codebase_version v2.0. If your dataset has been created
+            before this new format, you will be prompted to convert it using our conversion script from v1.6
+            to v2.0, which you can find at lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py.
+
+        2. Your dataset already exists on your local disk in the 'root' folder:
+            This is typically the case when you recorded your dataset locally and you may or may not have
+            pushed it to the hub yet. Instantiating this class with 'root' will load your dataset directly
+            from disk. This can happen while you're offline (no internet connection).
+
+        3. Your dataset doesn't already exists (either on local disk or on the Hub):
+            [TODO(aliberts): add classmethod for this case?]
+
+
+        In terms of files, a typical LeRobotDataset looks like this from its root path:
+        .
+        ├── data
+        │   ├── chunk-000
+        │   │   ├── episode_000000.parquet
+        │   │   ├── episode_000001.parquet
+        │   │   ├── episode_000002.parquet
+        │   │   └── ...
+        │   ├── chunk-001
+        │   │   ├── episode_001000.parquet
+        │   │   ├── episode_001001.parquet
+        │   │   ├── episode_001002.parquet
+        │   │   └── ...
+        │   └── ...
+        ├── meta
+        │   ├── episodes.jsonl
+        │   ├── info.json
+        │   ├── stats.json
+        │   └── tasks.jsonl
+        └── videos (optional)
+            ├── chunk-000
+            │   ├── observation.images.laptop
+            │   │   ├── episode_000000.mp4
+            │   │   ├── episode_000001.mp4
+            │   │   ├── episode_000002.mp4
+            │   │   └── ...
+            │   ├── observation.images.phone
+            │   │   ├── episode_000000.mp4
+            │   │   ├── episode_000001.mp4
+            │   │   ├── episode_000002.mp4
+            │   │   └── ...
+            ├── chunk-001
+            └── ...
+
+        Note that this file-based structure is designed to be as versatile as possible. The files are split by
+        episodes which allows a more granular control over which episodes one wants to use and download. The
+        structure of the dataset is entirely described in the info.json file, which can be easily downloaded
+        or viewed directly on the hub before downloading any actual data. The type of files used are very
+        simple and do not need complex tools to be read, it only uses .parquet, .json and .mp4 files (and .md
+        for the README).
+
+        Args:
+            repo_id (str): This is the repo id that will be used to fetch the dataset. Locally, the dataset
+                will be stored under root/repo_id.
+            root (Path | None, optional): Local directory to use for downloading/writing files. You can also
+                set the LEROBOT_HOME environment variable to point to a different location. Defaults to
+                '~/.cache/huggingface/lerobot'.
+            episodes (list[int] | None, optional): If specified, this will only load episodes specified by
+                their episode_index in this list. Defaults to None.
+            image_transforms (Callable | None, optional): You can pass standard v2 image transforms from
+                torchvision.transforms.v2 here which will be applied to visual modalities (whether they come
+                from videos or images). Defaults to None.
+            delta_timestamps (dict[list[float]] | None, optional): _description_. Defaults to None.
+            tolerance_s (float, optional): Tolerance in seconds used to ensure data timestamps are actually in
+                sync with the fps value. It is used at the init of the dataset to make sure that each
+                timestamps is separated to the next by 1/fps +/- tolerance_s. This also applies to frames
+                decoded from video files. It is also used to check that `delta_timestamps` (when provided) are
+                multiples of 1/fps. Defaults to 1e-4.
+            download_videos (bool, optional): Flag to download the videos. Note that when set to True but the
+                video files are already present on local disk, they won't be downloaded again. Defaults to
+                True.
+            video_backend (str | None, optional): Video backend to use for decoding videos. There is currently
+                a single option which is the pyav decoder used by Torchvision. Defaults to pyav.
+        """
        super().__init__()
        self.repo_id = repo_id
-        self.root = root
-        self.split = split
+        self.root = root if root is not None else LEROBOT_HOME / repo_id
        self.image_transforms = image_transforms
        self.delta_timestamps = delta_timestamps
-        # load data from hub or locally when root is provided
+        self.episodes = episodes
+        self.tolerance_s = tolerance_s
+        self.video_backend = video_backend if video_backend is not None else "pyav"
+        self.delta_indices = None
+        self.local_files_only = local_files_only
+        self.consolidated = True
+
+        # Unused attributes
+        self.image_writer = None
+        self.episode_buffer = {}
+
+        # Load metadata
+        self.root.mkdir(exist_ok=True, parents=True)
+        self.pull_from_repo(allow_patterns="meta/")
+        self.info = load_info(self.root)
+        self.stats = load_stats(self.root)
+        self.tasks = load_tasks(self.root)
+        self.episode_dicts = load_episode_dicts(self.root)
+
+        # Check version
+        check_version_compatibility(self.repo_id, self._version, CODEBASE_VERSION)
+
+        # Load actual data
+        self.download_episodes(download_videos)
+        self.hf_dataset = self.load_hf_dataset()
+        self.episode_data_index = get_episode_data_index(self.episodes, self.episode_dicts)
+
+        # Check timestamps
+        check_timestamps_sync(self.hf_dataset, self.episode_data_index, self.fps, self.tolerance_s)
+
+        # Setup delta_indices
+        if self.delta_timestamps is not None:
+            check_delta_timestamps(self.delta_timestamps, self.fps, self.tolerance_s)
+            self.delta_indices = get_delta_indices(self.delta_timestamps, self.fps)
+
+        # TODO(aliberts):
+        # - [X] Move delta_timestamp logic outside __get_item__
+        # - [X] Update __get_item__
+        # - [/] Add doc
+        # - [ ] Add self.add_frame()
+        # - [ ] Add self.consolidate() for:
+        #     - [X] Check timestamps sync
+        #     - [ ] Sanity checks (episodes num, shapes, files, etc.)
+        #     - [ ] Update episode_index (arg update=True)
+        #     - [ ] Update info.json (arg update=True)
+
+    @cached_property
+    def _hub_version(self) -> str | None:
+        return None if self.local_files_only else get_hub_safe_version(self.repo_id, CODEBASE_VERSION)
+
+    @property
+    def _version(self) -> str:
+        """Codebase version used to create this dataset."""
+        return self.info["codebase_version"]
+
+    def push_to_hub(self, push_videos: bool = True) -> None:
+        if not self.consolidated:
+            raise RuntimeError(
+                "You are trying to upload to the hub a LeRobotDataset that has not been consolidated yet."
+                "Please call the dataset 'consolidate()' method first."
+            )
+        ignore_patterns = ["images/"]
+        if not push_videos:
+            ignore_patterns.append("videos/")
+
+        upload_folder(
+            repo_id=self.repo_id,
+            folder_path=self.root,
+            repo_type="dataset",
+            ignore_patterns=ignore_patterns,
+        )
+        create_branch(repo_id=self.repo_id, branch=CODEBASE_VERSION, repo_type="dataset")
+
+    def pull_from_repo(
+        self,
+        allow_patterns: list[str] | str | None = None,
+        ignore_patterns: list[str] | str | None = None,
+    ) -> None:
+        snapshot_download(
+            self.repo_id,
+            repo_type="dataset",
+            revision=self._hub_version,
+            local_dir=self.root,
+            allow_patterns=allow_patterns,
+            ignore_patterns=ignore_patterns,
+            local_files_only=self.local_files_only,
+        )
+
+    def download_episodes(self, download_videos: bool = True) -> None:
+        """Downloads the dataset from the given 'repo_id' at the provided version. If 'episodes' is given, this
+        will only download those episodes (selected by their episode_index). If 'episodes' is None, the whole
+        dataset will be downloaded. Thanks to the behavior of snapshot_download, if the files are already present
+        in 'local_dir', they won't be downloaded again.
+        """
        # TODO(rcadene, aliberts): implement faster transfer
        # https://huggingface.co/docs/huggingface_hub/en/guides/download#faster-downloads
-        self.hf_dataset = load_hf_dataset(repo_id, CODEBASE_VERSION, root, split)
-        if split == "train":
-            self.episode_data_index = load_episode_data_index(repo_id, CODEBASE_VERSION, root)
+        files = None
+        ignore_patterns = None if download_videos else "videos/"
+        if self.episodes is not None:
+            files = [str(self.get_data_file_path(ep_idx)) for ep_idx in self.episodes]
+            if len(self.video_keys) > 0 and download_videos:
+                video_files = [
+                    str(self.get_video_file_path(ep_idx, vid_key))
+                    for vid_key in self.video_keys
+                    for ep_idx in self.episodes
+                ]
+                files += video_files
+
+        self.pull_from_repo(allow_patterns=files, ignore_patterns=ignore_patterns)
+
+    def load_hf_dataset(self) -> datasets.Dataset:
+        """hf_dataset contains all the observations, states, actions, rewards, etc."""
+        if self.episodes is None:
+            path = str(self.root / "data")
+            hf_dataset = load_dataset("parquet", data_dir=path, split="train")
        else:
-            self.episode_data_index = calculate_episode_data_index(self.hf_dataset)
-            self.hf_dataset = reset_episode_index(self.hf_dataset)
-        self.stats = load_stats(repo_id, CODEBASE_VERSION, root)
-        self.info = load_info(repo_id, CODEBASE_VERSION, root)
-        if self.video:
-            self.videos_dir = load_videos(repo_id, CODEBASE_VERSION, root)
-            self.video_backend = video_backend if video_backend is not None else "pyav"
+            files = [str(self.root / self.get_data_file_path(ep_idx)) for ep_idx in self.episodes]
+            hf_dataset = load_dataset("parquet", data_files=files, split="train")
+
+        hf_dataset.set_transform(hf_transform_to_torch)
+        return hf_dataset
+
+    def get_data_file_path(self, ep_index: int) -> Path:
+        ep_chunk = self.get_episode_chunk(ep_index)
+        fpath = self.data_path.format(episode_chunk=ep_chunk, episode_index=ep_index)
+        return Path(fpath)
+
+    def get_video_file_path(self, ep_index: int, vid_key: str) -> Path:
+        ep_chunk = self.get_episode_chunk(ep_index)
+        fpath = self.videos_path.format(episode_chunk=ep_chunk, video_key=vid_key, episode_index=ep_index)
+        return Path(fpath)
+
+    def get_episode_chunk(self, ep_index: int) -> int:
+        return ep_index // self.chunks_size
+
+    @property
+    def data_path(self) -> str:
+        """Formattable string for the parquet files."""
+        return self.info["data_path"]
+
+    @property
+    def videos_path(self) -> str | None:
+        """Formattable string for the video files."""
+        return self.info["videos"]["videos_path"] if len(self.video_keys) > 0 else None

    @property
    def fps(self) -> int:
@@ -77,140 +326,495 @@ class LeRobotDataset(torch.utils.data.Dataset):
        return self.info["fps"]

    @property
-    def video(self) -> bool:
-        """Returns True if this dataset loads video frames from mp4 files.
-        Returns False if it only loads images from png files.
-        """
-        return self.info.get("video", False)
+    def keys(self) -> list[str]:
+        """Keys to access non-image data (state, actions etc.)."""
+        return self.info["keys"]

    @property
-    def features(self) -> datasets.Features:
-        return self.hf_dataset.features
+    def image_keys(self) -> list[str]:
+        """Keys to access visual modalities stored as images."""
+        return self.info["image_keys"]
+
+    @property
+    def video_keys(self) -> list[str]:
+        """Keys to access visual modalities stored as videos."""
+        return self.info["video_keys"]

    @property
    def camera_keys(self) -> list[str]:
-        """Keys to access image and video stream from cameras."""
-        keys = []
-        for key, feats in self.hf_dataset.features.items():
-            if isinstance(feats, (datasets.Image, VideoFrame)):
-                keys.append(key)
-        return keys
+        """Keys to access visual modalities (regardless of their storage method)."""
+        return self.image_keys + self.video_keys

    @property
-    def video_frame_keys(self) -> list[str]:
-        """Keys to access video frames that requires to be decoded into images.
-
-        Note: It is empty if the dataset contains images only,
-        or equal to `self.cameras` if the dataset contains videos only,
-        or can even be a subset of `self.cameras` in a case of a mixed image/video dataset.
-        """
-        video_frame_keys = []
-        for key, feats in self.hf_dataset.features.items():
-            if isinstance(feats, VideoFrame):
-                video_frame_keys.append(key)
-        return video_frame_keys
+    def names(self) -> dict[list[str]]:
+        """Names of the various dimensions of vector modalities."""
+        return self.info["names"]

    @property
    def num_samples(self) -> int:
-        """Number of samples/frames."""
+        """Number of samples/frames in selected episodes."""
        return len(self.hf_dataset)

    @property
    def num_episodes(self) -> int:
-        """Number of episodes."""
-        return len(self.hf_dataset.unique("episode_index"))
+        """Number of episodes selected."""
+        return len(self.episodes) if self.episodes is not None else self.total_episodes

    @property
-    def tolerance_s(self) -> float:
-        """Tolerance in seconds used to discard loaded frames when their timestamps
-        are not close enough from the requested frames. It is only used when `delta_timestamps`
-        is provided or when loading video frames from mp4 files.
+    def total_episodes(self) -> int:
+        """Total number of episodes available."""
+        return self.info["total_episodes"]
+
+    @property
+    def total_frames(self) -> int:
+        """Total number of frames saved in this dataset."""
+        return self.info["total_frames"]
+
+    @property
+    def total_tasks(self) -> int:
+        """Total number of different tasks performed in this dataset."""
+        return self.info["total_tasks"]
+
+    @property
+    def total_chunks(self) -> int:
+        """Total number of chunks (groups of episodes)."""
+        return self.info["total_chunks"]
+
+    @property
+    def chunks_size(self) -> int:
+        """Max number of episodes per chunk."""
+        return self.info["chunks_size"]
+
+    @property
+    def shapes(self) -> dict:
+        """Shapes for the different features."""
+        return self.info["shapes"]
+
+    @property
+    def features(self) -> datasets.Features:
+        """Features of the hf_dataset."""
+        if self.hf_dataset is not None:
+            return self.hf_dataset.features
+        elif self.episode_buffer is None:
+            raise NotImplementedError(
+                "Dataset features must be infered from an existing hf_dataset or episode_buffer."
+            )
+
+        features = {}
+        for key in self.episode_buffer:
+            if key in ["episode_index", "frame_index", "index", "task_index"]:
+                features[key] = datasets.Value(dtype="int64")
+            elif key in ["next.done", "next.success"]:
+                features[key] = datasets.Value(dtype="bool")
+            elif key in ["timestamp", "next.reward"]:
+                features[key] = datasets.Value(dtype="float32")
+            elif key in self.image_keys:
+                features[key] = datasets.Image()
+            elif key in self.keys:
+                features[key] = datasets.Sequence(
+                    length=self.shapes[key], feature=datasets.Value(dtype="float32")
+                )
+
+        return datasets.Features(features)
+
+    @property
+    def task_to_task_index(self) -> dict:
+        return {task: task_idx for task_idx, task in self.tasks.items()}
+
+    def get_task_index(self, task: str) -> int:
        """
-        # 1e-4 to account for possible numerical error
-        return 1 / self.fps - 1e-4
+        Given a task in natural language, returns its task_index if the task already exists in the dataset,
+        otherwise creates a new task_index.
+        """
+        task_index = self.task_to_task_index.get(task, None)
+        return task_index if task_index is not None else self.total_tasks
+
+    def current_episode_index(self, idx: int) -> int:
+        episode_index = self.hf_dataset["episode_index"][idx]
+        if self.episodes is not None:
+            # get episode_index from selected episodes
+            episode_index = self.episodes.index(episode_index)
+
+        return episode_index
+
+    def episode_length(self, episode_index) -> int:
+        """Number of samples/frames for given episode."""
+        return self.info["episodes"][episode_index]["length"]
+
+    def _get_query_indices(self, idx: int, ep_idx: int) -> tuple[dict[str, list[int | bool]]]:
+        ep_start = self.episode_data_index["from"][ep_idx]
+        ep_end = self.episode_data_index["to"][ep_idx]
+        query_indices = {
+            key: [max(ep_start.item(), min(ep_end.item() - 1, idx + delta)) for delta in delta_idx]
+            for key, delta_idx in self.delta_indices.items()
+        }
+        padding = {  # Pad values outside of current episode range
+            f"{key}_is_pad": torch.BoolTensor(
+                [(idx + delta < ep_start.item()) | (idx + delta >= ep_end.item()) for delta in delta_idx]
+            )
+            for key, delta_idx in self.delta_indices.items()
+        }
+        return query_indices, padding
+
+    def _get_query_timestamps(
+        self,
+        current_ts: float,
+        query_indices: dict[str, list[int]] | None = None,
+    ) -> dict[str, list[float]]:
+        query_timestamps = {}
+        for key in self.video_keys:
+            if query_indices is not None and key in query_indices:
+                timestamps = self.hf_dataset.select(query_indices[key])["timestamp"]
+                query_timestamps[key] = torch.stack(timestamps).tolist()
+            else:
+                query_timestamps[key] = [current_ts]
+
+        return query_timestamps
+
+    def _query_hf_dataset(self, query_indices: dict[str, list[int]]) -> dict:
+        return {
+            key: torch.stack(self.hf_dataset.select(q_idx)[key])
+            for key, q_idx in query_indices.items()
+            if key not in self.video_keys
+        }
+
+    def _query_videos(self, query_timestamps: dict[str, list[float]], ep_idx: int) -> dict:
+        """Note: When using data workers (e.g. DataLoader with num_workers>0), do not call this function
+        in the main process (e.g. by using a second Dataloader with num_workers=0). It will result in a
+        Segmentation Fault. This probably happens because a memory reference to the video loader is created in
+        the main process and a subprocess fails to access it.
+        """
+        item = {}
+        for vid_key, query_ts in query_timestamps.items():
+            video_path = self.root / self.get_video_file_path(ep_idx, vid_key)
+            frames = decode_video_frames_torchvision(
+                video_path, query_ts, self.tolerance_s, self.video_backend
+            )
+            item[vid_key] = frames.squeeze(0)
+
+        return item
+
+    def _add_padding_keys(self, item: dict, padding: dict[str, list[bool]]) -> dict:
+        for key, val in padding.items():
+            item[key] = torch.BoolTensor(val)
+        return item

    def __len__(self):
        return self.num_samples

-    def __getitem__(self, idx):
+    def __getitem__(self, idx) -> dict:
        item = self.hf_dataset[idx]
+        ep_idx = item["episode_index"].item()

-        if self.delta_timestamps is not None:
-            item = load_previous_and_future_frames(
-                item,
-                self.hf_dataset,
-                self.episode_data_index,
-                self.delta_timestamps,
-                self.tolerance_s,
-            )
+        query_indices = None
+        if self.delta_indices is not None:
+            current_ep_idx = self.episodes.index(ep_idx) if self.episodes is not None else ep_idx
+            query_indices, padding = self._get_query_indices(idx, current_ep_idx)
+            query_result = self._query_hf_dataset(query_indices)
+            item = {**item, **padding}
+            for key, val in query_result.items():
+                item[key] = val

-        if self.video:
-            item = load_from_videos(
-                item,
-                self.video_frame_keys,
-                self.videos_dir,
-                self.tolerance_s,
-                self.video_backend,
-            )
+        if len(self.video_keys) > 0:
+            current_ts = item["timestamp"].item()
+            query_timestamps = self._get_query_timestamps(current_ts, query_indices)
+            video_frames = self._query_videos(query_timestamps, ep_idx)
+            item = {**video_frames, **item}

        if self.image_transforms is not None:
-            for cam in self.camera_keys:
+            image_keys = self.camera_keys
+            for cam in image_keys:
                item[cam] = self.image_transforms(item[cam])

        return item

    def __repr__(self):
        return (
-            f"{self.__class__.__name__}(\n"
+            f"{self.__class__.__name__}\n"
            f"  Repository ID: '{self.repo_id}',\n"
-            f"  Split: '{self.split}',\n"
-            f"  Number of Samples: {self.num_samples},\n"
-            f"  Number of Episodes: {self.num_episodes},\n"
-            f"  Type: {'video (.mp4)' if self.video else 'image (.png)'},\n"
-            f"  Recorded Frames per Second: {self.fps},\n"
-            f"  Camera Keys: {self.camera_keys},\n"
-            f"  Video Frame Keys: {self.video_frame_keys if self.video else 'N/A'},\n"
-            f"  Transformations: {self.image_transforms},\n"
-            f"  Codebase Version: {self.info.get('codebase_version', '< v1.6')},\n"
-            f")"
+            f"  Selected episodes: {self.episodes},\n"
+            f"  Number of selected episodes: {self.num_episodes},\n"
+            f"  Number of selected samples: {self.num_samples},\n"
+            f"\n{json.dumps(self.info, indent=4)}\n"
        )

-    @classmethod
-    def from_preloaded(
-        cls,
-        repo_id: str = "from_preloaded",
-        root: Path | None = None,
-        split: str = "train",
-        transform: callable = None,
-        delta_timestamps: dict[list[float]] | None = None,
-        # additional preloaded attributes
-        hf_dataset=None,
-        episode_data_index=None,
-        stats=None,
-        info=None,
-        videos_dir=None,
-        video_backend=None,
-    ) -> "LeRobotDataset":
-        """Create a LeRobot Dataset from existing data and attributes instead of loading from the filesystem.
+    def _create_episode_buffer(self, episode_index: int | None = None) -> dict:
+        # TODO(aliberts): Handle resume
+        return {
+            "size": 0,
+            "episode_index": self.total_episodes if episode_index is None else episode_index,
+            "task_index": None,
+            "frame_index": [],
+            "timestamp": [],
+            "next.done": [],
+            **{key: [] for key in self.keys},
+            **{key: [] for key in self.image_keys},
+        }

-        It is especially useful when converting raw data into LeRobotDataset before saving the dataset
-        on the filesystem or uploading to the hub.
-
-        Note: Meta-data attributes like `repo_id`, `version`, `root`, etc are optional and potentially
-        meaningless depending on the downstream usage of the return dataset.
+    def add_frame(self, frame: dict) -> None:
        """
-        # create an empty object of type LeRobotDataset
+        This function only adds the frame to the episode_buffer. Apart from images — which are written in a
+        temporary directory — nothing is written to disk. To save those frames, the 'add_episode()' method
+        then needs to be called.
+        """
+        frame_index = self.episode_buffer["size"]
+        self.episode_buffer["frame_index"].append(frame_index)
+        self.episode_buffer["timestamp"].append(frame_index / self.fps)
+        self.episode_buffer["next.done"].append(False)
+
+        # Save all observed modalities except images
+        for key in self.keys:
+            self.episode_buffer[key].append(frame[key])
+
+        self.episode_buffer["size"] += 1
+
+        if self.image_writer is None:
+            return
+
+        # Save images
+        for cam_key in self.camera_keys:
+            img_path = self.image_writer.get_image_file_path(
+                episode_index=self.episode_buffer["episode_index"], image_key=cam_key, frame_index=frame_index
+            )
+            if frame_index == 0:
+                img_path.parent.mkdir(parents=True, exist_ok=True)
+
+            self.image_writer.async_save_image(
+                image=frame[cam_key],
+                file_path=img_path,
+            )
+            if cam_key in self.image_keys:
+                self.episode_buffer[cam_key].append(str(img_path))
+
+    def add_episode(self, task: str, encode_videos: bool = False) -> None:
+        """
+        This will save to disk the current episode in self.episode_buffer. Note that since it affects files on
+        disk, it sets self.consolidated to False to ensure proper consolidation later on before uploading to
+        the hub.
+
+        Use 'encode_videos' if you want to encode videos during the saving of each episode. Otherwise,
+        you can do it later with dataset.consolidate(). This is to give more flexibility on when to spend
+        time for video encoding.
+        """
+        episode_length = self.episode_buffer.pop("size")
+        episode_index = self.episode_buffer["episode_index"]
+        if episode_index != self.total_episodes:
+            # TODO(aliberts): Add option to use existing episode_index
+            raise NotImplementedError()
+
+        task_index = self.get_task_index(task)
+        self.episode_buffer["next.done"][-1] = True
+
+        for key in self.episode_buffer:
+            if key in self.image_keys:
+                continue
+            if key in self.keys:
+                self.episode_buffer[key] = torch.stack(self.episode_buffer[key])
+            elif key == "episode_index":
+                self.episode_buffer[key] = torch.full((episode_length,), episode_index)
+            elif key == "task_index":
+                self.episode_buffer[key] = torch.full((episode_length,), task_index)
+            else:
+                self.episode_buffer[key] = torch.tensor(self.episode_buffer[key])
+
+        self.episode_buffer["index"] = torch.arange(self.total_frames, self.total_frames + episode_length)
+        self._save_episode_to_metadata(episode_index, episode_length, task, task_index)
+        self._save_episode_table(episode_index)
+
+        if encode_videos and len(self.video_keys) > 0:
+            self.encode_videos()
+
+        # Reset the buffer
+        self.episode_buffer = self._create_episode_buffer()
+        self.consolidated = False
+
+    def _save_episode_table(self, episode_index: int) -> None:
+        ep_dataset = datasets.Dataset.from_dict(self.episode_buffer, features=self.features, split="train")
+        ep_table = ep_dataset._data.table
+        ep_data_path = self.root / self.get_data_file_path(ep_index=episode_index)
+        ep_data_path.parent.mkdir(parents=True, exist_ok=True)
+        pq.write_table(ep_table, ep_data_path)
+
+    def _save_episode_to_metadata(
+        self, episode_index: int, episode_length: int, task: str, task_index: int
+    ) -> None:
+        self.info["total_episodes"] += 1
+        self.info["total_frames"] += episode_length
+
+        if task_index not in self.tasks:
+            self.info["total_tasks"] += 1
+            self.tasks[task_index] = task
+            task_dict = {
+                "task_index": task_index,
+                "task": task,
+            }
+            append_jsonl(task_dict, self.root / TASKS_PATH)
+
+        chunk = self.get_episode_chunk(episode_index)
+        if chunk >= self.total_chunks:
+            self.info["total_chunks"] += 1
+
+        self.info["splits"] = {"train": f"0:{self.info['total_episodes']}"}
+        self.info["total_videos"] += len(self.video_keys)
+        write_json(self.info, self.root / INFO_PATH)
+
+        episode_dict = {
+            "episode_index": episode_index,
+            "tasks": [task],
+            "length": episode_length,
+        }
+        self.episode_dicts.append(episode_dict)
+        append_jsonl(episode_dict, self.root / EPISODES_PATH)
+
+    def clear_episode_buffer(self) -> None:
+        episode_index = self.episode_buffer["episode_index"]
+        if self.image_writer is not None:
+            for cam_key in self.camera_keys:
+                img_dir = self.image_writer.get_episode_dir(episode_index, cam_key)
+                if img_dir.is_dir():
+                    shutil.rmtree(img_dir)
+
+        # Reset the buffer
+        self.episode_buffer = self._create_episode_buffer()
+
+    def start_image_writter(self, num_processes: int = 0, num_threads: int = 1) -> None:
+        if isinstance(self.image_writer, ImageWriter):
+            logging.warning(
+                "You are starting a new ImageWriter that is replacing an already exising one in the dataset."
+            )
+
+        self.image_writer = ImageWriter(
+            write_dir=self.root / "images",
+            num_processes=num_processes,
+            num_threads=num_threads,
+        )
+
+    def stop_image_writter(self) -> None:
+        """
+        Whenever wrapping this dataset inside a parallelized DataLoader, this needs to be called first to
+        remove the image_write in order for the LeRobotDataset object to be pickleable and parallelized.
+        """
+        if self.image_writer is not None:
+            self.image_writer.stop()
+            self.image_writer = None
+
+    def encode_videos(self) -> None:
+        # Use ffmpeg to convert frames stored as png into mp4 videos
+        for episode_index in range(self.num_episodes):
+            for key in self.video_keys:
+                # TODO: create video_buffer to store the state of encoded/unencoded videos and remove the need
+                # to call self.image_writer here
+                tmp_imgs_dir = self.image_writer.get_episode_dir(episode_index, key)
+                video_path = self.root / self.get_video_file_path(episode_index, key)
+                if video_path.is_file():
+                    # Skip if video is already encoded. Could be the case when resuming data recording.
+                    continue
+                # note: `encode_video_frames` is a blocking call. Making it asynchronous shouldn't speedup encoding,
+                # since video encoding with ffmpeg is already using multithreading.
+                encode_video_frames(tmp_imgs_dir, video_path, self.fps, overwrite=True)
+
+    def consolidate(self, run_compute_stats: bool = True, keep_image_files: bool = False) -> None:
+        self.hf_dataset = self.load_hf_dataset()
+        self.episode_data_index = get_episode_data_index(self.episodes, self.episode_dicts)
+        check_timestamps_sync(self.hf_dataset, self.episode_data_index, self.fps, self.tolerance_s)
+
+        if len(self.video_keys) > 0:
+            self.encode_videos()
+
+        if not keep_image_files and self.image_writer is not None:
+            shutil.rmtree(self.image_writer.dir)
+
+        if run_compute_stats:
+            self.stop_image_writter()
+            self.stats = compute_stats(self)
+            write_stats(self.stats, self.root / STATS_PATH)
+            self.consolidated = True
+        else:
+            logging.warning(
+                "Skipping computation of the dataset statistics, dataset is not fully consolidated."
+            )
+
+        # TODO(aliberts)
+        # - [ ] add video info in info.json
+        # Sanity checks:
+        # - [ ] shapes
+        # - [ ] ep_lenghts
+        # - [ ] number of files
+
+    @classmethod
+    def create(
+        cls,
+        repo_id: str,
+        fps: int,
+        root: Path | None = None,
+        robot: Robot | None = None,
+        robot_type: str | None = None,
+        keys: list[str] | None = None,
+        image_keys: list[str] | None = None,
+        video_keys: list[str] = None,
+        shapes: dict | None = None,
+        names: dict | None = None,
+        tolerance_s: float = 1e-4,
+        image_writer_processes: int = 0,
+        image_writer_threads_per_camera: int = 0,
+        use_videos: bool = True,
+        video_backend: str | None = None,
+    ) -> "LeRobotDataset":
+        """Create a LeRobot Dataset from scratch in order to record data."""
        obj = cls.__new__(cls)
        obj.repo_id = repo_id
-        obj.root = root
-        obj.split = split
-        obj.image_transforms = transform
-        obj.delta_timestamps = delta_timestamps
-        obj.hf_dataset = hf_dataset
-        obj.episode_data_index = episode_data_index
-        obj.stats = stats
-        obj.info = info if info is not None else {}
-        obj.videos_dir = videos_dir
+        obj.root = root if root is not None else LEROBOT_HOME / repo_id
+        obj.tolerance_s = tolerance_s
+        obj.image_writer = None
+
+        if robot is not None:
+            robot_type, keys, image_keys, video_keys, shapes, names = _get_info_from_robot(robot, use_videos)
+            if not all(cam.fps == fps for cam in robot.cameras.values()):
+                logging.warning(
+                    f"Some cameras in your {robot.robot_type} robot don't have an fps matching the fps of your dataset."
+                    "In this case, frames from lower fps cameras will be repeated to fill in the blanks"
+                )
+            if len(robot.cameras) > 0 and (image_writer_processes or image_writer_threads_per_camera):
+                obj.start_image_writter(
+                    image_writer_processes, image_writer_threads_per_camera * robot.num_cameras
+                )
+        elif (
+            robot_type is None
+            or keys is None
+            or image_keys is None
+            or video_keys is None
+            or shapes is None
+            or names is None
+        ):
+            raise ValueError(
+                "Dataset info (robot_type, keys, shapes...) must either come from a Robot or explicitly passed upon creation."
+            )
+
+        if len(video_keys) > 0 and not use_videos:
+            raise ValueError
+
+        obj.tasks, obj.stats, obj.episode_dicts = {}, {}, []
+        obj.info = create_empty_dataset_info(
+            CODEBASE_VERSION, fps, robot_type, keys, image_keys, video_keys, shapes, names
+        )
+        write_json(obj.info, obj.root / INFO_PATH)
+
+        # TODO(aliberts, rcadene, alexander-soare): Merge this with OnlineBuffer/DataBuffer
+        obj.episode_buffer = obj._create_episode_buffer()
+
+        # This bool indicates that the current LeRobotDataset instance is in sync with the files on disk. It
+        # is used to know when certain operations are need (for instance, computing dataset statistics). In
+        # order to be able to push the dataset to the hub, it needs to be consolidated first by calling
+        # self.consolidate().
+        obj.consolidated = True
+
+        obj.episodes = None
+        obj.hf_dataset = None
+        obj.image_transforms = None
+        obj.delta_timestamps = None
+        obj.delta_indices = None
+        obj.local_files_only = True
+        obj.episode_data_index = None
        obj.video_backend = video_backend if video_backend is not None else "pyav"
        return obj

@@ -225,8 +829,8 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):
    def __init__(
        self,
        repo_ids: list[str],
-        root: Path | None = DATA_DIR,
-        split: str = "train",
+        root: Path | None = None,
+        episodes: dict | None = None,
        image_transforms: Callable | None = None,
        delta_timestamps: dict[list[float]] | None = None,
        video_backend: str | None = None,
@@ -238,8 +842,8 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):
        self._datasets = [
            LeRobotDataset(
                repo_id,
-                root=root,
-                split=split,
+                root=root / repo_id if root is not None else None,
+                episodes=episodes[repo_id] if episodes is not None else None,
                delta_timestamps=delta_timestamps,
                image_transforms=image_transforms,
                video_backend=video_backend,
@@ -275,7 +879,6 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):
            self.disabled_data_keys.update(extra_keys)

        self.root = root
-        self.split = split
        self.image_transforms = image_transforms
        self.delta_timestamps = delta_timestamps
        self.stats = aggregate_stats(self._datasets)
@@ -389,7 +992,6 @@ class MultiLeRobotDataset(torch.utils.data.Dataset):
        return (
            f"{self.__class__.__name__}(\n"
            f"  Repository IDs: '{self.repo_ids}',\n"
-            f"  Split: '{self.split}',\n"
            f"  Number of Samples: {self.num_samples},\n"
            f"  Number of Episodes: {self.num_episodes},\n"
            f"  Type: {'video (.mp4)' if self.video else 'image (.png)'},\n"
--- a/lerobot/common/datasets/populate_dataset.py
+++ b/lerobot/common/datasets/populate_dataset.py
@@ -1,468 +0,0 @@
-"""Functions to create an empty dataset, and populate it with frames."""
-# TODO(rcadene, aliberts): to adapt as class methods of next version of LeRobotDataset
-
-import concurrent
-import json
-import logging
-import multiprocessing
-import shutil
-from pathlib import Path
-
-import torch
-import tqdm
-from PIL import Image
-
-from lerobot.common.datasets.compute_stats import compute_stats
-from lerobot.common.datasets.lerobot_dataset import CODEBASE_VERSION, LeRobotDataset
-from lerobot.common.datasets.push_dataset_to_hub.aloha_hdf5_format import to_hf_dataset
-from lerobot.common.datasets.push_dataset_to_hub.utils import concatenate_episodes, get_default_encoding
-from lerobot.common.datasets.utils import calculate_episode_data_index, create_branch
-from lerobot.common.datasets.video_utils import encode_video_frames
-from lerobot.common.utils.utils import log_say
-from lerobot.scripts.push_dataset_to_hub import (
-    push_dataset_card_to_hub,
-    push_meta_data_to_hub,
-    push_videos_to_hub,
-    save_meta_data,
-)
-
-########################################################################################
-# Asynchrounous saving of images on disk
-########################################################################################
-
-
-def safe_stop_image_writer(func):
-    # TODO(aliberts): Allow to pass custom exceptions
-    # (e.g. ThreadServiceExit, KeyboardInterrupt, SystemExit, UnpluggedError, DynamixelCommError)
-    def wrapper(*args, **kwargs):
-        try:
-            return func(*args, **kwargs)
-        except Exception as e:
-            image_writer = kwargs.get("dataset", {}).get("image_writer")
-            if image_writer is not None:
-                print("Waiting for image writer to terminate...")
-                stop_image_writer(image_writer, timeout=20)
-            raise e
-
-    return wrapper
-
-
-def save_image(img_tensor, key, frame_index, episode_index, videos_dir: str):
-    img = Image.fromarray(img_tensor.numpy())
-    path = Path(videos_dir) / f"{key}_episode_{episode_index:06d}" / f"frame_{frame_index:06d}.png"
-    path.parent.mkdir(parents=True, exist_ok=True)
-    img.save(str(path), quality=100)
-
-
-def loop_to_save_images_in_threads(image_queue, num_threads):
-    if num_threads < 1:
-        raise NotImplementedError(f"Only `num_threads>=1` is supported for now, but {num_threads=} given.")
-
-    with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
-        futures = []
-        while True:
-            # Blocks until a frame is available
-            frame_data = image_queue.get()
-
-            # As usually done, exit loop when receiving None to stop the worker
-            if frame_data is None:
-                break
-
-            image, key, frame_index, episode_index, videos_dir = frame_data
-            futures.append(executor.submit(save_image, image, key, frame_index, episode_index, videos_dir))
-
-        # Before exiting function, wait for all threads to complete
-        with tqdm.tqdm(total=len(futures), desc="Writing images") as progress_bar:
-            concurrent.futures.wait(futures)
-            progress_bar.update(len(futures))
-
-
-def start_image_writer_processes(image_queue, num_processes, num_threads_per_process):
-    if num_processes < 1:
-        raise ValueError(f"Only `num_processes>=1` is supported, but {num_processes=} given.")
-
-    if num_threads_per_process < 1:
-        raise NotImplementedError(
-            "Only `num_threads_per_process>=1` is supported for now, but {num_threads_per_process=} given."
-        )
-
-    processes = []
-    for _ in range(num_processes):
-        process = multiprocessing.Process(
-            target=loop_to_save_images_in_threads,
-            args=(image_queue, num_threads_per_process),
-        )
-        process.start()
-        processes.append(process)
-    return processes
-
-
-def stop_processes(processes, queue, timeout):
-    # Send None to each process to signal them to stop
-    for _ in processes:
-        queue.put(None)
-
-    # Wait maximum 20 seconds for all processes to terminate
-    for process in processes:
-        process.join(timeout=timeout)
-
-    # If not terminated after 20 seconds, force termination
-    if process.is_alive():
-        process.terminate()
-
-    # Close the queue, no more items can be put in the queue
-    queue.close()
-
-    # Ensure all background queue threads have finished
-    queue.join_thread()
-
-
-def start_image_writer(num_processes, num_threads):
-    """This function abstract away the initialisation of processes or/and threads to
-    save images on disk asynchrounously, which is critical to control a robot and record data
-    at a high frame rate.
-
-    When `num_processes=0`, it returns a dictionary containing a threads pool of size `num_threads`.
-    When `num_processes>0`, it returns a dictionary containing a processes pool of size `num_processes`,
-    where each subprocess starts their own threads pool of size `num_threads`.
-
-    The optimal number of processes and threads depends on your computer capabilities.
-    We advise to use 4 threads per camera with 0 processes. If the fps is not stable, try to increase or lower
-    the number of threads. If it is still not stable, try to use 1 subprocess, or more.
-    """
-    image_writer = {}
-
-    if num_processes == 0:
-        futures = []
-        threads_pool = concurrent.futures.ThreadPoolExecutor(max_workers=num_threads)
-        image_writer["threads_pool"], image_writer["futures"] = threads_pool, futures
-    else:
-        # TODO(rcadene): When using num_processes>1, `multiprocessing.Manager().Queue()`
-        # might be better than `multiprocessing.Queue()`. Source: https://www.geeksforgeeks.org/python-multiprocessing-queue-vs-multiprocessing-manager-queue
-        image_queue = multiprocessing.Queue()
-        processes_pool = start_image_writer_processes(
-            image_queue, num_processes=num_processes, num_threads_per_process=num_threads
-        )
-        image_writer["processes_pool"], image_writer["image_queue"] = processes_pool, image_queue
-
-    return image_writer
-
-
-def async_save_image(image_writer, image, key, frame_index, episode_index, videos_dir):
-    """This function abstract away the saving of an image on disk asynchrounously. It uses a dictionary
-    called image writer which contains either a pool of processes or a pool of threads.
-    """
-    if "threads_pool" in image_writer:
-        threads_pool, futures = image_writer["threads_pool"], image_writer["futures"]
-        futures.append(threads_pool.submit(save_image, image, key, frame_index, episode_index, videos_dir))
-    else:
-        image_queue = image_writer["image_queue"]
-        image_queue.put((image, key, frame_index, episode_index, videos_dir))
-
-
-def stop_image_writer(image_writer, timeout):
-    if "threads_pool" in image_writer:
-        futures = image_writer["futures"]
-        # Before exiting function, wait for all threads to complete
-        with tqdm.tqdm(total=len(futures), desc="Writing images") as progress_bar:
-            concurrent.futures.wait(futures, timeout=timeout)
-            progress_bar.update(len(futures))
-    else:
-        processes_pool, image_queue = image_writer["processes_pool"], image_writer["image_queue"]
-        stop_processes(processes_pool, image_queue, timeout=timeout)
-
-
-########################################################################################
-# Functions to initialize, resume and populate a dataset
-########################################################################################
-
-
-def init_dataset(
-    repo_id,
-    root,
-    force_override,
-    fps,
-    video,
-    write_images,
-    num_image_writer_processes,
-    num_image_writer_threads,
-):
-    local_dir = Path(root) / repo_id
-    if local_dir.exists() and force_override:
-        shutil.rmtree(local_dir)
-
-    episodes_dir = local_dir / "episodes"
-    episodes_dir.mkdir(parents=True, exist_ok=True)
-
-    videos_dir = local_dir / "videos"
-    videos_dir.mkdir(parents=True, exist_ok=True)
-
-    # Logic to resume data recording
-    rec_info_path = episodes_dir / "data_recording_info.json"
-    if rec_info_path.exists():
-        with open(rec_info_path) as f:
-            rec_info = json.load(f)
-        num_episodes = rec_info["last_episode_index"] + 1
-    else:
-        num_episodes = 0
-
-    dataset = {
-        "repo_id": repo_id,
-        "local_dir": local_dir,
-        "videos_dir": videos_dir,
-        "episodes_dir": episodes_dir,
-        "fps": fps,
-        "video": video,
-        "rec_info_path": rec_info_path,
-        "num_episodes": num_episodes,
-    }
-
-    if write_images:
-        # Initialize processes or/and threads dedicated to save images on disk asynchronously,
-        # which is critical to control a robot and record data at a high frame rate.
-        image_writer = start_image_writer(
-            num_processes=num_image_writer_processes,
-            num_threads=num_image_writer_threads,
-        )
-        dataset["image_writer"] = image_writer
-
-    return dataset
-
-
-def add_frame(dataset, observation, action):
-    if "current_episode" not in dataset:
-        # initialize episode dictionary
-        ep_dict = {}
-        for key in observation:
-            if key not in ep_dict:
-                ep_dict[key] = []
-        for key in action:
-            if key not in ep_dict:
-                ep_dict[key] = []
-
-        ep_dict["episode_index"] = []
-        ep_dict["frame_index"] = []
-        ep_dict["timestamp"] = []
-        ep_dict["next.done"] = []
-
-        dataset["current_episode"] = ep_dict
-        dataset["current_frame_index"] = 0
-
-    ep_dict = dataset["current_episode"]
-    episode_index = dataset["num_episodes"]
-    frame_index = dataset["current_frame_index"]
-    videos_dir = dataset["videos_dir"]
-    video = dataset["video"]
-    fps = dataset["fps"]
-
-    ep_dict["episode_index"].append(episode_index)
-    ep_dict["frame_index"].append(frame_index)
-    ep_dict["timestamp"].append(frame_index / fps)
-    ep_dict["next.done"].append(False)
-
-    img_keys = [key for key in observation if "image" in key]
-    non_img_keys = [key for key in observation if "image" not in key]
-
-    # Save all observed modalities except images
-    for key in non_img_keys:
-        ep_dict[key].append(observation[key])
-
-    # Save actions
-    for key in action:
-        ep_dict[key].append(action[key])
-
-    if "image_writer" not in dataset:
-        dataset["current_frame_index"] += 1
-        return
-
-    # Save images
-    image_writer = dataset["image_writer"]
-    for key in img_keys:
-        imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
-        async_save_image(
-            image_writer,
-            image=observation[key],
-            key=key,
-            frame_index=frame_index,
-            episode_index=episode_index,
-            videos_dir=str(videos_dir),
-        )
-
-        if video:
-            fname = f"{key}_episode_{episode_index:06d}.mp4"
-            frame_info = {"path": f"videos/{fname}", "timestamp": frame_index / fps}
-        else:
-            frame_info = str(imgs_dir / f"frame_{frame_index:06d}.png")
-
-        ep_dict[key].append(frame_info)
-
-    dataset["current_frame_index"] += 1
-
-
-def delete_current_episode(dataset):
-    del dataset["current_episode"]
-    del dataset["current_frame_index"]
-
-    # delete temporary images
-    episode_index = dataset["num_episodes"]
-    videos_dir = dataset["videos_dir"]
-    for tmp_imgs_dir in videos_dir.glob(f"*_episode_{episode_index:06d}"):
-        shutil.rmtree(tmp_imgs_dir)
-
-
-def save_current_episode(dataset):
-    episode_index = dataset["num_episodes"]
-    ep_dict = dataset["current_episode"]
-    episodes_dir = dataset["episodes_dir"]
-    rec_info_path = dataset["rec_info_path"]
-
-    ep_dict["next.done"][-1] = True
-
-    for key in ep_dict:
-        if "observation" in key and "image" not in key:
-            ep_dict[key] = torch.stack(ep_dict[key])
-
-    ep_dict["action"] = torch.stack(ep_dict["action"])
-    ep_dict["episode_index"] = torch.tensor(ep_dict["episode_index"])
-    ep_dict["frame_index"] = torch.tensor(ep_dict["frame_index"])
-    ep_dict["timestamp"] = torch.tensor(ep_dict["timestamp"])
-    ep_dict["next.done"] = torch.tensor(ep_dict["next.done"])
-
-    ep_path = episodes_dir / f"episode_{episode_index}.pth"
-    torch.save(ep_dict, ep_path)
-
-    rec_info = {
-        "last_episode_index": episode_index,
-    }
-    with open(rec_info_path, "w") as f:
-        json.dump(rec_info, f)
-
-    # force re-initialization of episode dictionnary during add_frame
-    del dataset["current_episode"]
-
-    dataset["num_episodes"] += 1
-
-
-def encode_videos(dataset, image_keys, play_sounds):
-    log_say("Encoding videos", play_sounds)
-
-    num_episodes = dataset["num_episodes"]
-    videos_dir = dataset["videos_dir"]
-    local_dir = dataset["local_dir"]
-    fps = dataset["fps"]
-
-    # Use ffmpeg to convert frames stored as png into mp4 videos
-    for episode_index in tqdm.tqdm(range(num_episodes)):
-        for key in image_keys:
-            # key = f"observation.images.{name}"
-            tmp_imgs_dir = videos_dir / f"{key}_episode_{episode_index:06d}"
-            fname = f"{key}_episode_{episode_index:06d}.mp4"
-            video_path = local_dir / "videos" / fname
-            if video_path.exists():
-                # Skip if video is already encoded. Could be the case when resuming data recording.
-                continue
-            # note: `encode_video_frames` is a blocking call. Making it asynchronous shouldn't speedup encoding,
-            # since video encoding with ffmpeg is already using multithreading.
-            encode_video_frames(tmp_imgs_dir, video_path, fps, overwrite=True)
-            shutil.rmtree(tmp_imgs_dir)
-
-
-def from_dataset_to_lerobot_dataset(dataset, play_sounds):
-    log_say("Consolidate episodes", play_sounds)
-
-    num_episodes = dataset["num_episodes"]
-    episodes_dir = dataset["episodes_dir"]
-    videos_dir = dataset["videos_dir"]
-    video = dataset["video"]
-    fps = dataset["fps"]
-    repo_id = dataset["repo_id"]
-
-    ep_dicts = []
-    for episode_index in tqdm.tqdm(range(num_episodes)):
-        ep_path = episodes_dir / f"episode_{episode_index}.pth"
-        ep_dict = torch.load(ep_path)
-        ep_dicts.append(ep_dict)
-    data_dict = concatenate_episodes(ep_dicts)
-
-    if video:
-        image_keys = [key for key in data_dict if "image" in key]
-        encode_videos(dataset, image_keys, play_sounds)
-
-    hf_dataset = to_hf_dataset(data_dict, video)
-    episode_data_index = calculate_episode_data_index(hf_dataset)
-
-    info = {
-        "codebase_version": CODEBASE_VERSION,
-        "fps": fps,
-        "video": video,
-    }
-    if video:
-        info["encoding"] = get_default_encoding()
-
-    lerobot_dataset = LeRobotDataset.from_preloaded(
-        repo_id=repo_id,
-        hf_dataset=hf_dataset,
-        episode_data_index=episode_data_index,
-        info=info,
-        videos_dir=videos_dir,
-    )
-
-    return lerobot_dataset
-
-
-def save_lerobot_dataset_on_disk(lerobot_dataset):
-    hf_dataset = lerobot_dataset.hf_dataset
-    info = lerobot_dataset.info
-    stats = lerobot_dataset.stats
-    episode_data_index = lerobot_dataset.episode_data_index
-    local_dir = lerobot_dataset.videos_dir.parent
-    meta_data_dir = local_dir / "meta_data"
-
-    hf_dataset = hf_dataset.with_format(None)  # to remove transforms that cant be saved
-    hf_dataset.save_to_disk(str(local_dir / "train"))
-
-    save_meta_data(info, stats, episode_data_index, meta_data_dir)
-
-
-def push_lerobot_dataset_to_hub(lerobot_dataset, tags):
-    hf_dataset = lerobot_dataset.hf_dataset
-    local_dir = lerobot_dataset.videos_dir.parent
-    videos_dir = lerobot_dataset.videos_dir
-    repo_id = lerobot_dataset.repo_id
-    video = lerobot_dataset.video
-    meta_data_dir = local_dir / "meta_data"
-
-    if not (local_dir / "train").exists():
-        raise ValueError(
-            "You need to run `save_lerobot_dataset_on_disk(lerobot_dataset)` before pushing to the hub."
-        )
-
-    hf_dataset.push_to_hub(repo_id, revision="main")
-    push_meta_data_to_hub(repo_id, meta_data_dir, revision="main")
-    push_dataset_card_to_hub(repo_id, revision="main", tags=tags)
-    if video:
-        push_videos_to_hub(repo_id, videos_dir, revision="main")
-    create_branch(repo_id, repo_type="dataset", branch=CODEBASE_VERSION)
-
-
-def create_lerobot_dataset(dataset, run_compute_stats, push_to_hub, tags, play_sounds):
-    if "image_writer" in dataset:
-        logging.info("Waiting for image writer to terminate...")
-        image_writer = dataset["image_writer"]
-        stop_image_writer(image_writer, timeout=20)
-
-    lerobot_dataset = from_dataset_to_lerobot_dataset(dataset, play_sounds)
-
-    if run_compute_stats:
-        log_say("Computing dataset statistics", play_sounds)
-        lerobot_dataset.stats = compute_stats(lerobot_dataset)
-    else:
-        logging.info("Skipping computation of the dataset statistics")
-        lerobot_dataset.stats = {}
-
-    save_lerobot_dataset_on_disk(lerobot_dataset)
-
-    if push_to_hub:
-        push_lerobot_dataset_to_hub(lerobot_dataset, tags)
-
-    return lerobot_dataset
--- a/lerobot/common/datasets/utils.py
+++ b/lerobot/common/datasets/utils.py
@@ -14,20 +14,31 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import json
-import re
 import warnings
-from functools import cache
+from itertools import accumulate
 from pathlib import Path
+from pprint import pformat
 from typing import Dict

 import datasets
+import jsonlines
 import torch
-from datasets import load_dataset, load_from_disk
-from huggingface_hub import DatasetCard, HfApi, hf_hub_download, snapshot_download
+from huggingface_hub import DatasetCard, HfApi
 from PIL import Image as PILImage
-from safetensors.torch import load_file
 from torchvision import transforms

+from lerobot.common.robot_devices.robots.utils import Robot
+
+DEFAULT_CHUNK_SIZE = 1000  # Max number of episodes per chunk
+
+INFO_PATH = "meta/info.json"
+EPISODES_PATH = "meta/episodes.jsonl"
+STATS_PATH = "meta/stats.json"
+TASKS_PATH = "meta/tasks.jsonl"
+
+DEFAULT_VIDEO_PATH = "videos/chunk-{episode_chunk:03d}/{video_key}/episode_{episode_index:06d}.mp4"
+DEFAULT_PARQUET_PATH = "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet"
+
 DATASET_CARD_TEMPLATE = """
 ---
 # Metadata will go there
@@ -37,7 +48,7 @@ This dataset was created using [LeRobot](https://github.com/huggingface/lerobot)
 """


-def flatten_dict(d, parent_key="", sep="/"):
+def flatten_dict(d: dict, parent_key: str = "", sep: str = "/") -> dict:
    """Flatten a nested dictionary structure by collapsing nested keys into one key with a separator.

    For example:
@@ -56,7 +67,7 @@ def flatten_dict(d, parent_key="", sep="/"):
    return dict(items)


-def unflatten_dict(d, sep="/"):
+def unflatten_dict(d: dict, sep: str = "/") -> dict:
    outdict = {}
    for key, value in d.items():
        parts = key.split(sep)
@@ -69,6 +80,24 @@ def unflatten_dict(d, sep="/"):
    return outdict


+def write_json(data: dict, fpath: Path) -> None:
+    fpath.parent.mkdir(exist_ok=True, parents=True)
+    with open(fpath, "w") as f:
+        json.dump(data, f, indent=4, ensure_ascii=False)
+
+
+def append_jsonl(data: dict, fpath: Path) -> None:
+    fpath.parent.mkdir(exist_ok=True, parents=True)
+    with jsonlines.open(fpath, "a") as writer:
+        writer.write(data)
+
+
+def write_stats(stats: dict[str, torch.Tensor | dict], fpath: Path) -> None:
+    serialized_stats = {key: value.tolist() for key, value in flatten_dict(stats).items()}
+    serialized_stats = unflatten_dict(serialized_stats)
+    write_json(serialized_stats, fpath)
+
+
 def hf_transform_to_torch(items_dict: dict[torch.Tensor | None]):
    """Get a transform function that convert items from Hugging Face dataset (pyarrow)
    to torch tensors. Importantly, images are converted from PIL, which corresponds to
@@ -80,14 +109,6 @@ def hf_transform_to_torch(items_dict: dict[torch.Tensor | None]):
        if isinstance(first_item, PILImage.Image):
            to_tensor = transforms.ToTensor()
            items_dict[key] = [to_tensor(img) for img in items_dict[key]]
-        elif isinstance(first_item, str):
-            # TODO (michel-aractingi): add str2embedding via language tokenizer
-            # For now we leave this part up to the user to choose how to address
-            # language conditioned tasks
-            pass
-        elif isinstance(first_item, dict) and "path" in first_item and "timestamp" in first_item:
-            # video frame will be processed downstream
-            pass
        elif first_item is None:
            pass
        else:
@@ -95,8 +116,40 @@ def hf_transform_to_torch(items_dict: dict[torch.Tensor | None]):
    return items_dict


-@cache
-def get_hf_dataset_safe_version(repo_id: str, version: str) -> str:
+def _get_major_minor(version: str) -> tuple[int]:
+    split = version.strip("v").split(".")
+    return int(split[0]), int(split[1])
+
+
+def check_version_compatibility(
+    repo_id: str, version_to_check: str, current_version: str, enforce_breaking_major: bool = True
+) -> None:
+    current_major, _ = _get_major_minor(current_version)
+    major_to_check, _ = _get_major_minor(version_to_check)
+    if major_to_check < current_major and enforce_breaking_major:
+        raise ValueError(
+            f"""The dataset you requested ({repo_id}) is in {version_to_check} format. We introduced a new
+            format with v2.0 that is not backward compatible. Please use our conversion script
+            first (convert_dataset_v1_to_v2.py) to convert your dataset to this new format."""
+        )
+    elif float(version_to_check.strip("v")) < float(current_version.strip("v")):
+        warnings.warn(
+            f"""The dataset you requested ({repo_id}) was created with a previous version ({version_to_check}) of the
+            codebase. The current codebase version is {current_version}. You should be fine since
+            backward compatibility is maintained. If you encounter a problem, contact LeRobot maintainers on
+            Discord ('https://discord.com/invite/s3KuuzsPFb') or open an issue on github.""",
+            stacklevel=1,
+        )
+
+
+def get_hub_safe_version(repo_id: str, version: str, enforce_v2: bool = True) -> str:
+    num_version = float(version.strip("v"))
+    if num_version < 2 and enforce_v2:
+        raise ValueError(
+            f"""The dataset you requested ({repo_id}) is in {version} format. We introduced a new
+            format with v2.0 that is not backward compatible. Please use our conversion script
+            first (convert_dataset_v1_to_v2.py) to convert your dataset to this new format."""
+        )
    api = HfApi()
    dataset_info = api.list_repo_refs(repo_id, repo_type="dataset")
    branches = [b.name for b in dataset_info.branches]
@@ -116,106 +169,185 @@ def get_hf_dataset_safe_version(repo_id: str, version: str) -> str:
        return version


-def load_hf_dataset(repo_id: str, version: str, root: Path, split: str) -> datasets.Dataset:
-    """hf_dataset contains all the observations, states, actions, rewards, etc."""
-    if root is not None:
-        hf_dataset = load_from_disk(str(Path(root) / repo_id / "train"))
-        # TODO(rcadene): clean this which enables getting a subset of dataset
-        if split != "train":
-            if "%" in split:
-                raise NotImplementedError(f"We dont support splitting based on percentage for now ({split}).")
-            match_from = re.search(r"train\[(\d+):\]", split)
-            match_to = re.search(r"train\[:(\d+)\]", split)
-            if match_from:
-                from_frame_index = int(match_from.group(1))
-                hf_dataset = hf_dataset.select(range(from_frame_index, len(hf_dataset)))
-            elif match_to:
-                to_frame_index = int(match_to.group(1))
-                hf_dataset = hf_dataset.select(range(to_frame_index))
-            else:
-                raise ValueError(
-                    f'`split` ({split}) should either be "train", "train[INT:]", or "train[:INT]"'
-                )
-    else:
-        safe_version = get_hf_dataset_safe_version(repo_id, version)
-        hf_dataset = load_dataset(repo_id, revision=safe_version, split=split)
-
-    hf_dataset.set_transform(hf_transform_to_torch)
-    return hf_dataset
+def load_info(local_dir: Path) -> dict:
+    with open(local_dir / INFO_PATH) as f:
+        return json.load(f)


-def load_episode_data_index(repo_id, version, root) -> dict[str, torch.Tensor]:
-    """episode_data_index contains the range of indices for each episode
-
-    Example:
-    ```python
-    from_id = episode_data_index["from"][episode_id].item()
-    to_id = episode_data_index["to"][episode_id].item()
-    episode_frames = [dataset[i] for i in range(from_id, to_id)]
-    ```
-    """
-    if root is not None:
-        path = Path(root) / repo_id / "meta_data" / "episode_data_index.safetensors"
-    else:
-        safe_version = get_hf_dataset_safe_version(repo_id, version)
-        path = hf_hub_download(
-            repo_id, "meta_data/episode_data_index.safetensors", repo_type="dataset", revision=safe_version
-        )
-
-    return load_file(path)
-
-
-def load_stats(repo_id, version, root) -> dict[str, dict[str, torch.Tensor]]:
-    """stats contains the statistics per modality computed over the full dataset, such as max, min, mean, std
-
-    Example:
-    ```python
-    normalized_action = (action - stats["action"]["mean"]) / stats["action"]["std"]
-    ```
-    """
-    if root is not None:
-        path = Path(root) / repo_id / "meta_data" / "stats.safetensors"
-    else:
-        safe_version = get_hf_dataset_safe_version(repo_id, version)
-        path = hf_hub_download(
-            repo_id, "meta_data/stats.safetensors", repo_type="dataset", revision=safe_version
-        )
-
-    stats = load_file(path)
+def load_stats(local_dir: Path) -> dict:
+    with open(local_dir / STATS_PATH) as f:
+        stats = json.load(f)
+    stats = {key: torch.tensor(value) for key, value in flatten_dict(stats).items()}
    return unflatten_dict(stats)


-def load_info(repo_id, version, root) -> dict:
-    """info contains useful information regarding the dataset that are not stored elsewhere
+def load_tasks(local_dir: Path) -> dict:
+    with jsonlines.open(local_dir / TASKS_PATH, "r") as reader:
+        tasks = list(reader)

-    Example:
-    ```python
-    print("frame per second used to collect the video", info["fps"])
-    ```
+    return {item["task_index"]: item["task"] for item in sorted(tasks, key=lambda x: x["task_index"])}
+
+
+def load_episode_dicts(local_dir: Path) -> dict:
+    with jsonlines.open(local_dir / EPISODES_PATH, "r") as reader:
+        return list(reader)
+
+
+def _get_info_from_robot(robot: Robot, use_videos: bool) -> tuple[list | dict]:
+    shapes = {key: len(names) for key, names in robot.names.items()}
+    camera_shapes = {}
+    for key, cam in robot.cameras.items():
+        video_key = f"observation.images.{key}"
+        camera_shapes[video_key] = {
+            "width": cam.width,
+            "height": cam.height,
+            "channels": cam.channels,
+        }
+    keys = list(robot.names)
+    image_keys = [] if use_videos else list(camera_shapes)
+    video_keys = list(camera_shapes) if use_videos else []
+    shapes = {**shapes, **camera_shapes}
+    names = robot.names
+    robot_type = robot.robot_type
+
+    return robot_type, keys, image_keys, video_keys, shapes, names
+
+
+def create_empty_dataset_info(
+    codebase_version: str,
+    fps: int,
+    robot_type: str,
+    keys: list[str],
+    image_keys: list[str],
+    video_keys: list[str],
+    shapes: dict,
+    names: dict,
+) -> dict:
+    return {
+        "codebase_version": codebase_version,
+        "data_path": DEFAULT_PARQUET_PATH,
+        "robot_type": robot_type,
+        "total_episodes": 0,
+        "total_frames": 0,
+        "total_tasks": 0,
+        "total_videos": 0,
+        "total_chunks": 0,
+        "chunks_size": DEFAULT_CHUNK_SIZE,
+        "fps": fps,
+        "splits": {},
+        "keys": keys,
+        "video_keys": video_keys,
+        "image_keys": image_keys,
+        "shapes": shapes,
+        "names": names,
+        "videos": {"videos_path": DEFAULT_VIDEO_PATH} if len(video_keys) > 0 else None,
+    }
+
+
+def get_episode_data_index(episodes: list, episode_dicts: list[dict]) -> dict[str, torch.Tensor]:
+    episode_lengths = {ep_idx: ep_dict["length"] for ep_idx, ep_dict in enumerate(episode_dicts)}
+    if episodes is not None:
+        episode_lengths = {ep_idx: episode_lengths[ep_idx] for ep_idx in episodes}
+
+    cumulative_lenghts = list(accumulate(episode_lengths.values()))
+    return {
+        "from": torch.LongTensor([0] + cumulative_lenghts[:-1]),
+        "to": torch.LongTensor(cumulative_lenghts),
+    }
+
+
+def check_timestamps_sync(
+    hf_dataset: datasets.Dataset,
+    episode_data_index: dict[str, torch.Tensor],
+    fps: int,
+    tolerance_s: float,
+    raise_value_error: bool = True,
+) -> bool:
    """
-    if root is not None:
-        path = Path(root) / repo_id / "meta_data" / "info.json"
-    else:
-        safe_version = get_hf_dataset_safe_version(repo_id, version)
-        path = hf_hub_download(repo_id, "meta_data/info.json", repo_type="dataset", revision=safe_version)
+    This check is to make sure that each timestamps is separated to the next by 1/fps +/- tolerance to
+    account for possible numerical error.
+    """
+    timestamps = torch.stack(hf_dataset["timestamp"])
+    # timestamps[2] += tolerance_s  # TODO delete
+    # timestamps[-2] += tolerance_s/2  # TODO delete
+    diffs = torch.diff(timestamps)
+    within_tolerance = torch.abs(diffs - 1 / fps) <= tolerance_s

-    with open(path) as f:
-        info = json.load(f)
-    return info
+    # We mask differences between the timestamp at the end of an episode
+    # and the one the start of the next episode since these are expected
+    # to be outside tolerance.
+    mask = torch.ones(len(diffs), dtype=torch.bool)
+    ignored_diffs = episode_data_index["to"][:-1] - 1
+    mask[ignored_diffs] = False
+    filtered_within_tolerance = within_tolerance[mask]
+
+    if not torch.all(filtered_within_tolerance):
+        # Track original indices before masking
+        original_indices = torch.arange(len(diffs))
+        filtered_indices = original_indices[mask]
+        outside_tolerance_filtered_indices = torch.nonzero(~filtered_within_tolerance)  # .squeeze()
+        outside_tolerance_indices = filtered_indices[outside_tolerance_filtered_indices]
+        episode_indices = torch.stack(hf_dataset["episode_index"])
+
+        outside_tolerances = []
+        for idx in outside_tolerance_indices:
+            entry = {
+                "timestamps": [timestamps[idx], timestamps[idx + 1]],
+                "diff": diffs[idx],
+                "episode_index": episode_indices[idx].item(),
+            }
+            outside_tolerances.append(entry)
+
+        if raise_value_error:
+            raise ValueError(
+                f"""One or several timestamps unexpectedly violate the tolerance inside episode range.
+                This might be due to synchronization issues with timestamps during data collection.
+                \n{pformat(outside_tolerances)}"""
+            )
+        return False
+
+    return True


-def load_videos(repo_id, version, root) -> Path:
-    if root is not None:
-        path = Path(root) / repo_id / "videos"
-    else:
-        # TODO(rcadene): we download the whole repo here. see if we can avoid this
-        safe_version = get_hf_dataset_safe_version(repo_id, version)
-        repo_dir = snapshot_download(repo_id, repo_type="dataset", revision=safe_version)
-        path = Path(repo_dir) / "videos"
+def check_delta_timestamps(
+    delta_timestamps: dict[str, list[float]], fps: int, tolerance_s: float, raise_value_error: bool = True
+) -> bool:
+    """This will check if all the values in delta_timestamps are multiples of 1/fps +/- tolerance.
+    This is to ensure that these delta_timestamps added to any timestamp from a dataset will themselves be
+    actual timestamps from the dataset.
+    """
+    outside_tolerance = {}
+    for key, delta_ts in delta_timestamps.items():
+        within_tolerance = [abs(ts * fps - round(ts * fps)) <= tolerance_s for ts in delta_ts]
+        if not all(within_tolerance):
+            outside_tolerance[key] = [
+                ts for ts, is_within in zip(delta_ts, within_tolerance, strict=True) if not is_within
+            ]

-    return path
+    if len(outside_tolerance) > 0:
+        if raise_value_error:
+            raise ValueError(
+                f"""
+                The following delta_timestamps are found outside of tolerance range.
+                Please make sure they are multiples of 1/{fps} +/- tolerance and adjust
+                their values accordingly.
+                \n{pformat(outside_tolerance)}
+                """
+            )
+        return False
+
+    return True


+def get_delta_indices(delta_timestamps: dict[str, list[float]], fps: int) -> dict[str, list[int]]:
+    delta_indices = {}
+    for key, delta_ts in delta_timestamps.items():
+        delta_indices[key] = (torch.tensor(delta_ts) * fps).long().tolist()
+
+    return delta_indices
+
+
+# TODO(aliberts): remove
 def load_previous_and_future_frames(
    item: dict[str, torch.Tensor],
    hf_dataset: datasets.Dataset,
@@ -309,6 +441,7 @@ def load_previous_and_future_frames(
    return item


+# TODO(aliberts): remove
 def calculate_episode_data_index(hf_dataset: datasets.Dataset) -> Dict[str, torch.Tensor]:
    """
    Calculate episode data index for the provided HuggingFace Dataset. Relies on episode_index column of hf_dataset.
@@ -363,6 +496,7 @@ def calculate_episode_data_index(hf_dataset: datasets.Dataset) -> Dict[str, torc
    return episode_data_index


+# TODO(aliberts): remove
 def reset_episode_index(hf_dataset: datasets.Dataset) -> datasets.Dataset:
    """Reset the `episode_index` of the provided HuggingFace Dataset.

@@ -400,7 +534,7 @@ def cycle(iterable):
            iterator = iter(iterable)


-def create_branch(repo_id, *, branch: str, repo_type: str | None = None):
+def create_branch(repo_id, *, branch: str, repo_type: str | None = None) -> None:
    """Create a branch on a existing Hugging Face repo. Delete the branch if it already
    exists before creating it.
    """
@@ -415,12 +549,17 @@ def create_branch(repo_id, *, branch: str, repo_type: str | None = None):
    api.create_branch(repo_id, repo_type=repo_type, branch=branch)


-def create_lerobot_dataset_card(tags: list | None = None, text: str | None = None) -> DatasetCard:
+def create_lerobot_dataset_card(
+    tags: list | None = None, text: str | None = None, info: dict | None = None
+) -> DatasetCard:
    card = DatasetCard(DATASET_CARD_TEMPLATE)
    card.data.task_categories = ["robotics"]
    card.data.tags = ["LeRobot"]
    if tags is not None:
        card.data.tags += tags
    if text is not None:
-        card.text += text
+        card.text += f"{text}\n"
+    if info is not None:
+        card.text += "[meta/info.json](meta/info.json)\n"
+        card.text += f"```json\n{json.dumps(info, indent=4)}\n```"
    return card
--- a/lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
+++ b/lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
@@ -0,0 +1,106 @@
+#!/usr/bin/env python
+
+# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import traceback
+from pathlib import Path
+
+from lerobot import available_datasets
+from lerobot.common.datasets.v2.convert_dataset_v1_to_v2 import convert_dataset, parse_robot_config
+
+LOCAL_DIR = Path("data/")
+ALOHA_SINGLE_TASKS_REAL = {
+    "aloha_mobile_cabinet": "Open the top cabinet, store the pot inside it then close the cabinet.",
+    "aloha_mobile_chair": "Push the chairs in front of the desk to place them against it.",
+    "aloha_mobile_elevator": "Take the elevator to the 1st floor.",
+    "aloha_mobile_shrimp": "Sauté the raw shrimp on both sides, then serve it in the bowl.",
+    "aloha_mobile_wash_pan": "Pick up the pan, rinse it in the sink and then place it in the drying rack.",
+    "aloha_mobile_wipe_wine": "Pick up the wet cloth on the faucet and use it to clean the spilled wine on the table and underneath the glass.",
+    "aloha_static_battery": "Place the battery into the slot of the remote controller.",
+    "aloha_static_candy": "Pick up the candy and unwrap it.",
+    "aloha_static_coffee": "Place the coffee capsule inside the capsule container, then place the cup onto the center of the cup tray, then push the 'Hot Water' and 'Travel Mug' buttons.",
+    "aloha_static_coffee_new": "Place the coffee capsule inside the capsule container, then place the cup onto the center of the cup tray.",
+    "aloha_static_cups_open": "Pick up the plastic cup and open its lid.",
+    "aloha_static_fork_pick_up": "Pick up the fork and place it on the plate.",
+    "aloha_static_pingpong_test": "Transfer one of the two balls in the right glass into the left glass, then transfer it back to the right glass.",
+    "aloha_static_pro_pencil": "Pick up the pencil with the right arm, hand it over to the left arm then place it back onto the table.",
+    "aloha_static_screw_driver": "Pick up the screwdriver with the right arm, hand it over to the left arm then place it into the cup.",
+    "aloha_static_tape": "Cut a small piece of tape from the tape dispenser then place it on the cardboard box's edge.",
+    "aloha_static_thread_velcro": "Pick up the velcro cable tie with the left arm, then insert the end of the velcro tie into the other end's loop with the right arm.",
+    "aloha_static_towel": "Pick up a piece of paper towel and place it on the spilled liquid.",
+    "aloha_static_vinh_cup": "Pick up the platic cup with the right arm, then pop its lid open with the left arm.",
+    "aloha_static_vinh_cup_left": "Pick up the platic cup with the left arm, then pop its lid open with the right arm.",
+    "aloha_static_ziploc_slide": "Slide open the ziploc bag.",
+}
+ALOHA_CONFIG = Path("lerobot/configs/robot/aloha.yaml")
+
+
+def batch_convert():
+    status = {}
+    logfile = LOCAL_DIR / "conversion_log.txt"
+    for num, repo_id in enumerate(available_datasets):
+        print(f"\nConverting {repo_id} ({num}/{len(available_datasets)})")
+        print("---------------------------------------------------------")
+        name = repo_id.split("/")[1]
+        single_task, tasks_col, robot_config = None, None, None
+
+        if "aloha" in name:
+            robot_config = parse_robot_config(ALOHA_CONFIG)
+            if "sim_insertion" in name:
+                single_task = "Insert the peg into the socket."
+            elif "sim_transfer" in name:
+                single_task = "Pick up the cube with the right arm and transfer it to the left arm."
+            else:
+                single_task = ALOHA_SINGLE_TASKS_REAL[name]
+        elif "unitreeh1" in name:
+            if "fold_clothes" in name:
+                single_task = "Fold the sweatshirt."
+            elif "rearrange_objects" in name or "rearrange_objects" in name:
+                single_task = "Put the object into the bin."
+            elif "two_robot_greeting" in name:
+                single_task = "Greet the other robot with a high five."
+            elif "warehouse" in name:
+                single_task = (
+                    "Grab the spray paint on the shelf and place it in the bin on top of the robot dog."
+                )
+        elif name != "columbia_cairlab_pusht_real" and "pusht" in name:
+            single_task = "Push the T-shaped block onto the T-shaped target."
+        elif "xarm_lift" in name or "xarm_push" in name:
+            single_task = "Pick up the cube and lift it."
+        elif name == "umi_cup_in_the_wild":
+            single_task = "Put the cup on the plate."
+        else:
+            tasks_col = "language_instruction"
+
+        try:
+            convert_dataset(
+                repo_id=repo_id,
+                local_dir=LOCAL_DIR,
+                single_task=single_task,
+                tasks_col=tasks_col,
+                robot_config=robot_config,
+            )
+            status = f"{repo_id}: success."
+            with open(logfile, "a") as file:
+                file.write(status + "\n")
+        except Exception:
+            status = f"{repo_id}: failed\n    {traceback.format_exc()}"
+            with open(logfile, "a") as file:
+                file.write(status + "\n")
+            continue
+
+
+if __name__ == "__main__":
+    batch_convert()
--- a/lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py
+++ b/lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py
@@ -0,0 +1,814 @@
+#!/usr/bin/env python
+
+# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+This script will help you convert any LeRobot dataset already pushed to the hub from codebase version 1.6 to
+2.0. You will be required to provide the 'tasks', which is a short but accurate description in plain English
+for each of the task performed in the dataset. This will allow to easily train models with task-conditionning.
+
+We support 3 different scenarios for these tasks (see instructions below):
+    1. Single task dataset: all episodes of your dataset have the same single task.
+    2. Single task episodes: the episodes of your dataset each contain a single task but they can differ from
+      one episode to the next.
+    3. Multi task episodes: episodes of your dataset may each contain several different tasks.
+
+
+Can you can also provide a robot config .yaml file (not mandatory) to this script via the option
+'--robot-config' so that it writes information about the robot (robot type, motors names) this dataset was
+recorded with. For now, only Aloha/Koch type robots are supported with this option.
+
+
+# 1. Single task dataset
+If your dataset contains a single task, you can simply provide it directly via the CLI with the
+'--single-task' option.
+
+Examples:
+
+```bash
+python lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py \
+    --repo-id lerobot/aloha_sim_insertion_human_image \
+    --single-task "Insert the peg into the socket." \
+    --robot-config lerobot/configs/robot/aloha.yaml \
+    --local-dir data
+```
+
+```bash
+python lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py \
+    --repo-id aliberts/koch_tutorial \
+    --single-task "Pick the Lego block and drop it in the box on the right." \
+    --robot-config lerobot/configs/robot/koch.yaml \
+    --local-dir data
+```
+
+
+# 2. Single task episodes
+If your dataset is a multi-task dataset, you have two options to provide the tasks to this script:
+
+- If your dataset already contains a language instruction column in its parquet file, you can simply provide
+  this column's name with the '--tasks-col' arg.
+
+    Example:
+
+    ```bash
+    python lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py \
+        --repo-id lerobot/stanford_kuka_multimodal_dataset \
+        --tasks-col "language_instruction" \
+        --local-dir data
+    ```
+
+- If your dataset doesn't contain a language instruction, you should provide the path to a .json file with the
+  '--tasks-path' arg. This file should have the following structure where keys correspond to each
+  episode_index in the dataset, and values are the language instruction for that episode.
+
+    Example:
+
+    ```json
+    {
+        "0": "Do something",
+        "1": "Do something else",
+        "2": "Do something",
+        "3": "Go there",
+        ...
+    }
+    ```
+
+# 3. Multi task episodes
+If you have multiple tasks per episodes, your dataset should contain a language instruction column in its
+parquet file, and you must provide this column's name with the '--tasks-col' arg.
+
+Example:
+
+```bash
+python lerobot/common/datasets/v2/convert_dataset_v1_to_v2.py \
+    --repo-id lerobot/stanford_kuka_multimodal_dataset \
+    --tasks-col "language_instruction" \
+    --local-dir data
+```
+"""
+
+import argparse
+import contextlib
+import filecmp
+import json
+import math
+import shutil
+import subprocess
+import warnings
+from pathlib import Path
+
+import datasets
+import jsonlines
+import pyarrow.compute as pc
+import pyarrow.parquet as pq
+import torch
+from datasets import Dataset
+from huggingface_hub import HfApi
+from huggingface_hub.errors import EntryNotFoundError
+from PIL import Image
+from safetensors.torch import load_file
+
+from lerobot.common.datasets.utils import (
+    DEFAULT_CHUNK_SIZE,
+    DEFAULT_PARQUET_PATH,
+    DEFAULT_VIDEO_PATH,
+    EPISODES_PATH,
+    INFO_PATH,
+    STATS_PATH,
+    TASKS_PATH,
+    create_branch,
+    create_lerobot_dataset_card,
+    flatten_dict,
+    get_hub_safe_version,
+    unflatten_dict,
+)
+from lerobot.common.datasets.video_utils import VideoFrame  # noqa: F401
+from lerobot.common.utils.utils import init_hydra_config
+
+V16 = "v1.6"
+V20 = "v2.0"
+
+GITATTRIBUTES_REF = "aliberts/gitattributes_reference"
+V1_VIDEO_FILE = "{video_key}_episode_{episode_index:06d}.mp4"
+V1_INFO_PATH = "meta_data/info.json"
+V1_STATS_PATH = "meta_data/stats.safetensors"
+
+
+def parse_robot_config(config_path: Path, config_overrides: list[str] | None = None) -> tuple[str, dict]:
+    robot_cfg = init_hydra_config(config_path, config_overrides)
+    if robot_cfg["robot_type"] in ["aloha", "koch"]:
+        state_names = [
+            f"{arm}_{motor}" if len(robot_cfg["follower_arms"]) > 1 else motor
+            for arm in robot_cfg["follower_arms"]
+            for motor in robot_cfg["follower_arms"][arm]["motors"]
+        ]
+        action_names = [
+            # f"{arm}_{motor}" for arm in ["left", "right"] for motor in robot_cfg["leader_arms"][arm]["motors"]
+            f"{arm}_{motor}" if len(robot_cfg["leader_arms"]) > 1 else motor
+            for arm in robot_cfg["leader_arms"]
+            for motor in robot_cfg["leader_arms"][arm]["motors"]
+        ]
+    # elif robot_cfg["robot_type"] == "stretch3": TODO
+    else:
+        raise NotImplementedError(
+            "Please provide robot_config={'robot_type': ..., 'names': ...} directly to convert_dataset()."
+        )
+
+    return {
+        "robot_type": robot_cfg["robot_type"],
+        "names": {
+            "observation.state": state_names,
+            "action": action_names,
+        },
+    }
+
+
+def load_json(fpath: Path) -> dict:
+    with open(fpath) as f:
+        return json.load(f)
+
+
+def write_json(data: dict, fpath: Path) -> None:
+    fpath.parent.mkdir(exist_ok=True, parents=True)
+    with open(fpath, "w") as f:
+        json.dump(data, f, indent=4, ensure_ascii=False)
+
+
+def write_jsonlines(data: dict, fpath: Path) -> None:
+    fpath.parent.mkdir(exist_ok=True, parents=True)
+    with jsonlines.open(fpath, "w") as writer:
+        writer.write_all(data)
+
+
+def convert_stats_to_json(v1_dir: Path, v2_dir: Path) -> None:
+    safetensor_path = v1_dir / V1_STATS_PATH
+    stats = load_file(safetensor_path)
+    serialized_stats = {key: value.tolist() for key, value in stats.items()}
+    serialized_stats = unflatten_dict(serialized_stats)
+
+    json_path = v2_dir / STATS_PATH
+    json_path.parent.mkdir(exist_ok=True, parents=True)
+    with open(json_path, "w") as f:
+        json.dump(serialized_stats, f, indent=4)
+
+    # Sanity check
+    with open(json_path) as f:
+        stats_json = json.load(f)
+
+    stats_json = flatten_dict(stats_json)
+    stats_json = {key: torch.tensor(value) for key, value in stats_json.items()}
+    for key in stats:
+        torch.testing.assert_close(stats_json[key], stats[key])
+
+
+def get_keys(dataset: Dataset) -> dict[str, list]:
+    sequence_keys, image_keys, video_keys = [], [], []
+    for key, ft in dataset.features.items():
+        if isinstance(ft, datasets.Sequence):
+            sequence_keys.append(key)
+        elif isinstance(ft, datasets.Image):
+            image_keys.append(key)
+        elif ft._type == "VideoFrame":
+            video_keys.append(key)
+
+    return {
+        "sequence": sequence_keys,
+        "image": image_keys,
+        "video": video_keys,
+    }
+
+
+def add_task_index_by_episodes(dataset: Dataset, tasks_by_episodes: dict) -> tuple[Dataset, list[str]]:
+    df = dataset.to_pandas()
+    tasks = list(set(tasks_by_episodes.values()))
+    tasks_to_task_index = {task: task_idx for task_idx, task in enumerate(tasks)}
+    episodes_to_task_index = {ep_idx: tasks_to_task_index[task] for ep_idx, task in tasks_by_episodes.items()}
+    df["task_index"] = df["episode_index"].map(episodes_to_task_index).astype(int)
+
+    features = dataset.features
+    features["task_index"] = datasets.Value(dtype="int64")
+    dataset = Dataset.from_pandas(df, features=features, split="train")
+    return dataset, tasks
+
+
+def add_task_index_from_tasks_col(
+    dataset: Dataset, tasks_col: str
+) -> tuple[Dataset, dict[str, list[str]], list[str]]:
+    df = dataset.to_pandas()
+
+    # HACK: This is to clean some of the instructions in our version of Open X datasets
+    prefix_to_clean = "tf.Tensor(b'"
+    suffix_to_clean = "', shape=(), dtype=string)"
+    df[tasks_col] = df[tasks_col].str.removeprefix(prefix_to_clean).str.removesuffix(suffix_to_clean)
+
+    # Create task_index col
+    tasks_by_episode = df.groupby("episode_index")[tasks_col].unique().apply(lambda x: x.tolist()).to_dict()
+    tasks = df[tasks_col].unique().tolist()
+    tasks_to_task_index = {task: idx for idx, task in enumerate(tasks)}
+    df["task_index"] = df[tasks_col].map(tasks_to_task_index).astype(int)
+
+    # Build the dataset back from df
+    features = dataset.features
+    features["task_index"] = datasets.Value(dtype="int64")
+    dataset = Dataset.from_pandas(df, features=features, split="train")
+    dataset = dataset.remove_columns(tasks_col)
+
+    return dataset, tasks, tasks_by_episode
+
+
+def split_parquet_by_episodes(
+    dataset: Dataset,
+    keys: dict[str, list],
+    total_episodes: int,
+    total_chunks: int,
+    output_dir: Path,
+) -> list:
+    table = dataset.remove_columns(keys["video"])._data.table
+    episode_lengths = []
+    for ep_chunk in range(total_chunks):
+        ep_chunk_start = DEFAULT_CHUNK_SIZE * ep_chunk
+        ep_chunk_end = min(DEFAULT_CHUNK_SIZE * (ep_chunk + 1), total_episodes)
+
+        chunk_dir = "/".join(DEFAULT_PARQUET_PATH.split("/")[:-1]).format(episode_chunk=ep_chunk)
+        (output_dir / chunk_dir).mkdir(parents=True, exist_ok=True)
+        for ep_idx in range(ep_chunk_start, ep_chunk_end):
+            ep_table = table.filter(pc.equal(table["episode_index"], ep_idx))
+            episode_lengths.insert(ep_idx, len(ep_table))
+            output_file = output_dir / DEFAULT_PARQUET_PATH.format(
+                episode_chunk=ep_chunk, episode_index=ep_idx
+            )
+            pq.write_table(ep_table, output_file)
+
+    return episode_lengths
+
+
+def move_videos(
+    repo_id: str,
+    video_keys: list[str],
+    total_episodes: int,
+    total_chunks: int,
+    work_dir: Path,
+    clean_gittatributes: Path,
+    branch: str = "main",
+) -> None:
+    """
+    HACK: Since HfApi() doesn't provide a way to move files directly in a repo, this function will run git
+    commands to fetch git lfs video files references to move them into subdirectories without having to
+    actually download them.
+    """
+    _lfs_clone(repo_id, work_dir, branch)
+
+    videos_moved = False
+    video_files = [str(f.relative_to(work_dir)) for f in work_dir.glob("videos*/*.mp4")]
+    if len(video_files) == 0:
+        video_files = [str(f.relative_to(work_dir)) for f in work_dir.glob("videos*/*/*/*.mp4")]
+        videos_moved = True  # Videos have already been moved
+
+    assert len(video_files) == total_episodes * len(video_keys)
+
+    lfs_untracked_videos = _get_lfs_untracked_videos(work_dir, video_files)
+
+    current_gittatributes = work_dir / ".gitattributes"
+    if not filecmp.cmp(current_gittatributes, clean_gittatributes, shallow=False):
+        fix_gitattributes(work_dir, current_gittatributes, clean_gittatributes)
+
+    if lfs_untracked_videos:
+        fix_lfs_video_files_tracking(work_dir, video_files)
+
+    if videos_moved:
+        return
+
+    video_dirs = sorted(work_dir.glob("videos*/"))
+    for ep_chunk in range(total_chunks):
+        ep_chunk_start = DEFAULT_CHUNK_SIZE * ep_chunk
+        ep_chunk_end = min(DEFAULT_CHUNK_SIZE * (ep_chunk + 1), total_episodes)
+        for vid_key in video_keys:
+            chunk_dir = "/".join(DEFAULT_VIDEO_PATH.split("/")[:-1]).format(
+                episode_chunk=ep_chunk, video_key=vid_key
+            )
+            (work_dir / chunk_dir).mkdir(parents=True, exist_ok=True)
+
+            for ep_idx in range(ep_chunk_start, ep_chunk_end):
+                target_path = DEFAULT_VIDEO_PATH.format(
+                    episode_chunk=ep_chunk, video_key=vid_key, episode_index=ep_idx
+                )
+                video_file = V1_VIDEO_FILE.format(video_key=vid_key, episode_index=ep_idx)
+                if len(video_dirs) == 1:
+                    video_path = video_dirs[0] / video_file
+                else:
+                    for dir in video_dirs:
+                        if (dir / video_file).is_file():
+                            video_path = dir / video_file
+                            break
+
+                video_path.rename(work_dir / target_path)
+
+    commit_message = "Move video files into chunk subdirectories"
+    subprocess.run(["git", "add", "."], cwd=work_dir, check=True)
+    subprocess.run(["git", "commit", "-m", commit_message], cwd=work_dir, check=True)
+    subprocess.run(["git", "push"], cwd=work_dir, check=True)
+
+
+def fix_lfs_video_files_tracking(work_dir: Path, lfs_untracked_videos: list[str]) -> None:
+    """
+    HACK: This function fixes the tracking by git lfs which was not properly set on some repos. In that case,
+    there's no other option than to download the actual files and reupload them with lfs tracking.
+    """
+    for i in range(0, len(lfs_untracked_videos), 100):
+        files = lfs_untracked_videos[i : i + 100]
+        try:
+            subprocess.run(["git", "rm", "--cached", *files], cwd=work_dir, capture_output=True, check=True)
+        except subprocess.CalledProcessError as e:
+            print("git rm --cached ERROR:")
+            print(e.stderr)
+        subprocess.run(["git", "add", *files], cwd=work_dir, check=True)
+
+    commit_message = "Track video files with git lfs"
+    subprocess.run(["git", "commit", "-m", commit_message], cwd=work_dir, check=True)
+    subprocess.run(["git", "push"], cwd=work_dir, check=True)
+
+
+def fix_gitattributes(work_dir: Path, current_gittatributes: Path, clean_gittatributes: Path) -> None:
+    shutil.copyfile(clean_gittatributes, current_gittatributes)
+    subprocess.run(["git", "add", ".gitattributes"], cwd=work_dir, check=True)
+    subprocess.run(["git", "commit", "-m", "Fix .gitattributes"], cwd=work_dir, check=True)
+    subprocess.run(["git", "push"], cwd=work_dir, check=True)
+
+
+def _lfs_clone(repo_id: str, work_dir: Path, branch: str) -> None:
+    subprocess.run(["git", "lfs", "install"], cwd=work_dir, check=True)
+    repo_url = f"https://huggingface.co/datasets/{repo_id}"
+    env = {"GIT_LFS_SKIP_SMUDGE": "1"}  # Prevent downloading LFS files
+    subprocess.run(
+        ["git", "clone", "--branch", branch, "--single-branch", "--depth", "1", repo_url, str(work_dir)],
+        check=True,
+        env=env,
+    )
+
+
+def _get_lfs_untracked_videos(work_dir: Path, video_files: list[str]) -> list[str]:
+    lfs_tracked_files = subprocess.run(
+        ["git", "lfs", "ls-files", "-n"], cwd=work_dir, capture_output=True, text=True, check=True
+    )
+    lfs_tracked_files = set(lfs_tracked_files.stdout.splitlines())
+    return [f for f in video_files if f not in lfs_tracked_files]
+
+
+def _get_audio_info(video_path: Path | str) -> dict:
+    ffprobe_audio_cmd = [
+        "ffprobe",
+        "-v",
+        "error",
+        "-select_streams",
+        "a:0",
+        "-show_entries",
+        "stream=channels,codec_name,bit_rate,sample_rate,bit_depth,channel_layout,duration",
+        "-of",
+        "json",
+        str(video_path),
+    ]
+    result = subprocess.run(ffprobe_audio_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
+    if result.returncode != 0:
+        raise RuntimeError(f"Error running ffprobe: {result.stderr}")
+
+    info = json.loads(result.stdout)
+    audio_stream_info = info["streams"][0] if info.get("streams") else None
+    if audio_stream_info is None:
+        return {"has_audio": False}
+
+    # Return the information, defaulting to None if no audio stream is present
+    return {
+        "has_audio": True,
+        "audio.channels": audio_stream_info.get("channels", None),
+        "audio.codec": audio_stream_info.get("codec_name", None),
+        "audio.bit_rate": int(audio_stream_info["bit_rate"]) if audio_stream_info.get("bit_rate") else None,
+        "audio.sample_rate": int(audio_stream_info["sample_rate"])
+        if audio_stream_info.get("sample_rate")
+        else None,
+        "audio.bit_depth": audio_stream_info.get("bit_depth", None),
+        "audio.channel_layout": audio_stream_info.get("channel_layout", None),
+    }
+
+
+def _get_video_info(video_path: Path | str) -> dict:
+    ffprobe_video_cmd = [
+        "ffprobe",
+        "-v",
+        "error",
+        "-select_streams",
+        "v:0",
+        "-show_entries",
+        "stream=r_frame_rate,width,height,codec_name,nb_frames,duration,pix_fmt",
+        "-of",
+        "json",
+        str(video_path),
+    ]
+    result = subprocess.run(ffprobe_video_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
+    if result.returncode != 0:
+        raise RuntimeError(f"Error running ffprobe: {result.stderr}")
+
+    info = json.loads(result.stdout)
+    video_stream_info = info["streams"][0]
+
+    # Calculate fps from r_frame_rate
+    r_frame_rate = video_stream_info["r_frame_rate"]
+    num, denom = map(int, r_frame_rate.split("/"))
+    fps = num / denom
+
+    pixel_channels = get_video_pixel_channels(video_stream_info["pix_fmt"])
+
+    video_info = {
+        "video.fps": fps,
+        "video.width": video_stream_info["width"],
+        "video.height": video_stream_info["height"],
+        "video.channels": pixel_channels,
+        "video.codec": video_stream_info["codec_name"],
+        "video.pix_fmt": video_stream_info["pix_fmt"],
+        "video.is_depth_map": False,
+        **_get_audio_info(video_path),
+    }
+
+    return video_info
+
+
+def get_videos_info(repo_id: str, local_dir: Path, video_keys: list[str], branch: str) -> dict:
+    hub_api = HfApi()
+    videos_info_dict = {"videos_path": DEFAULT_VIDEO_PATH}
+
+    # Assumes first episode
+    video_files = [
+        DEFAULT_VIDEO_PATH.format(episode_chunk=0, video_key=vid_key, episode_index=0)
+        for vid_key in video_keys
+    ]
+    hub_api.snapshot_download(
+        repo_id=repo_id, repo_type="dataset", local_dir=local_dir, revision=branch, allow_patterns=video_files
+    )
+    for vid_key, vid_path in zip(video_keys, video_files, strict=True):
+        videos_info_dict[vid_key] = _get_video_info(local_dir / vid_path)
+
+    return videos_info_dict
+
+
+def get_video_pixel_channels(pix_fmt: str) -> int:
+    if "gray" in pix_fmt or "depth" in pix_fmt or "monochrome" in pix_fmt:
+        return 1
+    elif "rgba" in pix_fmt or "yuva" in pix_fmt:
+        return 4
+    elif "rgb" in pix_fmt or "yuv" in pix_fmt:
+        return 3
+    else:
+        raise ValueError("Unknown format")
+
+
+def get_image_pixel_channels(image: Image):
+    if image.mode == "L":
+        return 1  # Grayscale
+    elif image.mode == "LA":
+        return 2  # Grayscale + Alpha
+    elif image.mode == "RGB":
+        return 3  # RGB
+    elif image.mode == "RGBA":
+        return 4  # RGBA
+    else:
+        raise ValueError("Unknown format")
+
+
+def get_video_shapes(videos_info: dict, video_keys: list) -> dict:
+    video_shapes = {}
+    for img_key in video_keys:
+        channels = get_video_pixel_channels(videos_info[img_key]["video.pix_fmt"])
+        video_shapes[img_key] = {
+            "width": videos_info[img_key]["video.width"],
+            "height": videos_info[img_key]["video.height"],
+            "channels": channels,
+        }
+
+    return video_shapes
+
+
+def get_image_shapes(dataset: Dataset, image_keys: list) -> dict:
+    image_shapes = {}
+    for img_key in image_keys:
+        image = dataset[0][img_key]  # Assuming first row
+        channels = get_image_pixel_channels(image)
+        image_shapes[img_key] = {
+            "width": image.width,
+            "height": image.height,
+            "channels": channels,
+        }
+
+    return image_shapes
+
+
+def get_generic_motor_names(sequence_shapes: dict) -> dict:
+    return {key: [f"motor_{i}" for i in range(length)] for key, length in sequence_shapes.items()}
+
+
+def convert_dataset(
+    repo_id: str,
+    local_dir: Path,
+    single_task: str | None = None,
+    tasks_path: Path | None = None,
+    tasks_col: Path | None = None,
+    robot_config: dict | None = None,
+    test_branch: str | None = None,
+):
+    v1 = get_hub_safe_version(repo_id, V16, enforce_v2=False)
+    v1x_dir = local_dir / V16 / repo_id
+    v20_dir = local_dir / V20 / repo_id
+    v1x_dir.mkdir(parents=True, exist_ok=True)
+    v20_dir.mkdir(parents=True, exist_ok=True)
+
+    hub_api = HfApi()
+    hub_api.snapshot_download(
+        repo_id=repo_id, repo_type="dataset", revision=v1, local_dir=v1x_dir, ignore_patterns="videos*/"
+    )
+    branch = "main"
+    if test_branch:
+        branch = test_branch
+        create_branch(repo_id=repo_id, branch=test_branch, repo_type="dataset")
+
+    metadata_v1 = load_json(v1x_dir / V1_INFO_PATH)
+    dataset = datasets.load_dataset("parquet", data_dir=v1x_dir / "data", split="train")
+    keys = get_keys(dataset)
+
+    if single_task and "language_instruction" in dataset.column_names:
+        warnings.warn(
+            "'single_task' provided but 'language_instruction' tasks_col found. Using 'language_instruction'.",
+            stacklevel=1,
+        )
+        single_task = None
+        tasks_col = "language_instruction"
+
+    # Episodes & chunks
+    episode_indices = sorted(dataset.unique("episode_index"))
+    total_episodes = len(episode_indices)
+    assert episode_indices == list(range(total_episodes))
+    total_videos = total_episodes * len(keys["video"])
+    total_chunks = total_episodes // DEFAULT_CHUNK_SIZE
+    if total_episodes % DEFAULT_CHUNK_SIZE != 0:
+        total_chunks += 1
+
+    # Tasks
+    if single_task:
+        tasks_by_episodes = {ep_idx: single_task for ep_idx in episode_indices}
+        dataset, tasks = add_task_index_by_episodes(dataset, tasks_by_episodes)
+        tasks_by_episodes = {ep_idx: [task] for ep_idx, task in tasks_by_episodes.items()}
+    elif tasks_path:
+        tasks_by_episodes = load_json(tasks_path)
+        tasks_by_episodes = {int(ep_idx): task for ep_idx, task in tasks_by_episodes.items()}
+        # tasks = list(set(tasks_by_episodes.values()))
+        dataset, tasks = add_task_index_by_episodes(dataset, tasks_by_episodes)
+        tasks_by_episodes = {ep_idx: [task] for ep_idx, task in tasks_by_episodes.items()}
+    elif tasks_col:
+        dataset, tasks, tasks_by_episodes = add_task_index_from_tasks_col(dataset, tasks_col)
+    else:
+        raise ValueError
+
+    assert set(tasks) == {task for ep_tasks in tasks_by_episodes.values() for task in ep_tasks}
+    tasks = [{"task_index": task_idx, "task": task} for task_idx, task in enumerate(tasks)]
+    write_jsonlines(tasks, v20_dir / TASKS_PATH)
+
+    # Shapes
+    sequence_shapes = {key: dataset.features[key].length for key in keys["sequence"]}
+    image_shapes = get_image_shapes(dataset, keys["image"]) if len(keys["image"]) > 0 else {}
+
+    # Videos
+    if len(keys["video"]) > 0:
+        assert metadata_v1.get("video", False)
+        tmp_video_dir = local_dir / "videos" / V20 / repo_id
+        tmp_video_dir.mkdir(parents=True, exist_ok=True)
+        clean_gitattr = Path(
+            hub_api.hf_hub_download(
+                repo_id=GITATTRIBUTES_REF, repo_type="dataset", local_dir=local_dir, filename=".gitattributes"
+            )
+        ).absolute()
+        move_videos(
+            repo_id, keys["video"], total_episodes, total_chunks, tmp_video_dir, clean_gitattr, branch
+        )
+        videos_info = get_videos_info(repo_id, v1x_dir, video_keys=keys["video"], branch=branch)
+        video_shapes = get_video_shapes(videos_info, keys["video"])
+        for img_key in keys["video"]:
+            assert math.isclose(videos_info[img_key]["video.fps"], metadata_v1["fps"], rel_tol=1e-3)
+            if "encoding" in metadata_v1:
+                assert videos_info[img_key]["video.pix_fmt"] == metadata_v1["encoding"]["pix_fmt"]
+    else:
+        assert metadata_v1.get("video", 0) == 0
+        videos_info = None
+        video_shapes = {}
+
+    # Split data into 1 parquet file by episode
+    episode_lengths = split_parquet_by_episodes(dataset, keys, total_episodes, total_chunks, v20_dir)
+
+    # Names
+    if robot_config is not None:
+        robot_type = robot_config["robot_type"]
+        names = robot_config["names"]
+        if "observation.effort" in keys["sequence"]:
+            names["observation.effort"] = names["observation.state"]
+        if "observation.velocity" in keys["sequence"]:
+            names["observation.velocity"] = names["observation.state"]
+        repo_tags = [robot_type]
+    else:
+        robot_type = "unknown"
+        names = get_generic_motor_names(sequence_shapes)
+        repo_tags = None
+
+    assert set(names) == set(keys["sequence"])
+    for key in sequence_shapes:
+        assert len(names[key]) == sequence_shapes[key]
+
+    # Episodes
+    episodes = [
+        {"episode_index": ep_idx, "tasks": tasks_by_episodes[ep_idx], "length": episode_lengths[ep_idx]}
+        for ep_idx in episode_indices
+    ]
+    write_jsonlines(episodes, v20_dir / EPISODES_PATH)
+
+    # Assemble metadata v2.0
+    metadata_v2_0 = {
+        "codebase_version": V20,
+        "data_path": DEFAULT_PARQUET_PATH,
+        "robot_type": robot_type,
+        "total_episodes": total_episodes,
+        "total_frames": len(dataset),
+        "total_tasks": len(tasks),
+        "total_videos": total_videos,
+        "total_chunks": total_chunks,
+        "chunks_size": DEFAULT_CHUNK_SIZE,
+        "fps": metadata_v1["fps"],
+        "splits": {"train": f"0:{total_episodes}"},
+        "keys": keys["sequence"],
+        "video_keys": keys["video"],
+        "image_keys": keys["image"],
+        "shapes": {**sequence_shapes, **video_shapes, **image_shapes},
+        "names": names,
+        "videos": videos_info,
+    }
+    write_json(metadata_v2_0, v20_dir / INFO_PATH)
+    convert_stats_to_json(v1x_dir, v20_dir)
+
+    with contextlib.suppress(EntryNotFoundError):
+        hub_api.delete_folder(repo_id=repo_id, path_in_repo="data", repo_type="dataset", revision=branch)
+
+    with contextlib.suppress(EntryNotFoundError):
+        hub_api.delete_folder(repo_id=repo_id, path_in_repo="meta_data", repo_type="dataset", revision=branch)
+
+    with contextlib.suppress(EntryNotFoundError):
+        hub_api.delete_folder(repo_id=repo_id, path_in_repo="meta", repo_type="dataset", revision=branch)
+
+    hub_api.upload_folder(
+        repo_id=repo_id,
+        path_in_repo="data",
+        folder_path=v20_dir / "data",
+        repo_type="dataset",
+        revision=branch,
+    )
+    hub_api.upload_folder(
+        repo_id=repo_id,
+        path_in_repo="meta",
+        folder_path=v20_dir / "meta",
+        repo_type="dataset",
+        revision=branch,
+    )
+
+    card = create_lerobot_dataset_card(tags=repo_tags, info=metadata_v2_0)
+    card.push_to_hub(repo_id=repo_id, repo_type="dataset", revision=branch)
+
+    if not test_branch:
+        create_branch(repo_id=repo_id, branch=V20, repo_type="dataset")
+
+    # TODO:
+    # - [X] Add shapes
+    # - [X] Add keys
+    # - [X] Add paths
+    # - [X] convert stats.json
+    # - [X] Add task.json
+    # - [X] Add names
+    # - [X] Add robot_type
+    # - [X] Add splits
+    # - [X] Push properly to branch v2.0 and delete v1.6 stuff from that branch
+    # - [X] Handle multitask datasets
+    # - [X] Handle hf hub repo limits (add chunks logic)
+    # - [X] Add test-branch
+    # - [X] Use jsonlines for episodes
+    # - [X] Add sanity checks (encoding, shapes)
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    task_args = parser.add_mutually_exclusive_group(required=True)
+
+    parser.add_argument(
+        "--repo-id",
+        type=str,
+        required=True,
+        help="Repository identifier on Hugging Face: a community or a user name `/` the name of the dataset (e.g. `lerobot/pusht`, `cadene/aloha_sim_insertion_human`).",
+    )
+    task_args.add_argument(
+        "--single-task",
+        type=str,
+        help="A short but accurate description of the single task performed in the dataset.",
+    )
+    task_args.add_argument(
+        "--tasks-col",
+        type=str,
+        help="The name of the column containing language instructions",
+    )
+    task_args.add_argument(
+        "--tasks-path",
+        type=Path,
+        help="The path to a .json file containing one language instruction for each episode_index",
+    )
+    parser.add_argument(
+        "--robot-config",
+        type=Path,
+        default=None,
+        help="Path to the robot's config yaml the dataset during conversion.",
+    )
+    parser.add_argument(
+        "--robot-overrides",
+        type=str,
+        nargs="*",
+        help="Any key=value arguments to override the robot config values (use dots for.nested=overrides)",
+    )
+    parser.add_argument(
+        "--local-dir",
+        type=Path,
+        default=None,
+        help="Local directory to store the dataset during conversion. Defaults to /tmp/lerobot_dataset_v2",
+    )
+    parser.add_argument(
+        "--test-branch",
+        type=str,
+        default=None,
+        help="Repo branch to test your conversion first (e.g. 'v2.0.test')",
+    )
+
+    args = parser.parse_args()
+    if not args.local_dir:
+        args.local_dir = Path("/tmp/lerobot_dataset_v2")
+
+    robot_config = parse_robot_config(args.robot_config, args.robot_overrides) if args.robot_config else None
+    del args.robot_config, args.robot_overrides
+
+    convert_dataset(**vars(args), robot_config=robot_config)
+
+
+if __name__ == "__main__":
+    from time import sleep
+
+    sleep(1)
+    main()
--- a/lerobot/common/datasets/video_utils.py
+++ b/lerobot/common/datasets/video_utils.py
@@ -27,45 +27,8 @@ import torchvision
 from datasets.features.features import register_feature


-def load_from_videos(
-    item: dict[str, torch.Tensor],
-    video_frame_keys: list[str],
-    videos_dir: Path,
-    tolerance_s: float,
-    backend: str = "pyav",
-):
-    """Note: When using data workers (e.g. DataLoader with num_workers>0), do not call this function
-    in the main process (e.g. by using a second Dataloader with num_workers=0). It will result in a Segmentation Fault.
-    This probably happens because a memory reference to the video loader is created in the main process and a
-    subprocess fails to access it.
-    """
-    # since video path already contains "videos" (e.g. videos_dir="data/videos", path="videos/episode_0.mp4")
-    data_dir = videos_dir.parent
-
-    for key in video_frame_keys:
-        if isinstance(item[key], list):
-            # load multiple frames at once (expected when delta_timestamps is not None)
-            timestamps = [frame["timestamp"] for frame in item[key]]
-            paths = [frame["path"] for frame in item[key]]
-            if len(set(paths)) > 1:
-                raise NotImplementedError("All video paths are expected to be the same for now.")
-            video_path = data_dir / paths[0]
-
-            frames = decode_video_frames_torchvision(video_path, timestamps, tolerance_s, backend)
-            item[key] = frames
-        else:
-            # load one frame
-            timestamps = [item[key]["timestamp"]]
-            video_path = data_dir / item[key]["path"]
-
-            frames = decode_video_frames_torchvision(video_path, timestamps, tolerance_s, backend)
-            item[key] = frames[0]
-
-    return item
-
-
 def decode_video_frames_torchvision(
-    video_path: str,
+    video_path: Path | str,
    timestamps: list[float],
    tolerance_s: float,
    backend: str = "pyav",
@@ -163,8 +126,8 @@ def decode_video_frames_torchvision(


 def encode_video_frames(
-    imgs_dir: Path,
-    video_path: Path,
+    imgs_dir: Path | str,
+    video_path: Path | str,
    fps: int,
    vcodec: str = "libsvtav1",
    pix_fmt: str = "yuv420p",
--- a/lerobot/common/robot_devices/cameras/intelrealsense.py
+++ b/lerobot/common/robot_devices/cameras/intelrealsense.py
@@ -21,9 +21,9 @@ from PIL import Image
 from lerobot.common.robot_devices.utils import (
    RobotDeviceAlreadyConnectedError,
    RobotDeviceNotConnectedError,
+    busy_wait,
 )
 from lerobot.common.utils.utils import capture_timestamp_utc
-from lerobot.scripts.control_robot import busy_wait

 SERIAL_NUMBER_INDEX = 1

@@ -168,6 +168,7 @@ class IntelRealSenseCameraConfig:
    width: int | None = None
    height: int | None = None
    color_mode: str = "rgb"
+    channels: int | None = None
    use_depth: bool = False
    force_hardware_reset: bool = True
    rotation: int | None = None
@@ -179,6 +180,8 @@ class IntelRealSenseCameraConfig:
                f"`color_mode` is expected to be 'rgb' or 'bgr', but {self.color_mode} is provided."
            )

+        self.channels = 3
+
        at_least_one_is_not_none = self.fps is not None or self.width is not None or self.height is not None
        at_least_one_is_none = self.fps is None or self.width is None or self.height is None
        if at_least_one_is_not_none and at_least_one_is_none:
@@ -200,7 +203,8 @@ class IntelRealSenseCamera:

    To find the camera indices of your cameras, you can run our utility script that will save a few frames for each camera:
    ```bash
-    python lerobot/common/robot_devices/cameras/intelrealsense.py --images-dir outputs/images_from_intelrealsense_cameras
+    python lerobot/common/robot_devices/cameras/intelrealsense.py \
+    --images-dir outputs/images_from_intelrealsense_cameras
    ```

    When an IntelRealSenseCamera is instantiated, if no specific config is provided, the default fps, width, height and color_mode
@@ -254,6 +258,7 @@ class IntelRealSenseCamera:
        self.fps = config.fps
        self.width = config.width
        self.height = config.height
+        self.channels = config.channels
        self.color_mode = config.color_mode
        self.use_depth = config.use_depth
        self.force_hardware_reset = config.force_hardware_reset
--- a/lerobot/common/robot_devices/cameras/opencv.py
+++ b/lerobot/common/robot_devices/cameras/opencv.py
@@ -192,6 +192,7 @@ class OpenCVCameraConfig:
    width: int | None = None
    height: int | None = None
    color_mode: str = "rgb"
+    channels: int | None = None
    rotation: int | None = None
    mock: bool = False

@@ -201,6 +202,8 @@ class OpenCVCameraConfig:
                f"`color_mode` is expected to be 'rgb' or 'bgr', but {self.color_mode} is provided."
            )

+        self.channels = 3
+
        if self.rotation not in [-90, None, 90, 180]:
            raise ValueError(f"`rotation` must be in [-90, None, 90, 180] (got {self.rotation})")

@@ -216,7 +219,8 @@ class OpenCVCamera:

    To find the camera indices of your cameras, you can run our utility script that will be save a few frames for each camera:
    ```bash
-    python lerobot/common/robot_devices/cameras/opencv.py --images-dir outputs/images_from_opencv_cameras
+    python lerobot/common/robot_devices/cameras/opencv.py \
+    --images-dir outputs/images_from_opencv_cameras
    ```

    When an OpenCVCamera is instantiated, if no specific config is provided, the default fps, width, height and color_mode
@@ -268,6 +272,7 @@ class OpenCVCamera:
        self.fps = config.fps
        self.width = config.width
        self.height = config.height
+        self.channels = config.channels
        self.color_mode = config.color_mode
        self.mock = config.mock

@@ -323,7 +328,7 @@ class OpenCVCamera:
            if self.camera_index not in available_cam_ids:
                raise ValueError(
                    f"`camera_index` is expected to be one of these available cameras {available_cam_ids}, but {self.camera_index} is provided instead. "
-                    "To find the camera index you should use, run `python lerobot/common/robot_devices/cameras/opencv.py`."
+                    "To find the camera index you should use, run `python lerobot/lerobot/common/robot_devices/cameras/opencv.py`."
                )

            raise OSError(f"Can't access OpenCVCamera({camera_idx}).")
--- a/lerobot/common/robot_devices/control_utils.py
+++ b/lerobot/common/robot_devices/control_utils.py
@@ -15,7 +15,8 @@ import torch
 import tqdm
 from termcolor import colored

-from lerobot.common.datasets.populate_dataset import add_frame, safe_stop_image_writer
+from lerobot.common.datasets.image_writer import safe_stop_image_writer
+from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
 from lerobot.common.policies.factory import make_policy
 from lerobot.common.robot_devices.robots.utils import Robot
 from lerobot.common.robot_devices.utils import busy_wait
@@ -227,7 +228,7 @@ def control_loop(
    control_time_s=None,
    teleoperate=False,
    display_cameras=False,
-    dataset=None,
+    dataset: LeRobotDataset | None = None,
    events=None,
    policy=None,
    device=None,
@@ -247,7 +248,7 @@ def control_loop(
    if teleoperate and policy is not None:
        raise ValueError("When `teleoperate` is True, `policy` should be None.")

-    if dataset is not None and fps is not None and dataset["fps"] != fps:
+    if dataset is not None and fps is not None and dataset.fps != fps:
        raise ValueError(f"The dataset fps should be equal to requested fps ({dataset['fps']} != {fps}).")

    timestamp = 0
@@ -268,7 +269,8 @@ def control_loop(
                action = {"action": action}

        if dataset is not None:
-            add_frame(dataset, observation, action)
+            frame = {**observation, **action}
+            dataset.add_frame(frame)

        if display_cameras and not is_headless():
            image_keys = [key for key in observation if "image" in key]
--- a/lerobot/common/robot_devices/motors/dynamixel.py
+++ b/lerobot/common/robot_devices/motors/dynamixel.py
@@ -4,7 +4,6 @@ import math
 import time
 import traceback
 from copy import deepcopy
-from pathlib import Path

 import numpy as np
 import tqdm
@@ -229,35 +228,6 @@ def assert_same_address(model_ctrl_table, motor_models, data_name):
        )


-def find_available_ports():
-    ports = []
-    for path in Path("/dev").glob("tty*"):
-        ports.append(str(path))
-    return ports
-
-
-def find_port():
-    print("Finding all available ports for the DynamixelMotorsBus.")
-    ports_before = find_available_ports()
-    print(ports_before)
-
-    print("Remove the usb cable from your DynamixelMotorsBus and press Enter when done.")
-    input()
-
-    time.sleep(0.5)
-    ports_after = find_available_ports()
-    ports_diff = list(set(ports_before) - set(ports_after))
-
-    if len(ports_diff) == 1:
-        port = ports_diff[0]
-        print(f"The port of this DynamixelMotorsBus is '{port}'")
-        print("Reconnect the usb cable.")
-    elif len(ports_diff) == 0:
-        raise OSError(f"Could not detect the port. No difference was found ({ports_diff}).")
-    else:
-        raise OSError(f"Could not detect the port. More than one port was found ({ports_diff}).")
-
-
 class TorqueMode(enum.Enum):
    ENABLED = 1
    DISABLED = 0
@@ -290,8 +260,8 @@ class DynamixelMotorsBus:
    A DynamixelMotorsBus instance requires a port (e.g. `DynamixelMotorsBus(port="/dev/tty.usbmodem575E0031751"`)).
    To find the port, you can run our utility script:
    ```bash
-    python lerobot/common/robot_devices/motors/dynamixel.py
-    >>> Finding all available ports for the DynamixelMotorsBus.
+    python lerobot/scripts/find_motors_bus_port.py
+    >>> Finding all available ports for the MotorBus.
    >>> ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
    >>> Remove the usb cable from your DynamixelMotorsBus and press Enter when done.
    >>> The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0031751.
@@ -369,7 +339,7 @@ class DynamixelMotorsBus:
        except Exception:
            traceback.print_exc()
            print(
-                "\nTry running `python lerobot/common/robot_devices/motors/dynamixel.py` to make sure you are using the correct port.\n"
+                "\nTry running `python lerobot/scripts/find_motors_bus_port.py` to make sure you are using the correct port.\n"
            )
            raise

@@ -378,20 +348,6 @@ class DynamixelMotorsBus:

        self.port_handler.setPacketTimeoutMillis(TIMEOUT_MS)

-        # Set expected baudrate for the bus
-        self.set_bus_baudrate(BAUDRATE)
-
-        if not self.are_motors_configured():
-            input(
-                "\n/!\\ A configuration issue has been detected with your motors: \n"
-                "If it's the first time that you use these motors, press enter to configure your motors... but before "
-                "verify that all the cables are connected the proper way. If you find an issue, before making a modification, "
-                "kill the python process, unplug the power cord to not damage the motors, rewire correctly, then plug the power "
-                "again and relaunch the script.\n"
-            )
-            print()
-            self.configure_motors()
-
    def reconnect(self):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
@@ -415,120 +371,14 @@ class DynamixelMotorsBus:
            print(e)
            return False

-    def configure_motors(self):
-        # TODO(rcadene): This script assumes motors follow the X_SERIES baudrates
-        # TODO(rcadene): Refactor this function with intermediate high-level functions
-
-        print("Scanning all baudrates and motor indices")
-        all_baudrates = set(X_SERIES_BAUDRATE_TABLE.values())
-        ids_per_baudrate = {}
-        for baudrate in all_baudrates:
-            self.set_bus_baudrate(baudrate)
-            present_ids = self.find_motor_indices()
-            if len(present_ids) > 0:
-                ids_per_baudrate[baudrate] = present_ids
-        print(f"Motor indices detected: {ids_per_baudrate}")
-        print()
-
-        possible_baudrates = list(ids_per_baudrate.keys())
-        possible_ids = list({idx for sublist in ids_per_baudrate.values() for idx in sublist})
-        untaken_ids = list(set(range(MAX_ID_RANGE)) - set(possible_ids) - set(self.motor_indices))
-
-        # Connect successively one motor to the chain and write a unique random index for each
-        for i in range(len(self.motors)):
-            self.disconnect()
-            input(
-                "1. Unplug the power cord\n"
-                "2. Plug/unplug minimal number of cables to only have the first "
-                f"{i+1} motor(s) ({self.motor_names[:i+1]}) connected.\n"
-                "3. Re-plug the power cord\n"
-                "Press Enter to continue..."
-            )
-            print()
-            self.reconnect()
-
-            if i > 0:
-                try:
-                    self._read_with_motor_ids(self.motor_models, untaken_ids[:i], "ID")
-                except ConnectionError:
-                    print(f"Failed to read from {untaken_ids[:i+1]}. Make sure the power cord is plugged in.")
-                    input("Press Enter to continue...")
-                    print()
-                    self.reconnect()
-
-            print("Scanning possible baudrates and motor indices")
-            motor_found = False
-            for baudrate in possible_baudrates:
-                self.set_bus_baudrate(baudrate)
-                present_ids = self.find_motor_indices(possible_ids)
-                if len(present_ids) == 1:
-                    present_idx = present_ids[0]
-                    print(f"Detected motor with index {present_idx}")
-
-                    if baudrate != BAUDRATE:
-                        print(f"Setting its baudrate to {BAUDRATE}")
-                        baudrate_idx = list(X_SERIES_BAUDRATE_TABLE.values()).index(BAUDRATE)
-
-                        # The write can fail, so we allow retries
-                        for _ in range(NUM_WRITE_RETRY):
-                            self._write_with_motor_ids(
-                                self.motor_models, present_idx, "Baud_Rate", baudrate_idx
-                            )
-                            time.sleep(0.5)
-                            self.set_bus_baudrate(BAUDRATE)
-                            try:
-                                present_baudrate_idx = self._read_with_motor_ids(
-                                    self.motor_models, present_idx, "Baud_Rate"
-                                )
-                            except ConnectionError:
-                                print("Failed to write baudrate. Retrying.")
-                                self.set_bus_baudrate(baudrate)
-                                continue
-                            break
-                        else:
-                            raise
-
-                        if present_baudrate_idx != baudrate_idx:
-                            raise OSError("Failed to write baudrate.")
-
-                    print(f"Setting its index to a temporary untaken index ({untaken_ids[i]})")
-                    self._write_with_motor_ids(self.motor_models, present_idx, "ID", untaken_ids[i])
-
-                    present_idx = self._read_with_motor_ids(self.motor_models, untaken_ids[i], "ID")
-                    if present_idx != untaken_ids[i]:
-                        raise OSError("Failed to write index.")
-
-                    motor_found = True
-                    break
-                elif len(present_ids) > 1:
-                    raise OSError(f"More than one motor detected ({present_ids}), but only one was expected.")
-
-            if not motor_found:
-                raise OSError(
-                    "No motor found, but one new motor expected. Verify power cord is plugged in and retry."
-                )
-            print()
-
-        print(f"Setting expected motor indices: {self.motor_indices}")
-        self.set_bus_baudrate(BAUDRATE)
-        self._write_with_motor_ids(
-            self.motor_models, untaken_ids[: len(self.motors)], "ID", self.motor_indices
-        )
-        print()
-
-        if (self.read("ID") != self.motor_indices).any():
-            raise OSError("Failed to write motors indices.")
-
-        print("Configuration is done!")
-
-    def find_motor_indices(self, possible_ids=None):
+    def find_motor_indices(self, possible_ids=None, num_retry=2):
        if possible_ids is None:
            possible_ids = range(MAX_ID_RANGE)

        indices = []
        for idx in tqdm.tqdm(possible_ids):
            try:
-                present_idx = self._read_with_motor_ids(self.motor_models, [idx], "ID")[0]
+                present_idx = self.read_with_motor_ids(self.motor_models, [idx], "ID", num_retry=num_retry)[0]
            except ConnectionError:
                continue

@@ -788,7 +638,7 @@ class DynamixelMotorsBus:
        values = np.round(values).astype(np.int32)
        return values

-    def _read_with_motor_ids(self, motor_models, motor_ids, data_name):
+    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
        else:
@@ -805,7 +655,11 @@ class DynamixelMotorsBus:
        for idx in motor_ids:
            group.addParam(idx)

-        comm = group.txRxPacket()
+        for _ in range(num_retry):
+            comm = group.txRxPacket()
+            if comm == dxl.COMM_SUCCESS:
+                break
+
        if comm != dxl.COMM_SUCCESS:
            raise ConnectionError(
                f"Read failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
@@ -895,7 +749,7 @@ class DynamixelMotorsBus:

        return values

-    def _write_with_motor_ids(self, motor_models, motor_ids, data_name, values):
+    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
        if self.mock:
            import tests.mock_dynamixel_sdk as dxl
        else:
@@ -913,7 +767,11 @@ class DynamixelMotorsBus:
            data = convert_to_bytes(value, bytes, self.mock)
            group.addParam(idx, data)

-        comm = group.txPacket()
+        for _ in range(num_retry):
+            comm = group.txPacket()
+            if comm == dxl.COMM_SUCCESS:
+                break
+
        if comm != dxl.COMM_SUCCESS:
            raise ConnectionError(
                f"Write failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
@@ -1007,8 +865,3 @@ class DynamixelMotorsBus:
    def __del__(self):
        if getattr(self, "is_connected", False):
            self.disconnect()
-
-
-if __name__ == "__main__":
-    # Helper to find the usb port associated to all your DynamixelMotorsBus.
-    find_port()
--- a/lerobot/common/robot_devices/motors/feetech.py
+++ b/lerobot/common/robot_devices/motors/feetech.py
@@ -0,0 +1,897 @@
+import enum
+import logging
+import math
+import time
+import traceback
+from copy import deepcopy
+
+import numpy as np
+import tqdm
+
+from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
+from lerobot.common.utils.utils import capture_timestamp_utc
+
+PROTOCOL_VERSION = 0
+BAUDRATE = 1_000_000
+TIMEOUT_MS = 1000
+
+MAX_ID_RANGE = 252
+
+# The following bounds define the lower and upper joints range (after calibration).
+# For joints in degree (i.e. revolute joints), their nominal range is [-180, 180] degrees
+# which corresponds to a half rotation on the left and half rotation on the right.
+# Some joints might require higher range, so we allow up to [-270, 270] degrees until
+# an error is raised.
+LOWER_BOUND_DEGREE = -270
+UPPER_BOUND_DEGREE = 270
+# For joints in percentage (i.e. joints that move linearly like the prismatic joint of a gripper),
+# their nominal range is [0, 100] %. For instance, for Aloha gripper, 0% is fully
+# closed, and 100% is fully open. To account for slight calibration issue, we allow up to
+# [-10, 110] until an error is raised.
+LOWER_BOUND_LINEAR = -10
+UPPER_BOUND_LINEAR = 110
+
+HALF_TURN_DEGREE = 180
+
+
+# See this link for STS3215 Memory Table:
+# https://docs.google.com/spreadsheets/d/1GVs7W1VS1PqdhA1nW-abeyAHhTUxKUdR/edit?usp=sharing&ouid=116566590112741600240&rtpof=true&sd=true
+# data_name: (address, size_byte)
+SCS_SERIES_CONTROL_TABLE = {
+    "Model": (3, 2),
+    "ID": (5, 1),
+    "Baud_Rate": (6, 1),
+    "Return_Delay": (7, 1),
+    "Response_Status_Level": (8, 1),
+    "Min_Angle_Limit": (9, 2),
+    "Max_Angle_Limit": (11, 2),
+    "Max_Temperature_Limit": (13, 1),
+    "Max_Voltage_Limit": (14, 1),
+    "Min_Voltage_Limit": (15, 1),
+    "Max_Torque_Limit": (16, 2),
+    "Phase": (18, 1),
+    "Unloading_Condition": (19, 1),
+    "LED_Alarm_Condition": (20, 1),
+    "P_Coefficient": (21, 1),
+    "D_Coefficient": (22, 1),
+    "I_Coefficient": (23, 1),
+    "Minimum_Startup_Force": (24, 2),
+    "CW_Dead_Zone": (26, 1),
+    "CCW_Dead_Zone": (27, 1),
+    "Protection_Current": (28, 2),
+    "Angular_Resolution": (30, 1),
+    "Offset": (31, 2),
+    "Mode": (33, 1),
+    "Protective_Torque": (34, 1),
+    "Protection_Time": (35, 1),
+    "Overload_Torque": (36, 1),
+    "Speed_closed_loop_P_proportional_coefficient": (37, 1),
+    "Over_Current_Protection_Time": (38, 1),
+    "Velocity_closed_loop_I_integral_coefficient": (39, 1),
+    "Torque_Enable": (40, 1),
+    "Acceleration": (41, 1),
+    "Goal_Position": (42, 2),
+    "Goal_Time": (44, 2),
+    "Goal_Speed": (46, 2),
+    "Torque_Limit": (48, 2),
+    "Lock": (55, 1),
+    "Present_Position": (56, 2),
+    "Present_Speed": (58, 2),
+    "Present_Load": (60, 2),
+    "Present_Voltage": (62, 1),
+    "Present_Temperature": (63, 1),
+    "Status": (65, 1),
+    "Moving": (66, 1),
+    "Present_Current": (69, 2),
+    # Not in the Memory Table
+    "Maximum_Acceleration": (85, 2),
+}
+
+SCS_SERIES_BAUDRATE_TABLE = {
+    0: 1_000_000,
+    1: 500_000,
+    2: 250_000,
+    3: 128_000,
+    4: 115_200,
+    5: 57_600,
+    6: 38_400,
+    7: 19_200,
+}
+
+CALIBRATION_REQUIRED = ["Goal_Position", "Present_Position"]
+CONVERT_UINT32_TO_INT32_REQUIRED = ["Goal_Position", "Present_Position"]
+
+
+MODEL_CONTROL_TABLE = {
+    "scs_series": SCS_SERIES_CONTROL_TABLE,
+    "sts3215": SCS_SERIES_CONTROL_TABLE,
+}
+
+MODEL_RESOLUTION = {
+    "scs_series": 4096,
+    "sts3215": 4096,
+}
+
+MODEL_BAUDRATE_TABLE = {
+    "scs_series": SCS_SERIES_BAUDRATE_TABLE,
+    "sts3215": SCS_SERIES_BAUDRATE_TABLE,
+}
+
+# High number of retries is needed for feetech compared to dynamixel motors.
+NUM_READ_RETRY = 20
+NUM_WRITE_RETRY = 20
+
+
+def convert_degrees_to_steps(degrees: float | np.ndarray, models: str | list[str]) -> np.ndarray:
+    """This function converts the degree range to the step range for indicating motors rotation.
+    It assumes a motor achieves a full rotation by going from -180 degree position to +180.
+    The motor resolution (e.g. 4096) corresponds to the number of steps needed to achieve a full rotation.
+    """
+    resolutions = [MODEL_RESOLUTION[model] for model in models]
+    steps = degrees / 180 * np.array(resolutions) / 2
+    steps = steps.astype(int)
+    return steps
+
+
+def convert_to_bytes(value, bytes, mock=False):
+    if mock:
+        return value
+
+    import scservo_sdk as scs
+
+    # Note: No need to convert back into unsigned int, since this byte preprocessing
+    # already handles it for us.
+    if bytes == 1:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+        ]
+    elif bytes == 2:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_LOWORD(value)),
+        ]
+    elif bytes == 4:
+        data = [
+            scs.SCS_LOBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_LOWORD(value)),
+            scs.SCS_LOBYTE(scs.SCS_HIWORD(value)),
+            scs.SCS_HIBYTE(scs.SCS_HIWORD(value)),
+        ]
+    else:
+        raise NotImplementedError(
+            f"Value of the number of bytes to be sent is expected to be in [1, 2, 4], but "
+            f"{bytes} is provided instead."
+        )
+    return data
+
+
+def get_group_sync_key(data_name, motor_names):
+    group_key = f"{data_name}_" + "_".join(motor_names)
+    return group_key
+
+
+def get_result_name(fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    rslt_name = f"{fn_name}_{group_key}"
+    return rslt_name
+
+
+def get_queue_name(fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    queue_name = f"{fn_name}_{group_key}"
+    return queue_name
+
+
+def get_log_name(var_name, fn_name, data_name, motor_names):
+    group_key = get_group_sync_key(data_name, motor_names)
+    log_name = f"{var_name}_{fn_name}_{group_key}"
+    return log_name
+
+
+def assert_same_address(model_ctrl_table, motor_models, data_name):
+    all_addr = []
+    all_bytes = []
+    for model in motor_models:
+        addr, bytes = model_ctrl_table[model][data_name]
+        all_addr.append(addr)
+        all_bytes.append(bytes)
+
+    if len(set(all_addr)) != 1:
+        raise NotImplementedError(
+            f"At least two motor models use a different address for `data_name`='{data_name}' ({list(zip(motor_models, all_addr, strict=False))}). Contact a LeRobot maintainer."
+        )
+
+    if len(set(all_bytes)) != 1:
+        raise NotImplementedError(
+            f"At least two motor models use a different bytes representation for `data_name`='{data_name}' ({list(zip(motor_models, all_bytes, strict=False))}). Contact a LeRobot maintainer."
+        )
+
+
+class TorqueMode(enum.Enum):
+    ENABLED = 1
+    DISABLED = 0
+
+
+class DriveMode(enum.Enum):
+    NON_INVERTED = 0
+    INVERTED = 1
+
+
+class CalibrationMode(enum.Enum):
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    DEGREE = 0
+    # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
+    LINEAR = 1
+
+
+class JointOutOfRangeError(Exception):
+    def __init__(self, message="Joint is out of range"):
+        self.message = message
+        super().__init__(self.message)
+
+
+class FeetechMotorsBus:
+    """
+    The FeetechMotorsBus class allows to efficiently read and write to the attached motors. It relies on
+    the python feetech sdk to communicate with the motors. For more info, see the [feetech SDK Documentation](https://emanual.robotis.com/docs/en/software/feetech/feetech_sdk/sample_code/python_read_write_protocol_2_0/#python-read-write-protocol-20).
+
+    A FeetechMotorsBus instance requires a port (e.g. `FeetechMotorsBus(port="/dev/tty.usbmodem575E0031751"`)).
+    To find the port, you can run our utility script:
+    ```bash
+    python lerobot/scripts/find_motors_bus_port.py
+    >>> Finding all available ports for the MotorsBus.
+    >>> ['/dev/tty.usbmodem575E0032081', '/dev/tty.usbmodem575E0031751']
+    >>> Remove the usb cable from your FeetechMotorsBus and press Enter when done.
+    >>> The port of this FeetechMotorsBus is /dev/tty.usbmodem575E0031751.
+    >>> Reconnect the usb cable.
+    ```
+
+    Example of usage for 1 motor connected to the bus:
+    ```python
+    motor_name = "gripper"
+    motor_index = 6
+    motor_model = "sts3215"
+
+    motors_bus = FeetechMotorsBus(
+        port="/dev/tty.usbmodem575E0031751",
+        motors={motor_name: (motor_index, motor_model)},
+    )
+    motors_bus.connect()
+
+    position = motors_bus.read("Present_Position")
+
+    # move from a few motor steps as an example
+    few_steps = 30
+    motors_bus.write("Goal_Position", position + few_steps)
+
+    # when done, consider disconnecting
+    motors_bus.disconnect()
+    ```
+    """
+
+    def __init__(
+        self,
+        port: str,
+        motors: dict[str, tuple[int, str]],
+        extra_model_control_table: dict[str, list[tuple]] | None = None,
+        extra_model_resolution: dict[str, int] | None = None,
+        mock=False,
+    ):
+        self.port = port
+        self.motors = motors
+        self.mock = mock
+
+        self.model_ctrl_table = deepcopy(MODEL_CONTROL_TABLE)
+        if extra_model_control_table:
+            self.model_ctrl_table.update(extra_model_control_table)
+
+        self.model_resolution = deepcopy(MODEL_RESOLUTION)
+        if extra_model_resolution:
+            self.model_resolution.update(extra_model_resolution)
+
+        self.port_handler = None
+        self.packet_handler = None
+        self.calibration = None
+        self.is_connected = False
+        self.group_readers = {}
+        self.group_writers = {}
+        self.logs = {}
+
+        self.track_positions = {}
+        self.present_pos = {
+            "prev": [None] * len(self.motor_names),
+            "below_zero": [None] * len(self.motor_names),
+            "above_max": [None] * len(self.motor_names),
+        }
+        self.goal_pos = {
+            "prev": [None] * len(self.motor_names),
+            "below_zero": [None] * len(self.motor_names),
+            "above_max": [None] * len(self.motor_names),
+        }
+
+    def connect(self):
+        if self.is_connected:
+            raise RobotDeviceAlreadyConnectedError(
+                f"FeetechMotorsBus({self.port}) is already connected. Do not call `motors_bus.connect()` twice."
+            )
+
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        self.port_handler = scs.PortHandler(self.port)
+        self.packet_handler = scs.PacketHandler(PROTOCOL_VERSION)
+
+        try:
+            if not self.port_handler.openPort():
+                raise OSError(f"Failed to open port '{self.port}'.")
+        except Exception:
+            traceback.print_exc()
+            print(
+                "\nTry running `python lerobot/scripts/find_motors_bus_port.py` to make sure you are using the correct port.\n"
+            )
+            raise
+
+        # Allow to read and write
+        self.is_connected = True
+
+        self.port_handler.setPacketTimeoutMillis(TIMEOUT_MS)
+
+    def reconnect(self):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        self.port_handler = scs.PortHandler(self.port)
+        self.packet_handler = scs.PacketHandler(PROTOCOL_VERSION)
+
+        if not self.port_handler.openPort():
+            raise OSError(f"Failed to open port '{self.port}'.")
+
+        self.is_connected = True
+
+    def are_motors_configured(self):
+        # Only check the motor indices and not baudrate, since if the motor baudrates are incorrect,
+        # a ConnectionError will be raised anyway.
+        try:
+            return (self.motor_indices == self.read("ID")).all()
+        except ConnectionError as e:
+            print(e)
+            return False
+
+    def find_motor_indices(self, possible_ids=None, num_retry=2):
+        if possible_ids is None:
+            possible_ids = range(MAX_ID_RANGE)
+
+        indices = []
+        for idx in tqdm.tqdm(possible_ids):
+            try:
+                present_idx = self.read_with_motor_ids(self.motor_models, [idx], "ID", num_retry=num_retry)[0]
+            except ConnectionError:
+                continue
+
+            if idx != present_idx:
+                # sanity check
+                raise OSError(
+                    "Motor index used to communicate through the bus is not the same as the one present in the motor memory. The motor memory might be damaged."
+                )
+            indices.append(idx)
+
+        return indices
+
+    def set_bus_baudrate(self, baudrate):
+        present_bus_baudrate = self.port_handler.getBaudRate()
+        if present_bus_baudrate != baudrate:
+            print(f"Setting bus baud rate to {baudrate}. Previously {present_bus_baudrate}.")
+            self.port_handler.setBaudRate(baudrate)
+
+            if self.port_handler.getBaudRate() != baudrate:
+                raise OSError("Failed to write bus baud rate.")
+
+    @property
+    def motor_names(self) -> list[str]:
+        return list(self.motors.keys())
+
+    @property
+    def motor_models(self) -> list[str]:
+        return [model for _, model in self.motors.values()]
+
+    @property
+    def motor_indices(self) -> list[int]:
+        return [idx for idx, _ in self.motors.values()]
+
+    def set_calibration(self, calibration: dict[str, list]):
+        self.calibration = calibration
+
+    def apply_calibration_autocorrect(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """This function apply the calibration, automatically detects out of range errors for motors values and attempt to correct.
+
+        For more info, see docstring of `apply_calibration` and `autocorrect_calibration`.
+        """
+        try:
+            values = self.apply_calibration(values, motor_names)
+        except JointOutOfRangeError as e:
+            print(e)
+            self.autocorrect_calibration(values, motor_names)
+            values = self.apply_calibration(values, motor_names)
+        return values
+
+    def apply_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """Convert from unsigned int32 joint position range [0, 2**32[ to the universal float32 nominal degree range ]-180.0, 180.0[ with
+        a "zero position" at 0 degree.
+
+        Note: We say "nominal degree range" since the motors can take values outside this range. For instance, 190 degrees, if the motor
+        rotate more than a half a turn from the zero position. However, most motors can't rotate more than 180 degrees and will stay in this range.
+
+        Joints values are original in [0, 2**32[ (unsigned int32). Each motor are expected to complete a full rotation
+        when given a goal position that is + or - their resolution. For instance, feetech xl330-m077 have a resolution of 4096, and
+        at any position in their original range, let's say the position 56734, they complete a full rotation clockwise by moving to 60830,
+        or anticlockwise by moving to 52638. The position in the original range is arbitrary and might change a lot between each motor.
+        To harmonize between motors of the same model, different robots, or even models of different brands, we propose to work
+        in the centered nominal degree range ]-180, 180[.
+        """
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        # Convert from unsigned int32 original range [0, 2**32] to signed float32 range
+        values = values.astype(np.float32)
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                # Update direction of rotation of the motor to match between leader and follower.
+                # In fact, the motor of the leader for a given joint can be assembled in an
+                # opposite direction in term of rotation than the motor of the follower on the same joint.
+                if drive_mode:
+                    values[i] *= -1
+
+                # Convert from range [-2**31, 2**31[ to
+                # nominal range ]-resolution, resolution[ (e.g. ]-2048, 2048[)
+                values[i] += homing_offset
+
+                # Convert from range ]-resolution, resolution[ to
+                # universal float32 centered degree range ]-180, 180[
+                values[i] = values[i] / (resolution // 2) * HALF_TURN_DEGREE
+
+                if (values[i] < LOWER_BOUND_DEGREE) or (values[i] > UPPER_BOUND_DEGREE):
+                    raise JointOutOfRangeError(
+                        f"Wrong motor position range detected for {name}. "
+                        f"Expected to be in nominal range of [-{HALF_TURN_DEGREE}, {HALF_TURN_DEGREE}] degrees (a full rotation), "
+                        f"with a maximum range of [{LOWER_BOUND_DEGREE}, {UPPER_BOUND_DEGREE}] degrees to account for joints that can rotate a bit more, "
+                        f"but present value is {values[i]} degree. "
+                        "This might be due to a cable connection issue creating an artificial 360 degrees jump in motor values. "
+                        "You need to recalibrate by running: `python lerobot/scripts/control_robot.py calibrate`"
+                    )
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Rescale the present position to a nominal range [0, 100] %,
+                # useful for joints with linear motions like Aloha gripper
+                values[i] = (values[i] - start_pos) / (end_pos - start_pos) * 100
+
+                if (values[i] < LOWER_BOUND_LINEAR) or (values[i] > UPPER_BOUND_LINEAR):
+                    raise JointOutOfRangeError(
+                        f"Wrong motor position range detected for {name}. "
+                        f"Expected to be in nominal range of [0, 100] % (a full linear translation), "
+                        f"with a maximum range of [{LOWER_BOUND_LINEAR}, {UPPER_BOUND_LINEAR}] % to account for some imprecision during calibration, "
+                        f"but present value is {values[i]} %. "
+                        "This might be due to a cable connection issue creating an artificial jump in motor values. "
+                        "You need to recalibrate by running: `python lerobot/scripts/control_robot.py calibrate`"
+                    )
+
+        return values
+
+    def autocorrect_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """This function automatically detects issues with values of motors after calibration, and correct for these issues.
+
+        Some motors might have values outside of expected maximum bounds after calibration.
+        For instance, for a joint in degree, its value can be outside [-270, 270] degrees, which is totally unexpected given
+        a nominal range of [-180, 180] degrees, which represents half a turn to the left or right starting from zero position.
+
+        Known issues:
+        #1: Motor value randomly shifts of a full turn, caused by hardware/connection errors.
+        #2: Motor internal homing offset is shifted of a full turn, caused by using default calibration (e.g Aloha).
+        #3: motor internal homing offset is shifted of less or more than a full turn, caused by using default calibration
+            or by human error during manual calibration.
+
+        Issues #1 and #2 can be solved by shifting the calibration homing offset by a full turn.
+        Issue #3 will be visually detected by user and potentially captured by the safety feature `max_relative_target`,
+        that will slow down the motor, raise an error asking to recalibrate. Manual recalibrating will solve the issue.
+
+        Note: A full turn corresponds to 360 degrees but also to 4096 steps for a motor resolution of 4096.
+        """
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        # Convert from unsigned int32 original range [0, 2**32] to signed float32 range
+        values = values.astype(np.float32)
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                if drive_mode:
+                    values[i] *= -1
+
+                # Convert from initial range to range [-180, 180] degrees
+                calib_val = (values[i] + homing_offset) / (resolution // 2) * HALF_TURN_DEGREE
+                in_range = (calib_val > LOWER_BOUND_DEGREE) and (calib_val < UPPER_BOUND_DEGREE)
+
+                # Solve this inequality to find the factor to shift the range into [-180, 180] degrees
+                # values[i] = (values[i] + homing_offset + resolution * factor) / (resolution // 2) * HALF_TURN_DEGREE
+                # - HALF_TURN_DEGREE <= (values[i] + homing_offset + resolution * factor) / (resolution // 2) * HALF_TURN_DEGREE <= HALF_TURN_DEGREE
+                # (- HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset) / resolution <= factor <= (HALF_TURN_DEGREE / 180 * (resolution // 2) - values[i] - homing_offset) / resolution
+                low_factor = (
+                    -HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset
+                ) / resolution
+                upp_factor = (
+                    HALF_TURN_DEGREE / HALF_TURN_DEGREE * (resolution // 2) - values[i] - homing_offset
+                ) / resolution
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Convert from initial range to range [0, 100] in %
+                calib_val = (values[i] - start_pos) / (end_pos - start_pos) * 100
+                in_range = (calib_val > LOWER_BOUND_LINEAR) and (calib_val < UPPER_BOUND_LINEAR)
+
+                # Solve this inequality to find the factor to shift the range into [0, 100] %
+                # values[i] = (values[i] - start_pos + resolution * factor) / (end_pos + resolution * factor - start_pos - resolution * factor) * 100
+                # values[i] = (values[i] - start_pos + resolution * factor) / (end_pos - start_pos) * 100
+                # 0 <= (values[i] - start_pos + resolution * factor) / (end_pos - start_pos) * 100 <= 100
+                # (start_pos - values[i]) / resolution <= factor <= (end_pos - values[i]) / resolution
+                low_factor = (start_pos - values[i]) / resolution
+                upp_factor = (end_pos - values[i]) / resolution
+
+            if not in_range:
+                # Get first integer between the two bounds
+                if low_factor < upp_factor:
+                    factor = math.ceil(low_factor)
+
+                    if factor > upp_factor:
+                        raise ValueError(f"No integer found between bounds [{low_factor=}, {upp_factor=}]")
+                else:
+                    factor = math.ceil(upp_factor)
+
+                    if factor > low_factor:
+                        raise ValueError(f"No integer found between bounds [{low_factor=}, {upp_factor=}]")
+
+                if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                    out_of_range_str = f"{LOWER_BOUND_DEGREE} < {calib_val} < {UPPER_BOUND_DEGREE} degrees"
+                    in_range_str = f"{LOWER_BOUND_DEGREE} < {calib_val} < {UPPER_BOUND_DEGREE} degrees"
+                elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                    out_of_range_str = f"{LOWER_BOUND_LINEAR} < {calib_val} < {UPPER_BOUND_LINEAR} %"
+                    in_range_str = f"{LOWER_BOUND_LINEAR} < {calib_val} < {UPPER_BOUND_LINEAR} %"
+
+                logging.warning(
+                    f"Auto-correct calibration of motor '{name}' by shifting value by {abs(factor)} full turns, "
+                    f"from '{out_of_range_str}' to '{in_range_str}'."
+                )
+
+                # A full turn corresponds to 360 degrees but also to 4096 steps for a motor resolution of 4096.
+                self.calibration["homing_offset"][calib_idx] += resolution * factor
+
+    def revert_calibration(self, values: np.ndarray | list, motor_names: list[str] | None):
+        """Inverse of `apply_calibration`."""
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        for i, name in enumerate(motor_names):
+            calib_idx = self.calibration["motor_names"].index(name)
+            calib_mode = self.calibration["calib_mode"][calib_idx]
+
+            if CalibrationMode[calib_mode] == CalibrationMode.DEGREE:
+                drive_mode = self.calibration["drive_mode"][calib_idx]
+                homing_offset = self.calibration["homing_offset"][calib_idx]
+                _, model = self.motors[name]
+                resolution = self.model_resolution[model]
+
+                # Convert from nominal 0-centered degree range [-180, 180] to
+                # 0-centered resolution range (e.g. [-2048, 2048] for resolution=4096)
+                values[i] = values[i] / HALF_TURN_DEGREE * (resolution // 2)
+
+                # Substract the homing offsets to come back to actual motor range of values
+                # which can be arbitrary.
+                values[i] -= homing_offset
+
+                # Remove drive mode, which is the rotation direction of the motor, to come back to
+                # actual motor rotation direction which can be arbitrary.
+                if drive_mode:
+                    values[i] *= -1
+
+            elif CalibrationMode[calib_mode] == CalibrationMode.LINEAR:
+                start_pos = self.calibration["start_pos"][calib_idx]
+                end_pos = self.calibration["end_pos"][calib_idx]
+
+                # Convert from nominal lnear range of [0, 100] % to
+                # actual motor range of values which can be arbitrary.
+                values[i] = values[i] / 100 * (end_pos - start_pos) + start_pos
+
+        values = np.round(values).astype(np.int32)
+        return values
+
+    def avoid_rotation_reset(self, values, motor_names, data_name):
+        if data_name not in self.track_positions:
+            self.track_positions[data_name] = {
+                "prev": [None] * len(self.motor_names),
+                # Assume False at initialization
+                "below_zero": [False] * len(self.motor_names),
+                "above_max": [False] * len(self.motor_names),
+            }
+
+        track = self.track_positions[data_name]
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        for i, name in enumerate(motor_names):
+            idx = self.motor_names.index(name)
+
+            if track["prev"][idx] is None:
+                track["prev"][idx] = values[i]
+                continue
+
+            # Detect a full rotation occured
+            if abs(track["prev"][idx] - values[i]) > 2048:
+                # Position went below 0 and got reset to 4095
+                if track["prev"][idx] < values[i]:
+                    # So we set negative value by adding a full rotation
+                    values[i] -= 4096
+
+                # Position went above 4095 and got reset to 0
+                elif track["prev"][idx] > values[i]:
+                    # So we add a full rotation
+                    values[i] += 4096
+
+            track["prev"][idx] = values[i]
+
+        return values
+
+    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        return_list = True
+        if not isinstance(motor_ids, list):
+            return_list = False
+            motor_ids = [motor_ids]
+
+        assert_same_address(self.model_ctrl_table, self.motor_models, data_name)
+        addr, bytes = self.model_ctrl_table[motor_models[0]][data_name]
+        group = scs.GroupSyncRead(self.port_handler, self.packet_handler, addr, bytes)
+        for idx in motor_ids:
+            group.addParam(idx)
+
+        for _ in range(num_retry):
+            comm = group.txRxPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Read failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        values = []
+        for idx in motor_ids:
+            value = group.getData(idx, addr, bytes)
+            values.append(value)
+
+        if return_list:
+            return values
+        else:
+            return values[0]
+
+    def read(self, data_name, motor_names: str | list[str] | None = None):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. You need to run `motors_bus.connect()`."
+            )
+
+        start_time = time.perf_counter()
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        motor_ids = []
+        models = []
+        for name in motor_names:
+            motor_idx, model = self.motors[name]
+            motor_ids.append(motor_idx)
+            models.append(model)
+
+        assert_same_address(self.model_ctrl_table, models, data_name)
+        addr, bytes = self.model_ctrl_table[model][data_name]
+        group_key = get_group_sync_key(data_name, motor_names)
+
+        if data_name not in self.group_readers:
+            # create new group reader
+            self.group_readers[group_key] = scs.GroupSyncRead(
+                self.port_handler, self.packet_handler, addr, bytes
+            )
+            for idx in motor_ids:
+                self.group_readers[group_key].addParam(idx)
+
+        for _ in range(NUM_READ_RETRY):
+            comm = self.group_readers[group_key].txRxPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Read failed due to communication error on port {self.port} for group_key {group_key}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        values = []
+        for idx in motor_ids:
+            value = self.group_readers[group_key].getData(idx, addr, bytes)
+            values.append(value)
+
+        values = np.array(values)
+
+        # Convert to signed int to use range [-2048, 2048] for our motor positions.
+        if data_name in CONVERT_UINT32_TO_INT32_REQUIRED:
+            values = values.astype(np.int32)
+
+        if data_name in CALIBRATION_REQUIRED:
+            values = self.avoid_rotation_reset(values, motor_names, data_name)
+
+        if data_name in CALIBRATION_REQUIRED and self.calibration is not None:
+            values = self.apply_calibration_autocorrect(values, motor_names)
+
+        # log the number of seconds it took to read the data from the motors
+        delta_ts_name = get_log_name("delta_timestamp_s", "read", data_name, motor_names)
+        self.logs[delta_ts_name] = time.perf_counter() - start_time
+
+        # log the utc time at which the data was received
+        ts_utc_name = get_log_name("timestamp_utc", "read", data_name, motor_names)
+        self.logs[ts_utc_name] = capture_timestamp_utc()
+
+        return values
+
+    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if not isinstance(motor_ids, list):
+            motor_ids = [motor_ids]
+        if not isinstance(values, list):
+            values = [values]
+
+        assert_same_address(self.model_ctrl_table, motor_models, data_name)
+        addr, bytes = self.model_ctrl_table[motor_models[0]][data_name]
+        group = scs.GroupSyncWrite(self.port_handler, self.packet_handler, addr, bytes)
+        for idx, value in zip(motor_ids, values, strict=True):
+            data = convert_to_bytes(value, bytes, self.mock)
+            group.addParam(idx, data)
+
+        for _ in range(num_retry):
+            comm = group.txPacket()
+            if comm == scs.COMM_SUCCESS:
+                break
+
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Write failed due to communication error on port {self.port_handler.port_name} for indices {motor_ids}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+    def write(self, data_name, values: int | float | np.ndarray, motor_names: str | list[str] | None = None):
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. You need to run `motors_bus.connect()`."
+            )
+
+        start_time = time.perf_counter()
+
+        if self.mock:
+            import tests.mock_scservo_sdk as scs
+        else:
+            import scservo_sdk as scs
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        if isinstance(values, (int, float, np.integer)):
+            values = [int(values)] * len(motor_names)
+
+        values = np.array(values)
+
+        motor_ids = []
+        models = []
+        for name in motor_names:
+            motor_idx, model = self.motors[name]
+            motor_ids.append(motor_idx)
+            models.append(model)
+
+        if data_name in CALIBRATION_REQUIRED and self.calibration is not None:
+            values = self.revert_calibration(values, motor_names)
+
+        values = values.tolist()
+
+        assert_same_address(self.model_ctrl_table, models, data_name)
+        addr, bytes = self.model_ctrl_table[model][data_name]
+        group_key = get_group_sync_key(data_name, motor_names)
+
+        init_group = data_name not in self.group_readers
+        if init_group:
+            self.group_writers[group_key] = scs.GroupSyncWrite(
+                self.port_handler, self.packet_handler, addr, bytes
+            )
+
+        for idx, value in zip(motor_ids, values, strict=True):
+            data = convert_to_bytes(value, bytes, self.mock)
+            if init_group:
+                self.group_writers[group_key].addParam(idx, data)
+            else:
+                self.group_writers[group_key].changeParam(idx, data)
+
+        comm = self.group_writers[group_key].txPacket()
+        if comm != scs.COMM_SUCCESS:
+            raise ConnectionError(
+                f"Write failed due to communication error on port {self.port} for group_key {group_key}: "
+                f"{self.packet_handler.getTxRxResult(comm)}"
+            )
+
+        # log the number of seconds it took to write the data to the motors
+        delta_ts_name = get_log_name("delta_timestamp_s", "write", data_name, motor_names)
+        self.logs[delta_ts_name] = time.perf_counter() - start_time
+
+        # TODO(rcadene): should we log the time before sending the write command?
+        # log the utc time when the write has been completed
+        ts_utc_name = get_log_name("timestamp_utc", "write", data_name, motor_names)
+        self.logs[ts_utc_name] = capture_timestamp_utc()
+
+    def disconnect(self):
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"FeetechMotorsBus({self.port}) is not connected. Try running `motors_bus.connect()` first."
+            )
+
+        if self.port_handler is not None:
+            self.port_handler.closePort()
+            self.port_handler = None
+
+        self.packet_handler = None
+        self.group_readers = {}
+        self.group_writers = {}
+        self.is_connected = False
+
+    def __del__(self):
+        if getattr(self, "is_connected", False):
+            self.disconnect()
--- a/lerobot/common/robot_devices/robots/dynamixel_calibration.py
+++ b/lerobot/common/robot_devices/robots/dynamixel_calibration.py
@@ -0,0 +1,130 @@
+"""Logic to calibrate a robot arm built with dynamixel motors"""
+# TODO(rcadene, aliberts): move this logic into the robot code when refactoring
+
+import numpy as np
+
+from lerobot.common.robot_devices.motors.dynamixel import (
+    CalibrationMode,
+    TorqueMode,
+    convert_degrees_to_steps,
+)
+from lerobot.common.robot_devices.motors.utils import MotorsBus
+
+URL_TEMPLATE = (
+    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
+)
+
+# The following positions are provided in nominal degree range ]-180, +180[
+# For more info on these constants, see comments in the code where they get used.
+ZERO_POSITION_DEGREE = 0
+ROTATED_POSITION_DEGREE = 90
+
+
+def assert_drive_mode(drive_mode):
+    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
+    if not np.all(np.isin(drive_mode, [0, 1])):
+        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
+
+
+def apply_drive_mode(position, drive_mode):
+    assert_drive_mode(drive_mode)
+    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
+    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
+    signed_drive_mode = -(drive_mode * 2 - 1)
+    position *= signed_drive_mode
+    return position
+
+
+def compute_nearest_rounded_position(position, models):
+    delta_turn = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, models)
+    nearest_pos = np.round(position.astype(float) / delta_turn) * delta_turn
+    return nearest_pos.astype(position.dtype)
+
+
+def run_arm_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """This function ensures that a neural network trained on data collected on a given robot
+    can work on another robot. For instance before calibration, setting a same goal position
+    for each motor of two different robots will get two very different positions. But after calibration,
+    the two robots will move to the same position.To this end, this function computes the homing offset
+    and the drive mode for each motor of a given robot.
+
+    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
+    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
+    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
+
+    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
+    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
+    to the "rotated position".
+
+    After calibration, the homing offsets and drive modes are stored in a cache.
+
+    Example of usage:
+    ```python
+    run_arm_calibration(arm, "koch", "left", "follower")
+    ```
+    """
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to zero position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
+    input("Press Enter to continue...")
+
+    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
+    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
+    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
+    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
+
+    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
+    zero_pos = arm.read("Present_Position")
+    zero_nearest_pos = compute_nearest_rounded_position(zero_pos, arm.motor_models)
+    homing_offset = zero_target_pos - zero_nearest_pos
+
+    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
+    # This allows to identify the rotation direction of each motor.
+    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
+    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
+    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
+    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
+    # of the previous motor in the kinetic chain.
+    print("\nMove arm to rotated target position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
+    input("Press Enter to continue...")
+
+    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
+
+    # Find drive mode by rotating each motor by a quarter of a turn.
+    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
+    rotated_pos = arm.read("Present_Position")
+    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
+
+    # Re-compute homing offset to take into account drive mode
+    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
+    rotated_nearest_pos = compute_nearest_rounded_position(rotated_drived_pos, arm.motor_models)
+    homing_offset = rotated_target_pos - rotated_nearest_pos
+
+    print("\nMove arm to rest position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
+    input("Press Enter to continue...")
+    print()
+
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    calib_mode = [CalibrationMode.DEGREE.name] * len(arm.motor_names)
+
+    # TODO(rcadene): make type of joints (DEGREE or LINEAR) configurable from yaml?
+    if robot_type in ["aloha"] and "gripper" in arm.motor_names:
+        # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
+        calib_idx = arm.motor_names.index("gripper")
+        calib_mode[calib_idx] = CalibrationMode.LINEAR.name
+
+    calib_data = {
+        "homing_offset": homing_offset.tolist(),
+        "drive_mode": drive_mode.tolist(),
+        "start_pos": zero_pos.tolist(),
+        "end_pos": rotated_pos.tolist(),
+        "calib_mode": calib_mode,
+        "motor_names": arm.motor_names,
+    }
+    return calib_data
--- a/lerobot/common/robot_devices/robots/feetech_calibration.py
+++ b/lerobot/common/robot_devices/robots/feetech_calibration.py
@@ -0,0 +1,485 @@
+"""Logic to calibrate a robot arm built with feetech motors"""
+# TODO(rcadene, aliberts): move this logic into the robot code when refactoring
+
+import time
+
+import numpy as np
+
+from lerobot.common.robot_devices.motors.feetech import (
+    CalibrationMode,
+    TorqueMode,
+    convert_degrees_to_steps,
+)
+from lerobot.common.robot_devices.motors.utils import MotorsBus
+
+URL_TEMPLATE = (
+    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
+)
+
+# The following positions are provided in nominal degree range ]-180, +180[
+# For more info on these constants, see comments in the code where they get used.
+ZERO_POSITION_DEGREE = 0
+ROTATED_POSITION_DEGREE = 90
+
+
+def assert_drive_mode(drive_mode):
+    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
+    if not np.all(np.isin(drive_mode, [0, 1])):
+        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
+
+
+def apply_drive_mode(position, drive_mode):
+    assert_drive_mode(drive_mode)
+    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
+    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
+    signed_drive_mode = -(drive_mode * 2 - 1)
+    position *= signed_drive_mode
+    return position
+
+
+def move_until_block(arm, motor_name, positive_direction=True, while_move_hook=None):
+    count = 0
+    while True:
+        present_pos = arm.read("Present_Position", motor_name)
+        if positive_direction:
+            # Move +100 steps every time. Lower the steps to lower the speed at which the arm moves.
+            arm.write("Goal_Position", present_pos + 100, motor_name)
+        else:
+            arm.write("Goal_Position", present_pos - 100, motor_name)
+
+        if while_move_hook is not None:
+            while_move_hook()
+
+        present_pos = arm.read("Present_Position", motor_name).item()
+        present_speed = arm.read("Present_Speed", motor_name).item()
+        present_current = arm.read("Present_Current", motor_name).item()
+        # present_load = arm.read("Present_Load", motor_name).item()
+        # present_voltage = arm.read("Present_Voltage", motor_name).item()
+        # present_temperature = arm.read("Present_Temperature", motor_name).item()
+
+        # print(f"{present_pos=}")
+        # print(f"{present_speed=}")
+        # print(f"{present_current=}")
+        # print(f"{present_load=}")
+        # print(f"{present_voltage=}")
+        # print(f"{present_temperature=}")
+
+        if present_speed == 0 and present_current > 50:
+            count += 1
+            if count > 100 or present_current > 300:
+                return present_pos
+        else:
+            count = 0
+
+
+def move_to_calibrate(
+    arm,
+    motor_name,
+    invert_drive_mode=False,
+    positive_first=True,
+    in_between_move_hook=None,
+    while_move_hook=None,
+):
+    initial_pos = arm.read("Present_Position", motor_name)
+
+    if positive_first:
+        p_present_pos = move_until_block(
+            arm, motor_name, positive_direction=True, while_move_hook=while_move_hook
+        )
+    else:
+        n_present_pos = move_until_block(
+            arm, motor_name, positive_direction=False, while_move_hook=while_move_hook
+        )
+
+    if in_between_move_hook is not None:
+        in_between_move_hook()
+
+    if positive_first:
+        n_present_pos = move_until_block(
+            arm, motor_name, positive_direction=False, while_move_hook=while_move_hook
+        )
+    else:
+        p_present_pos = move_until_block(
+            arm, motor_name, positive_direction=True, while_move_hook=while_move_hook
+        )
+
+    zero_pos = (n_present_pos + p_present_pos) / 2
+
+    calib_data = {
+        "initial_pos": initial_pos,
+        "homing_offset": zero_pos if invert_drive_mode else -zero_pos,
+        "invert_drive_mode": invert_drive_mode,
+        "drive_mode": -1 if invert_drive_mode else 0,
+        "zero_pos": zero_pos,
+        "start_pos": n_present_pos if invert_drive_mode else p_present_pos,
+        "end_pos": p_present_pos if invert_drive_mode else n_present_pos,
+    }
+    return calib_data
+
+
+def apply_offset(calib, offset):
+    calib["zero_pos"] += offset
+    if calib["drive_mode"]:
+        calib["homing_offset"] += offset
+    else:
+        calib["homing_offset"] -= offset
+    return calib
+
+
+def run_arm_auto_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    if robot_type == "so100":
+        return run_arm_auto_calibration_so100(arm, robot_type, arm_name, arm_type)
+    elif robot_type == "moss":
+        return run_arm_auto_calibration_moss(arm, robot_type, arm_name, arm_type)
+    else:
+        raise ValueError(robot_type)
+
+
+def run_arm_auto_calibration_so100(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """All the offsets and magic numbers are hand tuned, and are unique to SO-100 follower arms"""
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    if not (robot_type == "so100" and arm_type == "follower"):
+        raise NotImplementedError("Auto calibration only supports the follower of so100 arms for now.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to initial position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="initial"))
+    input("Press Enter to continue...")
+
+    # Lower the acceleration of the motors (in [0,254])
+    initial_acceleration = arm.read("Acceleration")
+    arm.write("Lock", 0)
+    arm.write("Acceleration", 10)
+    time.sleep(1)
+
+    arm.write("Torque_Enable", TorqueMode.ENABLED.value)
+
+    print(f'{arm.read("Present_Position", "elbow_flex")=}')
+
+    calib = {}
+
+    init_wf_pos = arm.read("Present_Position", "wrist_flex")
+    init_sl_pos = arm.read("Present_Position", "shoulder_lift")
+    init_ef_pos = arm.read("Present_Position", "elbow_flex")
+    arm.write("Goal_Position", init_wf_pos - 800, "wrist_flex")
+    arm.write("Goal_Position", init_sl_pos + 150 + 1024, "shoulder_lift")
+    arm.write("Goal_Position", init_ef_pos - 2048, "elbow_flex")
+    time.sleep(2)
+
+    print("Calibrate shoulder_pan")
+    calib["shoulder_pan"] = move_to_calibrate(arm, "shoulder_pan")
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    print("Calibrate gripper")
+    calib["gripper"] = move_to_calibrate(arm, "gripper", invert_drive_mode=True)
+    time.sleep(1)
+
+    print("Calibrate wrist_flex")
+    calib["wrist_flex"] = move_to_calibrate(arm, "wrist_flex")
+    calib["wrist_flex"] = apply_offset(calib["wrist_flex"], offset=80)
+
+    def in_between_move_hook():
+        nonlocal arm, calib
+        time.sleep(2)
+        ef_pos = arm.read("Present_Position", "elbow_flex")
+        sl_pos = arm.read("Present_Position", "shoulder_lift")
+        arm.write("Goal_Position", ef_pos + 1024, "elbow_flex")
+        arm.write("Goal_Position", sl_pos - 1024, "shoulder_lift")
+        time.sleep(2)
+
+    print("Calibrate elbow_flex")
+    calib["elbow_flex"] = move_to_calibrate(
+        arm, "elbow_flex", positive_first=False, in_between_move_hook=in_between_move_hook
+    )
+    calib["elbow_flex"] = apply_offset(calib["elbow_flex"], offset=80 - 1024)
+
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 1024 + 512, "elbow_flex")
+    time.sleep(1)
+
+    def in_between_move_hook():
+        nonlocal arm, calib
+        arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"], "elbow_flex")
+
+    print("Calibrate shoulder_lift")
+    calib["shoulder_lift"] = move_to_calibrate(
+        arm,
+        "shoulder_lift",
+        invert_drive_mode=True,
+        positive_first=False,
+        in_between_move_hook=in_between_move_hook,
+    )
+    # add an 30 steps as offset to align with body
+    calib["shoulder_lift"] = apply_offset(calib["shoulder_lift"], offset=1024 - 50)
+
+    def while_move_hook():
+        nonlocal arm, calib
+        positions = {
+            "shoulder_lift": round(calib["shoulder_lift"]["zero_pos"] - 1600),
+            "elbow_flex": round(calib["elbow_flex"]["zero_pos"] + 1700),
+            "wrist_flex": round(calib["wrist_flex"]["zero_pos"] + 800),
+            "gripper": round(calib["gripper"]["end_pos"]),
+        }
+        arm.write("Goal_Position", list(positions.values()), list(positions.keys()))
+
+    arm.write("Goal_Position", round(calib["shoulder_lift"]["zero_pos"] - 1600), "shoulder_lift")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["elbow_flex"]["zero_pos"] + 1700), "elbow_flex")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["wrist_flex"]["zero_pos"] + 800), "wrist_flex")
+    time.sleep(2)
+    arm.write("Goal_Position", round(calib["gripper"]["end_pos"]), "gripper")
+    time.sleep(2)
+
+    print("Calibrate wrist_roll")
+    calib["wrist_roll"] = move_to_calibrate(
+        arm, "wrist_roll", invert_drive_mode=True, positive_first=False, while_move_hook=while_move_hook
+    )
+
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"], "wrist_roll")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["gripper"]["start_pos"], "gripper")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"], "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 2048, "elbow_flex")
+    arm.write("Goal_Position", calib["shoulder_lift"]["zero_pos"] - 2048, "shoulder_lift")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": [calib[name]["homing_offset"] for name in arm.motor_names],
+        "drive_mode": [calib[name]["drive_mode"] for name in arm.motor_names],
+        "start_pos": [calib[name]["start_pos"] for name in arm.motor_names],
+        "end_pos": [calib[name]["end_pos"] for name in arm.motor_names],
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+
+    # Re-enable original accerlation
+    arm.write("Lock", 0)
+    arm.write("Acceleration", initial_acceleration)
+    time.sleep(1)
+
+    return calib_dict
+
+
+def run_arm_auto_calibration_moss(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """All the offsets and magic numbers are hand tuned, and are unique to SO-100 follower arms"""
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    if not (robot_type == "moss" and arm_type == "follower"):
+        raise NotImplementedError("Auto calibration only supports the follower of moss arms for now.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to initial position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="initial"))
+    input("Press Enter to continue...")
+
+    # Lower the acceleration of the motors (in [0,254])
+    initial_acceleration = arm.read("Acceleration")
+    arm.write("Lock", 0)
+    arm.write("Acceleration", 10)
+    time.sleep(1)
+
+    arm.write("Torque_Enable", TorqueMode.ENABLED.value)
+
+    sl_pos = arm.read("Present_Position", "shoulder_lift")
+    arm.write("Goal_Position", sl_pos - 1024 - 450, "shoulder_lift")
+    ef_pos = arm.read("Present_Position", "elbow_flex")
+    arm.write("Goal_Position", ef_pos + 1024 + 450, "elbow_flex")
+    time.sleep(2)
+
+    calib = {}
+
+    print("Calibrate shoulder_pan")
+    calib["shoulder_pan"] = move_to_calibrate(arm, "shoulder_pan", load_threshold=350, count_threshold=200)
+    arm.write("Goal_Position", calib["shoulder_pan"]["zero_pos"], "shoulder_pan")
+    time.sleep(1)
+
+    print("Calibrate gripper")
+    calib["gripper"] = move_to_calibrate(arm, "gripper", invert_drive_mode=True, count_threshold=200)
+    time.sleep(1)
+
+    print("Calibrate wrist_flex")
+    calib["wrist_flex"] = move_to_calibrate(arm, "wrist_flex", invert_drive_mode=True, count_threshold=200)
+    calib["wrist_flex"] = apply_offset(calib["wrist_flex"], offset=-210 + 1024)
+
+    wr_pos = arm.read("Present_Position", "wrist_roll")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", wr_pos - 1024, "wrist_roll")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 2048, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["gripper"]["end_pos"], "gripper")
+    time.sleep(1)
+
+    print("Calibrate wrist_roll")
+    calib["wrist_roll"] = move_to_calibrate(arm, "wrist_roll", invert_drive_mode=True, count_threshold=200)
+    calib["wrist_roll"] = apply_offset(calib["wrist_roll"], offset=790)
+
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"] - 1024, "wrist_roll")
+    arm.write("Goal_Position", calib["gripper"]["start_pos"], "gripper")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["wrist_roll"]["zero_pos"], "wrist_roll")
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 2048, "wrist_flex")
+
+    def in_between_move_elbow_flex_hook():
+        nonlocal arm, calib
+        arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"], "wrist_flex")
+
+    print("Calibrate elbow_flex")
+    calib["elbow_flex"] = move_to_calibrate(
+        arm,
+        "elbow_flex",
+        invert_drive_mode=True,
+        count_threshold=200,
+        in_between_move_hook=in_between_move_elbow_flex_hook,
+    )
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+
+    def in_between_move_shoulder_lift_hook():
+        nonlocal arm, calib
+        sl = arm.read("Present_Position", "shoulder_lift")
+        arm.write("Goal_Position", sl - 1500, "shoulder_lift")
+        time.sleep(1)
+        arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] + 1536, "elbow_flex")
+        time.sleep(1)
+        arm.write("Goal_Position", calib["wrist_flex"]["start_pos"], "wrist_flex")
+        time.sleep(1)
+
+    print("Calibrate shoulder_lift")
+    calib["shoulder_lift"] = move_to_calibrate(
+        arm, "shoulder_lift", in_between_move_hook=in_between_move_shoulder_lift_hook
+    )
+    calib["shoulder_lift"] = apply_offset(calib["shoulder_lift"], offset=-1024)
+
+    arm.write("Goal_Position", calib["wrist_flex"]["zero_pos"] - 1024, "wrist_flex")
+    time.sleep(1)
+    arm.write("Goal_Position", calib["shoulder_lift"]["zero_pos"] + 2048, "shoulder_lift")
+    arm.write("Goal_Position", calib["elbow_flex"]["zero_pos"] - 1024 - 400, "elbow_flex")
+    time.sleep(2)
+
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": [calib[name]["homing_offset"] for name in arm.motor_names],
+        "drive_mode": [calib[name]["drive_mode"] for name in arm.motor_names],
+        "start_pos": [calib[name]["start_pos"] for name in arm.motor_names],
+        "end_pos": [calib[name]["end_pos"] for name in arm.motor_names],
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+
+    # Re-enable original accerlation
+    arm.write("Lock", 0)
+    arm.write("Acceleration", initial_acceleration)
+    time.sleep(1)
+
+    return calib_dict
+
+
+def run_arm_manual_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
+    """This function ensures that a neural network trained on data collected on a given robot
+    can work on another robot. For instance before calibration, setting a same goal position
+    for each motor of two different robots will get two very different positions. But after calibration,
+    the two robots will move to the same position.To this end, this function computes the homing offset
+    and the drive mode for each motor of a given robot.
+
+    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
+    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
+    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
+
+    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
+    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
+    to the "rotated position".
+
+    After calibration, the homing offsets and drive modes are stored in a cache.
+
+    Example of usage:
+    ```python
+    run_arm_calibration(arm, "so100", "left", "follower")
+    ```
+    """
+    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
+        raise ValueError("To run calibration, the torque must be disabled on all motors.")
+
+    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
+
+    print("\nMove arm to zero position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
+    input("Press Enter to continue...")
+
+    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
+    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
+    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
+    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
+
+    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
+    zero_pos = arm.read("Present_Position")
+    homing_offset = zero_target_pos - zero_pos
+
+    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
+    # This allows to identify the rotation direction of each motor.
+    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
+    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
+    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
+    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
+    # of the previous motor in the kinetic chain.
+    print("\nMove arm to rotated target position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
+    input("Press Enter to continue...")
+
+    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
+
+    # Find drive mode by rotating each motor by a quarter of a turn.
+    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
+    rotated_pos = arm.read("Present_Position")
+    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
+
+    # Re-compute homing offset to take into account drive mode
+    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
+    homing_offset = rotated_target_pos - rotated_drived_pos
+
+    print("\nMove arm to rest position")
+    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
+    input("Press Enter to continue...")
+    print()
+
+    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
+    calib_modes = []
+    for name in arm.motor_names:
+        if name == "gripper":
+            calib_modes.append(CalibrationMode.LINEAR.name)
+        else:
+            calib_modes.append(CalibrationMode.DEGREE.name)
+
+    calib_dict = {
+        "homing_offset": homing_offset.tolist(),
+        "drive_mode": drive_mode.tolist(),
+        "start_pos": zero_pos.tolist(),
+        "end_pos": rotated_pos.tolist(),
+        "calib_mode": calib_modes,
+        "motor_names": arm.motor_names,
+    }
+    return calib_dict
--- a/lerobot/common/robot_devices/robots/manipulator.py
+++ b/lerobot/common/robot_devices/robots/manipulator.py
@@ -1,3 +1,9 @@
+"""Contains logic to instantiate a robot, read information from its motors and cameras,
+and send orders to its motors.
+"""
+# TODO(rcadene, aliberts): reorganize the codebase into one file per robot, with the associated
+# calibration procedure, to make it easy for people to add their own robot.
+
 import json
 import logging
 import time
@@ -10,138 +16,10 @@ import numpy as np
 import torch

 from lerobot.common.robot_devices.cameras.utils import Camera
-from lerobot.common.robot_devices.motors.dynamixel import (
-    CalibrationMode,
-    TorqueMode,
-    convert_degrees_to_steps,
-)
 from lerobot.common.robot_devices.motors.utils import MotorsBus
 from lerobot.common.robot_devices.robots.utils import get_arm_id
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError

-########################################################################
-# Calibration logic
-########################################################################
-
-URL_TEMPLATE = (
-    "https://raw.githubusercontent.com/huggingface/lerobot/main/media/{robot}/{arm}_{position}.webp"
-)
-
-# The following positions are provided in nominal degree range ]-180, +180[
-# For more info on these constants, see comments in the code where they get used.
-ZERO_POSITION_DEGREE = 0
-ROTATED_POSITION_DEGREE = 90
-
-
-def assert_drive_mode(drive_mode):
-    # `drive_mode` is in [0,1] with 0 means original rotation direction for the motor, and 1 means inverted.
-    if not np.all(np.isin(drive_mode, [0, 1])):
-        raise ValueError(f"`drive_mode` contains values other than 0 or 1: ({drive_mode})")
-
-
-def apply_drive_mode(position, drive_mode):
-    assert_drive_mode(drive_mode)
-    # Convert `drive_mode` from [0, 1] with 0 indicates original rotation direction and 1 inverted,
-    # to [-1, 1] with 1 indicates original rotation direction and -1 inverted.
-    signed_drive_mode = -(drive_mode * 2 - 1)
-    position *= signed_drive_mode
-    return position
-
-
-def compute_nearest_rounded_position(position, models):
-    delta_turn = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, models)
-    nearest_pos = np.round(position.astype(float) / delta_turn) * delta_turn
-    return nearest_pos.astype(position.dtype)
-
-
-def run_arm_calibration(arm: MotorsBus, robot_type: str, arm_name: str, arm_type: str):
-    """This function ensures that a neural network trained on data collected on a given robot
-    can work on another robot. For instance before calibration, setting a same goal position
-    for each motor of two different robots will get two very different positions. But after calibration,
-    the two robots will move to the same position.To this end, this function computes the homing offset
-    and the drive mode for each motor of a given robot.
-
-    Homing offset is used to shift the motor position to a ]-2048, +2048[ nominal range (when the motor uses 2048 steps
-    to complete a half a turn). This range is set around an arbitrary "zero position" corresponding to all motor positions
-    being 0. During the calibration process, you will need to manually move the robot to this "zero position".
-
-    Drive mode is used to invert the rotation direction of the motor. This is useful when some motors have been assembled
-    in the opposite orientation for some robots. During the calibration process, you will need to manually move the robot
-    to the "rotated position".
-
-    After calibration, the homing offsets and drive modes are stored in a cache.
-
-    Example of usage:
-    ```python
-    run_arm_calibration(arm, "koch", "left", "follower")
-    ```
-    """
-    if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
-        raise ValueError("To run calibration, the torque must be disabled on all motors.")
-
-    print(f"\nRunning calibration of {robot_type} {arm_name} {arm_type}...")
-
-    print("\nMove arm to zero position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="zero"))
-    input("Press Enter to continue...")
-
-    # We arbitrarily chose our zero target position to be a straight horizontal position with gripper upwards and closed.
-    # It is easy to identify and all motors are in a "quarter turn" position. Once calibration is done, this position will
-    # correspond to every motor angle being 0. If you set all 0 as Goal Position, the arm will move in this position.
-    zero_target_pos = convert_degrees_to_steps(ZERO_POSITION_DEGREE, arm.motor_models)
-
-    # Compute homing offset so that `present_position + homing_offset ~= target_position`.
-    zero_pos = arm.read("Present_Position")
-    zero_nearest_pos = compute_nearest_rounded_position(zero_pos, arm.motor_models)
-    homing_offset = zero_target_pos - zero_nearest_pos
-
-    # The rotated target position corresponds to a rotation of a quarter turn from the zero position.
-    # This allows to identify the rotation direction of each motor.
-    # For instance, if the motor rotates 90 degree, and its value is -90 after applying the homing offset, then we know its rotation direction
-    # is inverted. However, for the calibration being successful, we need everyone to follow the same target position.
-    # Sometimes, there is only one possible rotation direction. For instance, if the gripper is closed, there is only one direction which
-    # corresponds to opening the gripper. When the rotation direction is ambiguous, we arbitrarely rotate clockwise from the point of view
-    # of the previous motor in the kinetic chain.
-    print("\nMove arm to rotated target position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rotated"))
-    input("Press Enter to continue...")
-
-    rotated_target_pos = convert_degrees_to_steps(ROTATED_POSITION_DEGREE, arm.motor_models)
-
-    # Find drive mode by rotating each motor by a quarter of a turn.
-    # Drive mode indicates if the motor rotation direction should be inverted (=1) or not (=0).
-    rotated_pos = arm.read("Present_Position")
-    drive_mode = (rotated_pos < zero_pos).astype(np.int32)
-
-    # Re-compute homing offset to take into account drive mode
-    rotated_drived_pos = apply_drive_mode(rotated_pos, drive_mode)
-    rotated_nearest_pos = compute_nearest_rounded_position(rotated_drived_pos, arm.motor_models)
-    homing_offset = rotated_target_pos - rotated_nearest_pos
-
-    print("\nMove arm to rest position")
-    print("See: " + URL_TEMPLATE.format(robot=robot_type, arm=arm_type, position="rest"))
-    input("Press Enter to continue...")
-    print()
-
-    # Joints with rotational motions are expressed in degrees in nominal range of [-180, 180]
-    calib_mode = [CalibrationMode.DEGREE.name] * len(arm.motor_names)
-
-    # TODO(rcadene): make type of joints (DEGREE or LINEAR) configurable from yaml?
-    if robot_type == "aloha" and "gripper" in arm.motor_names:
-        # Joints with linear motions (like gripper of Aloha) are experessed in nominal range of [0, 100]
-        calib_idx = arm.motor_names.index("gripper")
-        calib_mode[calib_idx] = CalibrationMode.LINEAR.name
-
-    calib_data = {
-        "homing_offset": homing_offset.tolist(),
-        "drive_mode": drive_mode.tolist(),
-        "start_pos": zero_pos.tolist(),
-        "end_pos": rotated_pos.tolist(),
-        "calib_mode": calib_mode,
-        "motor_names": arm.motor_names,
-    }
-    return calib_data
-

 def ensure_safe_goal_position(
    goal_pos: torch.Tensor, present_pos: torch.Tensor, max_relative_target: float | list[float]
@@ -163,11 +41,6 @@ def ensure_safe_goal_position(
    return safe_goal_pos


-########################################################################
-# Manipulator robot
-########################################################################
-
-
@dataclass
 class ManipulatorRobotConfig:
    """
@@ -178,7 +51,7 @@ class ManipulatorRobotConfig:
    """

    # Define all components of the robot
-    robot_type: str | None = None
+    robot_type: str = "koch"
    leader_arms: dict[str, MotorsBus] = field(default_factory=lambda: {})
    follower_arms: dict[str, MotorsBus] = field(default_factory=lambda: {})
    cameras: dict[str, Camera] = field(default_factory=lambda: {})
@@ -207,6 +80,10 @@ class ManipulatorRobotConfig:
                    )
        super().__setattr__(prop, val)

+    def __post_init__(self):
+        if self.robot_type not in ["koch", "koch_bimanual", "aloha", "so100", "moss"]:
+            raise ValueError(f"Provided robot type ({self.robot_type}) is not supported.")
+

 class ManipulatorRobot:
    # TODO(rcadene): Implement force feedback
@@ -349,6 +226,13 @@ class ManipulatorRobot:
        self.is_connected = False
        self.logs = {}

+        action_names = [f"{arm}_{motor}" for arm, bus in self.leader_arms.items() for motor in bus.motors]
+        state_names = [f"{arm}_{motor}" for arm, bus in self.follower_arms.items() for motor in bus.motors]
+        self.names = {
+            "action": action_names,
+            "observation.state": state_names,
+        }
+
    @property
    def has_camera(self):
        return len(self.cameras) > 0
@@ -387,6 +271,11 @@ class ManipulatorRobot:
            print(f"Connecting {name} leader arm.")
            self.leader_arms[name].connect()

+        if self.robot_type in ["koch", "aloha"]:
+            from lerobot.common.robot_devices.motors.dynamixel import TorqueMode
+        elif self.robot_type in ["so100", "moss"]:
+            from lerobot.common.robot_devices.motors.feetech import TorqueMode
+
        # We assume that at connection time, arms are in a rest position, and torque can
        # be safely disabled to run calibration and/or set robot preset configurations.
        for name in self.follower_arms:
@@ -401,8 +290,8 @@ class ManipulatorRobot:
            self.set_koch_robot_preset()
        elif self.robot_type == "aloha":
            self.set_aloha_robot_preset()
-        else:
-            warnings.warn(f"No preset found for robot type: {self.robot_type}", stacklevel=1)
+        elif self.robot_type in ["so100", "moss"]:
+            self.set_so100_robot_preset()

        # Enable torque on all motors of the follower arms
        for name in self.follower_arms:
@@ -410,12 +299,22 @@ class ManipulatorRobot:
            self.follower_arms[name].write("Torque_Enable", 1)

        if self.config.gripper_open_degree is not None:
+            if self.robot_type in ["aloha", "so100", "moss"]:
+                raise NotImplementedError(
+                    f"{self.robot_type} does not support position AND current control in the handle, which is require to set the gripper open."
+                )
            # Set the leader arm in torque mode with the gripper motor set to an angle. This makes it possible
            # to squeeze the gripper and have it spring back to an open position on its own.
            for name in self.leader_arms:
                self.leader_arms[name].write("Torque_Enable", 1, "gripper")
                self.leader_arms[name].write("Goal_Position", self.config.gripper_open_degree, "gripper")

+        # Check both arms can be read
+        for name in self.follower_arms:
+            self.follower_arms[name].read("Present_Position")
+        for name in self.leader_arms:
+            self.leader_arms[name].read("Present_Position")
+
        # Connect the cameras
        for name in self.cameras:
            self.cameras[name].connect()
@@ -437,7 +336,25 @@ class ManipulatorRobot:
                    calibration = json.load(f)
            else:
                print(f"Missing calibration file '{arm_calib_path}'")
-                calibration = run_arm_calibration(arm, self.robot_type, name, arm_type)
+
+                if self.robot_type in ["koch", "aloha"]:
+                    from lerobot.common.robot_devices.robots.dynamixel_calibration import run_arm_calibration
+
+                    calibration = run_arm_calibration(arm, self.robot_type, name, arm_type)
+
+                elif self.robot_type in ["so100", "moss"]:
+                    from lerobot.common.robot_devices.robots.feetech_calibration import (
+                        run_arm_auto_calibration,
+                        run_arm_manual_calibration,
+                    )
+
+                    # TODO(rcadene): better way to handle mocking + test run_arm_auto_calibration
+                    if arm_type == "leader" or arm.mock:
+                        calibration = run_arm_manual_calibration(arm, self.robot_type, name, arm_type)
+                    elif arm_type == "follower":
+                        calibration = run_arm_auto_calibration(arm, self.robot_type, name, arm_type)
+                    else:
+                        raise ValueError(arm_type)

                print(f"Calibration is done! Saving calibration file '{arm_calib_path}'")
                arm_calib_path.parent.mkdir(parents=True, exist_ok=True)
@@ -455,6 +372,8 @@ class ManipulatorRobot:

    def set_koch_robot_preset(self):
        def set_operating_mode_(arm):
+            from lerobot.common.robot_devices.motors.dynamixel import TorqueMode
+
            if (arm.read("Torque_Enable") != TorqueMode.DISABLED.value).any():
                raise ValueError("To run set robot preset, the torque must be disabled on all motors.")

@@ -542,6 +461,23 @@ class ManipulatorRobot:
                stacklevel=1,
            )

+    def set_so100_robot_preset(self):
+        for name in self.follower_arms:
+            # Mode=0 for Position Control
+            self.follower_arms[name].write("Mode", 0)
+            # Set P_Coefficient to lower value to avoid shakiness (Default is 32)
+            self.follower_arms[name].write("P_Coefficient", 16)
+            # Set I_Coefficient and D_Coefficient to default value 0 and 32
+            self.follower_arms[name].write("I_Coefficient", 0)
+            self.follower_arms[name].write("D_Coefficient", 32)
+            # Close the write lock so that Maximum_Acceleration gets written to EPROM address,
+            # which is mandatory for Maximum_Acceleration to take effect after rebooting.
+            self.follower_arms[name].write("Lock", 0)
+            # Set Maximum_Acceleration to 254 to speedup acceleration and deceleration of
+            # the motors. Note: this configuration is not in the official STS3215 Memory Table
+            self.follower_arms[name].write("Maximum_Acceleration", 254)
+            self.follower_arms[name].write("Acceleration", 254)
+
    def teleop_step(
        self, record_data=False
    ) -> None | tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]:
--- a/lerobot/configs/robot/aloha.yaml
+++ b/lerobot/configs/robot/aloha.yaml
@@ -1,11 +1,13 @@
-# Aloha: A Low-Cost Hardware for Bimanual Teleoperation
+# [Aloha: A Low-Cost Hardware for Bimanual Teleoperation](https://www.trossenrobotics.com/aloha-stationary)
 # https://aloha-2.github.io
-# https://www.trossenrobotics.com/aloha-stationary

 # Requires installing extras packages
 # With pip: `pip install -e ".[dynamixel intelrealsense]"`
 # With poetry: `poetry install --sync --extras "dynamixel intelrealsense"`

+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/9_use_aloha.md)
+
+
 _target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
 robot_type: aloha
 # Specific to Aloha, LeRobot comes with default calibration files. Assuming the motors have been
--- a/lerobot/configs/robot/moss.yaml
+++ b/lerobot/configs/robot/moss.yaml
@@ -0,0 +1,56 @@
+# [Moss v1 robot arm](https://github.com/jess-moss/moss-robot-arms)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[feetech]"`
+# With poetry: `poetry install --sync --extras "feetech"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/11_use_moss.md)
+
+_target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
+robot_type: moss
+calibration_dir: .cache/calibration/moss
+
+# `max_relative_target` limits the magnitude of the relative positional target vector for safety purposes.
+# Set this to a positive scalar to have the same value for all motors, or a list that is the same length as
+# the number of motors in your follower arms.
+max_relative_target: null
+
+leader_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem58760431091
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+follower_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem58760431191
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+cameras:
+  laptop:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 0
+    fps: 30
+    width: 640
+    height: 480
+  phone:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 1
+    fps: 30
+    width: 640
+    height: 480
--- a/lerobot/configs/robot/so100.yaml
+++ b/lerobot/configs/robot/so100.yaml
@@ -0,0 +1,56 @@
+# [SO-100 robot arm](https://github.com/TheRobotStudio/SO-ARM100)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[feetech]"`
+# With poetry: `poetry install --sync --extras "feetech"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/10_use_so100.md)
+
+_target_: lerobot.common.robot_devices.robots.manipulator.ManipulatorRobot
+robot_type: so100
+calibration_dir: .cache/calibration/so100
+
+# `max_relative_target` limits the magnitude of the relative positional target vector for safety purposes.
+# Set this to a positive scalar to have the same value for all motors, or a list that is the same length as
+# the number of motors in your follower arms.
+max_relative_target: null
+
+leader_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem585A0077581
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+follower_arms:
+  main:
+    _target_: lerobot.common.robot_devices.motors.feetech.FeetechMotorsBus
+    port: /dev/tty.usbmodem585A0080971
+    motors:
+      # name: (index, model)
+      shoulder_pan: [1, "sts3215"]
+      shoulder_lift: [2, "sts3215"]
+      elbow_flex: [3, "sts3215"]
+      wrist_flex: [4, "sts3215"]
+      wrist_roll: [5, "sts3215"]
+      gripper: [6, "sts3215"]
+
+cameras:
+  laptop:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 0
+    fps: 30
+    width: 640
+    height: 480
+  phone:
+    _target_: lerobot.common.robot_devices.cameras.opencv.OpenCVCamera
+    camera_index: 1
+    fps: 30
+    width: 640
+    height: 480
--- a/lerobot/configs/robot/stretch.yaml
+++ b/lerobot/configs/robot/stretch.yaml
@@ -1,3 +1,12 @@
+# [Stretch3 from Hello Robot](https://hello-robot.com/stretch-3-product)
+
+# Requires installing extras packages
+# With pip: `pip install -e ".[stretch]"`
+# With poetry: `poetry install --sync --extras "stretch"`
+
+# See [tutorial](https://github.com/huggingface/lerobot/blob/main/examples/8_use_stretch.md)
+
+
 _target_: lerobot.common.robot_devices.robots.stretch.StretchRobot
 robot_type: stretch3

--- a/lerobot/scripts/configure_motor.py
+++ b/lerobot/scripts/configure_motor.py
@@ -0,0 +1,145 @@
+"""
+This script configure a single motor at a time to a given ID and baudrate.
+
+Example of usage:
+```bash
+python lerobot/scripts/configure_motor.py \
+  --port /dev/tty.usbmodem585A0080521 \
+  --brand feetech \
+  --model sts3215 \
+  --baudrate 1000000 \
+  --ID 1
+```
+"""
+
+import argparse
+import time
+
+
+def configure_motor(port, brand, model, motor_idx_des, baudrate_des):
+    if brand == "feetech":
+        from lerobot.common.robot_devices.motors.feetech import MODEL_BAUDRATE_TABLE
+        from lerobot.common.robot_devices.motors.feetech import (
+            SCS_SERIES_BAUDRATE_TABLE as SERIES_BAUDRATE_TABLE,
+        )
+        from lerobot.common.robot_devices.motors.feetech import FeetechMotorsBus as MotorsBusClass
+    elif brand == "dynamixel":
+        from lerobot.common.robot_devices.motors.dynamixel import MODEL_BAUDRATE_TABLE
+        from lerobot.common.robot_devices.motors.dynamixel import (
+            X_SERIES_BAUDRATE_TABLE as SERIES_BAUDRATE_TABLE,
+        )
+        from lerobot.common.robot_devices.motors.dynamixel import DynamixelMotorsBus as MotorsBusClass
+    else:
+        raise ValueError(
+            f"Currently we do not support this motor brand: {brand}. We currently support feetech and dynamixel motors."
+        )
+
+    # Check if the provided model exists in the model_baud_rate_table
+    if model not in MODEL_BAUDRATE_TABLE:
+        raise ValueError(
+            f"Invalid model '{model}' for brand '{brand}'. Supported models: {list(MODEL_BAUDRATE_TABLE.keys())}"
+        )
+
+    # Setup motor names, indices, and models
+    motor_name = "motor"
+    motor_index_arbitrary = motor_idx_des  # Use the motor ID passed via argument
+    motor_model = model  # Use the motor model passed via argument
+
+    # Initialize the MotorBus with the correct port and motor configurations
+    motor_bus = MotorsBusClass(port=port, motors={motor_name: (motor_index_arbitrary, motor_model)})
+
+    # Try to connect to the motor bus and handle any connection-specific errors
+    try:
+        motor_bus.connect()
+        print(f"Connected on port {motor_bus.port}")
+    except OSError as e:
+        print(f"Error occurred when connecting to the motor bus: {e}")
+        return
+
+    # Motor bus is connected, proceed with the rest of the operations
+    try:
+        print("Scanning all baudrates and motor indices")
+        all_baudrates = set(SERIES_BAUDRATE_TABLE.values())
+        motor_index = -1  # Set the motor index to an out-of-range value.
+
+        for baudrate in all_baudrates:
+            motor_bus.set_bus_baudrate(baudrate)
+            present_ids = motor_bus.find_motor_indices(list(range(1, 10)))
+            if len(present_ids) > 1:
+                raise ValueError(
+                    "Error: More than one motor ID detected. This script is designed to only handle one motor at a time. Please disconnect all but one motor."
+                )
+
+            if len(present_ids) == 1:
+                if motor_index != -1:
+                    raise ValueError(
+                        "Error: More than one motor ID detected. This script is designed to only handle one motor at a time. Please disconnect all but one motor."
+                    )
+                motor_index = present_ids[0]
+
+        if motor_index == -1:
+            raise ValueError("No motors detected. Please ensure you have one motor connected.")
+
+        print(f"Motor index found at: {motor_index}")
+
+        if brand == "feetech":
+            # Allows ID and BAUDRATE to be written in memory
+            motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Lock", 0)
+
+        if baudrate != baudrate_des:
+            print(f"Setting its baudrate to {baudrate_des}")
+            baudrate_idx = list(SERIES_BAUDRATE_TABLE.values()).index(baudrate_des)
+
+            # The write can fail, so we allow retries
+            motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Baud_Rate", baudrate_idx)
+            time.sleep(0.5)
+            motor_bus.set_bus_baudrate(baudrate_des)
+            present_baudrate_idx = motor_bus.read_with_motor_ids(
+                motor_bus.motor_models, motor_index, "Baud_Rate", num_retry=2
+            )
+
+            if present_baudrate_idx != baudrate_idx:
+                raise OSError("Failed to write baudrate.")
+
+        print(f"Setting its index to desired index {motor_idx_des}")
+        motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "Lock", 0)
+        motor_bus.write_with_motor_ids(motor_bus.motor_models, motor_index, "ID", motor_idx_des)
+
+        present_idx = motor_bus.read_with_motor_ids(motor_bus.motor_models, motor_idx_des, "ID", num_retry=2)
+        if present_idx != motor_idx_des:
+            raise OSError("Failed to write index.")
+
+        if brand == "feetech":
+            # Set Maximum_Acceleration to 254 to speedup acceleration and deceleration of
+            # the motors. Note: this configuration is not in the official STS3215 Memory Table
+            motor_bus.write("Lock", 0)
+            motor_bus.write("Maximum_Acceleration", 254)
+
+            motor_bus.write("Goal_Position", 2048)
+            time.sleep(4)
+            print("Present Position", motor_bus.read("Present_Position"))
+
+            motor_bus.write("Offset", 0)
+            time.sleep(4)
+            print("Offset", motor_bus.read("Offset"))
+
+    except Exception as e:
+        print(f"Error occurred during motor configuration: {e}")
+
+    finally:
+        motor_bus.disconnect()
+        print("Disconnected from motor bus.")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--port", type=str, required=True, help="Motors bus port (e.g. dynamixel,feetech)")
+    parser.add_argument("--brand", type=str, required=True, help="Motor brand (e.g. dynamixel,feetech)")
+    parser.add_argument("--model", type=str, required=True, help="Motor model (e.g. xl330-m077,sts3215)")
+    parser.add_argument("--ID", type=int, required=True, help="Desired ID of the current motor (e.g. 1,2,3)")
+    parser.add_argument(
+        "--baudrate", type=int, default=1000000, help="Desired baudrate for the motor (default: 1000000)"
+    )
+    args = parser.parse_args()
+
+    configure_motor(args.port, args.brand, args.model, args.ID, args.baudrate)
--- a/lerobot/scripts/control_robot.py
+++ b/lerobot/scripts/control_robot.py
@@ -106,12 +106,6 @@ from typing import List

 # from safetensors.torch import load_file, save_file
 from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
-from lerobot.common.datasets.populate_dataset import (
-    create_lerobot_dataset,
-    delete_current_episode,
-    init_dataset,
-    save_current_episode,
-)
 from lerobot.common.robot_devices.control_utils import (
    control_loop,
    has_method,
@@ -144,6 +138,9 @@ def calibrate(robot: Robot, arms: list[str] | None):
            robot.home()
        return

+    if arms is None:
+        arms = robot.available_arms
+
    unknown_arms = [arm_id for arm_id in arms if arm_id not in robot.available_arms]
    available_arms_str = " ".join(robot.available_arms)
    unknown_arms_str = " ".join(unknown_arms)
@@ -195,23 +192,24 @@ def record(
    robot: Robot,
    root: str,
    repo_id: str,
+    single_task: str,
    pretrained_policy_name_or_path: str | None = None,
    policy_overrides: List[str] | None = None,
    fps: int | None = None,
-    warmup_time_s=2,
-    episode_time_s=10,
-    reset_time_s=5,
-    num_episodes=50,
-    video=True,
-    run_compute_stats=True,
-    push_to_hub=True,
-    tags=None,
-    num_image_writer_processes=0,
-    num_image_writer_threads_per_camera=4,
-    force_override=False,
-    display_cameras=True,
-    play_sounds=True,
-):
+    warmup_time_s: int | float = 2,
+    episode_time_s: int | float = 10,
+    reset_time_s: int | float = 5,
+    num_episodes: int = 50,
+    video: bool = True,
+    run_compute_stats: bool = True,
+    push_to_hub: bool = True,
+    num_image_writer_processes: int = 0,
+    num_image_writer_threads_per_camera: int = 4,
+    display_cameras: bool = True,
+    play_sounds: bool = True,
+    tags: str = None,
+    force_override: bool = False,
+) -> LeRobotDataset:
    # TODO(rcadene): Add option to record logs
    listener = None
    events = None
@@ -219,6 +217,11 @@ def record(
    device = None
    use_amp = None

+    if single_task:
+        task = single_task
+    else:
+        raise NotImplementedError("Only single-task recording is supported for now")
+
    # Load pretrained policy
    if pretrained_policy_name_or_path is not None:
        policy, policy_fps, device, use_amp = init_policy(pretrained_policy_name_or_path, policy_overrides)
@@ -233,15 +236,14 @@ def record(

    # Create empty dataset or load existing saved episodes
    sanity_check_dataset_name(repo_id, policy)
-    dataset = init_dataset(
+    dataset = LeRobotDataset.create(
        repo_id,
-        root,
-        force_override,
        fps,
-        video,
-        write_images=robot.has_camera,
-        num_image_writer_processes=num_image_writer_processes,
-        num_image_writer_threads=num_image_writer_threads_per_camera * robot.num_cameras,
+        root=root,
+        robot=robot,
+        image_writer_processes=num_image_writer_processes,
+        image_writer_threads_per_camera=num_image_writer_threads_per_camera,
+        use_videos=video,
    )

    if not robot.is_connected:
@@ -260,11 +262,17 @@ def record(
    if has_method(robot, "teleop_safety_stop"):
        robot.teleop_safety_stop()

+    recorded_episodes = 0
    while True:
-        if dataset["num_episodes"] >= num_episodes:
+        if recorded_episodes >= num_episodes:
            break

-        episode_index = dataset["num_episodes"]
+        # TODO(aliberts): add task prompt for multitask here. Might need to temporarily disable event if
+        # input() messes with them.
+        # if multi_task:
+        #     task = input("Enter your task description: ")
+
+        episode_index = dataset.episode_buffer["episode_index"]
        log_say(f"Recording episode {episode_index}", play_sounds)
        record_episode(
            dataset=dataset,
@@ -292,11 +300,11 @@ def record(
            log_say("Re-record episode", play_sounds)
            events["rerecord_episode"] = False
            events["exit_early"] = False
-            delete_current_episode(dataset)
+            dataset.clear_episode_buffer()
            continue

-        # Increment by one dataset["current_episode_index"]
-        save_current_episode(dataset)
+        dataset.add_episode(task)
+        recorded_episodes += 1

        if events["stop_recording"]:
            break
@@ -304,35 +312,47 @@ def record(
    log_say("Stop recording", play_sounds, blocking=True)
    stop_recording(robot, listener, display_cameras)

-    lerobot_dataset = create_lerobot_dataset(dataset, run_compute_stats, push_to_hub, tags, play_sounds)
+    if dataset.image_writer is not None:
+        logging.info("Waiting for image writer to terminate...")
+        dataset.image_writer.stop()
+
+    if run_compute_stats:
+        logging.info("Computing dataset statistics")
+
+    dataset.consolidate(run_compute_stats)
+
+    # lerobot_dataset = create_lerobot_dataset(dataset, run_compute_stats, push_to_hub, tags, play_sounds)
+    if push_to_hub:
+        dataset.push_to_hub()

    log_say("Exiting", play_sounds)
-    return lerobot_dataset
+    return dataset


@safe_disconnect
 def replay(
-    robot: Robot, episode: int, fps: int | None = None, root="data", repo_id="lerobot/debug", play_sounds=True
+    robot: Robot,
+    root: Path,
+    repo_id: str,
+    episode: int,
+    fps: int | None = None,
+    play_sounds: bool = True,
+    local_files_only: bool = True,
 ):
    # TODO(rcadene, aliberts): refactor with control_loop, once `dataset` is an instance of LeRobotDataset
    # TODO(rcadene): Add option to record logs
-    local_dir = Path(root) / repo_id
-    if not local_dir.exists():
-        raise ValueError(local_dir)

-    dataset = LeRobotDataset(repo_id, root=root)
-    items = dataset.hf_dataset.select_columns("action")
-    from_idx = dataset.episode_data_index["from"][episode].item()
-    to_idx = dataset.episode_data_index["to"][episode].item()
+    dataset = LeRobotDataset(repo_id, root=root, episodes=[episode], local_files_only=local_files_only)
+    actions = dataset.hf_dataset.select_columns("action")

    if not robot.is_connected:
        robot.connect()

    log_say("Replaying episode", play_sounds, blocking=True)
-    for idx in range(from_idx, to_idx):
+    for idx in range(dataset.num_samples):
        start_episode_t = time.perf_counter()

-        action = items[idx]["action"]
+        action = actions[idx]["action"]
        robot.send_action(action)

        dt_s = time.perf_counter() - start_episode_t
@@ -381,9 +401,21 @@ if __name__ == "__main__":
    )

    parser_record = subparsers.add_parser("record", parents=[base_parser])
+    task_args = parser_record.add_mutually_exclusive_group(required=True)
    parser_record.add_argument(
        "--fps", type=none_or_int, default=None, help="Frames per second (set to None to disable)"
    )
+    task_args.add_argument(
+        "--single-task",
+        type=str,
+        help="A short but accurate description of the task performed during the recording.",
+    )
+    # TODO(aliberts): add multi-task support
+    # task_args.add_argument(
+    #     "--multi-task",
+    #     type=int,
+    #     help="You will need to enter the task performed at the start of each episode.",
+    # )
    parser_record.add_argument(
        "--root",
        type=Path,
--- a/lerobot/scripts/find_motors_bus_port.py
+++ b/lerobot/scripts/find_motors_bus_port.py
@@ -0,0 +1,36 @@
+import time
+from pathlib import Path
+
+
+def find_available_ports():
+    ports = []
+    for path in Path("/dev").glob("tty*"):
+        ports.append(str(path))
+    return ports
+
+
+def find_port():
+    print("Finding all available ports for the MotorsBus.")
+    ports_before = find_available_ports()
+    print(ports_before)
+
+    print("Remove the usb cable from your MotorsBus and press Enter when done.")
+    input()
+
+    time.sleep(0.5)
+    ports_after = find_available_ports()
+    ports_diff = list(set(ports_before) - set(ports_after))
+
+    if len(ports_diff) == 1:
+        port = ports_diff[0]
+        print(f"The port of this MotorsBus is '{port}'")
+        print("Reconnect the usb cable.")
+    elif len(ports_diff) == 0:
+        raise OSError(f"Could not detect the port. No difference was found ({ports_diff}).")
+    else:
+        raise OSError(f"Could not detect the port. More than one port was found ({ports_diff}).")
+
+
+if __name__ == "__main__":
+    # Helper to find the usb port associated to all your MotorsBus.
+    find_port()
--- a/lerobot/scripts/push_dataset_to_hub.py
+++ b/lerobot/scripts/push_dataset_to_hub.py
@@ -260,7 +260,7 @@ def push_dataset_to_hub(
        episode_index = 0
        tests_videos_dir = tests_data_dir / repo_id / "videos"
        tests_videos_dir.mkdir(parents=True, exist_ok=True)
-        for key in lerobot_dataset.video_frame_keys:
+        for key in lerobot_dataset.camera_keys:
            fname = f"{key}_episode_{episode_index:06d}.mp4"
            shutil.copy(videos_dir / fname, tests_videos_dir / fname)

--- a/lerobot/scripts/visualize_dataset_html.py
+++ b/lerobot/scripts/visualize_dataset_html.py
@@ -97,14 +97,13 @@ def run_server(
            "num_episodes": dataset.num_episodes,
            "fps": dataset.fps,
        }
-        video_paths = get_episode_video_paths(dataset, episode_id)
-        language_instruction = get_episode_language_instruction(dataset, episode_id)
+        video_paths = [dataset.get_video_file_path(episode_id, key) for key in dataset.video_keys]
+        tasks = dataset.episode_dicts[episode_id]["tasks"]
        videos_info = [
-            {"url": url_for("static", filename=video_path), "filename": Path(video_path).name}
+            {"url": url_for("static", filename=video_path), "filename": video_path.name}
            for video_path in video_paths
        ]
-        if language_instruction:
-            videos_info[0]["language_instruction"] = language_instruction
+        videos_info[0]["language_instruction"] = tasks

        ep_csv_url = url_for("static", filename=get_ep_csv_fname(episode_id))
        return render_template(
@@ -137,10 +136,10 @@ def write_episode_data_csv(output_dir, file_name, episode_index, dataset):
    # init header of csv with state and action names
    header = ["timestamp"]
    if has_state:
-        dim_state = len(dataset.hf_dataset["observation.state"][0])
+        dim_state = dataset.shapes["observation.state"]
        header += [f"state_{i}" for i in range(dim_state)]
    if has_action:
-        dim_action = len(dataset.hf_dataset["action"][0])
+        dim_action = dataset.shapes["action"]
        header += [f"action_{i}" for i in range(dim_action)]

    columns = ["timestamp"]
@@ -171,8 +170,7 @@ def get_episode_video_paths(dataset: LeRobotDataset, ep_index: int) -> list[str]
    # get first frame of episode (hack to get video_path of the episode)
    first_frame_idx = dataset.episode_data_index["from"][ep_index].item()
    return [
-        dataset.hf_dataset.select_columns(key)[first_frame_idx][key]["path"]
-        for key in dataset.video_frame_keys
+        dataset.hf_dataset.select_columns(key)[first_frame_idx][key]["path"] for key in dataset.video_keys
    ]


@@ -204,8 +202,8 @@ def visualize_dataset_html(

    dataset = LeRobotDataset(repo_id, root=root)

-    if not dataset.video:
-        raise NotImplementedError(f"Image datasets ({dataset.video=}) are currently not supported.")
+    if len(dataset.image_keys) > 0:
+        raise NotImplementedError(f"Image keys ({dataset.image_keys=}) are currently not supported.")

    if output_dir is None:
        output_dir = f"outputs/visualize_dataset_html/{repo_id}"
@@ -225,7 +223,7 @@ def visualize_dataset_html(
    static_dir.mkdir(parents=True, exist_ok=True)
    ln_videos_dir = static_dir / "videos"
    if not ln_videos_dir.exists():
-        ln_videos_dir.symlink_to(dataset.videos_dir.resolve())
+        ln_videos_dir.symlink_to((dataset.root / "videos").resolve())

    template_dir = Path(__file__).resolve().parent.parent / "templates"

--- a/media/moss/follower_initial.webp
+++ b/media/moss/follower_initial.webp
--- a/media/moss/leader_rest.webp
+++ b/media/moss/leader_rest.webp
--- a/media/moss/leader_rotated.webp
+++ b/media/moss/leader_rotated.webp
--- a/media/moss/leader_zero.webp
+++ b/media/moss/leader_zero.webp
--- a/media/so100/follower_initial.webp
+++ b/media/so100/follower_initial.webp
--- a/media/so100/leader_rest.webp
+++ b/media/so100/leader_rest.webp
--- a/media/so100/leader_rotated.webp
+++ b/media/so100/leader_rotated.webp
--- a/media/so100/leader_zero.webp
+++ b/media/so100/leader_zero.webp
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -43,9 +43,8 @@ opencv-python = ">=4.9.0"
 diffusers = ">=0.27.2"
 torchvision = ">=0.17.1"
 h5py = ">=3.10.0"
-huggingface-hub = {extras = ["hf-transfer", "cli"], version = ">=0.25.0"}
-# TODO(rcadene, aliberts): Make gym 1.0.0 work
-gymnasium = "==0.29.1"
+huggingface-hub = {extras = ["hf-transfer", "cli"], version = ">=0.25.2"}
+gymnasium = "==0.29.1" # TODO(rcadene, aliberts): Make gym 1.0.0 work
 cmake = ">=3.29.0.1"
 gym-dora = { git = "https://github.com/dora-rs/dora-lerobot.git", subdirectory = "gym_dora", optional = true }
 gym-pusht = { version = ">=0.1.5", optional = true}
@@ -65,11 +64,13 @@ pandas = {version = ">=2.2.2", optional = true}
 scikit-image = {version = ">=0.23.2", optional = true}
 dynamixel-sdk = {version = ">=3.7.31", optional = true}
 pynput = {version = ">=1.7.7", optional = true}
+feetech-servo-sdk = {version = ">=1.0.0", optional = true}
 setuptools = {version = "!=71.0.1", optional = true}  # TODO(rcadene, aliberts): 71.0.1 has a bug
 pyrealsense2 = {version = ">=2.55.1.6486", markers = "sys_platform != 'darwin'", optional = true}  # TODO(rcadene, aliberts): Fix on Mac
 pyrender = {git = "https://github.com/mmatl/pyrender.git", markers = "sys_platform == 'linux'", optional = true}
 hello-robot-stretch-body = {version = ">=0.7.27", markers = "sys_platform == 'linux'", optional = true}
 pyserial = {version = ">=3.5", optional = true}
+jsonlines = ">=4.0.0"


 [tool.poetry.extras]
@@ -82,6 +83,7 @@ test = ["pytest", "pytest-cov", "pyserial"]
 umi = ["imagecodecs"]
 video_benchmark = ["scikit-image", "pandas"]
 dynamixel = ["dynamixel-sdk", "pynput"]
+feetech = ["feetech-servo-sdk", "pynput"]
 intelrealsense = ["pyrealsense2"]
 stretch = ["hello-robot-stretch-body", "pyrender", "pyrealsense2", "pynput"]

--- a/tests/mock_dynamixel_sdk.py
+++ b/tests/mock_dynamixel_sdk.py
@@ -18,6 +18,19 @@ def convert_to_bytes(value, bytes):
    return value


+def get_default_motor_values(motor_index):
+    return {
+        # Key (int) are from X_SERIES_CONTROL_TABLE
+        7: motor_index,  # ID
+        8: DEFAULT_BAUDRATE,  # Baud_rate
+        10: 0,  # Drive_Mode
+        64: 0,  # Torque_Enable
+        # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
+        # For other joints, 2560 will be autocorrected to be in calibration range
+        132: 2560,  # Present_Position
+    }
+
+
 class PortHandler:
    def __init__(self, port):
        self.port = port
@@ -52,18 +65,9 @@ class GroupSyncRead:
        self.packet_handler = packet_handler

    def addParam(self, motor_index):  # noqa: N802
+        # Initialize motor default values
        if motor_index not in self.packet_handler.data:
-            # Initialize motor default values
-            self.packet_handler.data[motor_index] = {
-                # Key (int) are from X_SERIES_CONTROL_TABLE
-                7: motor_index,  # ID
-                8: DEFAULT_BAUDRATE,  # Baud_rate
-                10: 0,  # Drive_Mode
-                64: 0,  # Torque_Enable
-                # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
-                # For other joints, 2560 will be autocorrected to be in calibration range
-                132: 2560,  # Present_Position
-            }
+            self.packet_handler.data[motor_index] = get_default_motor_values(motor_index)

    def txRxPacket(self):  # noqa: N802
        return COMM_SUCCESS
@@ -78,6 +82,9 @@ class GroupSyncWrite:
        self.address = address

    def addParam(self, index, data):  # noqa: N802
+        # Initialize motor default values
+        if index not in self.packet_handler.data:
+            self.packet_handler.data[index] = get_default_motor_values(index)
        self.changeParam(index, data)

    def txPacket(self):  # noqa: N802
--- a/tests/mock_scservo_sdk.py
+++ b/tests/mock_scservo_sdk.py
@@ -0,0 +1,103 @@
+"""Mocked classes and functions from dynamixel_sdk to allow for continuous integration
+and testing code logic that requires hardware and devices (e.g. robot arms, cameras)
+
+Warning: These mocked versions are minimalist. They do not exactly mock every behaviors
+from the original classes and functions (e.g. return types might be None instead of boolean).
+"""
+
+# from dynamixel_sdk import COMM_SUCCESS
+
+DEFAULT_BAUDRATE = 1_000_000
+COMM_SUCCESS = 0  # tx or rx packet communication success
+
+
+def convert_to_bytes(value, bytes):
+    # TODO(rcadene): remove need to mock `convert_to_bytes` by implemented the inverse transform
+    # `convert_bytes_to_value`
+    del bytes  # unused
+    return value
+
+
+def get_default_motor_values(motor_index):
+    return {
+        # Key (int) are from SCS_SERIES_CONTROL_TABLE
+        5: motor_index,  # ID
+        6: DEFAULT_BAUDRATE,  # Baud_rate
+        10: 0,  # Drive_Mode
+        21: 32,  # P_Coefficient
+        22: 32,  # D_Coefficient
+        23: 0,  # I_Coefficient
+        40: 0,  # Torque_Enable
+        41: 254,  # Acceleration
+        31: -2047,  # Offset
+        33: 0,  # Mode
+        55: 1,  # Lock
+        # Set 2560 since calibration values for Aloha gripper is between start_pos=2499 and end_pos=3144
+        # For other joints, 2560 will be autocorrected to be in calibration range
+        56: 2560,  # Present_Position
+        58: 0,  # Present_Speed
+        69: 0,  # Present_Current
+        85: 150,  # Maximum_Acceleration
+    }
+
+
+class PortHandler:
+    def __init__(self, port):
+        self.port = port
+        # factory default baudrate
+        self.baudrate = DEFAULT_BAUDRATE
+
+    def openPort(self):  # noqa: N802
+        return True
+
+    def closePort(self):  # noqa: N802
+        pass
+
+    def setPacketTimeoutMillis(self, timeout_ms):  # noqa: N802
+        del timeout_ms  # unused
+
+    def getBaudRate(self):  # noqa: N802
+        return self.baudrate
+
+    def setBaudRate(self, baudrate):  # noqa: N802
+        self.baudrate = baudrate
+
+
+class PacketHandler:
+    def __init__(self, protocol_version):
+        del protocol_version  # unused
+        # Use packet_handler.data to communicate across Read and Write
+        self.data = {}
+
+
+class GroupSyncRead:
+    def __init__(self, port_handler, packet_handler, address, bytes):
+        self.packet_handler = packet_handler
+
+    def addParam(self, motor_index):  # noqa: N802
+        # Initialize motor default values
+        if motor_index not in self.packet_handler.data:
+            self.packet_handler.data[motor_index] = get_default_motor_values(motor_index)
+
+    def txRxPacket(self):  # noqa: N802
+        return COMM_SUCCESS
+
+    def getData(self, index, address, bytes):  # noqa: N802
+        return self.packet_handler.data[index][address]
+
+
+class GroupSyncWrite:
+    def __init__(self, port_handler, packet_handler, address, bytes):
+        self.packet_handler = packet_handler
+        self.address = address
+
+    def addParam(self, index, data):  # noqa: N802
+        if index not in self.packet_handler.data:
+            self.packet_handler.data[index] = get_default_motor_values(index)
+        self.changeParam(index, data)
+
+    def txPacket(self):  # noqa: N802
+        return COMM_SUCCESS
+
+    def changeParam(self, index, data):  # noqa: N802
+        self.packet_handler.data[index][self.address] = data
--- a/tests/test_control_robot.py
+++ b/tests/test_control_robot.py
@@ -29,14 +29,13 @@ from unittest.mock import patch

 import pytest

-from lerobot.common.datasets.populate_dataset import add_frame, init_dataset
 from lerobot.common.logger import Logger
 from lerobot.common.policies.factory import make_policy
 from lerobot.common.utils.utils import init_hydra_config
 from lerobot.scripts.control_robot import calibrate, record, replay, teleoperate
 from lerobot.scripts.train import make_optimizer_and_scheduler
 from tests.test_robots import make_robot
-from tests.utils import DEFAULT_CONFIG_PATH, DEVICE, TEST_ROBOT_TYPES, require_robot
+from tests.utils import DEFAULT_CONFIG_PATH, DEVICE, TEST_ROBOT_TYPES, mock_calibration_dir, require_robot


@pytest.mark.parametrize("robot_type, mock", TEST_ROBOT_TYPES)
@@ -49,6 +48,7 @@ def test_teleoperate(tmpdir, request, robot_type, mock):
        # and avoid writing calibration files in user .cache/calibration folder
        tmpdir = Path(tmpdir)
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False
@@ -89,10 +89,12 @@ def test_record_without_cameras(tmpdir, request, robot_type, mock):
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = Path(tmpdir) / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides.append(f"calibration_dir={calibration_dir}")

-    root = Path(tmpdir) / "data"
    repo_id = "lerobot/debug"
+    root = Path(tmpdir) / "data" / repo_id
+    single_task = "Do something."

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
    record(
@@ -100,6 +102,7 @@ def test_record_without_cameras(tmpdir, request, robot_type, mock):
        fps=30,
        root=root,
        repo_id=repo_id,
+        single_task=single_task,
        warmup_time_s=1,
        episode_time_s=1,
        num_episodes=2,
@@ -121,6 +124,7 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False or for aloha
@@ -129,17 +133,18 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
    env_name = "koch_real"
    policy_name = "act_koch_real"

-    root = tmpdir / "data"
    repo_id = "lerobot/debug"
-    eval_repo_id = "lerobot/eval_debug"
+    root = tmpdir / "data" / repo_id
+    single_task = "Do something."

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
    dataset = record(
        robot,
        root,
        repo_id,
-        fps=1,
-        warmup_time_s=1,
+        single_task,
+        fps=5,
+        warmup_time_s=0.5,
        episode_time_s=1,
        reset_time_s=1,
        num_episodes=2,
@@ -150,16 +155,16 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
        display_cameras=False,
        play_sounds=False,
    )
-    assert dataset.num_episodes == 2
-    assert len(dataset) == 2
+    assert dataset.total_episodes == 2
+    assert len(dataset) == 10

-    replay(robot, episode=0, fps=1, root=root, repo_id=repo_id, play_sounds=False)
+    replay(robot, episode=0, fps=5, root=root, repo_id=repo_id, play_sounds=False)

    # TODO(rcadene, aliberts): rethink this design
    if robot_type == "aloha":
        env_name = "aloha_real"
        policy_name = "act_aloha_real"
-    elif robot_type in ["koch", "koch_bimanual"]:
+    elif robot_type in ["koch", "koch_bimanual", "so100", "moss"]:
        env_name = "koch_real"
        policy_name = "act_koch_real"
    else:
@@ -216,10 +221,14 @@ def test_record_and_replay_and_policy(tmpdir, request, robot_type, mock):
    else:
        num_image_writer_processes = 0

-    record(
+    eval_repo_id = "lerobot/eval_debug"
+    eval_root = tmpdir / "data" / eval_repo_id
+
+    dataset = record(
        robot,
-        root,
+        eval_root,
        eval_repo_id,
+        single_task,
        pretrained_policy_name_or_path,
        warmup_time_s=1,
        episode_time_s=1,
@@ -248,6 +257,7 @@ def test_resume_record(tmpdir, request, robot_type, mock):
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False or for aloha
@@ -255,13 +265,15 @@ def test_resume_record(tmpdir, request, robot_type, mock):

    robot = make_robot(robot_type, overrides=overrides, mock=mock)

-    root = Path(tmpdir) / "data"
    repo_id = "lerobot/debug"
+    root = Path(tmpdir) / "data" / repo_id
+    single_task = "Do something."

    dataset = record(
        robot,
        root,
        repo_id,
+        single_task,
        fps=1,
        warmup_time_s=0,
        episode_time_s=1,
@@ -274,32 +286,33 @@ def test_resume_record(tmpdir, request, robot_type, mock):
    )
    assert len(dataset) == 1, "`dataset` should contain only 1 frame"

-    init_dataset_return_value = {}
+    # init_dataset_return_value = {}

-    def wrapped_init_dataset(*args, **kwargs):
-        nonlocal init_dataset_return_value
-        init_dataset_return_value = init_dataset(*args, **kwargs)
-        return init_dataset_return_value
+    # def wrapped_init_dataset(*args, **kwargs):
+    #     nonlocal init_dataset_return_value
+    #     init_dataset_return_value = init_dataset(*args, **kwargs)
+    #     return init_dataset_return_value

-    with patch("lerobot.scripts.control_robot.init_dataset", wraps=wrapped_init_dataset):
-        dataset = record(
-            robot,
-            root,
-            repo_id,
-            fps=1,
-            warmup_time_s=0,
-            episode_time_s=1,
-            num_episodes=2,
-            push_to_hub=False,
-            video=False,
-            display_cameras=False,
-            play_sounds=False,
-            run_compute_stats=False,
-        )
-        assert len(dataset) == 2, "`dataset` should contain only 1 frame"
-        assert (
-            init_dataset_return_value["num_episodes"] == 2
-        ), "`init_dataset` should load the previous episode"
+    # with patch("lerobot.scripts.control_robot.init_dataset", wraps=wrapped_init_dataset):
+    dataset = record(
+        robot,
+        root,
+        repo_id,
+        single_task,
+        fps=1,
+        warmup_time_s=0,
+        episode_time_s=1,
+        num_episodes=2,
+        push_to_hub=False,
+        video=False,
+        display_cameras=False,
+        play_sounds=False,
+        run_compute_stats=False,
+    )
+    assert len(dataset) == 2, "`dataset` should contain only 1 frame"
+    # assert (
+    #     init_dataset_return_value["num_episodes"] == 2
+    # ), "`init_dataset` should load the previous episode"


@pytest.mark.parametrize("robot_type, mock", [("koch", True)])
@@ -311,29 +324,29 @@ def test_record_with_event_rerecord_episode(tmpdir, request, robot_type, mock):
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False or for aloha
        overrides = []

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
-    with (
-        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
-        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
-    ):
+    with patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener:
        mock_events = {}
        mock_events["exit_early"] = True
        mock_events["rerecord_episode"] = True
        mock_events["stop_recording"] = False
        mock_listener.return_value = (None, mock_events)

-        root = Path(tmpdir) / "data"
        repo_id = "lerobot/debug"
+        root = Path(tmpdir) / "data" / repo_id
+        single_task = "Do something."

        dataset = record(
            robot,
            root,
            repo_id,
+            single_task,
            fps=1,
            warmup_time_s=0,
            episode_time_s=1,
@@ -347,7 +360,6 @@ def test_record_with_event_rerecord_episode(tmpdir, request, robot_type, mock):

        assert not mock_events["rerecord_episode"], "`rerecord_episode` wasn't properly reset to False"
        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
-        assert mock_add_frame.call_count == 2, "`add_frame` should have been called 2 times"
        assert len(dataset) == 1, "`dataset` should contain only 1 frame"


@@ -360,29 +372,29 @@ def test_record_with_event_exit_early(tmpdir, request, robot_type, mock):
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False or for aloha
        overrides = []

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
-    with (
-        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
-        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
-    ):
+    with patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener:
        mock_events = {}
        mock_events["exit_early"] = True
        mock_events["rerecord_episode"] = False
        mock_events["stop_recording"] = False
        mock_listener.return_value = (None, mock_events)

-        root = Path(tmpdir) / "data"
        repo_id = "lerobot/debug"
+        root = Path(tmpdir) / "data" / repo_id
+        single_task = "Do something."

        dataset = record(
            robot,
            fps=2,
            root=root,
+            single_task=single_task,
            repo_id=repo_id,
            warmup_time_s=0,
            episode_time_s=1,
@@ -395,7 +407,6 @@ def test_record_with_event_exit_early(tmpdir, request, robot_type, mock):
        )

        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
-        assert mock_add_frame.call_count == 1, "`add_frame` should have been called 1 time"
        assert len(dataset) == 1, "`dataset` should contain only 1 frame"


@@ -410,29 +421,29 @@ def test_record_with_event_stop_recording(tmpdir, request, robot_type, mock, num
        # Create an empty calibration directory to trigger manual calibration
        # and avoid writing calibration files in user .cache/calibration folder
        calibration_dir = tmpdir / robot_type
+        mock_calibration_dir(calibration_dir)
        overrides = [f"calibration_dir={calibration_dir}"]
    else:
        # Use the default .cache/calibration folder when mock=False or for aloha
        overrides = []

    robot = make_robot(robot_type, overrides=overrides, mock=mock)
-    with (
-        patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener,
-        patch("lerobot.common.robot_devices.control_utils.add_frame", wraps=add_frame) as mock_add_frame,
-    ):
+    with patch("lerobot.scripts.control_robot.init_keyboard_listener") as mock_listener:
        mock_events = {}
        mock_events["exit_early"] = True
        mock_events["rerecord_episode"] = False
        mock_events["stop_recording"] = True
        mock_listener.return_value = (None, mock_events)

-        root = Path(tmpdir) / "data"
        repo_id = "lerobot/debug"
+        root = Path(tmpdir) / "data" / repo_id
+        single_task = "Do something."

        dataset = record(
            robot,
            root,
            repo_id,
+            single_task=single_task,
            fps=1,
            warmup_time_s=0,
            episode_time_s=1,
@@ -446,5 +457,4 @@ def test_record_with_event_stop_recording(tmpdir, request, robot_type, mock, num
        )

        assert not mock_events["exit_early"], "`exit_early` wasn't properly reset to False"
-        assert mock_add_frame.call_count == 1, "`add_frame` should have been called 1 time"
        assert len(dataset) == 1, "`dataset` should contain only 1 frame"
--- a/tests/test_datasets.py
+++ b/tests/test_datasets.py
@@ -42,7 +42,28 @@ from lerobot.common.datasets.utils import (
    unflatten_dict,
 )
 from lerobot.common.utils.utils import init_hydra_config, seeded_context
-from tests.utils import DEFAULT_CONFIG_PATH, DEVICE
+from tests.utils import DEFAULT_CONFIG_PATH, DEVICE, make_robot
+
+# TODO(aliberts): create proper test repo
+TEST_REPO_ID = "aliberts/koch_tutorial"
+
+
+def test_same_attributes_defined():
+    # TODO(aliberts): test with keys, shapes, names etc. provided instead of robot
+    robot = make_robot("koch", mock=True)
+
+    # Instantiate both ways
+    dataset_init = LeRobotDataset(repo_id=TEST_REPO_ID)
+    dataset_create = LeRobotDataset.create(repo_id=TEST_REPO_ID, fps=30, robot=robot)
+
+    # Access the '_hub_version' cached_property in both instances to force its creation
+    _ = dataset_init._hub_version
+    _ = dataset_create._hub_version
+
+    init_attr = set(vars(dataset_init).keys())
+    create_attr = set(vars(dataset_create).keys())
+
+    assert init_attr == create_attr, "Attribute sets do not match between __init__ and .create()"


@pytest.mark.parametrize(
--- a/tests/test_motors.py
+++ b/tests/test_motors.py
@@ -30,8 +30,8 @@ import time
 import numpy as np
 import pytest

-from lerobot.common.robot_devices.motors.dynamixel import find_port
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
+from lerobot.scripts.find_motors_bus_port import find_port
 from tests.utils import TEST_MOTOR_TYPES, make_motors_bus, require_motor


@@ -52,12 +52,24 @@ def test_configure_motors_all_ids_1(request, motor_type, mock):
    if mock:
        request.getfixturevalue("patch_builtins_input")

+    if motor_type == "dynamixel":
+        # see X_SERIES_BAUDRATE_TABLE
+        smaller_baudrate = 9_600
+        smaller_baudrate_value = 0
+    elif motor_type == "feetech":
+        # see SCS_SERIES_BAUDRATE_TABLE
+        smaller_baudrate = 19_200
+        smaller_baudrate_value = 7
+    else:
+        raise ValueError(motor_type)
+
    input("Are you sure you want to re-configure the motors? Press enter to continue...")
    # This test expect the configuration was already correct.
    motors_bus = make_motors_bus(motor_type, mock=mock)
    motors_bus.connect()
-    motors_bus.write("Baud_Rate", [0] * len(motors_bus.motors))
-    motors_bus.set_bus_baudrate(9_600)
+    motors_bus.write("Baud_Rate", [smaller_baudrate_value] * len(motors_bus.motors))
+
+    motors_bus.set_bus_baudrate(smaller_baudrate)
    motors_bus.write("ID", [1] * len(motors_bus.motors))
    del motors_bus

--- a/tests/test_robots.py
+++ b/tests/test_robots.py
@@ -30,7 +30,7 @@ import torch

 from lerobot.common.robot_devices.robots.manipulator import ManipulatorRobot
 from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
-from tests.utils import TEST_ROBOT_TYPES, make_robot, require_robot
+from tests.utils import TEST_ROBOT_TYPES, make_robot, mock_calibration_dir, require_robot


@pytest.mark.parametrize("robot_type, mock", TEST_ROBOT_TYPES)
@@ -54,6 +54,7 @@ def test_robot(tmpdir, request, robot_type, mock):
        tmpdir = Path(tmpdir)
        calibration_dir = tmpdir / robot_type
        overrides_calibration_dir = [f"calibration_dir={calibration_dir}"]
+        mock_calibration_dir(calibration_dir)
        robot_kwargs["calibration_dir"] = calibration_dir

    # Test connecting without devices raises an error
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -13,10 +13,12 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import json
 import os
 import platform
 from copy import copy
 from functools import wraps
+from pathlib import Path

 import pytest
 import torch
@@ -52,7 +54,7 @@ for motor_type in available_motors:
 OPENCV_CAMERA_INDEX = int(os.environ.get("LEROBOT_TEST_OPENCV_CAMERA_INDEX", 0))
 INTELREALSENSE_CAMERA_INDEX = int(os.environ.get("LEROBOT_TEST_INTELREALSENSE_CAMERA_INDEX", 128422271614))

-DYNAMIXEL_PORT = "/dev/tty.usbmodem575E0032081"
+DYNAMIXEL_PORT = os.environ.get("LEROBOT_TEST_DYNAMIXEL_PORT", "/dev/tty.usbmodem575E0032081")
 DYNAMIXEL_MOTORS = {
    "shoulder_pan": [1, "xl430-w250"],
    "shoulder_lift": [2, "xl430-w250"],
@@ -62,6 +64,16 @@ DYNAMIXEL_MOTORS = {
    "gripper": [6, "xl330-m288"],
 }

+FEETECH_PORT = os.environ.get("LEROBOT_TEST_FEETECH_PORT", "/dev/tty.usbmodem585A0080971")
+FEETECH_MOTORS = {
+    "shoulder_pan": [1, "sts3215"],
+    "shoulder_lift": [2, "sts3215"],
+    "elbow_flex": [3, "sts3215"],
+    "wrist_flex": [4, "sts3215"],
+    "wrist_roll": [5, "sts3215"],
+    "gripper": [6, "sts3215"],
+}
+

 def require_x86_64_kernel(func):
    """
@@ -271,13 +283,39 @@ def require_motor(func):
    return wrapper


+def mock_calibration_dir(calibration_dir):
+    # TODO(rcadene): remove this hack
+    # calibration file produced with Moss v1, but works with Koch, Koch bimanual and SO-100
+    example_calib = {
+        "homing_offset": [-1416, -845, 2130, 2872, 1950, -2211],
+        "drive_mode": [0, 0, 1, 1, 1, 0],
+        "start_pos": [1442, 843, 2166, 2849, 1988, 1835],
+        "end_pos": [2440, 1869, -1106, -1848, -926, 3235],
+        "calib_mode": ["DEGREE", "DEGREE", "DEGREE", "DEGREE", "DEGREE", "LINEAR"],
+        "motor_names": ["shoulder_pan", "shoulder_lift", "elbow_flex", "wrist_flex", "wrist_roll", "gripper"],
+    }
+    Path(str(calibration_dir)).mkdir(parents=True, exist_ok=True)
+    with open(calibration_dir / "main_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "main_leader.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "left_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "left_leader.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "right_follower.json", "w") as f:
+        json.dump(example_calib, f)
+    with open(calibration_dir / "right_leader.json", "w") as f:
+        json.dump(example_calib, f)
+
+
 def make_robot(robot_type: str, overrides: list[str] | None = None, mock=False) -> Robot:
    if mock:
        overrides = [] if overrides is None else copy(overrides)

        # Explicitely add mock argument to the cameras and set it to true
        # TODO(rcadene, aliberts): redesign when we drop hydra
-        if robot_type == "koch":
+        if robot_type in ["koch", "so100", "moss"]:
            overrides.append("+leader_arms.main.mock=true")
            overrides.append("+follower_arms.main.mock=true")
            if "~cameras" not in overrides:
@@ -338,5 +376,12 @@ def make_motors_bus(motor_type: str, **kwargs) -> MotorsBus:
        motors = kwargs.pop("motors", DYNAMIXEL_MOTORS)
        return DynamixelMotorsBus(port, motors, **kwargs)

+    elif motor_type == "feetech":
+        from lerobot.common.robot_devices.motors.feetech import FeetechMotorsBus
+
+        port = kwargs.pop("port", FEETECH_PORT)
+        motors = kwargs.pop("motors", FEETECH_MOTORS)
+        return FeetechMotorsBus(port, motors, **kwargs)
+
    else:
        raise ValueError(f"The motor type '{motor_type}' is not valid.")
Author	SHA1	Message	Date
Remi Cadene	abbf9c5bee	Fix record	2024-10-24 15:27:33 +02:00
Remi Cadene	102e1d4436	Merge remote-tracking branch 'origin/user/aliberts/2024_09_25_reshape_dataset' into user/rcadene/2024_10_24_feetech_dataset_v2	2024-10-24 13:23:31 +02:00
Remi Cadene	35dd9f88bf	Revert "isolate tests" This reverts commit `8d624643aa`.	2024-10-24 11:52:16 +02:00
Remi Cadene	3368e8c4d7	move mock_calibration_dir in utils	2024-10-24 11:45:02 +02:00
Remi Cadene	8d624643aa	isolate tests	2024-10-24 11:44:39 +02:00
Simon Alibert	8bcf81fa24	Add todo	2024-10-24 11:38:32 +02:00
Simon Alibert	615894d3fb	Add test_same_attributes_defined	2024-10-24 11:37:44 +02:00
Remi Cadene	4d03ece607	fix unit tests	2024-10-24 11:15:18 +02:00
Remi Cadene	5d64ba5da1	fix unit tests	2024-10-24 10:26:25 +02:00
Simon Alibert	450eae310b	Add error msg	2024-10-24 00:13:53 +02:00
Simon Alibert	60865e8980	Allow dataset creation without robot	2024-10-24 00:13:21 +02:00
Simon Alibert	0d77be90ee	Move ImageWriter creation inside the dataset	2024-10-23 23:12:44 +02:00
Simon Alibert	0098bd264e	Nits	2024-10-23 20:55:54 +02:00
Simon Alibert	1aba80d93f	Fix consolidate	2024-10-23 18:45:59 +02:00
Remi Cadene	2b558dfc0c	Add mock to feetech	2024-10-23 18:22:52 +02:00
Simon Alibert	07570f867f	Fix _query_videos return shapes	2024-10-23 18:18:28 +02:00
Simon Alibert	b8bdbc1c5b	Fix check_delta_timestamps	2024-10-23 18:17:56 +02:00
Remi Cadene	67b28e17dc	first auto-review	2024-10-23 17:50:40 +02:00
Remi Cadene	68d7ab9729	Merge remote-tracking branch 'origin/main' into user/rcadene/2024_09_04_feetech	2024-10-23 17:01:30 +02:00
Remi Cadene	ea6b27d6c7	Fix configure, autocalibration, Add media	2024-10-23 16:11:26 +02:00
Simon Alibert	7ae8d05326	Fix visualization	2024-10-23 14:20:27 +02:00
Simon Alibert	a2a8538ac9	add write_stats, changes names, add some typing	2024-10-23 11:38:07 +02:00
Simon Alibert	fb73cdb9a4	Update dataset doc	2024-10-23 00:32:28 +02:00
Simon Alibert	9dca233d7e	Fix episode chunk	2024-10-23 00:27:14 +02:00
Simon Alibert	c3c0141738	Update & fix conversion script	2024-10-23 00:05:31 +02:00
Simon Alibert	c72dc23c43	Remove total_episodes from default parquet path	2024-10-23 00:03:30 +02:00
Simon Alibert	237a484be0	Fix paths & add add_frame doc	2024-10-22 22:46:34 +02:00
Simon Alibert	6c2cb6e107	Remove populate dataset	2024-10-22 20:21:26 +02:00
Simon Alibert	b46db7ea73	Fix tests	2024-10-22 20:14:06 +02:00
Simon Alibert	ee52b8b782	Add channels to intelrealsense	2024-10-22 20:07:11 +02:00
Simon Alibert	a805458c7e	Add local_files_only, encode_videos, fix bugs to pass tests (WIP)	2024-10-22 19:57:52 +02:00
Simon Alibert	e991a31061	Improve consistency between __init__() and create(), WIP on consolidate	2024-10-22 00:19:25 +02:00
Simon Alibert	c4c0a43de7	add delete_episode, WIP on consolidate	2024-10-21 20:10:13 +02:00
Simon Alibert	299451af81	Add add_episode & task logic	2024-10-21 19:30:20 +02:00
Simon Alibert	9ebf8b88ec	Merge remote-tracking branch 'origin/main' into user/aliberts/2024_09_25_reshape_dataset	2024-10-21 00:22:30 +02:00
Simon Alibert	c1232a01e2	Add add_frame, empty dataset creation	2024-10-21 00:16:52 +02:00
Simon Alibert	3b925c3dce	Add ImageWriter	2024-10-21 00:15:09 +02:00
Simon Alibert	e46bdb9d30	Change card creation	2024-10-20 14:01:10 +02:00
Simon Alibert	9316cf46ef	Add file paths	2024-10-20 14:00:19 +02:00
Remi Cadene	1d92acf1e3	fix	2024-10-20 02:56:20 +02:00
Remi Cadene	bee2b3c1b1	Update tutorial	2024-10-19 17:59:05 +02:00
Remi Cadene	48da694883	auto calibration works	2024-10-19 17:52:47 +02:00
Simon Alibert	ac3798bd62	Move default paths, use jsonlines for tasks	2024-10-18 17:53:25 +02:00
Simon Alibert	bce3dc3bfa	Add load_metadata	2024-10-18 14:59:09 +02:00
Simon Alibert	1a51505ec6	Add download_metadata, move default paths	2024-10-18 14:48:34 +02:00
Simon Alibert	e7355ba595	Fix episodes.jsonl	2024-10-18 12:03:29 +02:00
Remi Cadene	994209d1b0	Refactor to have dynamixel_calibration and feetech_calibration	2024-10-18 11:31:37 +02:00
Simon Alibert	91e8ce772b	Remove caret requirement	2024-10-18 09:48:38 +02:00
Simon Alibert	3a9f964429	Add copyrights	2024-10-18 00:31:21 +02:00
Simon Alibert	beacb7e957	Cleanup	2024-10-18 00:30:16 +02:00
Simon Alibert	be64d54bd9	Update doc	2024-10-18 00:29:50 +02:00
Simon Alibert	354f37aed3	Merge remote-tracking branch 'origin/main' into user/aliberts/2024_09_25_reshape_dataset	2024-10-17 23:34:00 +02:00
Simon Alibert	d0d8193625	Add unitreeh and umi	2024-10-17 23:33:51 +02:00
Simon Alibert	7242c57400	Cleanup	2024-10-17 16:08:37 +02:00
Simon Alibert	3ee3739edc	Add batch conversion log	2024-10-17 13:08:58 +02:00
Simon Alibert	ad3f112d16	Add fixes for lfs tracking	2024-10-17 12:58:48 +02:00
Simon Alibert	50a75ad3fe	Write episodes as jsonlines	2024-10-17 10:17:27 +02:00
Simon Alibert	c146ba936f	Add episode chunks logic, move_videos & lfs tracking fix	2024-10-16 23:34:54 +02:00
Remi Cadene	1990f9c9bc	make it work	2024-10-16 20:37:46 +02:00
Remi Cadene	79ac1ad271	fix teleop	2024-10-16 20:36:57 +02:00
Remi Cadene	eeea3e5d1d	merge but remove refactor of save_camera_images	2024-10-16 18:10:01 +02:00
Remi Cadene	cfa5ce02b8	Merge remote-tracking branch 'origin/user/rcadene/2024_09_04_feetech' into user/rcadene/2024_09_04_feetech	2024-10-16 16:58:35 +02:00
Remi Cadene	29f6abc997	Add safe_disconnect to replay	2024-10-15 19:38:40 +02:00
Simon Alibert	110264000f	Add fixes for batch convert	2024-10-15 19:03:11 +02:00
Remi Cadene	19d410a372	fix unit tests	2024-10-15 18:51:55 +02:00
Remi Cadene	cb30d7a8bf	Do not override fps	2024-10-15 18:29:25 +02:00
Remi	3960125b07	Update lerobot/common/robot_devices/robots/manipulator.py Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-10-15 18:06:36 +02:00
Remi	091177d302	Update .github/workflows/test.yml Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-10-15 18:06:17 +02:00
Simon Alibert	9433ac52ec	WIP add batch convert	2024-10-15 13:08:31 +02:00
Simon Alibert	da78bbfd16	Update load_tasks doc	2024-10-15 11:06:28 +02:00
Simon Alibert	835ab5a81b	Cleanup, fix load_tasks	2024-10-15 11:05:16 +02:00
Remi Cadene	cde4945b29	small fix unit tests	2024-10-14 17:19:25 +02:00
Remi Cadene	e0394961cb	small fix unit tests	2024-10-14 17:08:08 +02:00
Remi Cadene	87fef10f89	small fix	2024-10-14 16:30:10 +02:00
Remi Cadene	99414f356f	Handle fps=None in control_loop	2024-10-14 15:54:25 +02:00
Remi Cadene	94c3a480e1	small fix	2024-10-14 15:27:19 +02:00
Remi Cadene	904aaa497c	Refactor -> control_loop()	2024-10-14 15:15:28 +02:00
Simon Alibert	f96773de10	Fix safe_version	2024-10-14 13:51:40 +02:00
Simon Alibert	cbc51e1341	Extend v1 compatibility	2024-10-14 10:14:27 +02:00
Simon Alibert	cf633344be	Add multitask support, refactor conversion script	2024-10-13 21:21:40 +02:00
Remi Cadene	20f5ac31fa	remove test isolation	2024-10-13 18:38:13 +02:00
Remi Cadene	daf18dc36e	isolate tests	2024-10-13 18:32:59 +02:00
Remi Cadene	eed7b55fe3	Fix unit tests	2024-10-13 18:31:34 +02:00
Remi Cadene	d02e204e10	test	2024-10-13 17:15:40 +02:00
Remi Cadene	1f0bdecaf8	test	2024-10-13 17:11:07 +02:00
Remi Cadene	876fcc4a27	setup python 3.10 before	2024-10-13 16:46:55 +02:00
Remi Cadene	71245e3746	python --version	2024-10-13 16:43:10 +02:00
Remi Cadene	b1fd099228	TOREMOVE: isolate test	2024-10-13 16:27:46 +02:00
Remi Cadene	517a261d71	reset_time_s=1 in tests	2024-10-13 16:27:14 +02:00
Remi Cadene	63bd5013bf	overall improve, fix some issues with events, add some tests for events	2024-10-13 15:44:14 +02:00
Remi Cadene	9f5c586c1a	fix unit test	2024-10-13 12:32:48 +02:00
Remi Cadene	7115fefd21	fix	2024-10-12 20:56:45 +02:00
Remi Cadene	b07f91b710	Refactor record	2024-10-12 20:44:55 +02:00
Simon Alibert	8bd406e607	Add suggestions from code review	2024-10-11 18:52:11 +02:00
Simon Alibert	3ea53124e0	Add padding keys and download_data option	2024-10-11 17:38:47 +02:00
Simon Alibert	7f680886b0	Add huggingface-hub patch for offline snapshot_download with local_dir	2024-10-11 11:03:11 +02:00
Simon Alibert	6d2bc11365	Add doc, scrap video_frame_keys attribute	2024-10-11 10:59:38 +02:00
Simon Alibert	b417cebc4e	Update LeRobotDataset.__get_item__	2024-10-10 21:32:14 +02:00
Simon Alibert	3113038beb	Merge remote-tracking branch 'origin/main' into user/aliberts/2024_09_25_reshape_dataset	2024-10-09 14:33:58 +02:00
Simon Alibert	096824b5ff	Rework LeRobotDataset.__init__	2024-10-09 14:33:26 +02:00
Simon Alibert	2d75b93ba0	Update info.json format	2024-10-08 15:31:37 +02:00
Simon Alibert	21ba4b5263	Add pixel channels	2024-10-06 11:16:49 +02:00
Simon Alibert	028c17fd48	Merge remote-tracking branch 'origin/main' into user/aliberts/2024_09_25_reshape_dataset	2024-10-04 18:59:41 +02:00
Simon Alibert	07e113ce21	Add info.json link	2024-10-04 14:36:11 +02:00
Simon Alibert	1016a983a1	Add upload folders	2024-10-04 14:26:50 +02:00
Simon Alibert	17a1214e25	Merge remote-tracking branch 'origin/main' into user/aliberts/2024_09_25_reshape_dataset	2024-10-04 11:22:22 +02:00
Simon Alibert	ad115b6c27	WIP	2024-10-03 20:00:44 +02:00
Remi Cadene	6ca54a11dd	WIP	2024-10-03 14:34:28 +02:00
Remi Cadene	62651cb2a9	WIP	2024-09-25 11:31:14 +02:00
Remi Cadene	8b51a20b56	WIP to remove	2024-09-23 22:53:06 +02:00
jess-moss	34dcd05a96	Made script to save camera images instead of these funcitons existing in the camera driver classes. Fixed test_cameras.py script so it passes.	2024-09-20 15:25:58 -05:00
jess-moss	be5336f9d3	Made script to save camera images instead of these funcitons existing in the camera driver classes.	2024-09-20 14:53:15 -05:00
jess-moss	5aa2a88280	Merge branch 'main' of github.com:huggingface/lerobot into HEAD	2024-09-17 17:49:17 -05:00
jess-moss	aa22946a7f	Removed configuration and port finding code from classes no longer necessary.	2024-09-17 17:33:46 -05:00
jess-moss	1c1882e5eb	Added find_motor_bus_ports.py	2024-09-17 16:19:26 -05:00
jess-moss	89da9f73b5	Added a configuration script that can be used for feetech and dynamixel calibration.	2024-09-17 15:44:01 -05:00
jess-moss	0035bb962a	Merge branch 'user/rcadene/2024_09_04_feetech' of github.com:huggingface/lerobot into user/rcadene/2024_09_04_feetech	2024-09-17 11:42:29 -05:00
Remi Cadene	50c4b51049	TO REMOVE, before merging	2024-09-17 18:35:58 +02:00
Remi Cadene	2826c226d2	2 fixes	2024-09-17 18:35:32 +02:00
Remi Cadene	9d23d04691	WIP	2024-09-11 02:01:29 +02:00
Remi Cadene	dfdfeab366	Add feetech (WIP)	2024-09-11 02:01:23 +02:00