User/pepijn/2025 03 17 act different image shapes (#870 )

fix(codec): hot-fix for default codec in linux arm platforms (#868 )
Update test-docker-build.yml
2025-03-18 11:09:05 +01:00 · 2025-03-17 13:23:11 +01:00 · 2025-03-15 11:34:17 +01:00 · 2025-03-15 09:40:39 +01:00 · 2025-03-14 17:07:14 +01:00 · 2025-03-14 16:53:42 +01:00
100 changed files with 610 additions and 228 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -73,7 +73,7 @@ pip-log.txt
 pip-delete-this-directory.txt

 # Unit test / coverage reports
-!tests/data
+!tests/artifacts
 htmlcov/
 .tox/
 .nox/
--- a/.github/workflows/test-docker-build.yml
+++ b/.github/workflows/test-docker-build.yml
@@ -41,7 +41,7 @@ jobs:

      - name: Get changed files
        id: changed-files
-        uses: tj-actions/changed-files@v44
+        uses: tj-actions/changed-files@3f54ebb830831fc121d3263c1857cfbdc310cdb9 #v42
        with:
          files: docker/**
          json: "true"
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -126,7 +126,7 @@ jobs:
      # portaudio19-dev is needed to install pyaudio
        run: |
          sudo apt-get update && \
-          sudo apt-get install -y libegl1-mesa-dev portaudio19-dev
+          sudo apt-get install -y libegl1-mesa-dev ffmpeg portaudio19-dev

      - name: Install uv and python
        uses: astral-sh/setup-uv@v5
--- a/.gitignore
+++ b/.gitignore
@@ -78,7 +78,7 @@ pip-log.txt
 pip-delete-this-directory.txt

 # Unit test / coverage reports
-!tests/data
+!tests/artifacts
 htmlcov/
 .tox/
 .nox/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -12,10 +12,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-exclude: ^(tests/data)
+exclude: "tests/artifacts/.*\\.safetensors$"
 default_language_version:
    python: python3.10
 repos:
+  ##### Meta #####
+  - repo: meta
+    hooks:
+      - id: check-useless-excludes
+      - id: check-hooks-apply
+
+
  ##### Style / Misc. #####
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
@@ -28,31 +35,37 @@ repos:
      - id: check-toml
      - id: end-of-file-fixer
      - id: trailing-whitespace
+
  - repo: https://github.com/crate-ci/typos
-    rev: v1.30.0
+    rev: v1.30.2
    hooks:
      - id: typos
        args: [--force-exclude]
+
  - repo: https://github.com/asottile/pyupgrade
    rev: v3.19.1
    hooks:
    -   id: pyupgrade
+
  - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.9.9
+    rev: v0.9.10
    hooks:
      - id: ruff
        args: [--fix]
      - id: ruff-format

+
  ##### Security #####
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.24.0
    hooks:
      - id: gitleaks
+
  - repo: https://github.com/woodruffw/zizmor-pre-commit
    rev: v1.4.1
    hooks:
      - id: zizmor
+
  - repo: https://github.com/PyCQA/bandit
    rev: 1.8.3
    hooks:
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -291,7 +291,7 @@ sudo apt-get install git-lfs
 git lfs install
 ```

-Pull artifacts if they're not in [tests/data](tests/data)
+Pull artifacts if they're not in [tests/artifacts](tests/artifacts)
 ```bash
 git lfs pull
 ```
--- a/README.md
+++ b/README.md
@@ -232,8 +232,8 @@ python lerobot/scripts/eval.py \
    --env.type=pusht \
    --eval.batch_size=10 \
    --eval.n_episodes=10 \
-    --use_amp=false \
-    --device=cuda
+    --policy.use_amp=false \
+    --policy.device=cuda
 ```

 Note: After training your own policy, you can re-evaluate the checkpoints with:
--- a/benchmarks/video/README.md
+++ b/benchmarks/video/README.md
@@ -51,7 +51,7 @@ For a comprehensive list and documentation of these parameters, see the ffmpeg d
 ### Decoding parameters
 **Decoder**
 We tested two video decoding backends from torchvision:
- `pyav` (default)
+- `pyav`
 - `video_reader` (requires to build torchvision from source)

 **Requested timestamps**
--- a/benchmarks/video/run_video_benchmark.py
+++ b/benchmarks/video/run_video_benchmark.py
@@ -67,7 +67,7 @@ def parse_int_or_none(value) -> int | None:
 def check_datasets_formats(repo_ids: list) -> None:
    for repo_id in repo_ids:
        dataset = LeRobotDataset(repo_id)
-        if dataset.video:
+        if len(dataset.meta.video_keys) > 0:
            raise ValueError(
                f"Use only image dataset for running this benchmark. Video dataset provided: {repo_id}"
            )
--- a/examples/10_use_so100.md
+++ b/examples/10_use_so100.md
@@ -454,8 +454,8 @@ Next, you'll need to calibrate your SO-100 robot to ensure that the leader and f

 You will need to move the follower arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                             | 2. Rotated position                                                                                                                                                   | 3. Rest position                                                                                                                                             |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
 | <img src="../media/so100/follower_zero.webp?raw=true" alt="SO-100 follower arm zero position" title="SO-100 follower arm zero position" style="width:100%;"> | <img src="../media/so100/follower_rotated.webp?raw=true" alt="SO-100 follower arm rotated position" title="SO-100 follower arm rotated position" style="width:100%;"> | <img src="../media/so100/follower_rest.webp?raw=true" alt="SO-100 follower arm rest position" title="SO-100 follower arm rest position" style="width:100%;"> |

 Make sure both arms are connected and run this script to launch manual calibration:
@@ -470,8 +470,8 @@ python lerobot/scripts/control_robot.py \
 #### b. Manual calibration of leader arm
 Follow step 6 of the [assembly video](https://youtu.be/FioA2oeFZ5I?t=724) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                       | 2. Rotated position                                                                                                                                             | 3. Rest position                                                                                                                                       |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
 | <img src="../media/so100/leader_zero.webp?raw=true" alt="SO-100 leader arm zero position" title="SO-100 leader arm zero position" style="width:100%;"> | <img src="../media/so100/leader_rotated.webp?raw=true" alt="SO-100 leader arm rotated position" title="SO-100 leader arm rotated position" style="width:100%;"> | <img src="../media/so100/leader_rest.webp?raw=true" alt="SO-100 leader arm rest position" title="SO-100 leader arm rest position" style="width:100%;"> |

 Run this script to launch manual calibration:
@@ -571,18 +571,25 @@ python lerobot/scripts/train.py \
  --policy.type=act \
  --output_dir=outputs/train/act_so100_test \
  --job_name=act_so100_test \
-  --device=cuda \
+  --policy.device=cuda \
  --wandb.enable=true
 ```

 Let's explain it:
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/so100_test`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor sates, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 Training should take several hours. You will find checkpoints in `outputs/train/act_so100_test/checkpoints`.

+To resume training from a checkpoint, below is an example command to resume from `last` checkpoint of the `act_so100_test` policy:
+```bash
+python lerobot/scripts/train.py \
+  --config_path=outputs/train/act_so100_test/checkpoints/last/pretrained_model/train_config.json \
+  --resume=true
+```
+
 ## K. Evaluate your policy

 You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
--- a/examples/11_use_lekiwi.md
+++ b/examples/11_use_lekiwi.md
@@ -366,8 +366,8 @@ Now we have to calibrate the leader arm and the follower arm. The wheel motors d

 You will need to move the follower arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                                  | 2. Rotated position                                                                                                                                                        | 3. Rest position                                                                                                                                                  |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | <img src="../media/lekiwi/mobile_calib_zero.webp?raw=true" alt="SO-100 follower arm zero position" title="SO-100 follower arm zero position" style="width:100%;"> | <img src="../media/lekiwi/mobile_calib_rotated.webp?raw=true" alt="SO-100 follower arm rotated position" title="SO-100 follower arm rotated position" style="width:100%;"> | <img src="../media/lekiwi/mobile_calib_rest.webp?raw=true" alt="SO-100 follower arm rest position" title="SO-100 follower arm rest position" style="width:100%;"> |

 Make sure the arm is connected to the Raspberry Pi and run this script (on the Raspberry Pi) to launch manual calibration:
@@ -385,8 +385,8 @@ If you have the **wired** LeKiwi version please run all commands including this
 ### Calibrate leader arm
 Then to calibrate the leader arm (which is attached to the laptop/pc). You will need to move the leader arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                       | 2. Rotated position                                                                                                                                             | 3. Rest position                                                                                                                                       |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
 | <img src="../media/so100/leader_zero.webp?raw=true" alt="SO-100 leader arm zero position" title="SO-100 leader arm zero position" style="width:100%;"> | <img src="../media/so100/leader_rotated.webp?raw=true" alt="SO-100 leader arm rotated position" title="SO-100 leader arm rotated position" style="width:100%;"> | <img src="../media/so100/leader_rest.webp?raw=true" alt="SO-100 leader arm rest position" title="SO-100 leader arm rest position" style="width:100%;"> |

 Run this script (on your laptop/pc) to launch manual calibration:
@@ -416,22 +416,22 @@ python lerobot/scripts/control_robot.py \

 You should see on your laptop something like this: ```[INFO] Connected to remote robot at tcp://172.17.133.91:5555 and video stream at tcp://172.17.133.91:5556.``` Now you can move the leader arm and use the keyboard (w,a,s,d) to drive forward, left, backwards, right. And use (z,x) to turn left or turn right. You can use (r,f) to increase and decrease the speed of the mobile robot. There are three speed modes, see the table below:
 | Speed Mode | Linear Speed (m/s) | Rotation Speed (deg/s) |
-|------------|-------------------|-----------------------|
-| Fast      | 0.4               | 90                    |
-| Medium    | 0.25              | 60                    |
-| Slow      | 0.1               | 30                    |
+| ---------- | ------------------ | ---------------------- |
+| Fast       | 0.4                | 90                     |
+| Medium     | 0.25               | 60                     |
+| Slow       | 0.1                | 30                     |


-| Key  | Action                         |
-|------|--------------------------------|
-| W    | Move forward                   |
-| A    | Move left                       |
-| S    | Move backward                   |
-| D    | Move right                      |
-| Z    | Turn left                       |
-| X    | Turn right                      |
-| R    | Increase speed                  |
-| F    | Decrease speed                  |
+| Key | Action         |
+| --- | -------------- |
+| W   | Move forward   |
+| A   | Move left      |
+| S   | Move backward  |
+| D   | Move right     |
+| Z   | Turn left      |
+| X   | Turn right     |
+| R   | Increase speed |
+| F   | Decrease speed |

 > [!TIP]
 >  If you use a different keyboard you can change the keys for each command in the [`LeKiwiRobotConfig`](../lerobot/common/robot_devices/robots/configs.py).
@@ -549,14 +549,14 @@ python lerobot/scripts/train.py \
  --policy.type=act \
  --output_dir=outputs/train/act_lekiwi_test \
  --job_name=act_lekiwi_test \
-  --device=cuda \
+  --policy.device=cuda \
  --wandb.enable=true
 ```

 Let's explain it:
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/lekiwi_test`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor sates, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 Training should take several hours. You will find checkpoints in `outputs/train/act_lekiwi_test/checkpoints`.
--- a/examples/11_use_moss.md
+++ b/examples/11_use_moss.md
@@ -176,8 +176,8 @@ Next, you'll need to calibrate your Moss v1 robot to ensure that the leader and

 You will need to move the follower arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                              | 2. Rotated position                                                                                                                                                    | 3. Rest position                                                                                                                                              |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | <img src="../media/moss/follower_zero.webp?raw=true" alt="Moss v1 follower arm zero position" title="Moss v1 follower arm zero position" style="width:100%;"> | <img src="../media/moss/follower_rotated.webp?raw=true" alt="Moss v1 follower arm rotated position" title="Moss v1 follower arm rotated position" style="width:100%;"> | <img src="../media/moss/follower_rest.webp?raw=true" alt="Moss v1 follower arm rest position" title="Moss v1 follower arm rest position" style="width:100%;"> |

 Make sure both arms are connected and run this script to launch manual calibration:
@@ -192,8 +192,8 @@ python lerobot/scripts/control_robot.py \
 **Manual calibration of leader arm**
 Follow step 6 of the [assembly video](https://www.youtube.com/watch?v=DA91NJOtMic) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                        | 2. Rotated position                                                                                                                                              | 3. Rest position                                                                                                                                        |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | <img src="../media/moss/leader_zero.webp?raw=true" alt="Moss v1 leader arm zero position" title="Moss v1 leader arm zero position" style="width:100%;"> | <img src="../media/moss/leader_rotated.webp?raw=true" alt="Moss v1 leader arm rotated position" title="Moss v1 leader arm rotated position" style="width:100%;"> | <img src="../media/moss/leader_rest.webp?raw=true" alt="Moss v1 leader arm rest position" title="Moss v1 leader arm rest position" style="width:100%;"> |

 Run this script to launch manual calibration:
@@ -293,14 +293,14 @@ python lerobot/scripts/train.py \
  --policy.type=act \
  --output_dir=outputs/train/act_moss_test \
  --job_name=act_moss_test \
-  --device=cuda \
+  --policy.device=cuda \
  --wandb.enable=true
 ```

 Let's explain it:
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/moss_test`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor sates, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 Training should take several hours. You will find checkpoints in `outputs/train/act_moss_test/checkpoints`.
--- a/examples/4_train_policy_with_script.md
+++ b/examples/4_train_policy_with_script.md
@@ -1,5 +1,5 @@
 This tutorial will explain the training script, how to use it, and particularly how to configure everything needed for the training run.
-> **Note:** The following assume you're running these commands on a machine equipped with a cuda GPU. If you don't have one (or if you're using a Mac), you can add `--device=cpu` (`--device=mps` respectively). However, be advised that the code executes much slower on cpu.
+> **Note:** The following assume you're running these commands on a machine equipped with a cuda GPU. If you don't have one (or if you're using a Mac), you can add `--policy.device=cpu` (`--policy.device=mps` respectively). However, be advised that the code executes much slower on cpu.


 ## The training script
--- a/examples/7_get_started_with_real_robot.md
+++ b/examples/7_get_started_with_real_robot.md
@@ -386,14 +386,14 @@ When you connect your robot for the first time, the [`ManipulatorRobot`](../lero

 Here are the positions you'll move the follower arm to:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                                  | 2. Rotated position                                                                                                                                                        | 3. Rest position                                                                                                                                                  |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | <img src="../media/koch/follower_zero.webp?raw=true" alt="Koch v1.1 follower arm zero position" title="Koch v1.1 follower arm zero position" style="width:100%;"> | <img src="../media/koch/follower_rotated.webp?raw=true" alt="Koch v1.1 follower arm rotated position" title="Koch v1.1 follower arm rotated position" style="width:100%;"> | <img src="../media/koch/follower_rest.webp?raw=true" alt="Koch v1.1 follower arm rest position" title="Koch v1.1 follower arm rest position" style="width:100%;"> |

 And here are the corresponding positions for the leader arm:

-| 1. Zero position | 2. Rotated position | 3. Rest position |
-|---|---|---|
+| 1. Zero position                                                                                                                                            | 2. Rotated position                                                                                                                                                  | 3. Rest position                                                                                                                                            |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | <img src="../media/koch/leader_zero.webp?raw=true" alt="Koch v1.1 leader arm zero position" title="Koch v1.1 leader arm zero position" style="width:100%;"> | <img src="../media/koch/leader_rotated.webp?raw=true" alt="Koch v1.1 leader arm rotated position" title="Koch v1.1 leader arm rotated position" style="width:100%;"> | <img src="../media/koch/leader_rest.webp?raw=true" alt="Koch v1.1 leader arm rest position" title="Koch v1.1 leader arm rest position" style="width:100%;"> |

 You can watch a [video tutorial of the calibration procedure](https://youtu.be/8drnU9uRY24) for more details.
@@ -898,14 +898,14 @@ python lerobot/scripts/train.py \
  --policy.type=act \
  --output_dir=outputs/train/act_koch_test \
  --job_name=act_koch_test \
-  --device=cuda \
+  --policy.device=cuda \
  --wandb.enable=true
 ```

 Let's explain it:
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/koch_test`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor sates, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 For more information on the `train` script see the previous tutorial: [`examples/4_train_policy_with_script.md`](../examples/4_train_policy_with_script.md)
--- a/examples/9_use_aloha.md
+++ b/examples/9_use_aloha.md
@@ -135,14 +135,14 @@ python lerobot/scripts/train.py \
  --policy.type=act \
  --output_dir=outputs/train/act_aloha_test \
  --job_name=act_aloha_test \
-  --device=cuda \
+  --policy.device=cuda \
  --wandb.enable=true
 ```

 Let's explain it:
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/aloha_test`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../lerobot/common/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor sates, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `device=cuda` since we are training on a Nvidia GPU, but you could use `device=mps` to train on Apple silicon.
+4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
 5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 For more information on the `train` script see the previous tutorial: [`examples/4_train_policy_with_script.md`](../examples/4_train_policy_with_script.md)
--- a/lerobot/common/datasets/lerobot_dataset.py
+++ b/lerobot/common/datasets/lerobot_dataset.py
@@ -67,8 +67,9 @@ from lerobot.common.datasets.utils import (
 )
 from lerobot.common.datasets.video_utils import (
    VideoFrame,
-    decode_video_frames_torchvision,
+    decode_video_frames,
    encode_video_frames,
+    get_safe_default_codec,
    get_video_info,
 )
 from lerobot.common.robot_devices.robots.utils import Robot
@@ -462,8 +463,8 @@ class LeRobotDataset(torch.utils.data.Dataset):
            download_videos (bool, optional): Flag to download the videos. Note that when set to True but the
                video files are already present on local disk, they won't be downloaded again. Defaults to
                True.
-            video_backend (str | None, optional): Video backend to use for decoding videos. There is currently
-                a single option which is the pyav decoder used by Torchvision. Defaults to pyav.
+            video_backend (str | None, optional): Video backend to use for decoding videos. Defaults to torchcodec when available int the platform; otherwise, defaults to 'pyav'.
+                You can also use the 'pyav' decoder used by Torchvision, which used to be the default option, or 'video_reader' which is another decoder of Torchvision.
        """
        super().__init__()
        self.repo_id = repo_id
@@ -473,7 +474,7 @@ class LeRobotDataset(torch.utils.data.Dataset):
        self.episodes = episodes
        self.tolerance_s = tolerance_s
        self.revision = revision if revision else CODEBASE_VERSION
-        self.video_backend = video_backend if video_backend else "pyav"
+        self.video_backend = video_backend if video_backend else get_safe_default_codec()
        self.delta_indices = None

        # Unused attributes
@@ -707,9 +708,7 @@ class LeRobotDataset(torch.utils.data.Dataset):
        item = {}
        for vid_key, query_ts in query_timestamps.items():
            video_path = self.root / self.meta.get_video_file_path(ep_idx, vid_key)
-            frames = decode_video_frames_torchvision(
-                video_path, query_ts, self.tolerance_s, self.video_backend
-            )
+            frames = decode_video_frames(video_path, query_ts, self.tolerance_s, self.video_backend)
            item[vid_key] = frames.squeeze(0)

        return item
@@ -1029,7 +1028,7 @@ class LeRobotDataset(torch.utils.data.Dataset):
        obj.delta_timestamps = None
        obj.delta_indices = None
        obj.episode_data_index = None
-        obj.video_backend = video_backend if video_backend is not None else "pyav"
+        obj.video_backend = video_backend if video_backend is not None else get_safe_default_codec()
        return obj


--- a/lerobot/common/datasets/video_utils.py
+++ b/lerobot/common/datasets/video_utils.py
@@ -13,6 +13,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import importlib
 import json
 import logging
 import subprocess
@@ -29,6 +30,46 @@ from datasets.features.features import register_feature
 from PIL import Image


+def get_safe_default_codec():
+    if importlib.util.find_spec("torchcodec"):
+        return "torchcodec"
+    else:
+        logging.warning(
+            "'torchcodec' is not available in your platform, falling back to 'pyav' as a default decoder"
+        )
+        return "pyav"
+
+
+def decode_video_frames(
+    video_path: Path | str,
+    timestamps: list[float],
+    tolerance_s: float,
+    backend: str | None = None,
+) -> torch.Tensor:
+    """
+    Decodes video frames using the specified backend.
+
+    Args:
+        video_path (Path): Path to the video file.
+        timestamps (list[float]): List of timestamps to extract frames.
+        tolerance_s (float): Allowed deviation in seconds for frame retrieval.
+        backend (str, optional): Backend to use for decoding. Defaults to "torchcodec" when available in the platform; otherwise, defaults to "pyav"..
+
+    Returns:
+        torch.Tensor: Decoded frames.
+
+    Currently supports torchcodec on cpu and pyav.
+    """
+    if backend is None:
+        backend = get_safe_default_codec()
+    if backend == "torchcodec":
+        return decode_video_frames_torchcodec(video_path, timestamps, tolerance_s)
+    elif backend in ["pyav", "video_reader"]:
+        return decode_video_frames_torchvision(video_path, timestamps, tolerance_s, backend)
+    else:
+        raise ValueError(f"Unsupported video backend: {backend}")
+
+
 def decode_video_frames_torchvision(
    video_path: Path | str,
    timestamps: list[float],
@@ -127,6 +168,81 @@ def decode_video_frames_torchvision(
    return closest_frames


+def decode_video_frames_torchcodec(
+    video_path: Path | str,
+    timestamps: list[float],
+    tolerance_s: float,
+    device: str = "cpu",
+    log_loaded_timestamps: bool = False,
+) -> torch.Tensor:
+    """Loads frames associated with the requested timestamps of a video using torchcodec.
+
+    Note: Setting device="cuda" outside the main process, e.g. in data loader workers, will lead to CUDA initialization errors.
+
+    Note: Video benefits from inter-frame compression. Instead of storing every frame individually,
+    the encoder stores a reference frame (or a key frame) and subsequent frames as differences relative to
+    that key frame. As a consequence, to access a requested frame, we need to load the preceding key frame,
+    and all subsequent frames until reaching the requested frame. The number of key frames in a video
+    can be adjusted during encoding to take into account decoding time and video size in bytes.
+    """
+
+    if importlib.util.find_spec("torchcodec"):
+        from torchcodec.decoders import VideoDecoder
+    else:
+        raise ImportError("torchcodec is required but not available.")
+
+    # initialize video decoder
+    decoder = VideoDecoder(video_path, device=device, seek_mode="approximate")
+    loaded_frames = []
+    loaded_ts = []
+    # get metadata for frame information
+    metadata = decoder.metadata
+    average_fps = metadata.average_fps
+
+    # convert timestamps to frame indices
+    frame_indices = [round(ts * average_fps) for ts in timestamps]
+
+    # retrieve frames based on indices
+    frames_batch = decoder.get_frames_at(indices=frame_indices)
+
+    for frame, pts in zip(frames_batch.data, frames_batch.pts_seconds, strict=False):
+        loaded_frames.append(frame)
+        loaded_ts.append(pts.item())
+        if log_loaded_timestamps:
+            logging.info(f"Frame loaded at timestamp={pts:.4f}")
+
+    query_ts = torch.tensor(timestamps)
+    loaded_ts = torch.tensor(loaded_ts)
+
+    # compute distances between each query timestamp and loaded timestamps
+    dist = torch.cdist(query_ts[:, None], loaded_ts[:, None], p=1)
+    min_, argmin_ = dist.min(1)
+
+    is_within_tol = min_ < tolerance_s
+    assert is_within_tol.all(), (
+        f"One or several query timestamps unexpectedly violate the tolerance ({min_[~is_within_tol]} > {tolerance_s=})."
+        "It means that the closest frame that can be loaded from the video is too far away in time."
+        "This might be due to synchronization issues with timestamps during data collection."
+        "To be safe, we advise to ignore this item during training."
+        f"\nqueried timestamps: {query_ts}"
+        f"\nloaded timestamps: {loaded_ts}"
+        f"\nvideo: {video_path}"
+    )
+
+    # get closest frames to the query timestamps
+    closest_frames = torch.stack([loaded_frames[idx] for idx in argmin_])
+    closest_ts = loaded_ts[argmin_]
+
+    if log_loaded_timestamps:
+        logging.info(f"{closest_ts=}")
+
+    # convert to float32 in [0,1] range (channel first)
+    closest_frames = closest_frames.type(torch.float32) / 255
+
+    assert len(timestamps) == len(closest_frames)
+    return closest_frames
+
+
 def encode_video_frames(
    imgs_dir: Path | str,
    video_path: Path | str,
--- a/lerobot/common/policies/act/modeling_act.py
+++ b/lerobot/common/policies/act/modeling_act.py
@@ -119,9 +119,7 @@ class ACTPolicy(PreTrainedPolicy):
        batch = self.normalize_inputs(batch)
        if self.config.image_features:
            batch = dict(batch)  # shallow copy so that adding a key doesn't modify the original
-            batch["observation.images"] = torch.stack(
-                [batch[key] for key in self.config.image_features], dim=-4
-            )
+            batch["observation.images"] = [batch[key] for key in self.config.image_features]

        # If we are doing temporal ensembling, do online updates where we keep track of the number of actions
        # we are ensembling over.
@@ -149,9 +147,8 @@ class ACTPolicy(PreTrainedPolicy):
        batch = self.normalize_inputs(batch)
        if self.config.image_features:
            batch = dict(batch)  # shallow copy so that adding a key doesn't modify the original
-            batch["observation.images"] = torch.stack(
-                [batch[key] for key in self.config.image_features], dim=-4
-            )
+            batch["observation.images"] = [batch[key] for key in self.config.image_features]
+
        batch = self.normalize_targets(batch)
        actions_hat, (mu_hat, log_sigma_x2_hat) = self.model(batch)

@@ -413,11 +410,10 @@ class ACT(nn.Module):
                "actions must be provided when using the variational objective in training mode."
            )

-        batch_size = (
-            batch["observation.images"]
-            if "observation.images" in batch
-            else batch["observation.environment_state"]
-        ).shape[0]
+        if "observation.images" in batch:
+            batch_size = batch["observation.images"][0].shape[0]
+        else:
+            batch_size = batch["observation.environment_state"].shape[0]

        # Prepare the latent for input to the transformer encoder.
        if self.config.use_vae and "action" in batch:
@@ -490,20 +486,21 @@ class ACT(nn.Module):
            all_cam_features = []
            all_cam_pos_embeds = []

-            for cam_index in range(batch["observation.images"].shape[-4]):
-                cam_features = self.backbone(batch["observation.images"][:, cam_index])["feature_map"]
-                # TODO(rcadene, alexander-soare): remove call to `.to` to speedup forward ; precompute and use
-                # buffer
+            # For a list of images, the H and W may vary but H*W is constant.
+            for img in batch["observation.images"]:
+                cam_features = self.backbone(img)["feature_map"]
                cam_pos_embed = self.encoder_cam_feat_pos_embed(cam_features).to(dtype=cam_features.dtype)
-                cam_features = self.encoder_img_feat_input_proj(cam_features)  # (B, C, h, w)
+                cam_features = self.encoder_img_feat_input_proj(cam_features)
+
+                # Rearrange features to (sequence, batch, dim).
+                cam_features = einops.rearrange(cam_features, "b c h w -> (h w) b c")
+                cam_pos_embed = einops.rearrange(cam_pos_embed, "b c h w -> (h w) b c")
+
                all_cam_features.append(cam_features)
                all_cam_pos_embeds.append(cam_pos_embed)
-            # Concatenate camera observation feature maps and positional embeddings along the width dimension,
-            # and move to (sequence, batch, dim).
-            all_cam_features = torch.cat(all_cam_features, axis=-1)
-            encoder_in_tokens.extend(einops.rearrange(all_cam_features, "b c h w -> (h w) b c"))
-            all_cam_pos_embeds = torch.cat(all_cam_pos_embeds, axis=-1)
-            encoder_in_pos_embed.extend(einops.rearrange(all_cam_pos_embeds, "b c h w -> (h w) b c"))
+
+            encoder_in_tokens.extend(torch.cat(all_cam_features, axis=0))
+            encoder_in_pos_embed.extend(torch.cat(all_cam_pos_embeds, axis=0))

        # Stack all tokens along the sequence dimension.
        encoder_in_tokens = torch.stack(encoder_in_tokens, axis=0)
--- a/lerobot/common/robot_devices/cameras/intelrealsense.py
+++ b/lerobot/common/robot_devices/cameras/intelrealsense.py
@@ -48,7 +48,7 @@ def find_cameras(raise_when_empty=True, mock=False) -> list[dict]:
    connected to the computer.
    """
    if mock:
-        import tests.mock_pyrealsense2 as rs
+        import tests.cameras.mock_pyrealsense2 as rs
    else:
        import pyrealsense2 as rs

@@ -100,7 +100,7 @@ def save_images_from_cameras(
        serial_numbers = [cam["serial_number"] for cam in camera_infos]

    if mock:
-        import tests.mock_cv2 as cv2
+        import tests.cameras.mock_cv2 as cv2
    else:
        import cv2

@@ -114,7 +114,7 @@ def save_images_from_cameras(
        camera = IntelRealSenseCamera(config)
        camera.connect()
        print(
-            f"IntelRealSenseCamera({camera.serial_number}, fps={camera.fps}, width={camera.width}, height={camera.height}, color_mode={camera.color_mode})"
+            f"IntelRealSenseCamera({camera.serial_number}, fps={camera.fps}, width={camera.capture_width}, height={camera.capture_height}, color_mode={camera.color_mode})"
        )
        cameras.append(camera)

@@ -224,9 +224,20 @@ class IntelRealSenseCamera:
            self.serial_number = self.find_serial_number_from_name(config.name)
        else:
            self.serial_number = config.serial_number
+
+        # Store the raw (capture) resolution from the config.
+        self.capture_width = config.width
+        self.capture_height = config.height
+
+        # If rotated by ±90, swap width and height.
+        if config.rotation in [-90, 90]:
+            self.width = config.height
+            self.height = config.width
+        else:
+            self.width = config.width
+            self.height = config.height
+
        self.fps = config.fps
-        self.width = config.width
-        self.height = config.height
        self.channels = config.channels
        self.color_mode = config.color_mode
        self.use_depth = config.use_depth
@@ -242,11 +253,10 @@ class IntelRealSenseCamera:
        self.logs = {}

        if self.mock:
-            import tests.mock_cv2 as cv2
+            import tests.cameras.mock_cv2 as cv2
        else:
            import cv2

-        # TODO(alibets): Do we keep original width/height or do we define them after rotation?
        self.rotation = None
        if config.rotation == -90:
            self.rotation = cv2.ROTATE_90_COUNTERCLOCKWISE
@@ -277,22 +287,26 @@ class IntelRealSenseCamera:
            )

        if self.mock:
-            import tests.mock_pyrealsense2 as rs
+            import tests.cameras.mock_pyrealsense2 as rs
        else:
            import pyrealsense2 as rs

        config = rs.config()
        config.enable_device(str(self.serial_number))

-        if self.fps and self.width and self.height:
+        if self.fps and self.capture_width and self.capture_height:
            # TODO(rcadene): can we set rgb8 directly?
-            config.enable_stream(rs.stream.color, self.width, self.height, rs.format.rgb8, self.fps)
+            config.enable_stream(
+                rs.stream.color, self.capture_width, self.capture_height, rs.format.rgb8, self.fps
+            )
        else:
            config.enable_stream(rs.stream.color)

        if self.use_depth:
-            if self.fps and self.width and self.height:
-                config.enable_stream(rs.stream.depth, self.width, self.height, rs.format.z16, self.fps)
+            if self.fps and self.capture_width and self.capture_height:
+                config.enable_stream(
+                    rs.stream.depth, self.capture_width, self.capture_height, rs.format.z16, self.fps
+                )
            else:
                config.enable_stream(rs.stream.depth)

@@ -330,18 +344,18 @@ class IntelRealSenseCamera:
            raise OSError(
                f"Can't set {self.fps=} for IntelRealSenseCamera({self.serial_number}). Actual value is {actual_fps}."
            )
-        if self.width is not None and self.width != actual_width:
+        if self.capture_width is not None and self.capture_width != actual_width:
            raise OSError(
-                f"Can't set {self.width=} for IntelRealSenseCamera({self.serial_number}). Actual value is {actual_width}."
+                f"Can't set {self.capture_width=} for IntelRealSenseCamera({self.serial_number}). Actual value is {actual_width}."
            )
-        if self.height is not None and self.height != actual_height:
+        if self.capture_height is not None and self.capture_height != actual_height:
            raise OSError(
-                f"Can't set {self.height=} for IntelRealSenseCamera({self.serial_number}). Actual value is {actual_height}."
+                f"Can't set {self.capture_height=} for IntelRealSenseCamera({self.serial_number}). Actual value is {actual_height}."
            )

        self.fps = round(actual_fps)
-        self.width = round(actual_width)
-        self.height = round(actual_height)
+        self.capture_width = round(actual_width)
+        self.capture_height = round(actual_height)

        self.is_connected = True

@@ -361,7 +375,7 @@ class IntelRealSenseCamera:
            )

        if self.mock:
-            import tests.mock_cv2 as cv2
+            import tests.cameras.mock_cv2 as cv2
        else:
            import cv2

@@ -387,7 +401,7 @@ class IntelRealSenseCamera:
            color_image = cv2.cvtColor(color_image, cv2.COLOR_RGB2BGR)

        h, w, _ = color_image.shape
-        if h != self.height or w != self.width:
+        if h != self.capture_height or w != self.capture_width:
            raise OSError(
                f"Can't capture color image with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
            )
@@ -409,7 +423,7 @@ class IntelRealSenseCamera:
            depth_map = np.asanyarray(depth_frame.get_data())

            h, w = depth_map.shape
-            if h != self.height or w != self.width:
+            if h != self.capture_height or w != self.capture_width:
                raise OSError(
                    f"Can't capture depth map with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
                )
--- a/lerobot/common/robot_devices/cameras/opencv.py
+++ b/lerobot/common/robot_devices/cameras/opencv.py
@@ -80,7 +80,7 @@ def _find_cameras(
    possible_camera_ids: list[int | str], raise_when_empty=False, mock=False
 ) -> list[int | str]:
    if mock:
-        import tests.mock_cv2 as cv2
+        import tests.cameras.mock_cv2 as cv2
    else:
        import cv2

@@ -144,8 +144,8 @@ def save_images_from_cameras(
        camera = OpenCVCamera(config)
        camera.connect()
        print(
-            f"OpenCVCamera({camera.camera_index}, fps={camera.fps}, width={camera.width}, "
-            f"height={camera.height}, color_mode={camera.color_mode})"
+            f"OpenCVCamera({camera.camera_index}, fps={camera.fps}, width={camera.capture_width}, "
+            f"height={camera.capture_height}, color_mode={camera.color_mode})"
        )
        cameras.append(camera)

@@ -244,9 +244,19 @@ class OpenCVCamera:
            else:
                raise ValueError(f"Please check the provided camera_index: {self.camera_index}")

+        # Store the raw (capture) resolution from the config.
+        self.capture_width = config.width
+        self.capture_height = config.height
+
+        # If rotated by ±90, swap width and height.
+        if config.rotation in [-90, 90]:
+            self.width = config.height
+            self.height = config.width
+        else:
+            self.width = config.width
+            self.height = config.height
+
        self.fps = config.fps
-        self.width = config.width
-        self.height = config.height
        self.channels = config.channels
        self.color_mode = config.color_mode
        self.mock = config.mock
@@ -259,11 +269,10 @@ class OpenCVCamera:
        self.logs = {}

        if self.mock:
-            import tests.mock_cv2 as cv2
+            import tests.cameras.mock_cv2 as cv2
        else:
            import cv2

-        # TODO(aliberts): Do we keep original width/height or do we define them after rotation?
        self.rotation = None
        if config.rotation == -90:
            self.rotation = cv2.ROTATE_90_COUNTERCLOCKWISE
@@ -277,7 +286,7 @@ class OpenCVCamera:
            raise RobotDeviceAlreadyConnectedError(f"OpenCVCamera({self.camera_index}) is already connected.")

        if self.mock:
-            import tests.mock_cv2 as cv2
+            import tests.cameras.mock_cv2 as cv2
        else:
            import cv2

@@ -325,10 +334,10 @@ class OpenCVCamera:

        if self.fps is not None:
            self.camera.set(cv2.CAP_PROP_FPS, self.fps)
-        if self.width is not None:
-            self.camera.set(cv2.CAP_PROP_FRAME_WIDTH, self.width)
-        if self.height is not None:
-            self.camera.set(cv2.CAP_PROP_FRAME_HEIGHT, self.height)
+        if self.capture_width is not None:
+            self.camera.set(cv2.CAP_PROP_FRAME_WIDTH, self.capture_width)
+        if self.capture_height is not None:
+            self.camera.set(cv2.CAP_PROP_FRAME_HEIGHT, self.capture_height)

        actual_fps = self.camera.get(cv2.CAP_PROP_FPS)
        actual_width = self.camera.get(cv2.CAP_PROP_FRAME_WIDTH)
@@ -340,19 +349,22 @@ class OpenCVCamera:
            raise OSError(
                f"Can't set {self.fps=} for OpenCVCamera({self.camera_index}). Actual value is {actual_fps}."
            )
-        if self.width is not None and not math.isclose(self.width, actual_width, rel_tol=1e-3):
+        if self.capture_width is not None and not math.isclose(
+            self.capture_width, actual_width, rel_tol=1e-3
+        ):
            raise OSError(
-                f"Can't set {self.width=} for OpenCVCamera({self.camera_index}). Actual value is {actual_width}."
+                f"Can't set {self.capture_width=} for OpenCVCamera({self.camera_index}). Actual value is {actual_width}."
            )
-        if self.height is not None and not math.isclose(self.height, actual_height, rel_tol=1e-3):
+        if self.capture_height is not None and not math.isclose(
+            self.capture_height, actual_height, rel_tol=1e-3
+        ):
            raise OSError(
-                f"Can't set {self.height=} for OpenCVCamera({self.camera_index}). Actual value is {actual_height}."
+                f"Can't set {self.capture_height=} for OpenCVCamera({self.camera_index}). Actual value is {actual_height}."
            )

        self.fps = round(actual_fps)
-        self.width = round(actual_width)
-        self.height = round(actual_height)
-
+        self.capture_width = round(actual_width)
+        self.capture_height = round(actual_height)
        self.is_connected = True

    def read(self, temporary_color_mode: str | None = None) -> np.ndarray:
@@ -386,14 +398,14 @@ class OpenCVCamera:
        # so we convert the image color from BGR to RGB.
        if requested_color_mode == "rgb":
            if self.mock:
-                import tests.mock_cv2 as cv2
+                import tests.cameras.mock_cv2 as cv2
            else:
                import cv2

            color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)

        h, w, _ = color_image.shape
-        if h != self.height or w != self.width:
+        if h != self.capture_height or w != self.capture_width:
            raise OSError(
                f"Can't capture color image with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
            )
--- a/lerobot/common/robot_devices/motors/dynamixel.py
+++ b/lerobot/common/robot_devices/motors/dynamixel.py
@@ -332,7 +332,7 @@ class DynamixelMotorsBus:
            )

        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

@@ -356,7 +356,7 @@ class DynamixelMotorsBus:

    def reconnect(self):
        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

@@ -646,7 +646,7 @@ class DynamixelMotorsBus:

    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

@@ -691,7 +691,7 @@ class DynamixelMotorsBus:
        start_time = time.perf_counter()

        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

@@ -757,7 +757,7 @@ class DynamixelMotorsBus:

    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

@@ -793,7 +793,7 @@ class DynamixelMotorsBus:
        start_time = time.perf_counter()

        if self.mock:
-            import tests.mock_dynamixel_sdk as dxl
+            import tests.motors.mock_dynamixel_sdk as dxl
        else:
            import dynamixel_sdk as dxl

--- a/lerobot/common/robot_devices/motors/feetech.py
+++ b/lerobot/common/robot_devices/motors/feetech.py
@@ -313,7 +313,7 @@ class FeetechMotorsBus:
            )

        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

@@ -337,7 +337,7 @@ class FeetechMotorsBus:

    def reconnect(self):
        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

@@ -664,7 +664,7 @@ class FeetechMotorsBus:

    def read_with_motor_ids(self, motor_models, motor_ids, data_name, num_retry=NUM_READ_RETRY):
        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

@@ -702,7 +702,7 @@ class FeetechMotorsBus:

    def read(self, data_name, motor_names: str | list[str] | None = None):
        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

@@ -782,7 +782,7 @@ class FeetechMotorsBus:

    def write_with_motor_ids(self, motor_models, motor_ids, data_name, values, num_retry=NUM_WRITE_RETRY):
        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

@@ -818,7 +818,7 @@ class FeetechMotorsBus:
        start_time = time.perf_counter()

        if self.mock:
-            import tests.mock_scservo_sdk as scs
+            import tests.motors.mock_scservo_sdk as scs
        else:
            import scservo_sdk as scs

--- a/lerobot/common/utils/wandb_utils.py
+++ b/lerobot/common/utils/wandb_utils.py
@@ -69,7 +69,13 @@ class WandBLogger:
        os.environ["WANDB_SILENT"] = "True"
        import wandb

-        wandb_run_id = get_wandb_run_id_from_filesystem(self.log_dir) if cfg.resume else None
+        wandb_run_id = (
+            cfg.wandb.run_id
+            if cfg.wandb.run_id
+            else get_wandb_run_id_from_filesystem(self.log_dir)
+            if cfg.resume
+            else None
+        )
        wandb.init(
            id=wandb_run_id,
            project=self.cfg.project,
--- a/lerobot/configs/default.py
+++ b/lerobot/configs/default.py
@@ -20,6 +20,7 @@ from lerobot.common import (
    policies,  # noqa: F401
 )
 from lerobot.common.datasets.transforms import ImageTransformsConfig
+from lerobot.common.datasets.video_utils import get_safe_default_codec


@dataclass
@@ -35,7 +36,7 @@ class DatasetConfig:
    image_transforms: ImageTransformsConfig = field(default_factory=ImageTransformsConfig)
    revision: str | None = None
    use_imagenet_stats: bool = True
-    video_backend: str = "pyav"
+    video_backend: str = field(default_factory=get_safe_default_codec)


@dataclass
@@ -46,6 +47,7 @@ class WandBConfig:
    project: str = "lerobot"
    entity: str | None = None
    notes: str | None = None
+    run_id: str | None = None


@dataclass
--- a/lerobot/configs/parser.py
+++ b/lerobot/configs/parser.py
@@ -11,7 +11,9 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import importlib
 import inspect
+import pkgutil
 import sys
 from argparse import ArgumentError
 from functools import wraps
@@ -23,6 +25,7 @@ import draccus
 from lerobot.common.utils.utils import has_method

 PATH_KEY = "path"
+PLUGIN_DISCOVERY_SUFFIX = "discover_packages_path"
 draccus.set_config_type("json")


@@ -58,6 +61,86 @@ def parse_arg(arg_name: str, args: Sequence[str] | None = None) -> str | None:
    return None


+def parse_plugin_args(plugin_arg_suffix: str, args: Sequence[str]) -> dict:
+    """Parse plugin-related arguments from command-line arguments.
+
+    This function extracts arguments from command-line arguments that match a specified suffix pattern.
+    It processes arguments in the format '--key=value' and returns them as a dictionary.
+
+    Args:
+        plugin_arg_suffix (str): The suffix to identify plugin-related arguments.
+        cli_args (Sequence[str]): A sequence of command-line arguments to parse.
+
+    Returns:
+        dict: A dictionary containing the parsed plugin arguments where:
+            - Keys are the argument names (with '--' prefix removed if present)
+            - Values are the corresponding argument values
+
+    Example:
+        >>> args = ['--env.discover_packages_path=my_package',
+        ...         '--other_arg=value']
+        >>> parse_plugin_args('discover_packages_path', args)
+        {'env.discover_packages_path': 'my_package'}
+    """
+    plugin_args = {}
+    for arg in args:
+        if "=" in arg and plugin_arg_suffix in arg:
+            key, value = arg.split("=", 1)
+            # Remove leading '--' if present
+            if key.startswith("--"):
+                key = key[2:]
+            plugin_args[key] = value
+    return plugin_args
+
+
+class PluginLoadError(Exception):
+    """Raised when a plugin fails to load."""
+
+
+def load_plugin(plugin_path: str) -> None:
+    """Load and initialize a plugin from a given Python package path.
+
+    This function attempts to load a plugin by importing its package and any submodules.
+    Plugin registration is expected to happen during package initialization, i.e. when
+    the package is imported the gym environment should be registered and the config classes
+    registered with their parents using the `register_subclass` decorator.
+
+    Args:
+        plugin_path (str): The Python package path to the plugin (e.g. "mypackage.plugins.myplugin")
+
+    Raises:
+        PluginLoadError: If the plugin cannot be loaded due to import errors or if the package path is invalid.
+
+    Examples:
+        >>> load_plugin("external_plugin.core")       # Loads plugin from external package
+
+    Notes:
+        - The plugin package should handle its own registration during import
+        - All submodules in the plugin package will be imported
+        - Implementation follows the plugin discovery pattern from Python packaging guidelines
+
+    See Also:
+        https://packaging.python.org/en/latest/guides/creating-and-discovering-plugins/
+    """
+    try:
+        package_module = importlib.import_module(plugin_path, __package__)
+    except (ImportError, ModuleNotFoundError) as e:
+        raise PluginLoadError(
+            f"Failed to load plugin '{plugin_path}'. Verify the path and installation: {str(e)}"
+        ) from e
+
+    def iter_namespace(ns_pkg):
+        return pkgutil.iter_modules(ns_pkg.__path__, ns_pkg.__name__ + ".")
+
+    try:
+        for _finder, pkg_name, _ispkg in iter_namespace(package_module):
+            importlib.import_module(pkg_name)
+    except ImportError as e:
+        raise PluginLoadError(
+            f"Failed to load plugin '{plugin_path}'. Verify the path and installation: {str(e)}"
+        ) from e
+
+
 def get_path_arg(field_name: str, args: Sequence[str] | None = None) -> str | None:
    return parse_arg(f"{field_name}.{PATH_KEY}", args)

@@ -105,10 +188,13 @@ def filter_path_args(fields_to_filter: str | list[str], args: Sequence[str] | No

 def wrap(config_path: Path | None = None):
    """
-    HACK: Similar to draccus.wrap but does two additional things:
+    HACK: Similar to draccus.wrap but does three additional things:
        - Will remove '.path' arguments from CLI in order to process them later on.
        - If a 'config_path' is passed and the main config class has a 'from_pretrained' method, will
          initialize it from there to allow to fetch configs from the hub directly
+        - Will load plugins specified in the CLI arguments. These plugins will typically register
+            their own subclasses of config classes, so that draccus can find the right class to instantiate
+            from the CLI '.type' arguments
    """

    def wrapper_outer(fn):
@@ -121,6 +207,14 @@ def wrap(config_path: Path | None = None):
                args = args[1:]
            else:
                cli_args = sys.argv[1:]
+                plugin_args = parse_plugin_args(PLUGIN_DISCOVERY_SUFFIX, cli_args)
+                for plugin_cli_arg, plugin_path in plugin_args.items():
+                    try:
+                        load_plugin(plugin_path)
+                    except PluginLoadError as e:
+                        # add the relevant CLI arg to the error message
+                        raise PluginLoadError(f"{e}\nFailed plugin CLI Arg: {plugin_cli_arg}") from e
+                    cli_args = filter_arg(plugin_cli_arg, cli_args)
                config_path_cli = parse_arg("config_path", cli_args)
                if has_method(argtype, "__get_path_fields__"):
                    path_fields = argtype.__get_path_fields__()
--- a/lerobot/configs/train.py
+++ b/lerobot/configs/train.py
@@ -79,7 +79,9 @@ class TrainPipelineConfig(HubMixin):
            # The entire train config is already loaded, we just need to get the checkpoint dir
            config_path = parser.parse_arg("config_path")
            if not config_path:
-                raise ValueError("A config_path is expected when resuming a run.")
+                raise ValueError(
+                    f"A config_path is expected when resuming a run. Please specify path to {TRAIN_CONFIG_NAME}"
+                )
            if not Path(config_path).resolve().exists():
                raise NotADirectoryError(
                    f"{config_path=} is expected to be a local path. "
--- a/lerobot/scripts/visualize_dataset.py
+++ b/lerobot/scripts/visualize_dataset.py
@@ -265,13 +265,25 @@ def main():
        ),
    )

+    parser.add_argument(
+        "--tolerance-s",
+        type=float,
+        default=1e-4,
+        help=(
+            "Tolerance in seconds used to ensure data timestamps respect the dataset fps value"
+            "This is argument passed to the constructor of LeRobotDataset and maps to its tolerance_s constructor argument"
+            "If not given, defaults to 1e-4."
+        ),
+    )
+
    args = parser.parse_args()
    kwargs = vars(args)
    repo_id = kwargs.pop("repo_id")
    root = kwargs.pop("root")
+    tolerance_s = kwargs.pop("tolerance_s")

    logging.info("Loading dataset")
-    dataset = LeRobotDataset(repo_id, root=root)
+    dataset = LeRobotDataset(repo_id, root=root, tolerance_s=tolerance_s)

    visualize_dataset(dataset, **vars(args))

--- a/lerobot/scripts/visualize_dataset_html.py
+++ b/lerobot/scripts/visualize_dataset_html.py
@@ -446,15 +446,31 @@ def main():
        help="Delete the output directory if it exists already.",
    )

+    parser.add_argument(
+        "--tolerance-s",
+        type=float,
+        default=1e-4,
+        help=(
+            "Tolerance in seconds used to ensure data timestamps respect the dataset fps value"
+            "This is argument passed to the constructor of LeRobotDataset and maps to its tolerance_s constructor argument"
+            "If not given, defaults to 1e-4."
+        ),
+    )
+
    args = parser.parse_args()
    kwargs = vars(args)
    repo_id = kwargs.pop("repo_id")
    load_from_hf_hub = kwargs.pop("load_from_hf_hub")
    root = kwargs.pop("root")
+    tolerance_s = kwargs.pop("tolerance_s")

    dataset = None
    if repo_id:
-        dataset = LeRobotDataset(repo_id, root=root) if not load_from_hf_hub else get_dataset_info(repo_id)
+        dataset = (
+            LeRobotDataset(repo_id, root=root, tolerance_s=tolerance_s)
+            if not load_from_hf_hub
+            else get_dataset_info(repo_id)
+        )

    visualize_dataset_html(dataset, **vars(args))

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -56,7 +56,6 @@ dependencies = [
    "gymnasium==0.29.1",                                                 # TODO(rcadene, aliberts): Make gym 1.0.0 work
    "h5py>=3.10.0",
    "huggingface-hub[hf-transfer,cli]>=0.27.1 ; python_version < '4.0'",
-    "hydra-core>=1.3.2",
    "imageio[ffmpeg]>=2.34.0",
    "jsonlines>=4.0.0",
    "numba>=0.59.0",
@@ -70,6 +69,7 @@ dependencies = [
    "rerun-sdk>=0.21.0",
    "termcolor>=2.4.0",
    "torch>=2.2.1",
+    "torchcodec>=0.2.1 ; sys_platform != 'linux' or (sys_platform == 'linux' and platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l')",
    "torchvision>=0.21.0",
    "wandb>=0.16.3",
    "zarr>=2.17.0",
@@ -103,30 +103,7 @@ requires-poetry = ">=2.1"
 [tool.ruff]
 line-length = 110
 target-version = "py310"
-exclude = [
-    "tests/data",
-    ".bzr",
-    ".direnv",
-    ".eggs",
-    ".git",
-    ".git-rewrite",
-    ".hg",
-    ".mypy_cache",
-    ".nox",
-    ".pants.d",
-    ".pytype",
-    ".ruff_cache",
-    ".svn",
-    ".tox",
-    ".venv",
-    "__pypackages__",
-    "_build",
-    "buck-out",
-    "build",
-    "dist",
-    "node_modules",
-    "venv",
-]
+exclude = ["tests/artifacts/**/*.safetensors"]

 [tool.ruff.lint]
 select = ["E4", "E7", "E9", "F", "I", "N", "B", "C4", "SIM"]
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_0.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_0.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_1.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_1.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_250.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_250.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_251.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_251.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_498.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_498.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_499.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/aloha_sim_insertion_human/frame_499.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_0.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_0.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_1.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_1.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_159.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_159.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_160.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_160.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_80.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_80.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_81.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/pusht/frame_81.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_0.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_0.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_1.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_1.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_12.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_12.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_13.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_13.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_23.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_23.safetensors
--- a/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_24.safetensors
+++ b/tests/data/save_dataset_to_safetensors/lerobot/xarm_lift_medium/frame_24.safetensors
--- a/tests/artifacts/datasets/save_dataset_to_safetensors.py
+++ b/tests/artifacts/datasets/save_dataset_to_safetensors.py
@@ -23,7 +23,7 @@ If you know that your change will break backward compatibility, you should write
 doesnt need to be merged into the `main` branch. Then you need to run this script and update the tests artifacts.

 Example usage:
-    `python tests/scripts/save_dataset_to_safetensors.py`
+    `python tests/artifacts/datasets/save_dataset_to_safetensors.py`
 """

 import shutil
@@ -88,4 +88,4 @@ if __name__ == "__main__":
        "lerobot/nyu_franka_play_dataset",
        "lerobot/cmu_stretch",
    ]:
-        save_dataset_to_safetensors("tests/data/save_dataset_to_safetensors", repo_id=dataset)
+        save_dataset_to_safetensors("tests/artifacts/datasets", repo_id=dataset)
--- a/tests/data/save_image_transforms_to_safetensors/default_transforms.safetensors
+++ b/tests/data/save_image_transforms_to_safetensors/default_transforms.safetensors
--- a/tests/artifacts/image_transforms/save_image_transforms_to_safetensors.py
+++ b/tests/artifacts/image_transforms/save_image_transforms_to_safetensors.py
@@ -27,7 +27,7 @@ from lerobot.common.datasets.transforms import (
 )
 from lerobot.common.utils.random_utils import seeded_context

-ARTIFACT_DIR = Path("tests/data/save_image_transforms_to_safetensors")
+ARTIFACT_DIR = Path("tests/artifacts/image_transforms")
 DATASET_REPO_ID = "lerobot/aloha_mobile_shrimp"


--- a/tests/data/save_image_transforms_to_safetensors/single_transforms.safetensors
+++ b/tests/data/save_image_transforms_to_safetensors/single_transforms.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/actions.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/actions.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/grad_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/grad_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/output_dict.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/output_dict.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/param_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_/param_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/actions.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/actions.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/grad_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/grad_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/output_dict.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/output_dict.safetensors
--- a/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/param_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/aloha_sim_insertion_human_act_1000_steps/param_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/pusht_diffusion_/actions.safetensors
+++ b/tests/data/save_policy_to_safetensors/pusht_diffusion_/actions.safetensors
--- a/tests/data/save_policy_to_safetensors/pusht_diffusion_/grad_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/pusht_diffusion_/grad_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/pusht_diffusion_/output_dict.safetensors
+++ b/tests/data/save_policy_to_safetensors/pusht_diffusion_/output_dict.safetensors
--- a/tests/data/save_policy_to_safetensors/pusht_diffusion_/param_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/pusht_diffusion_/param_stats.safetensors
--- a/tests/artifacts/policies/save_policy_to_safetensors.py
+++ b/tests/artifacts/policies/save_policy_to_safetensors.py
@@ -141,5 +141,5 @@ if __name__ == "__main__":
        raise RuntimeError("No policies were provided!")
    for ds_repo_id, policy, policy_kwargs, file_name_extra in artifacts_cfg:
        ds_name = ds_repo_id.split("/")[-1]
-        output_dir = Path("tests/data/save_policy_to_safetensors") / f"{ds_name}_{policy}_{file_name_extra}"
+        output_dir = Path("tests/artifacts/policies") / f"{ds_name}_{policy}_{file_name_extra}"
        save_policy_to_safetensors(output_dir, ds_repo_id, policy, policy_kwargs)
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/actions.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/actions.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/grad_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/grad_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/output_dict.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/output_dict.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/param_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_mpc/param_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/actions.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/actions.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/grad_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/grad_stats.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/output_dict.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/output_dict.safetensors
--- a/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/param_stats.safetensors
+++ b/tests/data/save_policy_to_safetensors/xarm_lift_medium_tdmpc_use_policy/param_stats.safetensors
--- a/tests/cameras/mock_cv2.py
+++ b/tests/cameras/mock_cv2.py
--- a/tests/cameras/mock_pyrealsense2.py
+++ b/tests/cameras/mock_pyrealsense2.py
--- a/tests/cameras/test_cameras.py
+++ b/tests/cameras/test_cameras.py
@@ -85,8 +85,8 @@ def test_camera(request, camera_type, mock):
    camera.connect()
    assert camera.is_connected
    assert camera.fps is not None
-    assert camera.width is not None
-    assert camera.height is not None
+    assert camera.capture_width is not None
+    assert camera.capture_height is not None

    # Test connecting twice raises an error
    with pytest.raises(RobotDeviceAlreadyConnectedError):
@@ -146,7 +146,7 @@ def test_camera(request, camera_type, mock):
        camera.connect()

        if mock:
-            import tests.mock_cv2 as cv2
+            import tests.cameras.mock_cv2 as cv2
        else:
            import cv2

@@ -204,3 +204,49 @@ def test_save_images_from_cameras(tmp_path, request, camera_type, mock):

    # Small `record_time_s` to speedup unit tests
    save_images_from_cameras(tmp_path, record_time_s=0.02, mock=mock)
+
+
+@pytest.mark.parametrize("camera_type, mock", TEST_CAMERA_TYPES)
+@require_camera
+def test_camera_rotation(request, camera_type, mock):
+    config_kwargs = {"camera_type": camera_type, "mock": mock, "width": 640, "height": 480, "fps": 30}
+
+    # No rotation.
+    camera = make_camera(**config_kwargs, rotation=None)
+    camera.connect()
+    assert camera.capture_width == 640
+    assert camera.capture_height == 480
+    assert camera.width == 640
+    assert camera.height == 480
+    no_rot_img = camera.read()
+    h, w, c = no_rot_img.shape
+    assert h == 480 and w == 640 and c == 3
+    camera.disconnect()
+
+    # Rotation = 90 (clockwise).
+    camera = make_camera(**config_kwargs, rotation=90)
+    camera.connect()
+    # With a 90° rotation, we expect the metadata dimensions to be swapped.
+    assert camera.capture_width == 640
+    assert camera.capture_height == 480
+    assert camera.width == 480
+    assert camera.height == 640
+    import cv2
+
+    assert camera.rotation == cv2.ROTATE_90_CLOCKWISE
+    rot_img = camera.read()
+    h, w, c = rot_img.shape
+    assert h == 640 and w == 480 and c == 3
+    camera.disconnect()
+
+    # Rotation = 180.
+    camera = make_camera(**config_kwargs, rotation=None)
+    camera.connect()
+    assert camera.capture_width == 640
+    assert camera.capture_height == 480
+    assert camera.width == 640
+    assert camera.height == 480
+    no_rot_img = camera.read()
+    h, w, c = no_rot_img.shape
+    assert h == 480 and w == 640 and c == 3
+    camera.disconnect()
--- a/tests/configs/test_plugin_loading.py
+++ b/tests/configs/test_plugin_loading.py
@@ -0,0 +1,89 @@
+import sys
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Generator
+
+import pytest
+
+from lerobot.common.envs.configs import EnvConfig
+from lerobot.configs.parser import PluginLoadError, load_plugin, parse_plugin_args, wrap
+
+
+def create_plugin_code(*, base_class: str = "EnvConfig", plugin_name: str = "test_env") -> str:
+    """Creates a dummy plugin module that implements its own EnvConfig subclass."""
+    return f"""
+from dataclasses import dataclass
+from lerobot.common.envs.configs import {base_class}
+
+@{base_class}.register_subclass("{plugin_name}")
+@dataclass
+class TestPluginConfig:
+    value: int = 42
+    """
+
+
+@pytest.fixture
+def plugin_dir(tmp_path: Path) -> Generator[Path, None, None]:
+    """Creates a temporary plugin package structure."""
+    plugin_pkg = tmp_path / "test_plugin"
+    plugin_pkg.mkdir()
+    (plugin_pkg / "__init__.py").touch()
+
+    with open(plugin_pkg / "my_plugin.py", "w") as f:
+        f.write(create_plugin_code())
+
+    # Add tmp_path to Python path so we can import from it
+    sys.path.insert(0, str(tmp_path))
+    yield plugin_pkg
+    sys.path.pop(0)
+
+
+def test_parse_plugin_args():
+    cli_args = [
+        "--env.type=test",
+        "--model.discover_packages_path=some.package",
+        "--env.discover_packages_path=other.package",
+    ]
+    plugin_args = parse_plugin_args("discover_packages_path", cli_args)
+    assert plugin_args == {
+        "model.discover_packages_path": "some.package",
+        "env.discover_packages_path": "other.package",
+    }
+
+
+def test_load_plugin_success(plugin_dir: Path):
+    # Import should work and register the plugin with the real EnvConfig
+    load_plugin("test_plugin")
+
+    assert "test_env" in EnvConfig.get_known_choices()
+    plugin_cls = EnvConfig.get_choice_class("test_env")
+    plugin_instance = plugin_cls()
+    assert plugin_instance.value == 42
+
+
+def test_load_plugin_failure():
+    with pytest.raises(PluginLoadError) as exc_info:
+        load_plugin("nonexistent_plugin")
+    assert "Failed to load plugin 'nonexistent_plugin'" in str(exc_info.value)
+
+
+def test_wrap_with_plugin(plugin_dir: Path):
+    @dataclass
+    class Config:
+        env: EnvConfig
+
+    @wrap()
+    def dummy_func(cfg: Config):
+        return cfg
+
+    # Test loading plugin via CLI args
+    sys.argv = [
+        "dummy_script.py",
+        "--env.discover_packages_path=test_plugin",
+        "--env.type=test_env",
+    ]
+
+    cfg = dummy_func()
+    assert isinstance(cfg, Config)
+    assert isinstance(cfg.env, EnvConfig.get_choice_class("test_env"))
+    assert cfg.env.value == 42
--- a/tests/datasets/test_compute_stats.py
+++ b/tests/datasets/test_compute_stats.py
--- a/tests/datasets/test_datasets.py
+++ b/tests/datasets/test_datasets.py
@@ -473,12 +473,12 @@ def test_flatten_unflatten_dict():
 )
@require_x86_64_kernel
 def test_backward_compatibility(repo_id):
-    """The artifacts for this test have been generated by `tests/scripts/save_dataset_to_safetensors.py`."""
+    """The artifacts for this test have been generated by `tests/artifacts/datasets/save_dataset_to_safetensors.py`."""

    # TODO(rcadene, aliberts): remove dataset download
    dataset = LeRobotDataset(repo_id, episodes=[0])

-    test_dir = Path("tests/data/save_dataset_to_safetensors") / repo_id
+    test_dir = Path("tests/artifacts/datasets") / repo_id

    def load_and_compare(i):
        new_frame = dataset[i]  # noqa: B023
--- a/tests/datasets/test_delta_timestamps.py
+++ b/tests/datasets/test_delta_timestamps.py
--- a/tests/datasets/test_image_transforms.py
+++ b/tests/datasets/test_image_transforms.py
@@ -33,7 +33,7 @@ from lerobot.scripts.visualize_image_transforms import (
    save_all_transforms,
    save_each_transform,
 )
-from tests.scripts.save_image_transforms_to_safetensors import ARTIFACT_DIR
+from tests.artifacts.image_transforms.save_image_transforms_to_safetensors import ARTIFACT_DIR
 from tests.utils import require_x86_64_kernel


--- a/tests/datasets/test_image_writer.py
+++ b/tests/datasets/test_image_writer.py
--- a/tests/datasets/test_online_buffer.py
+++ b/tests/datasets/test_online_buffer.py
--- a/tests/datasets/test_sampler.py
+++ b/tests/datasets/test_sampler.py
--- a/tests/datasets/test_utils.py
+++ b/tests/datasets/test_utils.py
@@ -1,3 +1,5 @@
+#!/usr/bin/env python
+
 # Copyright 2024 The HuggingFace Inc. team. All rights reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -11,13 +13,32 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+
 import torch
 from datasets import Dataset
+from huggingface_hub import DatasetCard

 from lerobot.common.datasets.push_dataset_to_hub.utils import calculate_episode_data_index
-from lerobot.common.datasets.utils import (
-    hf_transform_to_torch,
-)
+from lerobot.common.datasets.utils import create_lerobot_dataset_card, hf_transform_to_torch
+
+
+def test_default_parameters():
+    card = create_lerobot_dataset_card()
+    assert isinstance(card, DatasetCard)
+    assert card.data.tags == ["LeRobot"]
+    assert card.data.task_categories == ["robotics"]
+    assert card.data.configs == [
+        {
+            "config_name": "default",
+            "data_files": "data/*/*.parquet",
+        }
+    ]
+
+
+def test_with_tags():
+    tags = ["tag1", "tag2"]
+    card = create_lerobot_dataset_card(tags=tags)
+    assert card.data.tags == ["LeRobot", "tag1", "tag2"]


 def test_calculate_episode_data_index():
--- a/tests/datasets/test_visualize_dataset.py
+++ b/tests/datasets/test_visualize_dataset.py
--- a/tests/envs/test_envs.py
+++ b/tests/envs/test_envs.py
@@ -23,8 +23,7 @@ from gymnasium.utils.env_checker import check_env
 import lerobot
 from lerobot.common.envs.factory import make_env, make_env_config
 from lerobot.common.envs.utils import preprocess_observation
-
-from .utils import require_env
+from tests.utils import require_env

 OBS_TYPES = ["state", "pixels", "pixels_agent_pos"]

--- a/tests/examples/test_examples.py
+++ b/tests/examples/test_examples.py
--- a/tests/lerobot/common/datasets/test_utils.py
+++ b/tests/lerobot/common/datasets/test_utils.py
@@ -1,38 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from huggingface_hub import DatasetCard
-
-from lerobot.common.datasets.utils import create_lerobot_dataset_card
-
-
-def test_default_parameters():
-    card = create_lerobot_dataset_card()
-    assert isinstance(card, DatasetCard)
-    assert card.data.tags == ["LeRobot"]
-    assert card.data.task_categories == ["robotics"]
-    assert card.data.configs == [
-        {
-            "config_name": "default",
-            "data_files": "data/*/*.parquet",
-        }
-    ]
-
-
-def test_with_tags():
-    tags = ["tag1", "tag2"]
-    card = create_lerobot_dataset_card(tags=tags)
-    assert card.data.tags == ["LeRobot", "tag1", "tag2"]
--- a/tests/motors/mock_dynamixel_sdk.py
+++ b/tests/motors/mock_dynamixel_sdk.py
--- a/tests/motors/mock_scservo_sdk.py
+++ b/tests/motors/mock_scservo_sdk.py
--- a/tests/motors/test_motors.py
+++ b/tests/motors/test_motors.py
--- a/tests/optim/test_optimizers.py
+++ b/tests/optim/test_optimizers.py
--- a/tests/optim/test_schedulers.py
+++ b/tests/optim/test_schedulers.py
--- a/tests/policies/test_policies.py
+++ b/tests/policies/test_policies.py
@@ -40,7 +40,7 @@ from lerobot.common.utils.random_utils import seeded_context
 from lerobot.configs.default import DatasetConfig
 from lerobot.configs.train import TrainPipelineConfig
 from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
-from tests.scripts.save_policy_to_safetensors import get_policy_stats
+from tests.artifacts.policies.save_policy_to_safetensors import get_policy_stats
 from tests.utils import DEVICE, require_cpu, require_env, require_x86_64_kernel


@@ -368,7 +368,7 @@ def test_normalize(insert_temporal_dim):
        # was changed to true. For some reason, tests would pass locally, but not in CI. So here we override
        # to test with `policy.use_mpc=false`.
        ("lerobot/xarm_lift_medium", "tdmpc", {"use_mpc": False}, "use_policy"),
-        ("lerobot/xarm_lift_medium", "tdmpc", {"use_mpc": True}, "use_mpc"),
+        # ("lerobot/xarm_lift_medium", "tdmpc", {"use_mpc": True}, "use_mpc"),
        # TODO(rcadene): the diffusion model was normalizing the image in mean=0.5 std=0.5 which is a hack supposed to
        # to normalize the image at all. In our current codebase we dont normalize at all. But there is still a minor difference
        # that fails the test. However, by testing to normalize the image with 0.5 0.5 in the current codebase, the test pass.
@@ -407,12 +407,10 @@ def test_backward_compatibility(ds_repo_id: str, policy_name: str, policy_kwargs
           should be updated.
        4. Check that this test now passes.
        5. Remember to restore `tests/scripts/save_policy_to_safetensors.py` to its original state.
-        6. Remember to stage and commit the resulting changes to `tests/data`.
+        6. Remember to stage and commit the resulting changes to `tests/artifacts`.
    """
    ds_name = ds_repo_id.split("/")[-1]
-    artifact_dir = (
-        Path("tests/data/save_policy_to_safetensors") / f"{ds_name}_{policy_name}_{file_name_extra}"
-    )
+    artifact_dir = Path("tests/artifacts/policies") / f"{ds_name}_{policy_name}_{file_name_extra}"
    saved_output_dict = load_file(artifact_dir / "output_dict.safetensors")
    saved_grad_stats = load_file(artifact_dir / "grad_stats.safetensors")
    saved_param_stats = load_file(artifact_dir / "param_stats.safetensors")
--- a/tests/robots/test_control_robot.py
+++ b/tests/robots/test_control_robot.py
@@ -51,7 +51,7 @@ from lerobot.common.robot_devices.control_configs import (
 )
 from lerobot.configs.policies import PreTrainedConfig
 from lerobot.scripts.control_robot import calibrate, record, replay, teleoperate
-from tests.test_robots import make_robot
+from tests.robots.test_robots import make_robot
 from tests.utils import TEST_ROBOT_TYPES, mock_calibration_dir, require_robot


--- a/tests/robots/test_robots.py
+++ b/tests/robots/test_robots.py
--- a/tests/utils/test_io_utils.py
+++ b/tests/utils/test_io_utils.py
--- a/tests/utils/test_logging_utils.py
+++ b/tests/utils/test_logging_utils.py
--- a/tests/utils/test_random_utils.py
+++ b/tests/utils/test_random_utils.py
--- a/tests/utils/test_train_utils.py
+++ b/tests/utils/test_train_utils.py
Author	SHA1	Message	Date
Pepijn	e8159997c7	User/pepijn/2025 03 17 act different image shapes (#870 )	2025-03-18 11:09:05 +01:00
Steven Palma	1c15bab70f	fix(codec): hot-fix for default codec in linux arm platforms (#868 )	2025-03-17 13:23:11 +01:00
Guillaume LEGENDRE	9f0a8a49d0	Update test-docker-build.yml	2025-03-15 11:34:17 +01:00
Huan Liu	a3cd18eda9	added wandb.run_id to allow resuming without wandb log; updated log m… (#841 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-03-15 09:40:39 +01:00
Huan Liu	7dc9ffe4c9	Update 10_use_so100.md (#840 )	2025-03-14 17:07:14 +01:00
Jade Choghari	0e98c6ee96	Add torchcodec cpu (#798 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Remi <re.cadene@gmail.com> Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Simon Alibert <simon.alibert@huggingface.co> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-03-14 16:53:42 +01:00
Simon Alibert	974028bd28	Organize test folders (#856 ) Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2025-03-13 14:05:55 +01:00
Simon Alibert	a36ed39487	Improve pre-commit config (#857 )	2025-03-13 13:29:55 +01:00
Ermano Arruda	c37b1d45b6	parametrise tolerance_s in visualize_dataset scripts (#716 )	2025-03-13 10:28:29 +01:00
pre-commit-ci[bot]	f994febca4	[pre-commit.ci] pre-commit autoupdate (#844 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-03-11 11:28:01 +01:00
Steven Palma	12f52632ed	chore(docs): update instructions for change in device and use_amp (#843 )	2025-03-10 21:03:33 +01:00
Steven Palma	8a64d8268b	chore(deps): remove hydra dependency (#842 )	2025-03-10 19:00:23 +01:00
Pepijn	84565c7c2e	Fix camera rotation error (#839 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-03-10 17:02:19 +01:00
Ben Sprenger	05b54733da	feat: add support for external plugin config dataclasses (#807 ) Co-authored-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-03-10 13:25:47 +01:00
Simon Alibert	513b008bcc	fix: deactivate tdmpc backward compatibility test with use_mpc=True (#838 )	2025-03-10 10:19:54 +01:00