Update pre-commit-config.yaml + pyproject.toml + ceil rerun & transformer dependencies version (#1520)

* chore: update .gitignore * chore: update pre-commit * chore(deps): update pyproject * fix(ci): multiple fixes * chore: pre-commit apply * chore: address review comments * Update pyproject.toml Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com> Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> * chore(deps): add todo --------- Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com>
2025-07-17 14:30:20 +02:00
parent 0938a1d816
commit 378e1f0338
78 changed files with 1450 additions and 636 deletions
--- a/docs/source/il_sim.mdx
+++ b/docs/source/il_sim.mdx
@@ -3,6 +3,7 @@
 This tutorial will explain how to train a neural network to control a robot in simulation with imitation learning.

 **You'll learn:**
+
 1. How to record a dataset in simulation with [gym-hil](https://github.com/huggingface/gym-hil) and visualize the dataset.
 2. How to train a policy using your data.
 3. How to evaluate your policy in simulation and visualize the results.
@@ -55,13 +56,21 @@ Note that to teleoperate the robot you have to hold the "Human Take Over Pause P
 **Gamepad Controls**

 <p align="center">
-  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/gamepad_guide.jpg?raw=true" alt="Figure shows the control mappings on a Logitech gamepad." title="Gamepad Control Mapping" width="100%"></img>
+  <img
+    src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/gamepad_guide.jpg?raw=true"
+    alt="Figure shows the control mappings on a Logitech gamepad."
+    title="Gamepad Control Mapping"
+    width="100%"
+  ></img>
+</p>
+<p align="center">
+  <i>Gamepad button mapping for robot control and episode management</i>
 </p>
-<p align="center"><i>Gamepad button mapping for robot control and episode management</i></p>

 **Keyboard controls**

 For keyboard controls use the `spacebar` to enable control and the following keys to move the robot:
+
 ```bash
  Arrow keys: Move in X-Y plane
  Shift and Shift_R: Move in Z axis
@@ -74,14 +83,21 @@ For keyboard controls use the `spacebar` to enable control and the following key
 If you uploaded your dataset to the hub you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id.

 <p align="center">
-  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/dataset_visualizer_sim.png" alt="Figure shows the dataset visualizer" title="Dataset visualization" width="100%"></img>
+  <img
+    src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/dataset_visualizer_sim.png"
+    alt="Figure shows the dataset visualizer"
+    title="Dataset visualization"
+    width="100%"
+  ></img>
+</p>
+<p align="center">
+  <i>Dataset visualizer</i>
 </p>
-<p align="center"><i>Dataset visualizer</i></p>
-

 ## Train a policy

 To train a policy to control your robot, use the [`python -m lerobot.scripts.train`](../src/lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
+
 ```bash
 python -m lerobot.scripts.train \
  --dataset.repo_id=${HF_USER}/il_gym \
@@ -93,25 +109,29 @@ python -m lerobot.scripts.train \
 ```

 Let's explain the command:
+
 1. We provided the dataset as argument with `--dataset.repo_id=${HF_USER}/il_gym`.
 2. We provided the policy with `policy.type=act`. This loads configurations from [`configuration_act.py`](../src/lerobot/policies/act/configuration_act.py). Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g. `laptop` and `phone`) which have been saved in your dataset.
-4. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
-5. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.
+3. We provided `policy.device=cuda` since we are training on a Nvidia GPU, but you could use `policy.device=mps` to train on Apple silicon.
+4. We provided `wandb.enable=true` to use [Weights and Biases](https://docs.wandb.ai/quickstart) for visualizing training plots. This is optional but if you use it, make sure you are logged in by running `wandb login`.

 Training should take several hours, 100k steps (which is the default) will take about 1h on Nvidia A100. You will find checkpoints in `outputs/train/il_sim_test/checkpoints`.

 #### Train using Collab
+
 If your local computer doesn't have a powerful GPU you could utilize Google Collab to train your model by following the [ACT training notebook](./notebooks#training-act).

 #### Upload policy checkpoints

 Once training is done, upload the latest checkpoint with:
+
 ```bash
 huggingface-cli upload ${HF_USER}/il_sim_test \
  outputs/train/il_sim_test/checkpoints/last/pretrained_model
 ```

 You can also upload intermediate checkpoints with:
+
 ```bash
 CKPT=010000
 huggingface-cli upload ${HF_USER}/il_sim_test${CKPT} \
@@ -144,9 +164,9 @@ mjpython -m lerobot.scripts.rl.eval_policy --config_path=path/to/eval_config_gym
 </hfoptions>

 > [!WARNING]
-> While the main workflow of training ACT in simulation is straightforward, there is significant room for exploring  how to set up the task, define the initial state of the environment, and determine the type of data required during collection to learn the most effective policy. If your trained policy doesn't perform well, investigate the quality of the dataset it was trained on using our visualizers, as well as the action values and various hyperparameters related to ACT and the simulation.
+> While the main workflow of training ACT in simulation is straightforward, there is significant room for exploring how to set up the task, define the initial state of the environment, and determine the type of data required during collection to learn the most effective policy. If your trained policy doesn't perform well, investigate the quality of the dataset it was trained on using our visualizers, as well as the action values and various hyperparameters related to ACT and the simulation.

 Congrats 🎉, you have finished this tutorial. If you want to continue with using LeRobot in simulation follow this [Tutorial on reinforcement learning in sim with HIL-SERL](https://huggingface.co/docs/lerobot/hilserl_sim)

 > [!TIP]
->  If you have any questions or need help, please reach out on [Discord](https://discord.com/invite/s3KuuzsPFb).
+> If you have any questions or need help, please reach out on [Discord](https://discord.com/invite/s3KuuzsPFb).