lerobot

Author	SHA1	Message	Date
AdilZouitine	6f7024242a	Refactor SACConfig properties for improved readability - Simplified the `image_features` property to directly iterate over `input_features`. - Removed unused imports and unnecessary code related to main execution, enhancing clarity and maintainability.	2025-03-28 17:18:48 +00:00
Michel Aractingi	02b9ea9446	Added gripper control mechanism to gym_manipulator Moved HilSerl env config to configs/env/configs.py fixes in actor_server and modeling_sac and configuration_sac added the possibility of ignoring missing keys in env_cfg in get_features_from_env_config function	2025-03-28 17:18:48 +00:00
AdilZouitine	79e0f6e06c	Add WrapperConfig for environment wrappers and update SACConfig properties - Introduced `WrapperConfig` dataclass for environment wrapper configurations. - Updated `ManiskillEnvConfig` to include a `wrapper` field for enhanced environment management. - Modified `SACConfig` to return `None` for `observation_delta_indices` and `action_delta_indices` properties. - Refactored `make_robot_env` function to improve readability and maintainability.	2025-03-28 17:18:48 +00:00
Michel Aractingi	d0b7690bc0	Change HILSerlRobotEnvConfig to inherit from EnvConfig Added support for hil_serl classifier to be trained with train.py run classifier training by python lerobot/scripts/train.py --policy.type=hilserl_classifier fixes in find_joint_limits, control_robot, end_effector_control_utils	2025-03-28 17:18:48 +00:00
AdilZouitine	052a4acfc2	[WIP] Update SAC configuration and environment settings - Reduced frame rate in `ManiskillEnvConfig` from 400 to 200. - Enhanced `SACConfig` with new dataclasses for actor, learner, and network configurations. - Improved input and output feature management in `SACConfig`. - Refactored `actor_server` and `learner_server` to access configuration properties directly. - Updated training pipeline to validate configurations and handle dataset repo IDs more robustly.	2025-03-28 17:18:48 +00:00
AdilZouitine	626e5dd35c	Add wandb run id in config	2025-03-28 17:18:48 +00:00
AdilZouitine	dd37bd412e	[WIP] Non functional yet Add ManiSkill environment configuration and wrappers - Introduced `VideoRecordConfig` for video recording settings. - Added `ManiskillEnvConfig` to encapsulate environment-specific configurations. - Implemented various wrappers for the ManiSkill environment, including observation and action scaling. - Enhanced the `make_maniskill` function to create a wrapped ManiSkill environment with video recording and observation processing. - Updated the `actor_server` and `learner_server` to utilize the new configuration structure. - Refactored the training pipeline to accommodate the new environment and policy configurations.	2025-03-28 17:18:48 +00:00
Michel Aractingi	b7b6d8102f	Change config logic in: - gym_manipulator - find_joint_limits - end_effector_utils	2025-03-28 17:18:48 +00:00
AdilZouitine	f483931fc0	Handle new config with sac	2025-03-28 17:18:48 +00:00
AdilZouitine	b2025b852c	Handle multi optimizers	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	7c05755823	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
Michel Aractingi	2945bbb221	Removed depleted files and scripts	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	8e6d5f504c	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	81952b2092	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
AdilZouitine	0eef49a0f6	Initialize log_alpha with the logarithm of temperature_init in SACPolicy - Updated the SACPolicy class to set log_alpha using the logarithm of the initial temperature value from the configuration.	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	2d5effeeba	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
AdilZouitine	c5c921cd7c	Remove unused functions and imports from modeling_sac.py - Deleted the `find_and_copy_params` function and the `Ensemble` class, as they were deemed unnecessary. - Cleaned up imports by removing `from_modules` from `tensordict` to enhance code clarity. - Simplified the assertion in the `Policy` class for better readability.	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	cb272294f5	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
AdilZouitine	4bb2077afa	Refactor SACPolicy and learner server for improved replay buffer management - Updated SACPolicy to create critic heads using a list comprehension for better readability. - Simplified the saving and loading of models using `save_model` and `load_model` functions from the safetensors library. - Introduced `initialize_offline_replay_buffer` function in the learner server to streamline offline dataset handling and replay buffer initialization. - Enhanced logging for dataset loading processes to improve traceability during training.	2025-03-28 17:18:48 +00:00
Michel Aractingi	b82faf7d8c	Add end effector action space to hil-serl (#861 ) Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-03-28 17:18:48 +00:00
AdilZouitine	7960f2c3c1	Enhance SAC configuration and policy with gradient clipping and temperature management - Introduced `grad_clip_norm` parameter in SAC configuration for gradient clipping - Updated SACPolicy to store temperature as an instance variable for consistent usage - Modified loss calculations in SACPolicy to utilize the instance temperature - Enhanced MLP and CriticHead to support a customizable final activation function - Implemented gradient clipping in the learner server during training steps for both actor and critic - Added tracking for gradient norms in training information	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	dee154a1a5	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
AdilZouitine	a3ef7dc6c3	Add custom save and load methods for SAC policy - Implement `_save_pretrained` method to handle TensorDict state saving - Add `_from_pretrained` class method for loading SAC policy from files - Create utility function `find_and_copy_params` to handle parameter copying	2025-03-28 17:18:48 +00:00
AdilZouitine	7e3e1ce173	Remove torch.no_grad decorator and optimize next action prediction in SAC policy - Removed `@torch.no_grad` decorator from Unnormalize forward method - Added TODO comment for optimizing next action prediction in SAC policy - Minor formatting adjustment in NaN assertion for log standard deviation Co-authored-by: Yoel Chornton <yoel.chornton@gmail.com>	2025-03-28 17:18:48 +00:00
Eugene Mironov	db78fee9de	[HIL-SERL] Migrate threading to multiprocessing (#759 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-03-28 17:18:48 +00:00
pre-commit-ci[bot]	38f5fa4523	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-03-28 17:18:48 +00:00
AdilZouitine	76df8a31b3	Add storage device configuration for SAC policy and replay buffer - Introduce `storage_device` parameter in SAC configuration and training settings - Update learner server to use configurable storage device for replay buffer - Reduce online buffer capacity in ManiSkill configuration - Modify replay buffer initialization to support custom storage device	2025-03-28 17:18:48 +00:00
Michel Aractingi	ff223c106d	Added caching function in the learner_server and modeling sac in order to limit the number of forward passes through the pretrained encoder when its frozen. Added tensordict dependencies Updated the version of torch and torchvision Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:48 +00:00
Eugene Mironov	d48161da1b	[Port HIL-SERL] Adjust Actor-Learner architecture & clean up dependency management for HIL-SERL (#722 )	2025-03-28 17:18:48 +00:00
AdilZouitine	150def839c	Refactor SAC policy with performance optimizations and multi-camera support - Introduced Ensemble and CriticHead classes for more efficient critic network handling - Added support for multiple camera inputs in observation encoder - Optimized image encoding by batching image processing - Updated configuration for ManiSkill environment with reduced image size and action scaling - Compiled critic networks for improved performance - Simplified normalization and ensemble handling in critic networks Co-authored-by: michel-aractingi <michel.aractingi@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	795063aa1b	- Fixed big issue in the loading of the policy parameters sent by the learner to the actor -- pass only the actor to the `update_policy_parameters` and remove `strict=False` - Fixed big issue in the normalization of the actions in the `forward` function of the critic -- remove the `torch.no_grad` decorator in `normalize.py` in the normalization function - Fixed performance issue to boost the optimization frequency by setting the storage device to be the same as the device of learning. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
AdilZouitine	279e03b6c8	Improve wandb logging and custom step tracking in logger - Modify logger to support multiple custom step keys - Update logging method to handle custom step keys more flexibly - Enhance logging of optimization step and frequency Co-authored-by: michel-aractingi <michel.aractingi@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	61b0e9539f	nit Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	0847b2119b	Changed the init_final value to center the starting mean and std of the policy Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	eb7e28d9d9	Hardcoded some normalization parameters. TODO refactor Added masking actions on the level of the intervention actions and offline dataset Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	a0e0a9a9b1	fix log_alpha in modeling_sac: change to nn.parameter added pretrained vision model in policy Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	9c14830cd9	Added possiblity to record and replay delta actions during teleoperation rather than absolute actions Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Eugene Mironov	3c58867738	[Port HIL-SERL] Add resnet-10 as default encoder for HIL-SERL (#696 ) Co-authored-by: Khalil Meftah <kmeftah.khalil@gmail.com> Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com> Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co> Co-authored-by: Ke Wang <superwk1017@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	c623824139	- Added JointMaskingActionSpace wrapper in `gym_manipulator` in order to select which joints will be controlled. For example, we can disable the gripper actions for some tasks. - Added Nan detection mechanisms in the actor, learner and gym_manipulator for the case where we encounter nans in the loop. - changed the non-blocking in the `.to(device)` functions to only work for the case of cuda because they were causing nans when running the policy on mps - Added some joint clipping and limits in the env, robot and policy configs. TODO clean this part and make the limits in one config file only. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	f4f5b26a21	Several fixes to move the actor_server and learner_server code from the maniskill environment to the real robot environment. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	729b4ed697	- Added `lerobot/scripts/server/gym_manipulator.py` that contains all the necessary wrappers to run a gym-style env around the real robot. - Added `lerobot/scripts/server/find_joint_limits.py` to test the min and max angles of the motion you wish the robot to explore during RL training. - Added logic in `manipulator.py` to limit the maximum possible joint angles to allow motion within a predefined joint position range. The limits are specified in the yaml config for each robot. Checkout the so100.yaml. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	87c7eca582	Added crop_dataset_roi.py that allows you to load a lerobotdataset -> crop its images -> create a new lerobot dataset with the cropped and resized images. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	b29401e4e2	- Refactor observation encoder in `modeling_sac.py` - added `torch.compile` to the actor and learner servers. - organized imports in `train_sac.py` - optimized the parameters push by not sending the frozen pre-trained encoder. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Yoel	faab32fe14	[Port HIL-SERL] Add HF vision encoder option in SAC (#651 ) Added support with custom pretrained vision encoder to the modeling sac implementation. Great job @ChorntonYoel !	2025-03-28 17:18:24 +00:00
Michel Aractingi	2023289ce8	Added support for checkpointing the policy. We can save and load the policy state dict, optimizers state, optimization step and interaction step Added functions for converting the replay buffer from and to LeRobotDataset. When we want to save the replay buffer, we convert it first to LeRobotDataset format and save it locally and vice-versa. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	18207d995e	- Added additional logging information in wandb around the timings of the policy loop and optimization loop. - Optimized critic design that improves the performance of the learner loop by a factor of 2 - Cleaned the code and fixed style issues - Completed the config with actor_learner_config field that contains host-ip and port elemnts that are necessary for the actor-learner servers. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	a0a81c0c12	FREEDOM, added back the optimization loop code in `learner_server.py` Ran experiment with pushcube env from maniskill. The learning seem to work. Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
Michel Aractingi	ef64ba91d9	Added server directory in `lerobot/scripts` that contains scripts and the protobuf message types to split training into two processes, acting and learning. The actor rollouts the policy and collects interaction data while the learner recieves the data, trains the policy and sends the updated parameters to the actor. The two scripts are ran simultaneously Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>	2025-03-28 17:18:24 +00:00
AdilZouitine	83dc00683c	Stable version of rlpd + drq	2025-03-28 17:18:24 +00:00
AdilZouitine	5b92465e38	Add type annotations and restructure SACConfig class fields	2025-03-28 17:18:24 +00:00

1 2 3 4 5 ...

462 Commits