lerobot

Author	SHA1	Message	Date
Michel Aractingi	c1ee25d9f7	nits in configuration classifier and control_robot	2025-04-18 16:18:13 +02:00
Michel Aractingi	9886520d33	Added option to add current readings to the state of the policy	2025-04-18 16:18:13 +02:00
Michel Aractingi	3b24ad3c84	Fixes for the reward classifier	2025-04-18 16:18:13 +02:00
AdilZouitine	54c3c6d684	Enhance MLP class in modeling_sac.py with detailed docstring and refactor layer construction for improved readability. Simplify layer addition logic by removing unnecessary conditions and ensuring consistent handling of activations and dropout.	2025-04-18 14:15:06 +00:00
pre-commit-ci[bot]	fb92935601	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 13:33:37 +00:00
AdilZouitine	dcd850feab	Refactor SACObservationEncoder to improve modularity and readability. Split initialization into dedicated methods for image and state layers, and enhance caching logic for image features. Update forward method to streamline feature encoding and ensure proper normalization handling.	2025-04-18 15:10:22 +02:00
AdilZouitine	1ce368503d	Refactor SACPolicy initialization by breaking down the constructor into smaller methods for normalization, encoders, critics, actor, and temperature setup. This enhances readability and maintainability.	2025-04-18 15:10:22 +02:00
AdilZouitine	fb075a709d	Refactor input and output normalization handling in SACPolicy for improved clarity and efficiency. Consolidate encoder initialization logic and remove redundant else statements.	2025-04-18 15:10:22 +02:00
AdilZouitine	3424644ecd	Fix init temp Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	c37936f2c9	Update log_std_min type to float in PolicyConfig for consistency	2025-04-18 15:10:22 +02:00
AdilZouitine	c5382a450c	fix caching Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	2f7339b410	Handle caching Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	9e5f254db0	change the tanh distribution to match hil serl Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	8122721f6d	match target entropy hil serl Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	5c352ae558	stick to hil serl nn architecture Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	9386892f8e	Refactor modeling_sac and parameter handling for clarity and reusability. Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com>	2025-04-18 15:10:22 +02:00
AdilZouitine	267a837a2c	fix encoder training	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	28b595c651	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
Michel Aractingi	9fd4c21d4d	General fixes in code, removed delta action, fixed grasp penalty, added logic to put gripper reward in info	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	02e1ed0bfb	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
AdilZouitine	e18274bc9a	fix caching and dataset stats is optional	2025-04-18 15:10:22 +02:00
AdilZouitine	68c271ad25	Add rounding for safety	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	a3ada81816	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
AdilZouitine	203315d378	fix sign issue	2025-04-18 15:10:22 +02:00
AdilZouitine	78c640b6d8	Refactor complementary_info handling in ReplayBuffer	2025-04-18 15:10:22 +02:00
AdilZouitine	d5a87f67cf	Handle gripper penalty	2025-04-18 15:10:22 +02:00
AdilZouitine	8bcf41761d	fix caching	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	1efaf02df9	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
AdilZouitine	cf58890bb0	fix indentation issue	2025-04-18 15:10:22 +02:00
AdilZouitine	7c2c67fc3c	Enhance SAC configuration and replay buffer with asynchronous prefetching support - Added async_prefetch parameter to SACConfig for improved buffer management. - Implemented get_iterator method in ReplayBuffer to support asynchronous prefetching of batches. - Updated learner_server to utilize the new iterator for online and offline sampling, enhancing training efficiency.	2025-04-18 15:10:22 +02:00
AdilZouitine	70130b9841	Enhance SACPolicy to support shared encoder and optimize action selection - Cached encoder output in select_action method to reduce redundant computations. - Updated action selection and grasp critic calls to utilize cached encoder features when available.	2025-04-18 15:10:22 +02:00
AdilZouitine	6167886472	Enhance SACPolicy and learner server for improved grasp critic integration - Updated SACPolicy to conditionally compute grasp critic losses based on the presence of discrete actions. - Refactored the forward method to handle grasp critic model selection and loss computation more clearly. - Adjusted learner server to utilize optimized parameters for grasp critic during training. - Improved action handling in the ManiskillMockGripperWrapper to accommodate both tuple and single action inputs.	2025-04-18 15:10:22 +02:00
AdilZouitine	f9fb9d4594	Refactor SACPolicy for improved readability and action dimension handling - Cleaned up code formatting for better readability, including consistent spacing and removal of unnecessary blank lines. - Consolidated continuous action dimension calculation to enhance clarity and maintainability. - Simplified loss return statements in the forward method to improve code structure. - Ensured grasp critic parameters are included conditionally based on configuration settings.	2025-04-18 15:10:22 +02:00
AdilZouitine	d86d29fe21	Add mock gripper support and enhance SAC policy action handling - Introduced mock_gripper parameter in ManiskillEnvConfig to enable gripper simulation. - Added ManiskillMockGripperWrapper to adjust action space for environments with discrete actions. - Updated SACPolicy to compute continuous action dimensions correctly, ensuring compatibility with the new gripper setup. - Refactored action handling in the training loop to accommodate the changes in action dimensions.	2025-04-18 15:10:22 +02:00
AdilZouitine	f83d215e7a	Refactor SAC policy and training loop to enhance discrete action support - Updated SACPolicy to conditionally compute losses for grasp critic based on num_discrete_actions. - Simplified forward method to return loss outputs as a dictionary for better clarity. - Adjusted learner_server to handle both main and grasp critic losses during training. - Ensured optimizers are created conditionally for grasp critic based on configuration settings.	2025-04-18 15:10:22 +02:00
AdilZouitine	7361a11a4d	Refactor SAC configuration and policy to support discrete actions - Removed GraspCriticNetworkConfig class and integrated its parameters into SACConfig. - Added num_discrete_actions parameter to SACConfig for better action handling. - Updated SACPolicy to conditionally create grasp critic networks based on num_discrete_actions. - Enhanced grasp critic forward pass to handle discrete actions and compute losses accordingly.	2025-04-18 15:10:22 +02:00
Michel Aractingi	0cce2fe0fa	Added Gripper quantization wrapper and grasp penalty removed complementary info from buffer and learner server removed get_gripper_action function added gripper parameters to `common/envs/configs.py`	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	88d26ae976	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
s1lent4gnt	3a2308d86f	Add grasp critic to the training loop - Integrated the grasp critic gradient update to the training loop in learner_server - Added Adam optimizer and configured grasp critic learning rate in configuration_sac - Added target critics networks update after the critics gradient step	2025-04-18 15:10:22 +02:00
s1lent4gnt	fdd04efdb7	Add get_gripper_action method to GamepadController	2025-04-18 15:10:22 +02:00
s1lent4gnt	ff18be18ad	Add gripper penalty wrapper	2025-04-18 15:10:22 +02:00
s1lent4gnt	427720426b	Add complementary info in the replay buffer - Added complementary info in the add method - Added complementary info in the sample method	2025-04-18 15:10:22 +02:00
s1lent4gnt	66693965c0	Add grasp critic - Implemented grasp critic to evaluate gripper actions - Added corresponding config parameters for tuning	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	334cf8143e	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
AdilZouitine	5b49601072	Fix convergence of sac, multiple torch compile on the same model caused divergence	2025-04-18 15:10:22 +02:00
AdilZouitine	0185a0b6fd	Fix cuda graph break	2025-04-18 15:10:22 +02:00
s1lent4gnt	70d418935d	Fix: Prevent Invalid next_state References When optimize_memory=True (#918 )	2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]	eb44a06a9b	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2025-04-18 15:10:22 +02:00
Michel Aractingi	8eb3c1510c	Added support for controlling the gripper with the pygame interface of gamepad Minor modifications in gym_manipulator to quantize the gripper actions clamped the observations after F.resize in ConvertToLeRobotObservation wrapper due to a bug in F.resize, images were returned exceeding the maximum value of 1.0	2025-04-18 15:10:22 +02:00
AdilZouitine	4d5ecb082e	Refactor SACPolicy for improved type annotations and readability - Enhanced type annotations for variables in the `SACPolicy` class to improve code clarity. - Updated method calls to use keyword arguments for better readability. - Streamlined the extraction of batch components, ensuring consistent typing across the class methods.	2025-04-18 15:10:22 +02:00

1 2 3 4 5 ...

951 Commits