AdilZouitine
4257fe5045
rename reward classifier
2025-04-25 18:38:52 +02:00
Michel Aractingi
ea89b29fe5
checkout normalize.py to prev commit
2025-04-25 18:10:59 +02:00
AdilZouitine
50e9a8ed6a
cleaning
2025-04-25 17:22:02 +02:00
pre-commit-ci[bot]
eb44a06a9b
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine
80d566eb56
Handle new config with sac
2025-04-18 15:09:27 +02:00
pre-commit-ci[bot]
0ea27704f6
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:09:25 +02:00
pre-commit-ci[bot]
1c8daf11fd
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:07:46 +02:00
AdilZouitine
e002c5ec56
Remove torch.no_grad decorator and optimize next action prediction in SAC policy
...
- Removed `@torch.no_grad` decorator from Unnormalize forward method
- Added TODO comment for optimizing next action prediction in SAC policy
- Minor formatting adjustment in NaN assertion for log standard deviation
Co-authored-by: Yoel Chornton <yoel.chornton@gmail.com >
2025-04-18 15:06:52 +02:00
Michel Aractingi
0d88a5ee09
- Fixed big issue in the loading of the policy parameters sent by the learner to the actor -- pass only the actor to the update_policy_parameters and remove strict=False
...
- Fixed big issue in the normalization of the actions in the `forward` function of the critic -- remove the `torch.no_grad` decorator in `normalize.py` in the normalization function
- Fixed performance issue to boost the optimization frequency by setting the storage device to be the same as the device of learning.
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com >
2025-04-18 15:04:44 +02:00
Simon Alibert
3354d919fc
LeRobotDataset v2.1 ( #711 )
...
Co-authored-by: Remi <remi.cadene@huggingface.co >
Co-authored-by: Remi Cadene <re.cadene@gmail.com >
2025-02-25 15:27:29 +01:00
Remi
638d411cd3
Add Pi0 ( #681 )
...
Co-authored-by: Simon Alibert <simon.alibert@huggingface.co >
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com >
Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com >
2025-02-04 18:01:04 +01:00
Simon Alibert
3c0a209f9f
Simplify configs ( #550 )
...
Co-authored-by: Remi <remi.cadene@huggingface.co >
Co-authored-by: HUANG TZU-CHUN <137322177+tc-huang@users.noreply.github.com >
2025-01-31 13:57:37 +01:00
Alexander Soare
abbb1d2367
Make sure policies don't mutate the batch ( #323 )
2024-07-22 20:38:33 +01:00
Ruijie
b0d954c6e1
Fix bug in normalize to avoid divide by zero ( #239 )
...
Co-authored-by: rj <rj@teleopstrio-razer.lan >
Co-authored-by: Remi <re.cadene@gmail.com >
2024-06-04 12:21:28 +02:00
Simon Alibert
f52f4f2cd2
Add copyrights ( #157 )
2024-05-15 12:13:09 +02:00
Alexander Soare
a4891095e4
Use PytorchModelHubMixin to save models as safetensors ( #125 )
...
Co-authored-by: Remi <re.cadene@gmail.com >
2024-05-01 16:17:18 +01:00
Alexander Soare
45f351c618
Make sure targets are normalized too ( #106 )
2024-04-26 11:18:39 +01:00
Remi
e760e4cd63
Move normalization to policy for act and diffusion ( #90 )
...
Co-authored-by: Alexander Soare <alexander.soare159@gmail.com >
2024-04-25 11:47:38 +02:00