Commit Graph

20 Commits

Author SHA1 Message Date
Jack Vial
27ba2951d1 fix(tdmpc): Add missing save_freq to tdmpc policy config (#404)
Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>
2024-09-02 19:04:41 +01:00
Alexander Soare
f8a6574698 Add online training with TD-MPC as proof of concept (#338) 2024-07-25 11:16:38 +01:00
Simon Alibert
964f9e86d6 Cleanup config defaults (#300) 2024-07-04 11:53:29 +02:00
Alexander Soare
342f429f1c Add test to make sure policy dataclass configs match yaml configs (#292) 2024-06-26 09:09:40 +01:00
Alexander Soare
2b270d085b Disable online training (#202)
Co-authored-by: Remi <re.cadene@gmail.com>
2024-05-20 18:27:54 +01:00
Alexander Soare
e89521dfa0 Enable tests for TD-MPC (#160) 2024-05-09 13:42:12 +01:00
Simon Alibert
c77633c38c Add regression tests (#119)
- Add `tests/scripts/save_policy_to_safetensor.py` to generate test artifacts
- Add `test_backward_compatibility to test generated outputs from the policies against artifacts
2024-05-04 16:20:30 +02:00
Alexander Soare
bccee745c3 Refactor eval.py (#127) 2024-05-03 17:33:16 +01:00
Alexander Soare
d1855a202a Refactor TD-MPC (#103)
Co-authored-by: Cadene <re.cadene@gmail.com>
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2024-05-01 16:40:04 +01:00
Alexander Soare
9d60dce6f3 Tidy up yaml configs (#121) 2024-04-30 16:08:59 +01:00
Remi
e760e4cd63 Move normalization to policy for act and diffusion (#90)
Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>
2024-04-25 11:47:38 +02:00
Cadene
06573d7f67 online training works (loss goes down), remove repeat_action, eval_policy outputs episodes data, eval_policy uses max_episodes_rendered 2024-04-10 11:34:01 +00:00
Cadene
73dfa3c8e3 tests for tdmpc and diffusion policy are passing 2024-04-09 02:50:32 +00:00
Cadene
4371a5570d Remove latency, tdmpc policy passes tests (TODO: make it work with online RL) 2024-04-07 16:01:22 +00:00
Cadene
f56b1a0e16 WIP tdmpc 2024-04-05 13:40:31 +00:00
Simon Alibert
1c24bbda3f WIP Upgrading simxam from mujoco-py to mujoco python bindings 2024-03-25 12:28:07 +01:00
Remi Cadene
cfc304e870 Refactor env queue, Training diffusion works (Still not converging) 2024-03-04 11:00:51 +00:00
Cadene
cf5063e50e Add diffusion policy (train and eval works, TODO: reproduce results) 2024-02-28 15:21:42 +00:00
Cadene
21670dce90 Refactor train, eval_policy, logger, Add diffusion.yaml (WIP) 2024-02-26 01:10:09 +00:00
Cadene
5a219fed6e Refactor policy config 2024-02-25 18:26:44 +00:00