Files
lerobot/lerobot
KeWang1017 ca74a13d61 Refactor SACPolicy for improved action sampling and standard deviation handling
- Updated action selection to use distribution sampling and log probabilities for better stochastic behavior.
- Enhanced standard deviation clamping to prevent extreme values, ensuring stability in policy outputs.
- Cleaned up code by removing unnecessary comments and improving readability.

These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.
2024-12-29 14:17:25 +00:00
..
2024-12-29 14:14:13 +00:00
2024-11-29 19:04:00 +01:00
2024-11-29 19:04:00 +01:00
2024-05-15 12:13:09 +02:00