* Enhance training and logging functionality with accelerator support
- Added support for multi-GPU training by introducing an `accelerator` parameter in training functions.
- Updated `update_policy` to handle gradient updates based on the presence of an accelerator.
- Modified logging to prevent duplicate messages in non-main processes.
- Enhanced `set_seed` and `get_safe_torch_device` functions to accommodate accelerator usage.
- Updated `MetricsTracker` to account for the number of processes when calculating metrics.
- Introduced a new feature in `pyproject.toml` for the `accelerate` library dependency.
* Initialize logging in training script for both main and non-main processes
- Added `init_logging` calls to ensure proper logging setup when using the accelerator and in standard training mode.
- This change enhances the clarity and consistency of logging during training sessions.
* add docs and only push model once
* Place logging under accelerate and update docs
* fix pre commit
* only log in main process
* main logging
* try with local rank
* add tests
* change runner
* fix test
* dont push to hub in multi gpu tests
* pre download dataset in tests
* small fixes
* fix path optimizer state
* update docs, and small improvements in train
* simplify accelerate main process detection
* small improvements in train
* fix OOM bug
* change accelerate detection
* add some debugging
* always use accelerate
* cleanup update method
* cleanup
* fix bug
* scale lr decay if we reduce steps
* cleanup logging
* fix formatting
* encorperate feedback pr
* add min memory to cpu tests
* use accelerate to determin logging
* fix precommit and fix tests
* chore: minor details
---------
Co-authored-by: AdilZouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>