* feat(dataset-tools): add dataset utilities and example script - Introduced dataset tools for LeRobotDataset, including functions for deleting episodes, splitting datasets, adding/removing features, and merging datasets. - Added an example script demonstrating the usage of these utilities. - Implemented comprehensive tests for all new functionalities to ensure reliability and correctness. * style fixes * move example to dataset dir * missing lisence * fixes mostly path * clean comments * move tests to functions instead of class based * - fix video editting, decode, delete frames and rencode video - copy unchanged video and parquet files to avoid recreating the entire dataset * Fortify tooling tests * Fix type issue resulting from saving numpy arrays with shape 3,1,1 * added lerobot_edit_dataset * - revert changes in examples - remove hardcoded split names * update comment * fix comment add lerobot-edit-dataset shortcut * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co> * style nit after copilot review * fix: bug in dataset root when editing the dataset in place (without setting new_repo_id * Fix bug in aggregate.py when accumelating video timestamps; add tests to fortify aggregate videos * Added missing output repo id * migrate delete episode to using pyav instead of decoding, writing frames to disk and encoding again. Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com> * added modified suffix in case repo_id is not set in delete_episode * adding docs for dataset tools * bump av version and add back time_base assignment * linter * modified push_to_hub logic in lerobot_edit_dataset * fix(progress bar): fixing the progress bar issue in dataset tools * chore(concatenate): removing no longer needed concatenate_datasets usage * fix(file sizes forwarding): forwarding files and chunk sizes in metadata info when splitting and aggregating datasets * style fix * refactor(aggregate): Fix video indexing and timestamp bugs in dataset merging There were three critical bugs in aggregate.py that prevented correct dataset merging: 1. Video file indices: Changed from += to = assignment to correctly reference merged video files 2. Video timestamps: Implemented per-source-file offset tracking to maintain continuous timestamps when merging split datasets (was causing non-monotonic timestamp warnings) 3. File rotation offsets: Store timestamp offsets after rotation decision to prevent out-of-bounds frame access (was causing "Invalid frame index" errors with small file size limits) Changes: - Updated update_meta_data() to apply per-source-file timestamp offsets - Updated aggregate_videos() to track offsets correctly during file rotation - Added get_video_duration_in_s import for duration calculation * Improved docs for split dataset and added a check for the possible case that the split size results in zero episodes * chore(docs): update merge documentation details Signed-off-by: Steven Palma <imstevenpmwork@ieee.org> --------- Co-authored-by: CarolinePascal <caroline8.pascal@gmail.com> Co-authored-by: Jack Vial <vialjack@gmail.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
83 lines
1.9 KiB
YAML
83 lines
1.9 KiB
YAML
- sections:
|
|
- local: index
|
|
title: LeRobot
|
|
- local: installation
|
|
title: Installation
|
|
title: Get started
|
|
- sections:
|
|
- local: il_robots
|
|
title: Imitation Learning for Robots
|
|
- local: il_sim
|
|
title: Imitation Learning in Sim
|
|
- local: cameras
|
|
title: Cameras
|
|
- local: integrate_hardware
|
|
title: Bring Your Own Hardware
|
|
- local: hilserl
|
|
title: Train a Robot with RL
|
|
- local: hilserl_sim
|
|
title: Train RL in Simulation
|
|
- local: async
|
|
title: Use Async Inference
|
|
title: "Tutorials"
|
|
- sections:
|
|
- local: lerobot-dataset-v3
|
|
title: Using LeRobotDataset
|
|
- local: porting_datasets_v3
|
|
title: Porting Large Datasets
|
|
- local: using_dataset_tools
|
|
title: Using the Dataset Tools
|
|
title: "Datasets"
|
|
- sections:
|
|
- local: act
|
|
title: ACT
|
|
- local: smolvla
|
|
title: SmolVLA
|
|
- local: pi0
|
|
title: π₀ (Pi0)
|
|
- local: pi05
|
|
title: π₀.₅ (Pi05)
|
|
- local: libero
|
|
title: Using Libero
|
|
title: "Policies"
|
|
- sections:
|
|
- local: introduction_processors
|
|
title: Introduction to Robot Processors
|
|
- local: debug_processor_pipeline
|
|
title: Debug your processor pipeline
|
|
- local: implement_your_own_processor
|
|
title: Implement your own processor
|
|
- local: processors_robots_teleop
|
|
title: Processors for Robots and Teleoperators
|
|
title: "Robot Processors"
|
|
- sections:
|
|
- local: so101
|
|
title: SO-101
|
|
- local: so100
|
|
title: SO-100
|
|
- local: koch
|
|
title: Koch v1.1
|
|
- local: lekiwi
|
|
title: LeKiwi
|
|
- local: hope_jr
|
|
title: Hope Jr
|
|
- local: reachy2
|
|
title: Reachy 2
|
|
title: "Robots"
|
|
- sections:
|
|
- local: phone_teleop
|
|
title: Phone
|
|
title: "Teleoperators"
|
|
- sections:
|
|
- local: notebooks
|
|
title: Notebooks
|
|
- local: feetech
|
|
title: Updating Feetech Firmware
|
|
title: "Resources"
|
|
- sections:
|
|
- local: contributing
|
|
title: Contribute to LeRobot
|
|
- local: backwardcomp
|
|
title: Backward compatibility
|
|
title: "About"
|