# Source: https://internrobotics.github.io/InternDataEngine-Docs/policy/training.html # Training [​](#training) This guide covers data format conversion and policy training for validating generated simulation data. ## Part 1: LMDB to LeRobot Data Conversion [​](#part-1-lmdb-to-lerobot-data-conversion) The simulation data generated by InternDataEngine is stored in LMDB format. To use this data for policy training, you need to convert it to LeRobot format. ### Step 1: Install LeRobot v2.1 [​](#step-1-install-lerobot-v2-1) We use LeRobot v2.1 format for data storage. Install the LeRobot 2.1 repo. ### Step 2: Convert LMDB to LeRobot v2.1 [​](#step-2-convert-lmdb-to-lerobot-v2-1) Use the conversion scripts in ``[policy/lmdb2lerobotv21](https://github.com/InternRobotics/InternDataEngine/tree/master/policy/lmdb2lerobotv21)directory. We provide conversion scripts for different robot platforms: - **lmdb2lerobot_lift2_a1.py **( script ): Lift2 (ARX). - **lmdb2lerobot_split_aloha_a1.py **( script ): Split Aloha. - **lmdb2lerobot_genie1_a1.py **( script ): Genie1. - **lmdb2lerobot_franka_a1.py **( script ): Franka FR3. - **lmdb2lerobot_frankarobotiq_a1.py **( script ): Franka with Robotiq gripper. Example usage: bash ``` python lmdb2lerobot_lift2_a1.py \ --src_path ${src_path} \ --save_path ${save_path} \ --repo_id ${repo_id} \ --num-threads ${num_threads} \ --num_demos ${num_demos} ``` 1 2 3 4 5 6 **Parameters: ** - **--src_path **( str ): Path to the source LMDB data directory. - **--save_path **( str ): Path to save the converted LeRobot dataset. - **--repo_id **( str ): Dataset repository identifier. - **--num-threads **( int ): Number of threads for parallel processing. - **--num_demos **( int ): Number of demonstrations to convert (optional). ### Step 3: Convert to LeRobot v3.0 (Optional) [​](#step-3-convert-to-lerobot-v3-0-optional) If you need LeRobot v3.0 format for training, please install LeRobot 3.0. Then use the conversion script: bash ``` python convertv21_to_v30.py --input_path ${v21_path} --output_path ${v30_path} ``` 1 The conversion code is available at ``[policy/lmdb2lerobotv21/convertv21_to_v30.py](https://github.com/InternRobotics/InternDataEngine/tree/master/policy/lmdb2lerobotv21/convertv21_to_v30.py). ## Part 2: Policy Training with π 0 [​](#part-2-policy-training-with-π0) As described in the [InternData-A1 paper](https://arxiv.org/pdf/2511.16651), we used multi-machine, multi-GPU JAX-based π 0 for data validation. We have implemented a JAX-based, multi-nodes, multi-GPU training pipeline that supports multi-dataset mixed training for π 0 . ### Features [​](#features) - **Multi-machine, multi-GPU training **: Scale training across multiple nodes - **Multi-dataset mixed training **: Train on multiple datasets simultaneously - **JAX-based implementation **: High-performance training with JAX/Flax ### Installation, Training, and Deployment [​](#installation-training-and-deployment) For detailed instructions on installation, training, and deployment, please refer to the [openpi-InternData-A1 README](https://github.com/InternRobotics/InternDataEngine/blob/master/policy/openpi-InternData-A1/README.md). ## References [​](#references) - [LeRobot](https://github.com/huggingface/lerobot)- HuggingFace LeRobot - [InternData-A1 Paper](https://arxiv.org/pdf/2511.16651)- InternData-A1: A High-Fidelity Synthetic Data Generator for Robotic Manipulation - [openpi-InternData-A1](https://github.com/InternRobotics/InternDataEngine/blob/master/policy/openpi-InternData-A1/)- JAX-based π 0 training code