diff --git a/README.md b/README.md index 40814b5..bb5c412 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,8 @@
-# InternDataEngine: Pioneering High-Fidelity Synthetic Data Generator for Robotic Manipulation +# InternDataEngine + +**High-Fidelity Synthetic Data Generator for Robotic Manipulation**
@@ -15,30 +17,127 @@ -## 💻 About +## About
InternDataEngine Overview
-InternDataEngine is a synthetic data generation engine for embodied AI that powers large-scale model training and iteration. Built on NVIDIA Isaac Sim, it unifies high-fidelity physical interaction from InternData-A1, semantic task and scene generation from InternData-M1, and high-throughput scheduling from the Nimbus framework to deliver realistic, task-aligned, and massively scalable robotic manipulation data. +InternDataEngine is a synthetic data generation engine for embodied AI, built on NVIDIA Isaac Sim. It unifies high-fidelity physical interaction (InternData-A1), semantic task and scene generation (InternData-M1), and high-throughput scheduling (Nimbus) to deliver realistic, task-aligned, and scalable robotic manipulation data. -- **More realistic physical interaction**: Unified simulation of rigid, articulated, deformable, and fluid objects across single-arm, dual-arm, and humanoid robots, enabling long-horizon, skill-composed manipulation that better supports sim-to-real transfer. -- **More diverse data generation**: By leveraging the internal state of the simulation engine to extract high-quality ground truth, coupled with multi-dimensional domain randomization (e.g., layout, texture, structure, and lighting), the data distribution is significantly expanded. This approach produces precise and diverse operational data, while simultaneously exporting rich multimodal annotations such as bounding boxes, segmentation masks, and keypoints. -- **More efficient large-scale production**: Nimbus-powered asynchronous pipelines that decouple planning, rendering, and storage, achieving 2–3× end-to-end throughput, cluster-level load balancing and fault tolerance for billion-scale data generation. +**Key capabilities:** -## 🔥 Latest News +- **Realistic physical interaction** -- Unified simulation of rigid, articulated, deformable, and fluid objects across single-arm, dual-arm, and humanoid robots. Supports long-horizon, skill-composed manipulation for sim-to-real transfer. +- **Diverse data generation** -- Multi-dimensional domain randomization (layout, texture, structure, lighting) with rich multimodal annotations (bounding boxes, segmentation masks, keypoints). +- **Efficient large-scale production** -- Nimbus-powered asynchronous pipelines that decouple planning, rendering, and storage, achieving 2-3x end-to-end throughput with cluster-level load balancing and fault tolerance. -- **[2026/03]** We release the InternDataEngine codebase v1.0, which includes the core modules: InternData-A1 and Nimbus. +## Prerequisites -## 🚀 Quickstart +| Dependency | Version | +|------------|---------| +| NVIDIA Isaac Sim | 5.0.0 (Kit 107.x) | +| CUDA Toolkit | >= 12.8 | +| Python | 3.10 | +| GPU | NVIDIA RTX (tested on RTX PRO 6000 Blackwell) | -Please refer to the [Installation](https://internrobotics.github.io/InternDataEngine-Docs/guides/installation.html) and [Usage](https://internrobotics.github.io/InternDataEngine-Docs/guides/quickstart.html) to start the installation and run your first synthetic data generation task. +> For detailed environment setup (conda, CUDA, PyTorch, curobo), see [install.md](install.md). +> +> If migrating from Isaac Sim 4.5.0, see [migerate/migerate.md](migerate/migerate.md) for known issues and fixes. -For more details, please check [Documentation](https://internrobotics.github.io/InternDataEngine-Docs/). +## Quick Start + +### 1. Install + +```bash +# Create conda environment +conda create -n banana500 python=3.10 +conda activate banana500 + +# Install CUDA 12.8 and set up Isaac Sim 5.0.0 +conda install -y cuda-toolkit=12.8 +source ~/isaacsim500/setup_conda_env.sh +export CUDA_HOME="$CONDA_PREFIX" + +# Install PyTorch (CUDA 12.8) +pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128 + +# Install project dependencies +pip install -r requirements.txt + +# Install curobo (motion planning) +cd workflows/simbox/curobo +export TORCH_CUDA_ARCH_LIST="12.0+PTX" # Set to your GPU's compute capability +pip install -e .[isaacsim] --no-build-isolation +cd ../../.. +``` + +See [install.md](install.md) for the full step-by-step guide including troubleshooting. + +### 2. Run Data Generation + +```bash +# Full pipeline: plan trajectories + render + save +python launcher.py --config configs/simbox/de_plan_and_render_template.yaml +``` + +Output is saved to `output/simbox_plan_and_render/` including: +- `demo.mp4` -- rendered video from robot cameras +- LMDB data files for model training + +### 3. Available Pipeline Configs + +| Config | Description | +|--------|-------------| +| `de_plan_and_render_template.yaml` | Full pipeline: plan + render + save | +| `de_plan_template.yaml` | Plan trajectories only (no rendering) | +| `de_render_template.yaml` | Render from existing plans | +| `de_plan_with_render_template.yaml` | Plan with live rendering preview | +| `de_pipe_template.yaml` | Pipelined mode for throughput | + +### 4. Configuration + +The main config file (`configs/simbox/de_plan_and_render_template.yaml`) controls: + +```yaml +simulator: + headless: True # Set False for GUI debugging + renderer: "RayTracedLighting" # Or "PathTracing" for higher quality + physics_dt: 1/30 + rendering_dt: 1/30 +``` + +Task configs are in `workflows/simbox/core/configs/tasks/`. The example task (`sort_the_rubbish`) demonstrates a dual-arm pick-and-place scenario. + +## Project Structure + +``` +InternDataEngine/ + configs/simbox/ # Pipeline configuration files + launcher.py # Main entry point + nimbus_extension/ # Nimbus framework components + workflows/simbox/ + core/ + configs/ # Task, robot, arena, camera configs + controllers/ # Motion planning (curobo integration) + skills/ # Manipulation skills (pick, place, etc.) + tasks/ # Task definitions + example_assets/ # Example USD assets (robots, objects, tables) + curobo/ # GPU-accelerated motion planning library + migerate/ # Migration tools and documentation + output/ # Generated data output +``` + +## Documentation + +- [Installation Guide](install.md) -- Environment setup and dependency installation +- [Migration Guide](migerate/migerate.md) -- Isaac Sim 4.5.0 to 5.0.0 migration notes and tools +- [Online Documentation](https://internrobotics.github.io/InternDataEngine-Docs/) -- Full API docs, tutorials, and advanced usage ## License and Citation -All the code within this repo are under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Please consider citing our papers if it helps your research. + +This project is based on [InternDataEngine](https://github.com/InternRobotics/InternDataEngine) by InternRobotics, licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). + +If this project helps your research, please cite the following papers: ```BibTeX @article{tian2025interndata, @@ -62,13 +161,3 @@ All the code within this repo are under [CC BY-NC-SA 4.0](https://creativecommon year={2025} } ``` - -