docs: rewrite README with quick start, project structure, and migration refs

- Rewrite README with clear prerequisites, quick start guide, and project structure
- Add references to install.md and migerate/migerate.md
- Add pipeline config table and configuration examples
- Preserve original license (CC BY-NC-SA 4.0) and paper citations
- Update remote: matai as origin, merge 5.0.0 into master

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Tangger
2026-04-03 11:17:43 +08:00
parent 03d9a5b909
commit 6b78ba0d6f

133
README.md
View File

@@ -1,6 +1,8 @@
<div align="center"> <div align="center">
# InternDataEngine: Pioneering High-Fidelity Synthetic Data Generator for Robotic Manipulation # InternDataEngine
**High-Fidelity Synthetic Data Generator for Robotic Manipulation**
</div> </div>
@@ -15,30 +17,127 @@
</div> </div>
## 💻 About ## About
<div align="center"> <div align="center">
<img src="./docs/images/intern_data_engine.jpeg" alt="InternDataEngine Overview" width="80%"> <img src="./docs/images/intern_data_engine.jpeg" alt="InternDataEngine Overview" width="80%">
</div> </div>
InternDataEngine is a synthetic data generation engine for embodied AI that powers large-scale model training and iteration. Built on NVIDIA Isaac Sim, it unifies high-fidelity physical interaction from InternData-A1, semantic task and scene generation from InternData-M1, and high-throughput scheduling from the Nimbus framework to deliver realistic, task-aligned, and massively scalable robotic manipulation data. InternDataEngine is a synthetic data generation engine for embodied AI, built on NVIDIA Isaac Sim. It unifies high-fidelity physical interaction (InternData-A1), semantic task and scene generation (InternData-M1), and high-throughput scheduling (Nimbus) to deliver realistic, task-aligned, and scalable robotic manipulation data.
- **More realistic physical interaction**: Unified simulation of rigid, articulated, deformable, and fluid objects across single-arm, dual-arm, and humanoid robots, enabling long-horizon, skill-composed manipulation that better supports sim-to-real transfer. **Key capabilities:**
- **More diverse data generation**: By leveraging the internal state of the simulation engine to extract high-quality ground truth, coupled with multi-dimensional domain randomization (e.g., layout, texture, structure, and lighting), the data distribution is significantly expanded. This approach produces precise and diverse operational data, while simultaneously exporting rich multimodal annotations such as bounding boxes, segmentation masks, and keypoints.
- **More efficient large-scale production**: Nimbus-powered asynchronous pipelines that decouple planning, rendering, and storage, achieving 23× end-to-end throughput, cluster-level load balancing and fault tolerance for billion-scale data generation.
## 🔥 Latest News - **Realistic physical interaction** -- Unified simulation of rigid, articulated, deformable, and fluid objects across single-arm, dual-arm, and humanoid robots. Supports long-horizon, skill-composed manipulation for sim-to-real transfer.
- **Diverse data generation** -- Multi-dimensional domain randomization (layout, texture, structure, lighting) with rich multimodal annotations (bounding boxes, segmentation masks, keypoints).
- **Efficient large-scale production** -- Nimbus-powered asynchronous pipelines that decouple planning, rendering, and storage, achieving 2-3x end-to-end throughput with cluster-level load balancing and fault tolerance.
- **[2026/03]** We release the InternDataEngine codebase v1.0, which includes the core modules: InternData-A1 and Nimbus. ## Prerequisites
## 🚀 Quickstart | Dependency | Version |
|------------|---------|
| NVIDIA Isaac Sim | 5.0.0 (Kit 107.x) |
| CUDA Toolkit | >= 12.8 |
| Python | 3.10 |
| GPU | NVIDIA RTX (tested on RTX PRO 6000 Blackwell) |
Please refer to the [Installation](https://internrobotics.github.io/InternDataEngine-Docs/guides/installation.html) and [Usage](https://internrobotics.github.io/InternDataEngine-Docs/guides/quickstart.html) to start the installation and run your first synthetic data generation task. > For detailed environment setup (conda, CUDA, PyTorch, curobo), see [install.md](install.md).
>
> If migrating from Isaac Sim 4.5.0, see [migerate/migerate.md](migerate/migerate.md) for known issues and fixes.
For more details, please check [Documentation](https://internrobotics.github.io/InternDataEngine-Docs/). ## Quick Start
### 1. Install
```bash
# Create conda environment
conda create -n banana500 python=3.10
conda activate banana500
# Install CUDA 12.8 and set up Isaac Sim 5.0.0
conda install -y cuda-toolkit=12.8
source ~/isaacsim500/setup_conda_env.sh
export CUDA_HOME="$CONDA_PREFIX"
# Install PyTorch (CUDA 12.8)
pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128
# Install project dependencies
pip install -r requirements.txt
# Install curobo (motion planning)
cd workflows/simbox/curobo
export TORCH_CUDA_ARCH_LIST="12.0+PTX" # Set to your GPU's compute capability
pip install -e .[isaacsim] --no-build-isolation
cd ../../..
```
See [install.md](install.md) for the full step-by-step guide including troubleshooting.
### 2. Run Data Generation
```bash
# Full pipeline: plan trajectories + render + save
python launcher.py --config configs/simbox/de_plan_and_render_template.yaml
```
Output is saved to `output/simbox_plan_and_render/` including:
- `demo.mp4` -- rendered video from robot cameras
- LMDB data files for model training
### 3. Available Pipeline Configs
| Config | Description |
|--------|-------------|
| `de_plan_and_render_template.yaml` | Full pipeline: plan + render + save |
| `de_plan_template.yaml` | Plan trajectories only (no rendering) |
| `de_render_template.yaml` | Render from existing plans |
| `de_plan_with_render_template.yaml` | Plan with live rendering preview |
| `de_pipe_template.yaml` | Pipelined mode for throughput |
### 4. Configuration
The main config file (`configs/simbox/de_plan_and_render_template.yaml`) controls:
```yaml
simulator:
headless: True # Set False for GUI debugging
renderer: "RayTracedLighting" # Or "PathTracing" for higher quality
physics_dt: 1/30
rendering_dt: 1/30
```
Task configs are in `workflows/simbox/core/configs/tasks/`. The example task (`sort_the_rubbish`) demonstrates a dual-arm pick-and-place scenario.
## Project Structure
```
InternDataEngine/
configs/simbox/ # Pipeline configuration files
launcher.py # Main entry point
nimbus_extension/ # Nimbus framework components
workflows/simbox/
core/
configs/ # Task, robot, arena, camera configs
controllers/ # Motion planning (curobo integration)
skills/ # Manipulation skills (pick, place, etc.)
tasks/ # Task definitions
example_assets/ # Example USD assets (robots, objects, tables)
curobo/ # GPU-accelerated motion planning library
migerate/ # Migration tools and documentation
output/ # Generated data output
```
## Documentation
- [Installation Guide](install.md) -- Environment setup and dependency installation
- [Migration Guide](migerate/migerate.md) -- Isaac Sim 4.5.0 to 5.0.0 migration notes and tools
- [Online Documentation](https://internrobotics.github.io/InternDataEngine-Docs/) -- Full API docs, tutorials, and advanced usage
## License and Citation ## License and Citation
All the code within this repo are under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Please consider citing our papers if it helps your research.
This project is based on [InternDataEngine](https://github.com/InternRobotics/InternDataEngine) by InternRobotics, licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/).
If this project helps your research, please cite the following papers:
```BibTeX ```BibTeX
@article{tian2025interndata, @article{tian2025interndata,
@@ -62,13 +161,3 @@ All the code within this repo are under [CC BY-NC-SA 4.0](https://creativecommon
year={2025} year={2025}
} }
``` ```
<!--
```BibTeX
@misc{interndataengine2026,
title={InternDataEngine: A Synthetic Data Generation Engine for Robotic Learning},
author={InternDataEngine Contributors},
year={2026},
}
}
``` -->