init commit

This commit is contained in:
zyhe
2026-03-16 11:44:10 +00:00
commit 94384a93c9
552 changed files with 363038 additions and 0 deletions

View File

@@ -0,0 +1,118 @@
# Cameras
Template-based cameras for simbox tasks. All cameras currently use a single generic implementation, `CustomCamera`, which is configured entirely from the task YAML.
## Available cameras
| Camera class | Notes |
|-----------------|-------|
| `CustomCamera` | Generic pinhole RGB-D camera with configurable intrinsics and pose. |
Importing `CustomCamera` in your task (e.g. `banana.py`) is enough to register it via `@register_camera`.
---
## Customizing a camera configuration
Camera behavior is controlled by the config (`cfg`) passed into `CustomCamera.__init__` in `banana.py`. You typically edit the YAML under `configs/simbox/...`.
### 1. Top-level camera fields
Each camera entry in the YAML should provide:
- **`name`**: Unique camera name (string). Used for prim paths and as the key in `task.cameras`.
- **`parent`**: Optional prim path (under the task root) that the camera mount is attached to. Empty string (`""`) means no specific parent.
- **`translation`**: Initial camera translation in world or parent frame, as a list of three floats `[x, y, z]` (meters).
- **`orientation`**: Initial camera orientation as a quaternion `[w, x, y, z]`.
- **`camera_axes`**: Axes convention for `set_local_pose` (e.g. `[1.0, 0.0, 0.0]` etc. follow existing configs).
These values are used in `banana.py` when calling:
```python
camera.set_local_pose(
translation=cfg["translation"],
orientation=cfg["orientation"],
camera_axes=cfg["camera_axes"],
)
```
### 2. Required `params` fields
Inside each camera config there is a `params` dict that controls the optics and intrinsics. `CustomCamera` expects:
- **`pixel_size`** (`float`, microns)
Physical pixel size on the sensor. Used to compute horizontal/vertical aperture and focal length.
- **`f_number`** (`float`)
Lens f-number. Used in `set_lens_aperture(f_number * 100.0)`.
- **`focus_distance`** (`float`, meters)
Focus distance passed to `set_focus_distance`.
- **`camera_params`** (`[fx, fy, cx, cy]`)
Intrinsic matrix parameters in pixel units:
- `fx`, `fy`: focal lengths in x/y (pixels)
- `cx`, `cy`: principal point (pixels)
- **`resolution_width`** (`int`)
Image width in pixels.
- **`resolution_height`** (`int`)
Image height in pixels.
Optional:
- **`output_mode`** (`"rgb"` or `"diffuse_albedo"`, default `"rgb"`)
Controls which color source is used in `get_observations()`.
### 3. How the parameters are used in `CustomCamera`
Given `cfg["params"]`, `CustomCamera` does the following:
- Computes the camera apertures and focal length:
- `horizontal_aperture = pixel_size * 1e-3 * width`
- `vertical_aperture = pixel_size * 1e-3 * height`
- `focal_length_x = fx * pixel_size * 1e-3`
- `focal_length_y = fy * pixel_size * 1e-3`
- `focal_length = (focal_length_x + focal_length_y) / 2`
- Sets optical parameters:
- `set_focal_length(focal_length / 10.0)`
- `set_focus_distance(focus_distance)`
- `set_lens_aperture(f_number * 100.0)`
- `set_horizontal_aperture(horizontal_aperture / 10.0)`
- `set_vertical_aperture(vertical_aperture / 10.0)`
- `set_clipping_range(0.05, 1.0e5)`
- `set_projection_type("pinhole")`
- Recomputes intrinsic matrix `K` on the fly:
```python
fx = width * self.get_focal_length() / self.get_horizontal_aperture()
fy = height * self.get_focal_length() / self.get_vertical_aperture()
self.is_camera_matrix = np.array([[fx, 0.0, cx], [0.0, fy, cy], [0.0, 0.0, 1.0]])
```
### 4. Outputs from `get_observations()`
`CustomCamera.get_observations()` returns a dict:
- **`color_image`**: RGB image (`H x W x 3`, float32), either from `get_rgba()` or `DiffuseAlbedo` depending on `output_mode`.
- **`depth_image`**: Depth map from `get_depth()` (same resolution as color).
- **`camera2env_pose`**: 4x4 transform from camera to environment, computed from USD prims.
- **`camera_params`**: 3x3 intrinsic matrix `K` as a Python list.
These are the values consumed by tasks (e.g. `banana.py`) for perception and planning.
---
## Summary checklist for a new camera
To add or tweak a camera in a task YAML:
1. **Choose a `name`** and, optionally, a `parent` prim under the task root.
2. **Set pose**: `translation`, `orientation` (quaternion `[w, x, y, z]`), and `camera_axes`.
3. Under `params`, provide:
- `pixel_size`, `f_number`, `focus_distance`
- `camera_params = [fx, fy, cx, cy]`
- `resolution_width`, `resolution_height`
- optional `output_mode` (`"rgb"` or `"diffuse_albedo"`).
4. Ensure your task (e.g. `banana.py`) constructs `CustomCamera` with this `cfg` (this is already wired up in the current code).

View File

@@ -0,0 +1,21 @@
"""Camera module initialization."""
from core.cameras.base_camera import CAMERA_DICT
from .custom_camera import CustomCamera
__all__ = [
"CustomCamera",
"get_camera_cls",
"get_camera_dict",
]
def get_camera_cls(category_name):
"""Get camera class by category name."""
return CAMERA_DICT[category_name]
def get_camera_dict():
"""Get camera dictionary."""
return CAMERA_DICT

View File

@@ -0,0 +1,9 @@
CAMERA_DICT = {}
def register_camera(target_class):
# key = "_".join(re.sub(r"([A-Z0-9])", r" \1", target_class.__name__).split()).lower()
key = target_class.__name__
assert key not in CAMERA_DICT
CAMERA_DICT[key] = target_class
return target_class

View File

@@ -0,0 +1,163 @@
import numpy as np
import omni.replicator.core as rep
from core.cameras.base_camera import register_camera
from omni.isaac.core.prims import XFormPrim
from omni.isaac.core.utils.prims import get_prim_at_path
from omni.isaac.core.utils.transformations import (
get_relative_transform,
pose_from_tf_matrix,
)
from omni.isaac.sensor import Camera
@register_camera
class CustomCamera(Camera):
"""Generic pinhole RGB-D camera used in simbox tasks."""
def __init__(self, cfg, prim_path, root_prim_path, reference_path, name, *args, **kwargs):
"""
Args:
cfg: Config dict with required keys:
- params: Dict containing:
- pixel_size: Pixel size in microns
- f_number: F-number
- focus_distance: Focus distance in meters
- camera_params: [fx, fy, cx, cy] camera intrinsics
- resolution_width: Image width
- resolution_height: Image height
- output_mode (optional): "rgb" or "diffuse_albedo"
prim_path: Camera prim path in USD stage
root_prim_path: Root prim path in USD stage
reference_path: Reference prim path for camera mounting
name: Camera name
"""
# ===== Initialize camera =====
super().__init__(
prim_path=prim_path,
name=name,
resolution=(cfg["params"]["resolution_width"], cfg["params"]["resolution_height"]),
*args,
**kwargs,
)
self.initialize()
self.add_motion_vectors_to_frame()
self.add_semantic_segmentation_to_frame()
self.add_distance_to_image_plane_to_frame()
# ===== From cfg =====
pixel_size = cfg["params"].get("pixel_size")
f_number = cfg["params"].get("f_number")
focus_distance = cfg["params"].get("focus_distance")
fx, fy, cx, cy = cfg["params"].get("camera_params")
width = cfg["params"].get("resolution_width")
height = cfg["params"].get("resolution_height")
self.output_mode = cfg.get("output_mode", "rgb")
# ===== Compute and set camera parameters =====
horizontal_aperture = pixel_size * 1e-3 * width
vertical_aperture = pixel_size * 1e-3 * height
focal_length_x = fx * pixel_size * 1e-3
focal_length_y = fy * pixel_size * 1e-3
focal_length = (focal_length_x + focal_length_y) / 2
self.set_focal_length(focal_length / 10.0)
self.set_focus_distance(focus_distance)
self.set_lens_aperture(f_number * 100.0)
self.set_horizontal_aperture(horizontal_aperture / 10.0)
self.set_vertical_aperture(vertical_aperture / 10.0)
self.set_clipping_range(0.05, 1.0e5)
self.set_projection_type("pinhole")
fx = width * self.get_focal_length() / self.get_horizontal_aperture()
fy = height * self.get_focal_length() / self.get_vertical_aperture()
self.is_camera_matrix = np.array([[fx, 0.0, cx], [0.0, fy, cy], [0.0, 0.0, 1.0]])
self.reference_path = reference_path
self.root_prim_path = root_prim_path
self.parent_camera_prim_path = str(self.prim.GetParent().GetPath())
self.parent_camera_xform = XFormPrim(self.parent_camera_prim_path)
if self.output_mode == "diffuse_albedo":
self.add_diffuse_albedo_to_frame()
def add_diffuse_albedo_to_frame(self) -> None:
"""Attach the diffuse_albedo annotator to this camera."""
if "DiffuseAlbedo" not in self._custom_annotators:
self._custom_annotators["DiffuseAlbedo"] = rep.AnnotatorRegistry.get_annotator("DiffuseAlbedo")
self._custom_annotators["DiffuseAlbedo"].attach([self._render_product_path])
self._current_frame["DiffuseAlbedo"] = None
def remove_diffuse_albedo_from_frame(self) -> None:
if self._custom_annotators["DiffuseAlbedo"] is not None:
self._custom_annotators["DiffuseAlbedo"].detach([self._render_product_path])
self._custom_annotators["DiffuseAlbedo"] = None
self._current_frame.pop("DiffuseAlbedo", None)
def add_specular_albedo_to_frame(self) -> None:
"""Attach the specular_albedo annotator to this camera."""
if self._custom_annotators["SpecularAlbedo"] is None:
self._custom_annotators["SpecularAlbedo"] = rep.AnnotatorRegistry.get_annotator("SpecularAlbedo")
self._custom_annotators["SpecularAlbedo"].attach([self._render_product_path])
self._current_frame["SpecularAlbedo"] = None
def remove_specular_albedo_from_frame(self) -> None:
if self._custom_annotators["SpecularAlbedo"] is not None:
self._custom_annotators["SpecularAlbedo"].detach([self._render_product_path])
self._custom_annotators["SpecularAlbedo"] = None
self._current_frame.pop("SpecularAlbedo", None)
def add_direct_diffuse_to_frame(self) -> None:
"""Attach the direct_diffuse annotator to this camera."""
if self._custom_annotators["DirectDiffuse"] is None:
self._custom_annotators["DirectDiffuse"] = rep.AnnotatorRegistry.get_annotator("DirectDiffuse")
self._custom_annotators["DirectDiffuse"].attach([self._render_product_path])
self._current_frame["DirectDiffuse"] = None
def remove_direct_diffuse_from_frame(self) -> None:
if self._custom_annotators["DirectDiffuse"] is not None:
self._custom_annotators["DirectDiffuse"].detach([self._render_product_path])
self._custom_annotators["DirectDiffuse"] = None
self._current_frame.pop("DirectDiffuse", None)
def add_indirect_diffuse_to_frame(self) -> None:
"""Attach the indirect_diffuse annotator to this camera."""
if self._custom_annotators["IndirectDiffuse"] is None:
self._custom_annotators["IndirectDiffuse"] = rep.AnnotatorRegistry.get_annotator("IndirectDiffuse")
self._custom_annotators["IndirectDiffuse"].attach([self._render_product_path])
self._current_frame["IndirectDiffuse"] = None
def remove_indirect_diffuse_from_frame(self) -> None:
if self._custom_annotators["IndirectDiffuse"] is not None:
self._custom_annotators["IndirectDiffuse"].detach([self._render_product_path])
self._custom_annotators["IndirectDiffuse"] = None
self._current_frame.pop("IndirectDiffuse", None)
def get_observations(self):
if self.reference_path:
camera_mount2env_pose = get_relative_transform(
get_prim_at_path(self.reference_path), get_prim_at_path(self.root_prim_path)
)
camera_mount2env_pose = pose_from_tf_matrix(camera_mount2env_pose)
self.parent_camera_xform.set_local_pose(
translation=camera_mount2env_pose[0],
orientation=camera_mount2env_pose[1],
)
camera2env_pose = get_relative_transform(
get_prim_at_path(self.prim_path), get_prim_at_path(self.root_prim_path)
)
if self.output_mode == "rgb":
color_image = self.get_rgba()[..., :3]
elif self.output_mode == "diffuse_albedo":
color_image = self._custom_annotators["DiffuseAlbedo"].get_data()[..., :3]
else:
raise NotImplementedError
obs = {
"color_image": color_image,
"depth_image": self.get_depth(),
"camera2env_pose": camera2env_pose,
"camera_params": self.is_camera_matrix.tolist(),
}
return obs