diff --git a/CLAUDE.md b/CLAUDE.md index 483a62e..f987640 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,15 +1,16 @@ # MindBot Project -基于 Isaac Lab 的移动操作机器人(MindRobot)仿真学习框架,支持强化学习(RL)和模仿学习(IL)。 +基于 Isaac Lab 的移动操作机器人(MindRobot)仿真学习框架,支持强化学习(RL)、模仿学习(IL)和 XR 遥操作。 ## 环境要求 - **Isaac Sim**: 5.1.0 - **Python**: 3.11 +- **GPU**: NVIDIA RTX PRO 6000 96GB(开发机),推荐开启 DLSS - **依赖**: isaaclab, isaaclab_assets, isaaclab_mimic, isaaclab_rl, isaaclab_tasks - **Conda 环境**: - `env_isaaclab` — 主开发环境 - - `xr` — 另一个同样包含 isaaclab/isaacsim 的环境,两者均可运行 Isaac Lab 脚本 + - `xr` — XR 遥操作 + 仿真推流环境,包含 isaaclab/isaacsim + PyAV **IMPORTANT**: 所有 Isaac Lab 脚本必须通过 `~/IsaacLab/isaaclab.sh -p` 启动,直接用 `python` 会报 `KeyError: 'EXP_PATH'`(该变量只有 `isaaclab.sh` 会在启动时设置)。 @@ -43,8 +44,19 @@ python scripts/rsl_rl/play.py --task Template-Mindbot-v0 # 遥操作(Spacemouse / 键盘) python scripts/environments/teleoperation/teleop_se3_agent.py --task Isaac-MindRobot-LeftArm-IK-Rel-v0 -# 遥操作(XR 设备) -python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-LeftArm-IK-Rel-v0 +# 遥操作(XR 双臂 + 头部控制) +python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0 + +# 遥操作(XR 双臂,无头部) +python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-DualArm-IK-Abs-v0 + +# 遥操作 + 仿真画面推送到 PICO VR(需要 --enable_cameras) +python scripts/environments/teleoperation/teleop_xr_agent.py \ + --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0 \ + --stream-to --enable_cameras + +# 推流测试(不需要 Isaac Lab,用 xr 环境直接运行) +conda run -n xr python scripts/test_sim_streamer.py --host # 代码格式化 pre-commit run --all-files @@ -55,7 +67,9 @@ pre-commit run --all-files ``` source/mindbot/mindbot/ ├── robot/mindbot.py # MindRobot 关节/执行器配置(MINDBOT_CFG) -├── utils/assets.py # USD 资源路径(MINDBOT_ASSETS_DIR) +├── utils/ +│ ├── assets.py # USD 资源路径(MINDBOT_ASSETS_DIR) +│ └── sim_video_streamer.py # 仿真→VR H264推流模块(PyAV编码 + TCP) └── tasks/manager_based/ ├── rl/ # 强化学习任务 │ ├── mindbot/ # Template-Mindbot-v0(抓取) @@ -64,19 +78,75 @@ source/mindbot/mindbot/ │ └── pullUltrasoundLidUp/ # Pull-Ultrasound-Lid-Up-v0 └── il/ # 模仿学习任务 ├── demo/ # Demo 任务(H1、Franka) - └── open_drybox/ # Isaac-MindRobot-LeftArm-IK-{Rel,Abs}-v0 + └── open_drybox/ # 双臂遥操作任务 + +scripts/environments/teleoperation/ +├── teleop_xr_agent.py # PICO XR 遥操作(双臂/单臂,支持仿真推流) +├── teleop_se3_agent.py # Spacemouse/键盘 遥操作 +└── test_sim_streamer.py # 推流独立测试(红蓝测试图案) + +deps/ +├── XRoboToolkit-Teleop-Sample-Python/ # XR SDK Python 封装 +├── XRoboToolkit-RobotVision-PC/ # 物理ZED相机采集+H264推流(C++) +└── zed-isaac-sim/ # ZED 仿真扩展(Isaac Sim OGN) ``` ## 注册任务列表 -| 任务 ID | 类型 | 说明 | -|---|---|---| -| `Template-Mindbot-v0` | RL | MindRobot 抓取 | -| `Template-centrifuge-lidup-v0` | RL | 离心机旋转 | -| `Template-Pull-v0` | RL | 物体拉取 | -| `Pull-Ultrasound-Lid-Up-v0` | RL | 超声机盖拉取 | -| `Isaac-MindRobot-LeftArm-IK-Rel-v0` | IL | 左臂 IK 相对控制 | -| `Isaac-MindRobot-LeftArm-IK-Abs-v0` | IL | 左臂 IK 绝对控制 | +| 任务 ID | 类型 | Action维度 | 说明 | +|---|---|---|---| +| `Template-Mindbot-v0` | RL | - | MindRobot 抓取 | +| `Template-centrifuge-lidup-v0` | RL | - | 离心机旋转 | +| `Template-Pull-v0` | RL | - | 物体拉取 | +| `Pull-Ultrasound-Lid-Up-v0` | RL | - | 超声机盖拉取 | +| `Isaac-MindRobot-LeftArm-IK-Rel-v0` | IL | 12D | 左臂 IK 相对控制 | +| `Isaac-MindRobot-LeftArm-IK-Abs-v0` | IL | 12D | 左臂 IK 绝对控制 | +| `Isaac-MindRobot-DualArm-IK-Abs-v0` | IL | 20D | 双臂 IK(无头部) | +| `Isaac-MindRobot-2i-DualArm-IK-Abs-v0` | IL | 22D | 双臂 IK + 头部(带VR推流相机) | + +## XR 遥操作 + +### teleop_xr_agent.py 功能 + +- **双臂绝对 IK 控制**:左右 PICO 手柄分别控制左右机械臂 +- **Grip 按钮**:按住移动手臂,松开锁定关节(缓存 root-frame IK 命令) +- **左摇杆**:底盘前后/转向(直接速度控制,绕过摩擦限制) +- **左手 Y 键**:重置环境(右手 B 键不触发重置) +- **头部追踪**:HMD 偏航/俯仰映射到机器人头部关节(仅 2i 任务) +- **`--stream-to`**:将仿真立体相机画面实时推送到 PICO VR 显示 + +### 仿真→VR 推流架构 + +``` +Isaac Lab env.step() + → vr_left_eye / vr_right_eye (CameraCfg, 1280x720) + → .cpu().numpy() (GPU→CPU) + → SimVideoStreamer (后台线程: np.concatenate SBS → PyAV H264 → TCP) + → PICO Unity Client (TCP Server :12345, H264解码 + VR双眼显示) +``` + +- TCP 协议:4字节大端长度头 + H264 NALU(与 RobotVisionConsole 相同) +- Unity Client 零修改,选择 SIM-Stereo 或 ZEDMINI 配置均可 +- PICO 端 `video_source.yml` 通过 adb push 热更新: + ```bash + adb push Assets/StreamingAssets/video_source.yml \ + /sdcard/Android/data/com.xrobotoolkit.client/files/video_source.yml + adb shell am force-stop com.xrobotoolkit.client + ``` + +### 性能优化 + +- **decimation=4**(25Hz 控制频率,平衡 FPS 和响应性) +- 推流时可关闭调试 viewport 节省 GPU(`--stream-to` 默认关,`--debug-viewports` 开) +- Isaac Sim 开启 **DLSS + Frame Generation** 提升渲染帧率 +- 编码在后台线程异步执行,不阻塞主仿真循环 + +### ZED 2i 物理相机的 IPD 问题 + +ZED 2i 基线 120mm >> 人眼 IPD 65mm,直接在 VR 中显示会导致所有距离视差过大,无法融合3D。 +- Shader `stereoOffset` 只是 toe-in 矫正,只对一个固定距离有效 +- PC 端深度矫正(DIBR)理论正确但 artifact 严重 +- **根本解决**:在仿真中用 65mm 基线虚拟相机,或使用 ZED Mini(63mm ≈ 人眼 IPD) ## 新增任务的规范 @@ -119,12 +189,13 @@ tasks/manager_based/il// | 左夹爪 | left_hand_joint_left/right | | 右夹爪 | right_hand_joint_left/right | | 躯干升降 | PrismaticJoint | -| 头部 | head_revoluteJoint | +| 头部 | head_revoluteJoint, head_pitch_Joint | | 底盘轮子 | front/back revolute joints | ## 遥操作数据收集 -- Isaac Lab 内遥操作脚本使用 `env_isaaclab` 环境 -- XR 设备接入依赖 `deps/XRoboToolkit-Teleop-Sample-Python/`,需切换到 `xr` 环境 +- Isaac Lab 内遥操作脚本使用 `xr` 环境(包含 isaaclab + XR SDK) +- XR 设备接入依赖 `deps/XRoboToolkit-Teleop-Sample-Python/` - 示教数据格式兼容 LeRobot / RoboMimic - 多相机配置:cam_head, cam_chest, cam_left_hand, cam_right_hand, cam_top, cam_side +- VR 立体相机:vr_left_eye, vr_right_eye(ZED_X CameraLeft/CameraRight) diff --git a/deps/XRoboToolkit-Unity-Client b/deps/XRoboToolkit-Unity-Client new file mode 160000 index 0000000..c932609 --- /dev/null +++ b/deps/XRoboToolkit-Unity-Client @@ -0,0 +1 @@ +Subproject commit c9326092ff4d11e8b507b041713194b93470a8e1 diff --git a/scripts/environments/teleoperation/teleop_xr_agent copy.py b/scripts/environments/teleoperation/teleop_xr_agent copy.py new file mode 100644 index 0000000..1ba5972 --- /dev/null +++ b/scripts/environments/teleoperation/teleop_xr_agent copy.py @@ -0,0 +1,856 @@ +#!/usr/bin/env python3 +# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md). +# All rights reserved. +# +# SPDX-License-Identifier: BSD-3-Clause + +""" +Script to run teleoperation with Isaac Lab manipulation environments using PICO XR Controllers. + +Only absolute IK mode is supported (Isaac-MindRobot-LeftArm-IK-Abs-v0). +The controller computes delta pose in Isaac Sim world frame and accumulates +an absolute EEF target, which is then converted to robot root frame before +being sent to the DifferentialIK action manager. +""" + +import argparse +import logging +import sys +import os +from collections.abc import Callable + +# Ensure xr_utils (next to this script) is importable when running directly +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +from isaaclab.app import AppLauncher + +logger = logging.getLogger(__name__) + +# add argparse arguments +parser = argparse.ArgumentParser( + description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode only)." +) +parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.") +parser.add_argument( + "--task", + type=str, + default="Isaac-MindRobot-LeftArm-IK-Abs-v0", + help="Name of the task (must be an Abs IK task).", +) +parser.add_argument( + "--sensitivity", type=float, default=None, + help="Set both pos and rot sensitivity (overridden by --pos_sensitivity/--rot_sensitivity).", +) +parser.add_argument("--pos_sensitivity", type=float, default=None, help="Position sensitivity. Default: 1.0.") +parser.add_argument("--rot_sensitivity", type=float, default=None, help="Rotation sensitivity. Default: 0.3.") +parser.add_argument("--arm", type=str, default="left", choices=["left", "right"], help="Which arm/controller to use.") +parser.add_argument("--base_speed", type=float, default=5.0, help="Max wheel speed (rad/s) for joystick full forward.") +parser.add_argument("--base_turn", type=float, default=2.0, help="Max wheel differential (rad/s) for joystick full left/right.") +parser.add_argument("--drive_speed", type=float, default=0.5, help="Max robot linear speed (m/s) for direct base control. Default: 0.5") +parser.add_argument("--drive_turn", type=float, default=1.5, help="Max robot yaw rate (rad/s) for direct base control. Default: 1.5") +parser.add_argument( + "--stream-to", type=str, default=None, dest="stream_to", + help="PICO VR headset IP for sim stereo streaming. Streams H264 SBS to the Unity client TCP server. " + "Requires vr_left_eye/vr_right_eye cameras in the env scene config.", +) +parser.add_argument("--stream-port", type=int, default=12345, dest="stream_port", help="TCP port for video streaming.") +parser.add_argument("--stream-bitrate", type=int, default=20_000_000, dest="stream_bitrate", help="H264 bitrate (bps).") +parser.add_argument( + "--debug-viewports", action="store_true", dest="debug_viewports", default=False, + help="Enable debug viewport windows (Left Hand, Right Hand, Head, Chest). Off by default when --stream-to is used.", +) + +AppLauncher.add_app_launcher_args(parser) +args_cli = parser.parse_args() +app_launcher_args = vars(args_cli) +app_launcher_args["xr"] = False + +app_launcher = AppLauncher(app_launcher_args) +simulation_app = app_launcher.app + +"""Rest everything follows.""" + +import gymnasium as gym +import numpy as np +import torch +from scipy.spatial.transform import Rotation as R + +from isaaclab.envs import ManagerBasedRLEnvCfg + +import isaaclab_tasks # noqa: F401 +import mindbot.tasks # noqa: F401 +from isaaclab_tasks.utils import parse_env_cfg + +from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion +from xr_utils.geometry import R_HEADSET_TO_WORLD + + +# ===================================================================== +# Robot Camera Viewport Utilities +# ===================================================================== + +# Resolved prim paths for env_0 — must match CameraCfg prim_path with +# {ENV_REGEX_NS} → /World/envs/env_0 +_ROBOT_CAM_PRIMS: dict[str, str] = { + "Left Hand": "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand", + "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand", + "Head": "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft", + "Chest": "/World/envs/env_0/Robot/robot_trunk/cam_chest", +} + +# Stores created viewport window objects +_robot_viewports: dict[str, object] = {} + + +def create_robot_viewports() -> None: + """Create 4 viewport windows and bind each to a robot camera prim. + + Must be called after env.reset() so all prims exist on the USD stage. + """ + try: + import omni.kit.viewport.utility as vp_util + except ImportError: + logger.warning("[Viewport] omni.kit.viewport.utility not available — skipping viewport creation.") + return + + for name, cam_path in _ROBOT_CAM_PRIMS.items(): + vp_win = vp_util.create_viewport_window( + f"Robot {name} View", width=640, height=360 + ) + vp_win.viewport_api.camera_path = cam_path + _robot_viewports[name] = vp_win + print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}") + + +def rebind_robot_viewports() -> None: + """Re-bind all robot viewports to their camera paths. + + Call this after every env reset so viewports stay locked to the cameras. + """ + for name, vp_win in _robot_viewports.items(): + cam_path = _ROBOT_CAM_PRIMS[name] + vp_win.viewport_api.camera_path = cam_path + + + +# ===================================================================== +# Helpers +# ===================================================================== + +def _quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R: + """Convert Isaac-style wxyz quaternion to scipy Rotation.""" + return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]]) + + +def _rotation_to_quat_wxyz(rot: R) -> np.ndarray: + """Convert scipy Rotation to Isaac-style wxyz quaternion.""" + q = rot.as_quat() + return np.array([q[3], q[0], q[1], q[2]]) + + +def world_pose_to_root_frame( + pos_w: np.ndarray, + quat_wxyz: np.ndarray, + root_pos_w: np.ndarray, + root_quat_wxyz: np.ndarray, +) -> tuple[np.ndarray, np.ndarray]: + """Express a world-frame pose in the robot root frame.""" + root_rot = _quat_wxyz_to_rotation(root_quat_wxyz) + pose_rot = _quat_wxyz_to_rotation(quat_wxyz) + pos_root = root_rot.inv().apply(pos_w - root_pos_w) + quat_root = _rotation_to_quat_wxyz(root_rot.inv() * pose_rot) + return pos_root, quat_root + + +# ===================================================================== +# Teleoperation Controller +# ===================================================================== + +class XrTeleopController: + """Teleop controller for PICO XR headset (absolute IK mode). + + Accumulates an absolute EEF target in Isaac Sim world frame. + Grip button acts as clutch; B/Y buttons trigger environment reset. + """ + + def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3, + max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None): + self.xr_client = xr_client if xr_client is not None else XrClient() + self._owns_client = xr_client is None # only close if we created it + self.pos_sensitivity = pos_sensitivity + self.rot_sensitivity = rot_sensitivity + # Hard caps per physics step to prevent IK divergence. + # max_rot_per_step ~4.6°, max_pos_per_step 5 cm. + self.max_pos_per_step = max_pos_per_step + self.max_rot_per_step = max_rot_per_step + self.arm = arm + + self.controller_name = "left_controller" if arm == "left" else "right_controller" + self.grip_name = "left_grip" if arm == "left" else "right_grip" + self.trigger_name = "left_trigger" if arm == "left" else "right_trigger" + + self.R_headset_world = R_HEADSET_TO_WORLD + + self.prev_xr_pos = None + self.prev_xr_quat = None + self.target_eef_pos = None # world frame + self.target_eef_quat = None # world frame, wxyz + + self.grip_active = False + self.frame_count = 0 + self.reset_button_latched = False + self.require_grip_reengage = False + self.grip_engage_threshold = 0.8 + self.grip_release_threshold = 0.2 + self.callbacks = {} + + def add_callback(self, name: str, func: Callable): + self.callbacks[name] = func + + def reset(self): + self.prev_xr_pos = None + self.prev_xr_quat = None + self.grip_active = False + self.frame_count = 0 + self.target_eef_pos = None + self.target_eef_quat = None + # Require grip release after reset to avoid driving to stale pose. + self.require_grip_reengage = True + + def close(self): + if self._owns_client: + self.xr_client.close() + + def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose.""" + action = torch.zeros(8) + if self.target_eef_pos is not None and self.target_eef_quat is not None: + # Prefer frozen world-frame target so IK holds a fixed world position + # even if the robot chassis drifts slightly under gravity. + action[:3] = torch.tensor(self.target_eef_pos.copy()) + action[3:7] = torch.tensor(self.target_eef_quat.copy()) + elif current_eef_pos is not None and current_eef_quat is not None: + action[:3] = torch.tensor(current_eef_pos.copy()) + action[3:7] = torch.tensor(current_eef_quat.copy()) + else: + action[3] = 1.0 # identity quaternion + action[7] = 1.0 if trigger > 0.5 else -1.0 + return action + + def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper]. + + Position and quaternion are in Isaac Sim world frame. + Caller must convert to robot root frame before sending to IK. + """ + try: + # Only left controller Y button resets (not right controller B) + if self.arm == "left": + reset_pressed = self.xr_client.get_button("Y") + else: + reset_pressed = False + if reset_pressed and not self.reset_button_latched: + if "RESET" in self.callbacks: + self.callbacks["RESET"]() + self.reset_button_latched = reset_pressed + except Exception: + pass + + try: + raw_pose = self.xr_client.get_pose(self.controller_name) + grip = self.xr_client.get_key_value(self.grip_name) + trigger = self.xr_client.get_key_value(self.trigger_name) + except Exception: + return self.get_zero_action(0.0, current_eef_pos, current_eef_quat) + + if not is_valid_quaternion(raw_pose[3:]): + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Wait for grip release after reset + if self.require_grip_reengage: + if grip <= self.grip_release_threshold: + self.require_grip_reengage = False + else: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Grip button as clutch with hysteresis + grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold + + if not grip_pressed: + self.prev_xr_pos = None + self.prev_xr_quat = None + # Only snapshot target on the transition frame (grip just released) or first ever frame. + # After that, keep it frozen so IK holds a fixed world-frame position. + if self.grip_active or self.target_eef_pos is None: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = False + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + if not self.grip_active: + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = True + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Project current and previous XR poses into Isaac Sim world frame + xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat) + + # Delta position (world frame), clamped per step + world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity + pos_norm = np.linalg.norm(world_delta_pos) + if pos_norm > self.max_pos_per_step: + world_delta_pos *= self.max_pos_per_step / pos_norm + + # Delta rotation (world frame), clamped per step + world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity + rot_norm = np.linalg.norm(world_delta_rot) + if rot_norm > self.max_rot_per_step: + world_delta_rot *= self.max_rot_per_step / rot_norm + + gripper_action = 1.0 if trigger > 0.5 else -1.0 + + # Accumulate absolute target + if self.target_eef_pos is None: + self.target_eef_pos = np.zeros(3) + self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0]) + + self.target_eef_pos += world_delta_pos + + # Apply rotation delta in world frame so controller direction = EEF world direction + target_r = _quat_wxyz_to_rotation(self.target_eef_quat) + self.target_eef_quat = _rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r) + + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + self.frame_count += 1 + + action = torch.tensor([ + *self.target_eef_pos, + *self.target_eef_quat, + gripper_action, + ], dtype=torch.float32) + + if self.frame_count % 30 == 0: + np.set_printoptions(precision=4, suppress=True, floatmode='fixed') + print("\n====================== [VR TELEOP DEBUG] ======================") + print(f"| Raw VR Pos (OpenXR): {np.array(raw_pose[:3])}") + print(f"| Raw VR Quat (xyzw): {np.array(raw_pose[3:])}") + print(f"| XR Delta Pos (world): {world_delta_pos} (norm={np.linalg.norm(world_delta_pos):.4f})") + print(f"| XR Delta Rot (world): {world_delta_rot} (norm={np.linalg.norm(world_delta_rot):.4f})") + print(f"| Targ Pos (world): {action[:3].numpy()}") + print(f"| Targ Quat (world, wxyz): {action[3:7].numpy()}") + print(f"| Gripper & Trigger: Grip={grip:.2f}, Trig={trigger:.2f}") + print("===============================================================") + + return action + + +# ===================================================================== +# Head Tracking Utility +# ===================================================================== + +def get_head_joint_targets( + xr_client, neutral_rot: R | None +) -> tuple[np.ndarray, R | None]: + """Extract head joint targets [yaw, pitch] from HMD pose. + + At first call (neutral_rot is None) the current headset orientation + is recorded as the neutral reference and zeros are returned. + Subsequent calls return yaw/pitch relative to that neutral pose. + + Returns: + (joint_targets, neutral_rot) where joint_targets is shape (2,) + float32 [head_yaw_Joint_rad, head_pitch_Joint_rad]. + """ + try: + raw_pose = xr_client.get_pose("headset") + if not is_valid_quaternion(raw_pose[3:]): + return np.zeros(2, dtype=np.float32), neutral_rot + _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + head_rot = R.from_quat(head_world_quat_xyzw) + if neutral_rot is None: + # First call — record neutral and hold zero + return np.zeros(2, dtype=np.float32), head_rot + # Relative rotation from neutral heading + rel_rot = head_rot * neutral_rot.inv() + # ZYX Euler: [0]=yaw (Z), [1]=pitch (Y) + euler_zyx = rel_rot.as_euler("ZYX") + yaw = float(np.clip(euler_zyx[0], -1.57, 1.57)) # ±90° + pitch = float(np.clip(euler_zyx[1], -0.52, 0.52)) # ±30° + return np.array([yaw, pitch], dtype=np.float32), neutral_rot + except Exception: + return np.zeros(2, dtype=np.float32), neutral_rot + + +# ===================================================================== +# Main Execution Loop +# ===================================================================== + +def main() -> None: + """Run teleoperation with PICO XR Controller against Isaac Lab environment.""" + + env_cfg = parse_env_cfg(args_cli.task, num_envs=args_cli.num_envs) + env_cfg.env_name = args_cli.task + + if not isinstance(env_cfg, ManagerBasedRLEnvCfg): + raise ValueError(f"Teleoperation requires ManagerBasedRLEnvCfg. Got: {type(env_cfg)}") + if "Abs" not in args_cli.task: + raise ValueError( + f"Task '{args_cli.task}' is not an absolute IK task. " + "Only Abs tasks are supported (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)." + ) + + env_cfg.terminations.time_out = None + + try: + env = gym.make(args_cli.task, cfg=env_cfg).unwrapped + except Exception as e: + logger.error(f"Failed to create environment '{args_cli.task}': {e}") + simulation_app.close() + return + + # Detect single-arm vs dual-arm based on action manager term names + is_dual_arm = "left_arm_action" in env.action_manager._terms + has_head = "head_action" in env.action_manager._terms + + # -- Sim-to-VR stereo streaming setup -- + streamer = None + if args_cli.stream_to: + try: + from mindbot.utils.sim_video_streamer import SimVideoStreamer + # Get camera resolution from scene config (vr_left_eye) + scene = env.unwrapped.scene + if "vr_left_eye" not in scene.sensors: + logger.warning("[Stream] vr_left_eye camera not found in scene. Streaming disabled.") + else: + cam = scene["vr_left_eye"] + cam_w = cam.cfg.width + cam_h = cam.cfg.height + streamer = SimVideoStreamer( + host=args_cli.stream_to, + port=args_cli.stream_port, + width=cam_w, + height=cam_h, + fps=30, + bitrate=args_cli.stream_bitrate, + ) + if streamer.connect(): + print(f"[INFO] Sim stereo streaming to {args_cli.stream_to}:{args_cli.stream_port} " + f"({cam_w*2}x{cam_h} SBS @ 30fps)") + else: + logger.error("[Stream] Failed to connect. Streaming disabled.") + streamer = None + except ImportError: + logger.error("[Stream] mindbot.utils.sim_video_streamer not found. pip install av?") + except Exception as e: + logger.error(f"[Stream] Init failed: {e}") + + # Sensitivity: explicit args override --sensitivity, which overrides defaults + pos_sens = 1.0 + rot_sens = 0.3 + if args_cli.sensitivity is not None: + pos_sens = args_cli.sensitivity + rot_sens = args_cli.sensitivity + if args_cli.pos_sensitivity is not None: + pos_sens = args_cli.pos_sensitivity + if args_cli.rot_sensitivity is not None: + rot_sens = args_cli.rot_sensitivity + + if is_dual_arm: + print(f"\n[INFO] Dual-arm mode detected. Using both controllers.") + print(f"[INFO] IK Mode: ABSOLUTE") + print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") + shared_client = XrClient() + teleop_left = XrTeleopController(arm="left", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) + teleop_right = XrTeleopController(arm="right", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) + teleop_interface = teleop_left # primary interface for callbacks/reset + teleop_right_ref = teleop_right + else: + print(f"\n[INFO] Connecting to PICO XR Headset using {args_cli.arm} controller...") + print(f"[INFO] IK Mode: ABSOLUTE") + print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") + teleop_interface = XrTeleopController( + arm=args_cli.arm, + pos_sensitivity=pos_sens, + rot_sensitivity=rot_sens, + ) + + should_reset = False + + def request_reset(): + nonlocal should_reset + should_reset = True + print("[INFO] Reset requested via XR button.") + + teleop_interface.add_callback("RESET", request_reset) + + def clear_ik_target_state(): + """Sync IK controller's internal target to current EEF pose to avoid jumps after reset.""" + if is_dual_arm: + for term_key in ["left_arm_action", "right_arm_action"]: + arm_action_term = env.action_manager._terms[term_key] + ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() + arm_action_term._raw_actions.zero_() + arm_action_term._processed_actions.zero_() + arm_action_term._ik_controller._command.zero_() + arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b + arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b + else: + arm_action_term = env.action_manager._terms["arm_action"] + ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() + arm_action_term._raw_actions.zero_() + arm_action_term._processed_actions.zero_() + arm_action_term._ik_controller._command.zero_() + arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b + arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b + + def convert_action_world_to_root(action_tensor: torch.Tensor) -> torch.Tensor: + """Convert absolute IK action from world frame to robot root frame.""" + robot = env.unwrapped.scene["robot"] + root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy() + root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy() + pos_root, quat_root = world_pose_to_root_frame( + action_tensor[:3].numpy(), action_tensor[3:7].numpy(), + root_pos_w, root_quat_w, + ) + out = action_tensor.clone() + out[:3] = torch.tensor(pos_root, dtype=torch.float32) + out[3:7] = torch.tensor(quat_root, dtype=torch.float32) + return out + + def get_chassis_commands() -> tuple[torch.Tensor, float, float]: + """Read left joystick → (wheel_cmd 4D, v_fwd m/s, omega rad/s). + + wheel_cmd : differential wheel velocities for the action vector (training data). + v_fwd : desired robot forward speed in m/s (for direct base control). + omega : desired robot yaw rate in rad/s (for direct base control). + """ + try: + joy = teleop_interface.xr_client.get_joystick("left") + jy = float(joy[1]) + jx = float(joy[0]) + except Exception: + return torch.zeros(4), 0.0, 0.0 + # Wheel velocity commands for action vector (rad/s) + v_w = jy * args_cli.base_speed + omega_w = jx * args_cli.base_turn + right_vel = v_w - omega_w + left_vel = v_w + omega_w + wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32) + if abs(v_w) > 0.1 or abs(omega_w) > 0.1: + print(f"[WHEEL] joy=({jx:+.2f},{jy:+.2f}) v_wheel={v_w:.2f} ω_wheel={omega_w:.2f} → {wheel_cmd.tolist()}") + # Direct base velocity commands (m/s, rad/s) + v_fwd = jy * args_cli.drive_speed + omega = -jx * args_cli.drive_turn # jx<0 (left push) → positive yaw = left turn + return wheel_cmd, v_fwd, omega + + def apply_base_velocity(v_fwd: float, omega: float) -> None: + """Directly set robot root linear+angular velocity to bypass isotropic friction. + + PhysX uses isotropic friction, so skid-steer lateral slip is impossible through + wheel torques alone. This function overrides the base velocity each step to give + clean translational + rotational motion regardless of friction. + The write is buffered and applied at the start of the next env.step(). + """ + if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4: + return + robot = env.unwrapped.scene["robot"] + rq = robot.data.root_quat_w # [N, 4] wxyz + w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3] + # Robot forward direction in world frame (rotate [1,0,0] by root quaternion) + fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q) + fwd_y = 2.0 * (x_q * y_q + w_q * z_q) + vel = torch.zeros(env.num_envs, 6, device=device) + vel[:, 0] = v_fwd * fwd_x # world x linear + vel[:, 1] = v_fwd * fwd_y # world y linear + # vel[:, 2] = 0 # z: let physics handle gravity/contact + vel[:, 5] = omega # world z angular (yaw) + robot.write_root_velocity_to_sim(vel) + + obs, _ = env.reset() + clear_ik_target_state() + teleop_interface.reset() + + # Read contact sensor data directly from OmniGraph node USD attributes. + # The USD has action graphs that execute each physics step and write outputs to: + # outputs:inContact (bool) outputs:value (float, Newtons) + # /root in USD → /World/envs/env_0/Robot in the loaded scene. + _CONTACT_OG_NODES = { + "left_r": ( + "/World/envs/env_0/Robot" + "/l_zhuajia___m2_1_cs/isaac_read_contact_sensor_node" + ), + "left_l": ( + "/World/envs/env_0/Robot" + "/l_zhuajia___m3_1_cs/isaac_read_contact_sensor_node" + ), + } + + def _og_contact_reading(stage, node_path: str) -> tuple[bool, float]: + """Read one OmniGraph contact node output via USD attribute API.""" + prim = stage.GetPrimAtPath(node_path) + if not prim.IsValid(): + return False, float("nan") + in_contact_attr = prim.GetAttribute("outputs:inContact") + force_attr = prim.GetAttribute("outputs:value") + in_contact = in_contact_attr.Get() if in_contact_attr.IsValid() else False + force = force_attr.Get() if force_attr.IsValid() else 0.0 + return bool(in_contact), float(force or 0.0) + + # One-time diagnostic: run after first env.reset() so all prims exist + def _diag_contact_nodes() -> None: + import omni.usd + stage = omni.usd.get_context().get_stage() + print("\n[DIAG] OmniGraph contact node check:") + for name, path in _CONTACT_OG_NODES.items(): + prim = stage.GetPrimAtPath(path) + if prim.IsValid(): + has_in = prim.GetAttribute("outputs:inContact").IsValid() + has_val = prim.GetAttribute("outputs:value").IsValid() + print(f" [{name}] ✓ prim valid | outputs:inContact={has_in} outputs:value={has_val}") + else: + # Try to find similar prims to hint correct path + parent_path = "/".join(path.split("/")[:-1]) + parent = stage.GetPrimAtPath(parent_path) + print(f" [{name}] ✗ NOT FOUND at: {path}") + if parent.IsValid(): + children = [c.GetName() for c in parent.GetChildren()] + print(f" parent exists, children: {children}") + else: + gp_path = "/".join(path.split("/")[:-2]) + gp = stage.GetPrimAtPath(gp_path) + print(f" grandparent '{gp_path}' valid={gp.IsValid()}") + + def read_gripper_contact() -> str: + """Return contact status string by reading OmniGraph node USD attributes.""" + try: + import omni.usd + stage = omni.usd.get_context().get_stage() + c_r, f_r = _og_contact_reading(stage, _CONTACT_OG_NODES["left_r"]) + c_l, f_l = _og_contact_reading(stage, _CONTACT_OG_NODES["left_l"]) + if f_r != f_r: # nan → prim not found + return "prim not found (run with diag)" + sym_r = "✓" if c_r else "✗" + sym_l = "✓" if c_l else "✗" + return f"{sym_r}R={f_r:.1f}N {sym_l}L={f_l:.1f}N" + except Exception as _e: + return f"N/A ({_e})" + + # One-time diagnostic: verify OmniGraph contact sensor node paths + _diag_contact_nodes() + + # Create debug viewport windows (always on) + create_robot_viewports() + + + + if is_dual_arm: + teleop_right_ref.reset() + + print("\n" + "=" * 50) + print(" Teleoperation Started!") + if is_dual_arm: + print(" LEFT controller → left arm") + print(" RIGHT controller → right arm") + print(" TRIGGER: open/close gripper") + print(" GRIP (hold): move the arm") + print(" Left joystick: Y=forward/back, X=turn") + print(" Y (left controller): reset environment") + print("=" * 50 + "\n") + + device = env.unwrapped.device + sim_frame = 0 + obs, _ = env.reset() + clear_ik_target_state() + last_root_left = None + last_root_right = None + neutral_head_rot = None # calibrated on first headset sample after reset + + # trunk_action uses use_default_offset=False, so action = absolute position target. + trunk_cmd = torch.tensor([0.1], dtype=torch.float32) + + while simulation_app.is_running(): + try: + with torch.inference_mode(): + if should_reset: + obs, _ = env.reset() + clear_ik_target_state() + teleop_interface.reset() + if is_dual_arm: + teleop_right_ref.reset() + rebind_robot_viewports() + should_reset = False + sim_frame = 0 + last_root_left = None + last_root_right = None + neutral_head_rot = None # re-calibrate on next headset sample + continue + + policy_obs = obs["policy"] + wheel_np, v_fwd, omega_base = get_chassis_commands() + + # Head tracking: HMD yaw/pitch relative to neutral pose + head_cmd_np, neutral_head_rot = get_head_joint_targets( + teleop_interface.xr_client, neutral_head_rot + ) + head_cmd = torch.tensor(head_cmd_np, dtype=torch.float32) + + if is_dual_arm: + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy() + + left_action = teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left) + right_action = teleop_right_ref.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right) + + # Only recompute root-frame IK target when grip is active; otherwise freeze joints + if teleop_left.grip_active or last_root_left is None: + last_root_left = convert_action_world_to_root(left_action)[:7].clone() + if teleop_right_ref.grip_active or last_root_right is None: + last_root_right = convert_action_world_to_root(right_action)[:7].clone() + + # Action layout depends on task: + # with head: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) | head(2) | trunk(1) = 23D + # without head: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) = 20D + parts = [ + last_root_left, wheel_np, left_action[7:8], + last_root_right, right_action[7:8], + ] + if has_head: + parts.append(head_cmd) + parts.append(trunk_cmd) + action_np = torch.cat(parts) + else: + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + eef_quat = policy_obs["eef_quat"][0].cpu().numpy() + + action_np = teleop_interface.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat) + + if teleop_interface.grip_active or last_root_left is None: + last_root_left = convert_action_world_to_root(action_np)[:7].clone() + + # Action layout: arm(7) | wheel(4) | gripper(1) + action_np = torch.cat([last_root_left, wheel_np, action_np[7:8]]) + + actions = action_np.unsqueeze(0).repeat(env.num_envs, 1).to(device) + obs, _, _, _, _ = env.step(actions) + # Direct base velocity override: bypasses isotropic friction limitation + # so skid-steer turning works correctly. + apply_base_velocity(v_fwd, omega_base) + + # Stream sim stereo to VR headset (if enabled) + if streamer is not None: + try: + scene = env.unwrapped.scene + left_rgb = scene["vr_left_eye"].data.output["rgb"][0].cpu().numpy() + right_rgb = scene["vr_right_eye"].data.output["rgb"][0].cpu().numpy() + if not streamer.send_frame(left_rgb, right_rgb): + logger.warning("[Stream] Connection lost. Disabling streaming.") + streamer = None + except Exception as e: + if sim_frame % 300 == 0: + logger.warning(f"[Stream] Frame error: {e}") + + sim_frame += 1 + if sim_frame % 30 == 0: + np.set_printoptions(precision=4, suppress=True, floatmode='fixed') + policy_obs = obs["policy"] + joint_pos = policy_obs["joint_pos"][0].cpu().numpy() + last_act = policy_obs["actions"][0].cpu().numpy() + + if sim_frame == 30: + robot = env.unwrapped.scene["robot"] + jnames = robot.joint_names + print(f"\n{'='*70}") + print(f" ALL {len(jnames)} JOINT NAMES AND POSITIONS") + print(f"{'='*70}") + for i, name in enumerate(jnames): + print(f" [{i:2d}] {name:30s} = {joint_pos[i]:+.4f}") + arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] + print(f" Deduced left arm indices: {arm_idx}") + print(f"{'='*70}\n") + + if not hasattr(env, '_arm_idx_cache'): + robot = env.unwrapped.scene["robot"] + jnames = robot.joint_names + env._arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] + env._right_arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("r_joint")] + arm_joints = joint_pos[env._arm_idx_cache] + right_arm_joints = joint_pos[env._right_arm_idx_cache] + + try: + joy_dbg = teleop_interface.xr_client.get_joystick("left") + joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]" + except Exception: + joy_str = "N/A" + + if is_dual_arm: + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + + print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| Right Arm Joints (rad): {right_arm_joints}") + print(f"| Left EEF Pos (world, m): {eef_pos_left}") + print(f"| Right EEF Pos (world, m): {eef_pos_right}") + print(f"| Left Gripper Contact: {read_gripper_contact()}") + print(f"| Cmd left_arm(abs): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd left_gripper: {last_act[11:12]} (+1=open -1=close)") + print(f"| Cmd right_arm(abs): {last_act[12:19]}") + print(f"| Cmd right_gripper: {last_act[19:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") + else: + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + eef_quat = policy_obs["eef_quat"][0].cpu().numpy() + + if not hasattr(env, '_prev_eef_quat'): + env._prev_eef_quat = eef_quat.copy() + prev_q = env._prev_eef_quat + r_prev = R.from_quat([prev_q[1], prev_q[2], prev_q[3], prev_q[0]]) + r_curr = R.from_quat([eef_quat[1], eef_quat[2], eef_quat[3], eef_quat[0]]) + delta_rotvec = (r_curr * r_prev.inv()).as_rotvec() + delta_angle_deg = np.degrees(np.linalg.norm(delta_rotvec)) + delta_axis = delta_rotvec / (np.linalg.norm(delta_rotvec) + 1e-9) + env._prev_eef_quat = eef_quat.copy() + approach_vec = r_curr.apply(np.array([1.0, 0.0, 0.0])) + + print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| EEF Pos (world, m): {eef_pos}") + print(f"| EEF approach vec (world): [{approach_vec[0]:+.3f}, {approach_vec[1]:+.3f}, {approach_vec[2]:+.3f}] (夹爪朝向)") + print(f"| EEF rot since last print: {delta_angle_deg:+.2f}° around [{delta_axis[0]:+.3f},{delta_axis[1]:+.3f},{delta_axis[2]:+.3f}] (world)") + print(f"| Cmd arm(abs pos+quat): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd gripper: {last_act[11:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") + + except Exception as e: + logger.error(f"Error during simulation step: {e}") + break + + if streamer is not None: + streamer.close() + teleop_interface.close() + if is_dual_arm: + teleop_right_ref.close() + shared_client.close() + env.close() + + +if __name__ == "__main__": + main() + simulation_app.close() diff --git a/scripts/environments/teleoperation/teleop_xr_agent.py b/scripts/environments/teleoperation/teleop_xr_agent.py index e59f0c2..bc24c40 100644 --- a/scripts/environments/teleoperation/teleop_xr_agent.py +++ b/scripts/environments/teleoperation/teleop_xr_agent.py @@ -1,53 +1,49 @@ #!/usr/bin/env python3 -# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md). -# All rights reserved. -# +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. # SPDX-License-Identifier: BSD-3-Clause -""" -Script to run teleoperation with Isaac Lab manipulation environments using PICO XR Controllers. +"""XR teleoperation entry point for MindBot. -Only absolute IK mode is supported (Isaac-MindRobot-LeftArm-IK-Abs-v0). -The controller computes delta pose in Isaac Sim world frame and accumulates -an absolute EEF target, which is then converted to robot root frame before -being sent to the DifferentialIK action manager. +Selects the appropriate agent class based on the task's action manager: + - DualArmHeadXrAgent (23D) for tasks with head_action (e.g. Isaac-MindRobot-2i-DualArm-IK-Abs-v0) + - DualArmXrAgent (20D) for dual-arm tasks without head + - SingleArmXrAgent (8D) for single-arm tasks (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0) + +Usage: + ~/IsaacLab/isaaclab.sh -p scripts/environments/teleoperation/teleop_xr_agent.py \\ + --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0 --stream-to """ import argparse import logging import sys import os -from collections.abc import Callable -# Ensure xr_utils (next to this script) is importable when running directly +# Ensure xr_utils (next to this script) is importable sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) from isaaclab.app import AppLauncher logger = logging.getLogger(__name__) -# add argparse arguments +# -- CLI arguments -- parser = argparse.ArgumentParser( - description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode only)." + description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode)." ) -parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.") -parser.add_argument( - "--task", - type=str, - default="Isaac-MindRobot-LeftArm-IK-Abs-v0", - help="Name of the task (must be an Abs IK task).", -) -parser.add_argument( - "--sensitivity", type=float, default=None, - help="Set both pos and rot sensitivity (overridden by --pos_sensitivity/--rot_sensitivity).", -) -parser.add_argument("--pos_sensitivity", type=float, default=None, help="Position sensitivity. Default: 1.0.") -parser.add_argument("--rot_sensitivity", type=float, default=None, help="Rotation sensitivity. Default: 0.3.") -parser.add_argument("--arm", type=str, default="left", choices=["left", "right"], help="Which arm/controller to use.") -parser.add_argument("--base_speed", type=float, default=5.0, help="Max wheel speed (rad/s) for joystick full forward.") -parser.add_argument("--base_turn", type=float, default=2.0, help="Max wheel differential (rad/s) for joystick full left/right.") -parser.add_argument("--drive_speed", type=float, default=0.5, help="Max robot linear speed (m/s) for direct base control. Default: 0.5") -parser.add_argument("--drive_turn", type=float, default=1.5, help="Max robot yaw rate (rad/s) for direct base control. Default: 1.5") +parser.add_argument("--num_envs", type=int, default=1) +parser.add_argument("--task", type=str, default="Isaac-MindRobot-LeftArm-IK-Abs-v0") +parser.add_argument("--sensitivity", type=float, default=None) +parser.add_argument("--pos_sensitivity", type=float, default=None) +parser.add_argument("--rot_sensitivity", type=float, default=None) +parser.add_argument("--arm", type=str, default="left", choices=["left", "right"]) +parser.add_argument("--base_speed", type=float, default=5.0) +parser.add_argument("--base_turn", type=float, default=2.0) +parser.add_argument("--drive_speed", type=float, default=0.5) +parser.add_argument("--drive_turn", type=float, default=1.5) +parser.add_argument("--stream-to", type=str, default=None, dest="stream_to") +parser.add_argument("--stream-port", type=int, default=12345, dest="stream_port") +parser.add_argument("--stream-bitrate", type=int, default=20_000_000, dest="stream_bitrate") +parser.add_argument("--debug-viewports", action="store_true", dest="debug_viewports", default=False) AppLauncher.add_app_launcher_args(parser) args_cli = parser.parse_args() @@ -57,726 +53,69 @@ app_launcher_args["xr"] = False app_launcher = AppLauncher(app_launcher_args) simulation_app = app_launcher.app -"""Rest everything follows.""" - +# -- Post-launch imports (require Isaac Sim runtime) -- import gymnasium as gym -import numpy as np -import torch -from scipy.spatial.transform import Rotation as R - from isaaclab.envs import ManagerBasedRLEnvCfg - import isaaclab_tasks # noqa: F401 import mindbot.tasks # noqa: F401 from isaaclab_tasks.utils import parse_env_cfg -from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion -from xr_utils.geometry import R_HEADSET_TO_WORLD +from xr_teleop import SingleArmXrAgent, DualArmXrAgent, DualArmHeadXrAgent -# ===================================================================== -# Robot Camera Viewport Utilities -# ===================================================================== - -# Resolved prim paths for env_0 — must match CameraCfg prim_path with -# {ENV_REGEX_NS} → /World/envs/env_0 -_ROBOT_CAM_PRIMS: dict[str, str] = { - "Left Hand": "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand", - "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand", - "Head": "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft", - "Chest": "/World/envs/env_0/Robot/robot_trunk/cam_chest", -} - -# Stores created viewport window objects -_robot_viewports: dict[str, object] = {} - - -def create_robot_viewports() -> None: - """Create 4 viewport windows and bind each to a robot camera prim. - - Must be called after env.reset() so all prims exist on the USD stage. - """ - try: - import omni.kit.viewport.utility as vp_util - except ImportError: - logger.warning("[Viewport] omni.kit.viewport.utility not available — skipping viewport creation.") - return - - for name, cam_path in _ROBOT_CAM_PRIMS.items(): - vp_win = vp_util.create_viewport_window( - f"Robot {name} View", width=640, height=360 - ) - vp_win.viewport_api.camera_path = cam_path - _robot_viewports[name] = vp_win - print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}") - - -def rebind_robot_viewports() -> None: - """Re-bind all robot viewports to their camera paths. - - Call this after every env reset so viewports stay locked to the cameras. - """ - for name, vp_win in _robot_viewports.items(): - cam_path = _ROBOT_CAM_PRIMS[name] - vp_win.viewport_api.camera_path = cam_path - - - -# ===================================================================== -# Helpers -# ===================================================================== - -def _quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R: - """Convert Isaac-style wxyz quaternion to scipy Rotation.""" - return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]]) - - -def _rotation_to_quat_wxyz(rot: R) -> np.ndarray: - """Convert scipy Rotation to Isaac-style wxyz quaternion.""" - q = rot.as_quat() - return np.array([q[3], q[0], q[1], q[2]]) - - -def world_pose_to_root_frame( - pos_w: np.ndarray, - quat_wxyz: np.ndarray, - root_pos_w: np.ndarray, - root_quat_wxyz: np.ndarray, -) -> tuple[np.ndarray, np.ndarray]: - """Express a world-frame pose in the robot root frame.""" - root_rot = _quat_wxyz_to_rotation(root_quat_wxyz) - pose_rot = _quat_wxyz_to_rotation(quat_wxyz) - pos_root = root_rot.inv().apply(pos_w - root_pos_w) - quat_root = _rotation_to_quat_wxyz(root_rot.inv() * pose_rot) - return pos_root, quat_root - - -# ===================================================================== -# Teleoperation Controller -# ===================================================================== - -class XrTeleopController: - """Teleop controller for PICO XR headset (absolute IK mode). - - Accumulates an absolute EEF target in Isaac Sim world frame. - Grip button acts as clutch; B/Y buttons trigger environment reset. - """ - - def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3, - max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None): - self.xr_client = xr_client if xr_client is not None else XrClient() - self._owns_client = xr_client is None # only close if we created it - self.pos_sensitivity = pos_sensitivity - self.rot_sensitivity = rot_sensitivity - # Hard caps per physics step to prevent IK divergence. - # max_rot_per_step ~4.6°, max_pos_per_step 5 cm. - self.max_pos_per_step = max_pos_per_step - self.max_rot_per_step = max_rot_per_step - self.arm = arm - - self.controller_name = "left_controller" if arm == "left" else "right_controller" - self.grip_name = "left_grip" if arm == "left" else "right_grip" - self.trigger_name = "left_trigger" if arm == "left" else "right_trigger" - - self.R_headset_world = R_HEADSET_TO_WORLD - - self.prev_xr_pos = None - self.prev_xr_quat = None - self.target_eef_pos = None # world frame - self.target_eef_quat = None # world frame, wxyz - - self.grip_active = False - self.frame_count = 0 - self.reset_button_latched = False - self.require_grip_reengage = False - self.grip_engage_threshold = 0.8 - self.grip_release_threshold = 0.2 - self.callbacks = {} - - def add_callback(self, name: str, func: Callable): - self.callbacks[name] = func - - def reset(self): - self.prev_xr_pos = None - self.prev_xr_quat = None - self.grip_active = False - self.frame_count = 0 - self.target_eef_pos = None - self.target_eef_quat = None - # Require grip release after reset to avoid driving to stale pose. - self.require_grip_reengage = True - - def close(self): - if self._owns_client: - self.xr_client.close() - - def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: - """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose.""" - action = torch.zeros(8) - if self.target_eef_pos is not None and self.target_eef_quat is not None: - # Prefer frozen world-frame target so IK holds a fixed world position - # even if the robot chassis drifts slightly under gravity. - action[:3] = torch.tensor(self.target_eef_pos.copy()) - action[3:7] = torch.tensor(self.target_eef_quat.copy()) - elif current_eef_pos is not None and current_eef_quat is not None: - action[:3] = torch.tensor(current_eef_pos.copy()) - action[3:7] = torch.tensor(current_eef_quat.copy()) - else: - action[3] = 1.0 # identity quaternion - action[7] = 1.0 if trigger > 0.5 else -1.0 - return action - - def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: - """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper]. - - Position and quaternion are in Isaac Sim world frame. - Caller must convert to robot root frame before sending to IK. - """ - try: - reset_pressed = self.xr_client.get_button("B") or self.xr_client.get_button("Y") - if reset_pressed and not self.reset_button_latched: - if "RESET" in self.callbacks: - self.callbacks["RESET"]() - self.reset_button_latched = reset_pressed - except Exception: - pass - - try: - raw_pose = self.xr_client.get_pose(self.controller_name) - grip = self.xr_client.get_key_value(self.grip_name) - trigger = self.xr_client.get_key_value(self.trigger_name) - except Exception: - return self.get_zero_action(0.0, current_eef_pos, current_eef_quat) - - if not is_valid_quaternion(raw_pose[3:]): - return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) - - # Wait for grip release after reset - if self.require_grip_reengage: - if grip <= self.grip_release_threshold: - self.require_grip_reengage = False - else: - if current_eef_pos is not None and current_eef_quat is not None: - self.target_eef_pos = current_eef_pos.copy() - self.target_eef_quat = current_eef_quat.copy() - return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) - - # Grip button as clutch with hysteresis - grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold - - if not grip_pressed: - self.prev_xr_pos = None - self.prev_xr_quat = None - # Only snapshot target on the transition frame (grip just released) or first ever frame. - # After that, keep it frozen so IK holds a fixed world-frame position. - if self.grip_active or self.target_eef_pos is None: - if current_eef_pos is not None and current_eef_quat is not None: - self.target_eef_pos = current_eef_pos.copy() - self.target_eef_quat = current_eef_quat.copy() - self.grip_active = False - return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) - - if not self.grip_active: - self.prev_xr_pos = raw_pose[:3].copy() - self.prev_xr_quat = raw_pose[3:].copy() - if current_eef_pos is not None and current_eef_quat is not None: - self.target_eef_pos = current_eef_pos.copy() - self.target_eef_quat = current_eef_quat.copy() - self.grip_active = True - return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) - - # Project current and previous XR poses into Isaac Sim world frame - xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) - prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat) - - # Delta position (world frame), clamped per step - world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity - pos_norm = np.linalg.norm(world_delta_pos) - if pos_norm > self.max_pos_per_step: - world_delta_pos *= self.max_pos_per_step / pos_norm - - # Delta rotation (world frame), clamped per step - world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity - rot_norm = np.linalg.norm(world_delta_rot) - if rot_norm > self.max_rot_per_step: - world_delta_rot *= self.max_rot_per_step / rot_norm - - gripper_action = 1.0 if trigger > 0.5 else -1.0 - - # Accumulate absolute target - if self.target_eef_pos is None: - self.target_eef_pos = np.zeros(3) - self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0]) - - self.target_eef_pos += world_delta_pos - - # Apply rotation delta in world frame so controller direction = EEF world direction - target_r = _quat_wxyz_to_rotation(self.target_eef_quat) - self.target_eef_quat = _rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r) - - self.prev_xr_pos = raw_pose[:3].copy() - self.prev_xr_quat = raw_pose[3:].copy() - self.frame_count += 1 - - action = torch.tensor([ - *self.target_eef_pos, - *self.target_eef_quat, - gripper_action, - ], dtype=torch.float32) - - if self.frame_count % 30 == 0: - np.set_printoptions(precision=4, suppress=True, floatmode='fixed') - print("\n====================== [VR TELEOP DEBUG] ======================") - print(f"| Raw VR Pos (OpenXR): {np.array(raw_pose[:3])}") - print(f"| Raw VR Quat (xyzw): {np.array(raw_pose[3:])}") - print(f"| XR Delta Pos (world): {world_delta_pos} (norm={np.linalg.norm(world_delta_pos):.4f})") - print(f"| XR Delta Rot (world): {world_delta_rot} (norm={np.linalg.norm(world_delta_rot):.4f})") - print(f"| Targ Pos (world): {action[:3].numpy()}") - print(f"| Targ Quat (world, wxyz): {action[3:7].numpy()}") - print(f"| Gripper & Trigger: Grip={grip:.2f}, Trig={trigger:.2f}") - print("===============================================================") - - return action - - -# ===================================================================== -# Head Tracking Utility -# ===================================================================== - -def get_head_joint_targets( - xr_client, neutral_rot: R | None -) -> tuple[np.ndarray, R | None]: - """Extract head joint targets [yaw, pitch] from HMD pose. - - At first call (neutral_rot is None) the current headset orientation - is recorded as the neutral reference and zeros are returned. - Subsequent calls return yaw/pitch relative to that neutral pose. - - Returns: - (joint_targets, neutral_rot) where joint_targets is shape (2,) - float32 [head_yaw_Joint_rad, head_pitch_Joint_rad]. - """ - try: - raw_pose = xr_client.get_pose("headset") - if not is_valid_quaternion(raw_pose[3:]): - return np.zeros(2, dtype=np.float32), neutral_rot - _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) - head_rot = R.from_quat(head_world_quat_xyzw) - if neutral_rot is None: - # First call — record neutral and hold zero - return np.zeros(2, dtype=np.float32), head_rot - # Relative rotation from neutral heading - rel_rot = head_rot * neutral_rot.inv() - # ZYX Euler: [0]=yaw (Z), [1]=pitch (Y) - euler_zyx = rel_rot.as_euler("ZYX") - yaw = float(np.clip(euler_zyx[0], -1.57, 1.57)) # ±90° - pitch = float(np.clip(euler_zyx[1], -0.52, 0.52)) # ±30° - return np.array([yaw, pitch], dtype=np.float32), neutral_rot - except Exception: - return np.zeros(2, dtype=np.float32), neutral_rot - - -# ===================================================================== -# Main Execution Loop -# ===================================================================== - def main() -> None: - """Run teleoperation with PICO XR Controller against Isaac Lab environment.""" + args = args_cli - env_cfg = parse_env_cfg(args_cli.task, num_envs=args_cli.num_envs) - env_cfg.env_name = args_cli.task + # Resolve sensitivity + pos_sens = args.pos_sensitivity or args.sensitivity or 1.0 + rot_sens = args.rot_sensitivity or args.sensitivity or 0.3 + # Create environment + env_cfg = parse_env_cfg(args.task, num_envs=args.num_envs) + env_cfg.env_name = args.task if not isinstance(env_cfg, ManagerBasedRLEnvCfg): raise ValueError(f"Teleoperation requires ManagerBasedRLEnvCfg. Got: {type(env_cfg)}") - if "Abs" not in args_cli.task: - raise ValueError( - f"Task '{args_cli.task}' is not an absolute IK task. " - "Only Abs tasks are supported (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)." - ) - + if "Abs" not in args.task: + raise ValueError(f"Task '{args.task}' is not an absolute IK task.") env_cfg.terminations.time_out = None - try: - env = gym.make(args_cli.task, cfg=env_cfg).unwrapped - except Exception as e: - logger.error(f"Failed to create environment '{args_cli.task}': {e}") - simulation_app.close() - return + env = gym.make(args.task, cfg=env_cfg).unwrapped - # Detect single-arm vs dual-arm based on action manager term names - is_dual_arm = "left_arm_action" in env.action_manager._terms + # Select agent based on task capabilities + action_terms = env.action_manager._terms + is_dual_arm = "left_arm_action" in action_terms + has_head = "head_action" in action_terms - # Sensitivity: explicit args override --sensitivity, which overrides defaults - pos_sens = 1.0 - rot_sens = 0.3 - if args_cli.sensitivity is not None: - pos_sens = args_cli.sensitivity - rot_sens = args_cli.sensitivity - if args_cli.pos_sensitivity is not None: - pos_sens = args_cli.pos_sensitivity - if args_cli.rot_sensitivity is not None: - rot_sens = args_cli.rot_sensitivity - - if is_dual_arm: - print(f"\n[INFO] Dual-arm mode detected. Using both controllers.") - print(f"[INFO] IK Mode: ABSOLUTE") - print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") - shared_client = XrClient() - teleop_left = XrTeleopController(arm="left", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) - teleop_right = XrTeleopController(arm="right", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) - teleop_interface = teleop_left # primary interface for callbacks/reset - teleop_right_ref = teleop_right + if is_dual_arm and has_head: + print(f"[INFO] DualArmHeadXrAgent (23D) | pos_sens={pos_sens} rot_sens={rot_sens}") + agent = DualArmHeadXrAgent( + env, simulation_app, + pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, + base_speed=args.base_speed, base_turn=args.base_turn, + drive_speed=args.drive_speed, drive_turn=args.drive_turn, + stream_to=args.stream_to, stream_port=args.stream_port, + stream_bitrate=args.stream_bitrate, + debug_viewports=args.debug_viewports, + ) + elif is_dual_arm: + print(f"[INFO] DualArmXrAgent (20D) | pos_sens={pos_sens} rot_sens={rot_sens}") + agent = DualArmXrAgent( + env, simulation_app, + pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, + base_speed=args.base_speed, base_turn=args.base_turn, + drive_speed=args.drive_speed, drive_turn=args.drive_turn, + debug_viewports=args.debug_viewports, + ) else: - print(f"\n[INFO] Connecting to PICO XR Headset using {args_cli.arm} controller...") - print(f"[INFO] IK Mode: ABSOLUTE") - print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") - teleop_interface = XrTeleopController( - arm=args_cli.arm, - pos_sensitivity=pos_sens, - rot_sensitivity=rot_sens, + print(f"[INFO] SingleArmXrAgent (8D) | arm={args.arm} pos_sens={pos_sens} rot_sens={rot_sens}") + agent = SingleArmXrAgent( + env, simulation_app, + arm=args.arm, + pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, + debug_viewports=args.debug_viewports, ) - should_reset = False - - def request_reset(): - nonlocal should_reset - should_reset = True - print("[INFO] Reset requested via XR button.") - - teleop_interface.add_callback("RESET", request_reset) - - def clear_ik_target_state(): - """Sync IK controller's internal target to current EEF pose to avoid jumps after reset.""" - if is_dual_arm: - for term_key in ["left_arm_action", "right_arm_action"]: - arm_action_term = env.action_manager._terms[term_key] - ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() - arm_action_term._raw_actions.zero_() - arm_action_term._processed_actions.zero_() - arm_action_term._ik_controller._command.zero_() - arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b - arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b - else: - arm_action_term = env.action_manager._terms["arm_action"] - ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() - arm_action_term._raw_actions.zero_() - arm_action_term._processed_actions.zero_() - arm_action_term._ik_controller._command.zero_() - arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b - arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b - - def convert_action_world_to_root(action_tensor: torch.Tensor) -> torch.Tensor: - """Convert absolute IK action from world frame to robot root frame.""" - robot = env.unwrapped.scene["robot"] - root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy() - root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy() - pos_root, quat_root = world_pose_to_root_frame( - action_tensor[:3].numpy(), action_tensor[3:7].numpy(), - root_pos_w, root_quat_w, - ) - out = action_tensor.clone() - out[:3] = torch.tensor(pos_root, dtype=torch.float32) - out[3:7] = torch.tensor(quat_root, dtype=torch.float32) - return out - - def get_chassis_commands() -> tuple[torch.Tensor, float, float]: - """Read left joystick → (wheel_cmd 4D, v_fwd m/s, omega rad/s). - - wheel_cmd : differential wheel velocities for the action vector (training data). - v_fwd : desired robot forward speed in m/s (for direct base control). - omega : desired robot yaw rate in rad/s (for direct base control). - """ - try: - joy = teleop_interface.xr_client.get_joystick("left") - jy = float(joy[1]) - jx = float(joy[0]) - except Exception: - return torch.zeros(4), 0.0, 0.0 - # Wheel velocity commands for action vector (rad/s) - v_w = jy * args_cli.base_speed - omega_w = jx * args_cli.base_turn - right_vel = v_w - omega_w - left_vel = v_w + omega_w - wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32) - if abs(v_w) > 0.1 or abs(omega_w) > 0.1: - print(f"[WHEEL] joy=({jx:+.2f},{jy:+.2f}) v_wheel={v_w:.2f} ω_wheel={omega_w:.2f} → {wheel_cmd.tolist()}") - # Direct base velocity commands (m/s, rad/s) - v_fwd = jy * args_cli.drive_speed - omega = -jx * args_cli.drive_turn # jx<0 (left push) → positive yaw = left turn - return wheel_cmd, v_fwd, omega - - def apply_base_velocity(v_fwd: float, omega: float) -> None: - """Directly set robot root linear+angular velocity to bypass isotropic friction. - - PhysX uses isotropic friction, so skid-steer lateral slip is impossible through - wheel torques alone. This function overrides the base velocity each step to give - clean translational + rotational motion regardless of friction. - The write is buffered and applied at the start of the next env.step(). - """ - if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4: - return - robot = env.unwrapped.scene["robot"] - rq = robot.data.root_quat_w # [N, 4] wxyz - w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3] - # Robot forward direction in world frame (rotate [1,0,0] by root quaternion) - fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q) - fwd_y = 2.0 * (x_q * y_q + w_q * z_q) - vel = torch.zeros(env.num_envs, 6, device=device) - vel[:, 0] = v_fwd * fwd_x # world x linear - vel[:, 1] = v_fwd * fwd_y # world y linear - # vel[:, 2] = 0 # z: let physics handle gravity/contact - vel[:, 5] = omega # world z angular (yaw) - robot.write_root_velocity_to_sim(vel) - - obs, _ = env.reset() - clear_ik_target_state() - teleop_interface.reset() - - # Read contact sensor data directly from OmniGraph node USD attributes. - # The USD has action graphs that execute each physics step and write outputs to: - # outputs:inContact (bool) outputs:value (float, Newtons) - # /root in USD → /World/envs/env_0/Robot in the loaded scene. - _CONTACT_OG_NODES = { - "left_r": ( - "/World/envs/env_0/Robot" - "/l_zhuajia___m2_1_cs/isaac_read_contact_sensor_node" - ), - "left_l": ( - "/World/envs/env_0/Robot" - "/l_zhuajia___m3_1_cs/isaac_read_contact_sensor_node" - ), - } - - def _og_contact_reading(stage, node_path: str) -> tuple[bool, float]: - """Read one OmniGraph contact node output via USD attribute API.""" - prim = stage.GetPrimAtPath(node_path) - if not prim.IsValid(): - return False, float("nan") - in_contact_attr = prim.GetAttribute("outputs:inContact") - force_attr = prim.GetAttribute("outputs:value") - in_contact = in_contact_attr.Get() if in_contact_attr.IsValid() else False - force = force_attr.Get() if force_attr.IsValid() else 0.0 - return bool(in_contact), float(force or 0.0) - - # One-time diagnostic: run after first env.reset() so all prims exist - def _diag_contact_nodes() -> None: - import omni.usd - stage = omni.usd.get_context().get_stage() - print("\n[DIAG] OmniGraph contact node check:") - for name, path in _CONTACT_OG_NODES.items(): - prim = stage.GetPrimAtPath(path) - if prim.IsValid(): - has_in = prim.GetAttribute("outputs:inContact").IsValid() - has_val = prim.GetAttribute("outputs:value").IsValid() - print(f" [{name}] ✓ prim valid | outputs:inContact={has_in} outputs:value={has_val}") - else: - # Try to find similar prims to hint correct path - parent_path = "/".join(path.split("/")[:-1]) - parent = stage.GetPrimAtPath(parent_path) - print(f" [{name}] ✗ NOT FOUND at: {path}") - if parent.IsValid(): - children = [c.GetName() for c in parent.GetChildren()] - print(f" parent exists, children: {children}") - else: - gp_path = "/".join(path.split("/")[:-2]) - gp = stage.GetPrimAtPath(gp_path) - print(f" grandparent '{gp_path}' valid={gp.IsValid()}") - - def read_gripper_contact() -> str: - """Return contact status string by reading OmniGraph node USD attributes.""" - try: - import omni.usd - stage = omni.usd.get_context().get_stage() - c_r, f_r = _og_contact_reading(stage, _CONTACT_OG_NODES["left_r"]) - c_l, f_l = _og_contact_reading(stage, _CONTACT_OG_NODES["left_l"]) - if f_r != f_r: # nan → prim not found - return "prim not found (run with diag)" - sym_r = "✓" if c_r else "✗" - sym_l = "✓" if c_l else "✗" - return f"{sym_r}R={f_r:.1f}N {sym_l}L={f_l:.1f}N" - except Exception as _e: - return f"N/A ({_e})" - - # One-time diagnostic: verify OmniGraph contact sensor node paths - _diag_contact_nodes() - - # Create 4 viewport windows bound to the robot cameras - create_robot_viewports() - - - if is_dual_arm: - teleop_right_ref.reset() - - print("\n" + "=" * 50) - print(" Teleoperation Started!") - if is_dual_arm: - print(" LEFT controller → left arm") - print(" RIGHT controller → right arm") - print(" TRIGGER: open/close gripper") - print(" GRIP (hold): move the arm") - print(" Left joystick: Y=forward/back, X=turn") - print(" B / Y: reset environment") - print("=" * 50 + "\n") - - device = env.unwrapped.device - sim_frame = 0 - obs, _ = env.reset() - clear_ik_target_state() - last_root_left = None - last_root_right = None - neutral_head_rot = None # calibrated on first headset sample after reset - - while simulation_app.is_running(): - try: - with torch.inference_mode(): - if should_reset: - obs, _ = env.reset() - clear_ik_target_state() - teleop_interface.reset() - if is_dual_arm: - teleop_right_ref.reset() - rebind_robot_viewports() - should_reset = False - sim_frame = 0 - last_root_left = None - last_root_right = None - neutral_head_rot = None # re-calibrate on next headset sample - continue - - policy_obs = obs["policy"] - wheel_np, v_fwd, omega_base = get_chassis_commands() - - # Head tracking: HMD yaw/pitch relative to neutral pose - head_cmd_np, neutral_head_rot = get_head_joint_targets( - teleop_interface.xr_client, neutral_head_rot - ) - head_cmd = torch.tensor(head_cmd_np, dtype=torch.float32) - - if is_dual_arm: - eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() - eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() - eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() - eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy() - - left_action = teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left) - right_action = teleop_right_ref.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right) - - # Only recompute root-frame IK target when grip is active; otherwise freeze joints - if teleop_left.grip_active or last_root_left is None: - last_root_left = convert_action_world_to_root(left_action)[:7].clone() - if teleop_right_ref.grip_active or last_root_right is None: - last_root_right = convert_action_world_to_root(right_action)[:7].clone() - - # Action layout: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) | head(2) - action_np = torch.cat([ - last_root_left, wheel_np, left_action[7:8], - last_root_right, right_action[7:8], - head_cmd, - ]) - else: - eef_pos = policy_obs["eef_pos"][0].cpu().numpy() - eef_quat = policy_obs["eef_quat"][0].cpu().numpy() - - action_np = teleop_interface.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat) - - if teleop_interface.grip_active or last_root_left is None: - last_root_left = convert_action_world_to_root(action_np)[:7].clone() - - # Action layout: arm(7) | wheel(4) | gripper(1) - action_np = torch.cat([last_root_left, wheel_np, action_np[7:8]]) - - actions = action_np.unsqueeze(0).repeat(env.num_envs, 1).to(device) - obs, _, _, _, _ = env.step(actions) - # Direct base velocity override: bypasses isotropic friction limitation - # so skid-steer turning works correctly. - apply_base_velocity(v_fwd, omega_base) - - sim_frame += 1 - if sim_frame % 30 == 0: - np.set_printoptions(precision=4, suppress=True, floatmode='fixed') - policy_obs = obs["policy"] - joint_pos = policy_obs["joint_pos"][0].cpu().numpy() - last_act = policy_obs["actions"][0].cpu().numpy() - - if sim_frame == 30: - robot = env.unwrapped.scene["robot"] - jnames = robot.joint_names - print(f"\n{'='*70}") - print(f" ALL {len(jnames)} JOINT NAMES AND POSITIONS") - print(f"{'='*70}") - for i, name in enumerate(jnames): - print(f" [{i:2d}] {name:30s} = {joint_pos[i]:+.4f}") - arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] - print(f" Deduced left arm indices: {arm_idx}") - print(f"{'='*70}\n") - - if not hasattr(env, '_arm_idx_cache'): - robot = env.unwrapped.scene["robot"] - jnames = robot.joint_names - env._arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] - env._right_arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("r_joint")] - arm_joints = joint_pos[env._arm_idx_cache] - right_arm_joints = joint_pos[env._right_arm_idx_cache] - - try: - joy_dbg = teleop_interface.xr_client.get_joystick("left") - joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]" - except Exception: - joy_str = "N/A" - - if is_dual_arm: - eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() - eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() - eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() - - print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") - print(f"| Left Arm Joints (rad): {arm_joints}") - print(f"| Right Arm Joints (rad): {right_arm_joints}") - print(f"| Left EEF Pos (world, m): {eef_pos_left}") - print(f"| Right EEF Pos (world, m): {eef_pos_right}") - print(f"| Left Gripper Contact: {read_gripper_contact()}") - print(f"| Cmd left_arm(abs): {last_act[:7]}") - print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") - print(f"| Cmd left_gripper: {last_act[11:12]} (+1=open -1=close)") - print(f"| Cmd right_arm(abs): {last_act[12:19]}") - print(f"| Cmd right_gripper: {last_act[19:]} (+1=open -1=close)") - print(f"| Joystick (x=turn,y=fwd): {joy_str}") - print(f"----------------------------------------------------------------") - else: - eef_pos = policy_obs["eef_pos"][0].cpu().numpy() - eef_quat = policy_obs["eef_quat"][0].cpu().numpy() - - if not hasattr(env, '_prev_eef_quat'): - env._prev_eef_quat = eef_quat.copy() - prev_q = env._prev_eef_quat - r_prev = R.from_quat([prev_q[1], prev_q[2], prev_q[3], prev_q[0]]) - r_curr = R.from_quat([eef_quat[1], eef_quat[2], eef_quat[3], eef_quat[0]]) - delta_rotvec = (r_curr * r_prev.inv()).as_rotvec() - delta_angle_deg = np.degrees(np.linalg.norm(delta_rotvec)) - delta_axis = delta_rotvec / (np.linalg.norm(delta_rotvec) + 1e-9) - env._prev_eef_quat = eef_quat.copy() - approach_vec = r_curr.apply(np.array([1.0, 0.0, 0.0])) - - print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") - print(f"| Left Arm Joints (rad): {arm_joints}") - print(f"| EEF Pos (world, m): {eef_pos}") - print(f"| EEF approach vec (world): [{approach_vec[0]:+.3f}, {approach_vec[1]:+.3f}, {approach_vec[2]:+.3f}] (夹爪朝向)") - print(f"| EEF rot since last print: {delta_angle_deg:+.2f}° around [{delta_axis[0]:+.3f},{delta_axis[1]:+.3f},{delta_axis[2]:+.3f}] (world)") - print(f"| Cmd arm(abs pos+quat): {last_act[:7]}") - print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") - print(f"| Cmd gripper: {last_act[11:]} (+1=open -1=close)") - print(f"| Joystick (x=turn,y=fwd): {joy_str}") - print(f"----------------------------------------------------------------") - - except Exception as e: - logger.error(f"Error during simulation step: {e}") - break - - teleop_interface.close() - if is_dual_arm: - teleop_right_ref.close() - shared_client.close() - env.close() + agent.run() if __name__ == "__main__": diff --git a/scripts/environments/teleoperation/teleop_xr_agent.py.bak b/scripts/environments/teleoperation/teleop_xr_agent.py.bak new file mode 100644 index 0000000..1ba5972 --- /dev/null +++ b/scripts/environments/teleoperation/teleop_xr_agent.py.bak @@ -0,0 +1,856 @@ +#!/usr/bin/env python3 +# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md). +# All rights reserved. +# +# SPDX-License-Identifier: BSD-3-Clause + +""" +Script to run teleoperation with Isaac Lab manipulation environments using PICO XR Controllers. + +Only absolute IK mode is supported (Isaac-MindRobot-LeftArm-IK-Abs-v0). +The controller computes delta pose in Isaac Sim world frame and accumulates +an absolute EEF target, which is then converted to robot root frame before +being sent to the DifferentialIK action manager. +""" + +import argparse +import logging +import sys +import os +from collections.abc import Callable + +# Ensure xr_utils (next to this script) is importable when running directly +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +from isaaclab.app import AppLauncher + +logger = logging.getLogger(__name__) + +# add argparse arguments +parser = argparse.ArgumentParser( + description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode only)." +) +parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.") +parser.add_argument( + "--task", + type=str, + default="Isaac-MindRobot-LeftArm-IK-Abs-v0", + help="Name of the task (must be an Abs IK task).", +) +parser.add_argument( + "--sensitivity", type=float, default=None, + help="Set both pos and rot sensitivity (overridden by --pos_sensitivity/--rot_sensitivity).", +) +parser.add_argument("--pos_sensitivity", type=float, default=None, help="Position sensitivity. Default: 1.0.") +parser.add_argument("--rot_sensitivity", type=float, default=None, help="Rotation sensitivity. Default: 0.3.") +parser.add_argument("--arm", type=str, default="left", choices=["left", "right"], help="Which arm/controller to use.") +parser.add_argument("--base_speed", type=float, default=5.0, help="Max wheel speed (rad/s) for joystick full forward.") +parser.add_argument("--base_turn", type=float, default=2.0, help="Max wheel differential (rad/s) for joystick full left/right.") +parser.add_argument("--drive_speed", type=float, default=0.5, help="Max robot linear speed (m/s) for direct base control. Default: 0.5") +parser.add_argument("--drive_turn", type=float, default=1.5, help="Max robot yaw rate (rad/s) for direct base control. Default: 1.5") +parser.add_argument( + "--stream-to", type=str, default=None, dest="stream_to", + help="PICO VR headset IP for sim stereo streaming. Streams H264 SBS to the Unity client TCP server. " + "Requires vr_left_eye/vr_right_eye cameras in the env scene config.", +) +parser.add_argument("--stream-port", type=int, default=12345, dest="stream_port", help="TCP port for video streaming.") +parser.add_argument("--stream-bitrate", type=int, default=20_000_000, dest="stream_bitrate", help="H264 bitrate (bps).") +parser.add_argument( + "--debug-viewports", action="store_true", dest="debug_viewports", default=False, + help="Enable debug viewport windows (Left Hand, Right Hand, Head, Chest). Off by default when --stream-to is used.", +) + +AppLauncher.add_app_launcher_args(parser) +args_cli = parser.parse_args() +app_launcher_args = vars(args_cli) +app_launcher_args["xr"] = False + +app_launcher = AppLauncher(app_launcher_args) +simulation_app = app_launcher.app + +"""Rest everything follows.""" + +import gymnasium as gym +import numpy as np +import torch +from scipy.spatial.transform import Rotation as R + +from isaaclab.envs import ManagerBasedRLEnvCfg + +import isaaclab_tasks # noqa: F401 +import mindbot.tasks # noqa: F401 +from isaaclab_tasks.utils import parse_env_cfg + +from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion +from xr_utils.geometry import R_HEADSET_TO_WORLD + + +# ===================================================================== +# Robot Camera Viewport Utilities +# ===================================================================== + +# Resolved prim paths for env_0 — must match CameraCfg prim_path with +# {ENV_REGEX_NS} → /World/envs/env_0 +_ROBOT_CAM_PRIMS: dict[str, str] = { + "Left Hand": "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand", + "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand", + "Head": "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft", + "Chest": "/World/envs/env_0/Robot/robot_trunk/cam_chest", +} + +# Stores created viewport window objects +_robot_viewports: dict[str, object] = {} + + +def create_robot_viewports() -> None: + """Create 4 viewport windows and bind each to a robot camera prim. + + Must be called after env.reset() so all prims exist on the USD stage. + """ + try: + import omni.kit.viewport.utility as vp_util + except ImportError: + logger.warning("[Viewport] omni.kit.viewport.utility not available — skipping viewport creation.") + return + + for name, cam_path in _ROBOT_CAM_PRIMS.items(): + vp_win = vp_util.create_viewport_window( + f"Robot {name} View", width=640, height=360 + ) + vp_win.viewport_api.camera_path = cam_path + _robot_viewports[name] = vp_win + print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}") + + +def rebind_robot_viewports() -> None: + """Re-bind all robot viewports to their camera paths. + + Call this after every env reset so viewports stay locked to the cameras. + """ + for name, vp_win in _robot_viewports.items(): + cam_path = _ROBOT_CAM_PRIMS[name] + vp_win.viewport_api.camera_path = cam_path + + + +# ===================================================================== +# Helpers +# ===================================================================== + +def _quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R: + """Convert Isaac-style wxyz quaternion to scipy Rotation.""" + return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]]) + + +def _rotation_to_quat_wxyz(rot: R) -> np.ndarray: + """Convert scipy Rotation to Isaac-style wxyz quaternion.""" + q = rot.as_quat() + return np.array([q[3], q[0], q[1], q[2]]) + + +def world_pose_to_root_frame( + pos_w: np.ndarray, + quat_wxyz: np.ndarray, + root_pos_w: np.ndarray, + root_quat_wxyz: np.ndarray, +) -> tuple[np.ndarray, np.ndarray]: + """Express a world-frame pose in the robot root frame.""" + root_rot = _quat_wxyz_to_rotation(root_quat_wxyz) + pose_rot = _quat_wxyz_to_rotation(quat_wxyz) + pos_root = root_rot.inv().apply(pos_w - root_pos_w) + quat_root = _rotation_to_quat_wxyz(root_rot.inv() * pose_rot) + return pos_root, quat_root + + +# ===================================================================== +# Teleoperation Controller +# ===================================================================== + +class XrTeleopController: + """Teleop controller for PICO XR headset (absolute IK mode). + + Accumulates an absolute EEF target in Isaac Sim world frame. + Grip button acts as clutch; B/Y buttons trigger environment reset. + """ + + def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3, + max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None): + self.xr_client = xr_client if xr_client is not None else XrClient() + self._owns_client = xr_client is None # only close if we created it + self.pos_sensitivity = pos_sensitivity + self.rot_sensitivity = rot_sensitivity + # Hard caps per physics step to prevent IK divergence. + # max_rot_per_step ~4.6°, max_pos_per_step 5 cm. + self.max_pos_per_step = max_pos_per_step + self.max_rot_per_step = max_rot_per_step + self.arm = arm + + self.controller_name = "left_controller" if arm == "left" else "right_controller" + self.grip_name = "left_grip" if arm == "left" else "right_grip" + self.trigger_name = "left_trigger" if arm == "left" else "right_trigger" + + self.R_headset_world = R_HEADSET_TO_WORLD + + self.prev_xr_pos = None + self.prev_xr_quat = None + self.target_eef_pos = None # world frame + self.target_eef_quat = None # world frame, wxyz + + self.grip_active = False + self.frame_count = 0 + self.reset_button_latched = False + self.require_grip_reengage = False + self.grip_engage_threshold = 0.8 + self.grip_release_threshold = 0.2 + self.callbacks = {} + + def add_callback(self, name: str, func: Callable): + self.callbacks[name] = func + + def reset(self): + self.prev_xr_pos = None + self.prev_xr_quat = None + self.grip_active = False + self.frame_count = 0 + self.target_eef_pos = None + self.target_eef_quat = None + # Require grip release after reset to avoid driving to stale pose. + self.require_grip_reengage = True + + def close(self): + if self._owns_client: + self.xr_client.close() + + def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose.""" + action = torch.zeros(8) + if self.target_eef_pos is not None and self.target_eef_quat is not None: + # Prefer frozen world-frame target so IK holds a fixed world position + # even if the robot chassis drifts slightly under gravity. + action[:3] = torch.tensor(self.target_eef_pos.copy()) + action[3:7] = torch.tensor(self.target_eef_quat.copy()) + elif current_eef_pos is not None and current_eef_quat is not None: + action[:3] = torch.tensor(current_eef_pos.copy()) + action[3:7] = torch.tensor(current_eef_quat.copy()) + else: + action[3] = 1.0 # identity quaternion + action[7] = 1.0 if trigger > 0.5 else -1.0 + return action + + def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper]. + + Position and quaternion are in Isaac Sim world frame. + Caller must convert to robot root frame before sending to IK. + """ + try: + # Only left controller Y button resets (not right controller B) + if self.arm == "left": + reset_pressed = self.xr_client.get_button("Y") + else: + reset_pressed = False + if reset_pressed and not self.reset_button_latched: + if "RESET" in self.callbacks: + self.callbacks["RESET"]() + self.reset_button_latched = reset_pressed + except Exception: + pass + + try: + raw_pose = self.xr_client.get_pose(self.controller_name) + grip = self.xr_client.get_key_value(self.grip_name) + trigger = self.xr_client.get_key_value(self.trigger_name) + except Exception: + return self.get_zero_action(0.0, current_eef_pos, current_eef_quat) + + if not is_valid_quaternion(raw_pose[3:]): + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Wait for grip release after reset + if self.require_grip_reengage: + if grip <= self.grip_release_threshold: + self.require_grip_reengage = False + else: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Grip button as clutch with hysteresis + grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold + + if not grip_pressed: + self.prev_xr_pos = None + self.prev_xr_quat = None + # Only snapshot target on the transition frame (grip just released) or first ever frame. + # After that, keep it frozen so IK holds a fixed world-frame position. + if self.grip_active or self.target_eef_pos is None: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = False + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + if not self.grip_active: + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = True + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + # Project current and previous XR poses into Isaac Sim world frame + xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat) + + # Delta position (world frame), clamped per step + world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity + pos_norm = np.linalg.norm(world_delta_pos) + if pos_norm > self.max_pos_per_step: + world_delta_pos *= self.max_pos_per_step / pos_norm + + # Delta rotation (world frame), clamped per step + world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity + rot_norm = np.linalg.norm(world_delta_rot) + if rot_norm > self.max_rot_per_step: + world_delta_rot *= self.max_rot_per_step / rot_norm + + gripper_action = 1.0 if trigger > 0.5 else -1.0 + + # Accumulate absolute target + if self.target_eef_pos is None: + self.target_eef_pos = np.zeros(3) + self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0]) + + self.target_eef_pos += world_delta_pos + + # Apply rotation delta in world frame so controller direction = EEF world direction + target_r = _quat_wxyz_to_rotation(self.target_eef_quat) + self.target_eef_quat = _rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r) + + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + self.frame_count += 1 + + action = torch.tensor([ + *self.target_eef_pos, + *self.target_eef_quat, + gripper_action, + ], dtype=torch.float32) + + if self.frame_count % 30 == 0: + np.set_printoptions(precision=4, suppress=True, floatmode='fixed') + print("\n====================== [VR TELEOP DEBUG] ======================") + print(f"| Raw VR Pos (OpenXR): {np.array(raw_pose[:3])}") + print(f"| Raw VR Quat (xyzw): {np.array(raw_pose[3:])}") + print(f"| XR Delta Pos (world): {world_delta_pos} (norm={np.linalg.norm(world_delta_pos):.4f})") + print(f"| XR Delta Rot (world): {world_delta_rot} (norm={np.linalg.norm(world_delta_rot):.4f})") + print(f"| Targ Pos (world): {action[:3].numpy()}") + print(f"| Targ Quat (world, wxyz): {action[3:7].numpy()}") + print(f"| Gripper & Trigger: Grip={grip:.2f}, Trig={trigger:.2f}") + print("===============================================================") + + return action + + +# ===================================================================== +# Head Tracking Utility +# ===================================================================== + +def get_head_joint_targets( + xr_client, neutral_rot: R | None +) -> tuple[np.ndarray, R | None]: + """Extract head joint targets [yaw, pitch] from HMD pose. + + At first call (neutral_rot is None) the current headset orientation + is recorded as the neutral reference and zeros are returned. + Subsequent calls return yaw/pitch relative to that neutral pose. + + Returns: + (joint_targets, neutral_rot) where joint_targets is shape (2,) + float32 [head_yaw_Joint_rad, head_pitch_Joint_rad]. + """ + try: + raw_pose = xr_client.get_pose("headset") + if not is_valid_quaternion(raw_pose[3:]): + return np.zeros(2, dtype=np.float32), neutral_rot + _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + head_rot = R.from_quat(head_world_quat_xyzw) + if neutral_rot is None: + # First call — record neutral and hold zero + return np.zeros(2, dtype=np.float32), head_rot + # Relative rotation from neutral heading + rel_rot = head_rot * neutral_rot.inv() + # ZYX Euler: [0]=yaw (Z), [1]=pitch (Y) + euler_zyx = rel_rot.as_euler("ZYX") + yaw = float(np.clip(euler_zyx[0], -1.57, 1.57)) # ±90° + pitch = float(np.clip(euler_zyx[1], -0.52, 0.52)) # ±30° + return np.array([yaw, pitch], dtype=np.float32), neutral_rot + except Exception: + return np.zeros(2, dtype=np.float32), neutral_rot + + +# ===================================================================== +# Main Execution Loop +# ===================================================================== + +def main() -> None: + """Run teleoperation with PICO XR Controller against Isaac Lab environment.""" + + env_cfg = parse_env_cfg(args_cli.task, num_envs=args_cli.num_envs) + env_cfg.env_name = args_cli.task + + if not isinstance(env_cfg, ManagerBasedRLEnvCfg): + raise ValueError(f"Teleoperation requires ManagerBasedRLEnvCfg. Got: {type(env_cfg)}") + if "Abs" not in args_cli.task: + raise ValueError( + f"Task '{args_cli.task}' is not an absolute IK task. " + "Only Abs tasks are supported (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)." + ) + + env_cfg.terminations.time_out = None + + try: + env = gym.make(args_cli.task, cfg=env_cfg).unwrapped + except Exception as e: + logger.error(f"Failed to create environment '{args_cli.task}': {e}") + simulation_app.close() + return + + # Detect single-arm vs dual-arm based on action manager term names + is_dual_arm = "left_arm_action" in env.action_manager._terms + has_head = "head_action" in env.action_manager._terms + + # -- Sim-to-VR stereo streaming setup -- + streamer = None + if args_cli.stream_to: + try: + from mindbot.utils.sim_video_streamer import SimVideoStreamer + # Get camera resolution from scene config (vr_left_eye) + scene = env.unwrapped.scene + if "vr_left_eye" not in scene.sensors: + logger.warning("[Stream] vr_left_eye camera not found in scene. Streaming disabled.") + else: + cam = scene["vr_left_eye"] + cam_w = cam.cfg.width + cam_h = cam.cfg.height + streamer = SimVideoStreamer( + host=args_cli.stream_to, + port=args_cli.stream_port, + width=cam_w, + height=cam_h, + fps=30, + bitrate=args_cli.stream_bitrate, + ) + if streamer.connect(): + print(f"[INFO] Sim stereo streaming to {args_cli.stream_to}:{args_cli.stream_port} " + f"({cam_w*2}x{cam_h} SBS @ 30fps)") + else: + logger.error("[Stream] Failed to connect. Streaming disabled.") + streamer = None + except ImportError: + logger.error("[Stream] mindbot.utils.sim_video_streamer not found. pip install av?") + except Exception as e: + logger.error(f"[Stream] Init failed: {e}") + + # Sensitivity: explicit args override --sensitivity, which overrides defaults + pos_sens = 1.0 + rot_sens = 0.3 + if args_cli.sensitivity is not None: + pos_sens = args_cli.sensitivity + rot_sens = args_cli.sensitivity + if args_cli.pos_sensitivity is not None: + pos_sens = args_cli.pos_sensitivity + if args_cli.rot_sensitivity is not None: + rot_sens = args_cli.rot_sensitivity + + if is_dual_arm: + print(f"\n[INFO] Dual-arm mode detected. Using both controllers.") + print(f"[INFO] IK Mode: ABSOLUTE") + print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") + shared_client = XrClient() + teleop_left = XrTeleopController(arm="left", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) + teleop_right = XrTeleopController(arm="right", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client) + teleop_interface = teleop_left # primary interface for callbacks/reset + teleop_right_ref = teleop_right + else: + print(f"\n[INFO] Connecting to PICO XR Headset using {args_cli.arm} controller...") + print(f"[INFO] IK Mode: ABSOLUTE") + print(f"[INFO] Sensitivity: pos={pos_sens:.3f} rot={rot_sens:.3f}") + teleop_interface = XrTeleopController( + arm=args_cli.arm, + pos_sensitivity=pos_sens, + rot_sensitivity=rot_sens, + ) + + should_reset = False + + def request_reset(): + nonlocal should_reset + should_reset = True + print("[INFO] Reset requested via XR button.") + + teleop_interface.add_callback("RESET", request_reset) + + def clear_ik_target_state(): + """Sync IK controller's internal target to current EEF pose to avoid jumps after reset.""" + if is_dual_arm: + for term_key in ["left_arm_action", "right_arm_action"]: + arm_action_term = env.action_manager._terms[term_key] + ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() + arm_action_term._raw_actions.zero_() + arm_action_term._processed_actions.zero_() + arm_action_term._ik_controller._command.zero_() + arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b + arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b + else: + arm_action_term = env.action_manager._terms["arm_action"] + ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose() + arm_action_term._raw_actions.zero_() + arm_action_term._processed_actions.zero_() + arm_action_term._ik_controller._command.zero_() + arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b + arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b + + def convert_action_world_to_root(action_tensor: torch.Tensor) -> torch.Tensor: + """Convert absolute IK action from world frame to robot root frame.""" + robot = env.unwrapped.scene["robot"] + root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy() + root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy() + pos_root, quat_root = world_pose_to_root_frame( + action_tensor[:3].numpy(), action_tensor[3:7].numpy(), + root_pos_w, root_quat_w, + ) + out = action_tensor.clone() + out[:3] = torch.tensor(pos_root, dtype=torch.float32) + out[3:7] = torch.tensor(quat_root, dtype=torch.float32) + return out + + def get_chassis_commands() -> tuple[torch.Tensor, float, float]: + """Read left joystick → (wheel_cmd 4D, v_fwd m/s, omega rad/s). + + wheel_cmd : differential wheel velocities for the action vector (training data). + v_fwd : desired robot forward speed in m/s (for direct base control). + omega : desired robot yaw rate in rad/s (for direct base control). + """ + try: + joy = teleop_interface.xr_client.get_joystick("left") + jy = float(joy[1]) + jx = float(joy[0]) + except Exception: + return torch.zeros(4), 0.0, 0.0 + # Wheel velocity commands for action vector (rad/s) + v_w = jy * args_cli.base_speed + omega_w = jx * args_cli.base_turn + right_vel = v_w - omega_w + left_vel = v_w + omega_w + wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32) + if abs(v_w) > 0.1 or abs(omega_w) > 0.1: + print(f"[WHEEL] joy=({jx:+.2f},{jy:+.2f}) v_wheel={v_w:.2f} ω_wheel={omega_w:.2f} → {wheel_cmd.tolist()}") + # Direct base velocity commands (m/s, rad/s) + v_fwd = jy * args_cli.drive_speed + omega = -jx * args_cli.drive_turn # jx<0 (left push) → positive yaw = left turn + return wheel_cmd, v_fwd, omega + + def apply_base_velocity(v_fwd: float, omega: float) -> None: + """Directly set robot root linear+angular velocity to bypass isotropic friction. + + PhysX uses isotropic friction, so skid-steer lateral slip is impossible through + wheel torques alone. This function overrides the base velocity each step to give + clean translational + rotational motion regardless of friction. + The write is buffered and applied at the start of the next env.step(). + """ + if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4: + return + robot = env.unwrapped.scene["robot"] + rq = robot.data.root_quat_w # [N, 4] wxyz + w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3] + # Robot forward direction in world frame (rotate [1,0,0] by root quaternion) + fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q) + fwd_y = 2.0 * (x_q * y_q + w_q * z_q) + vel = torch.zeros(env.num_envs, 6, device=device) + vel[:, 0] = v_fwd * fwd_x # world x linear + vel[:, 1] = v_fwd * fwd_y # world y linear + # vel[:, 2] = 0 # z: let physics handle gravity/contact + vel[:, 5] = omega # world z angular (yaw) + robot.write_root_velocity_to_sim(vel) + + obs, _ = env.reset() + clear_ik_target_state() + teleop_interface.reset() + + # Read contact sensor data directly from OmniGraph node USD attributes. + # The USD has action graphs that execute each physics step and write outputs to: + # outputs:inContact (bool) outputs:value (float, Newtons) + # /root in USD → /World/envs/env_0/Robot in the loaded scene. + _CONTACT_OG_NODES = { + "left_r": ( + "/World/envs/env_0/Robot" + "/l_zhuajia___m2_1_cs/isaac_read_contact_sensor_node" + ), + "left_l": ( + "/World/envs/env_0/Robot" + "/l_zhuajia___m3_1_cs/isaac_read_contact_sensor_node" + ), + } + + def _og_contact_reading(stage, node_path: str) -> tuple[bool, float]: + """Read one OmniGraph contact node output via USD attribute API.""" + prim = stage.GetPrimAtPath(node_path) + if not prim.IsValid(): + return False, float("nan") + in_contact_attr = prim.GetAttribute("outputs:inContact") + force_attr = prim.GetAttribute("outputs:value") + in_contact = in_contact_attr.Get() if in_contact_attr.IsValid() else False + force = force_attr.Get() if force_attr.IsValid() else 0.0 + return bool(in_contact), float(force or 0.0) + + # One-time diagnostic: run after first env.reset() so all prims exist + def _diag_contact_nodes() -> None: + import omni.usd + stage = omni.usd.get_context().get_stage() + print("\n[DIAG] OmniGraph contact node check:") + for name, path in _CONTACT_OG_NODES.items(): + prim = stage.GetPrimAtPath(path) + if prim.IsValid(): + has_in = prim.GetAttribute("outputs:inContact").IsValid() + has_val = prim.GetAttribute("outputs:value").IsValid() + print(f" [{name}] ✓ prim valid | outputs:inContact={has_in} outputs:value={has_val}") + else: + # Try to find similar prims to hint correct path + parent_path = "/".join(path.split("/")[:-1]) + parent = stage.GetPrimAtPath(parent_path) + print(f" [{name}] ✗ NOT FOUND at: {path}") + if parent.IsValid(): + children = [c.GetName() for c in parent.GetChildren()] + print(f" parent exists, children: {children}") + else: + gp_path = "/".join(path.split("/")[:-2]) + gp = stage.GetPrimAtPath(gp_path) + print(f" grandparent '{gp_path}' valid={gp.IsValid()}") + + def read_gripper_contact() -> str: + """Return contact status string by reading OmniGraph node USD attributes.""" + try: + import omni.usd + stage = omni.usd.get_context().get_stage() + c_r, f_r = _og_contact_reading(stage, _CONTACT_OG_NODES["left_r"]) + c_l, f_l = _og_contact_reading(stage, _CONTACT_OG_NODES["left_l"]) + if f_r != f_r: # nan → prim not found + return "prim not found (run with diag)" + sym_r = "✓" if c_r else "✗" + sym_l = "✓" if c_l else "✗" + return f"{sym_r}R={f_r:.1f}N {sym_l}L={f_l:.1f}N" + except Exception as _e: + return f"N/A ({_e})" + + # One-time diagnostic: verify OmniGraph contact sensor node paths + _diag_contact_nodes() + + # Create debug viewport windows (always on) + create_robot_viewports() + + + + if is_dual_arm: + teleop_right_ref.reset() + + print("\n" + "=" * 50) + print(" Teleoperation Started!") + if is_dual_arm: + print(" LEFT controller → left arm") + print(" RIGHT controller → right arm") + print(" TRIGGER: open/close gripper") + print(" GRIP (hold): move the arm") + print(" Left joystick: Y=forward/back, X=turn") + print(" Y (left controller): reset environment") + print("=" * 50 + "\n") + + device = env.unwrapped.device + sim_frame = 0 + obs, _ = env.reset() + clear_ik_target_state() + last_root_left = None + last_root_right = None + neutral_head_rot = None # calibrated on first headset sample after reset + + # trunk_action uses use_default_offset=False, so action = absolute position target. + trunk_cmd = torch.tensor([0.1], dtype=torch.float32) + + while simulation_app.is_running(): + try: + with torch.inference_mode(): + if should_reset: + obs, _ = env.reset() + clear_ik_target_state() + teleop_interface.reset() + if is_dual_arm: + teleop_right_ref.reset() + rebind_robot_viewports() + should_reset = False + sim_frame = 0 + last_root_left = None + last_root_right = None + neutral_head_rot = None # re-calibrate on next headset sample + continue + + policy_obs = obs["policy"] + wheel_np, v_fwd, omega_base = get_chassis_commands() + + # Head tracking: HMD yaw/pitch relative to neutral pose + head_cmd_np, neutral_head_rot = get_head_joint_targets( + teleop_interface.xr_client, neutral_head_rot + ) + head_cmd = torch.tensor(head_cmd_np, dtype=torch.float32) + + if is_dual_arm: + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy() + + left_action = teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left) + right_action = teleop_right_ref.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right) + + # Only recompute root-frame IK target when grip is active; otherwise freeze joints + if teleop_left.grip_active or last_root_left is None: + last_root_left = convert_action_world_to_root(left_action)[:7].clone() + if teleop_right_ref.grip_active or last_root_right is None: + last_root_right = convert_action_world_to_root(right_action)[:7].clone() + + # Action layout depends on task: + # with head: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) | head(2) | trunk(1) = 23D + # without head: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) = 20D + parts = [ + last_root_left, wheel_np, left_action[7:8], + last_root_right, right_action[7:8], + ] + if has_head: + parts.append(head_cmd) + parts.append(trunk_cmd) + action_np = torch.cat(parts) + else: + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + eef_quat = policy_obs["eef_quat"][0].cpu().numpy() + + action_np = teleop_interface.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat) + + if teleop_interface.grip_active or last_root_left is None: + last_root_left = convert_action_world_to_root(action_np)[:7].clone() + + # Action layout: arm(7) | wheel(4) | gripper(1) + action_np = torch.cat([last_root_left, wheel_np, action_np[7:8]]) + + actions = action_np.unsqueeze(0).repeat(env.num_envs, 1).to(device) + obs, _, _, _, _ = env.step(actions) + # Direct base velocity override: bypasses isotropic friction limitation + # so skid-steer turning works correctly. + apply_base_velocity(v_fwd, omega_base) + + # Stream sim stereo to VR headset (if enabled) + if streamer is not None: + try: + scene = env.unwrapped.scene + left_rgb = scene["vr_left_eye"].data.output["rgb"][0].cpu().numpy() + right_rgb = scene["vr_right_eye"].data.output["rgb"][0].cpu().numpy() + if not streamer.send_frame(left_rgb, right_rgb): + logger.warning("[Stream] Connection lost. Disabling streaming.") + streamer = None + except Exception as e: + if sim_frame % 300 == 0: + logger.warning(f"[Stream] Frame error: {e}") + + sim_frame += 1 + if sim_frame % 30 == 0: + np.set_printoptions(precision=4, suppress=True, floatmode='fixed') + policy_obs = obs["policy"] + joint_pos = policy_obs["joint_pos"][0].cpu().numpy() + last_act = policy_obs["actions"][0].cpu().numpy() + + if sim_frame == 30: + robot = env.unwrapped.scene["robot"] + jnames = robot.joint_names + print(f"\n{'='*70}") + print(f" ALL {len(jnames)} JOINT NAMES AND POSITIONS") + print(f"{'='*70}") + for i, name in enumerate(jnames): + print(f" [{i:2d}] {name:30s} = {joint_pos[i]:+.4f}") + arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] + print(f" Deduced left arm indices: {arm_idx}") + print(f"{'='*70}\n") + + if not hasattr(env, '_arm_idx_cache'): + robot = env.unwrapped.scene["robot"] + jnames = robot.joint_names + env._arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] + env._right_arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("r_joint")] + arm_joints = joint_pos[env._arm_idx_cache] + right_arm_joints = joint_pos[env._right_arm_idx_cache] + + try: + joy_dbg = teleop_interface.xr_client.get_joystick("left") + joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]" + except Exception: + joy_str = "N/A" + + if is_dual_arm: + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + + print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| Right Arm Joints (rad): {right_arm_joints}") + print(f"| Left EEF Pos (world, m): {eef_pos_left}") + print(f"| Right EEF Pos (world, m): {eef_pos_right}") + print(f"| Left Gripper Contact: {read_gripper_contact()}") + print(f"| Cmd left_arm(abs): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd left_gripper: {last_act[11:12]} (+1=open -1=close)") + print(f"| Cmd right_arm(abs): {last_act[12:19]}") + print(f"| Cmd right_gripper: {last_act[19:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") + else: + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + eef_quat = policy_obs["eef_quat"][0].cpu().numpy() + + if not hasattr(env, '_prev_eef_quat'): + env._prev_eef_quat = eef_quat.copy() + prev_q = env._prev_eef_quat + r_prev = R.from_quat([prev_q[1], prev_q[2], prev_q[3], prev_q[0]]) + r_curr = R.from_quat([eef_quat[1], eef_quat[2], eef_quat[3], eef_quat[0]]) + delta_rotvec = (r_curr * r_prev.inv()).as_rotvec() + delta_angle_deg = np.degrees(np.linalg.norm(delta_rotvec)) + delta_axis = delta_rotvec / (np.linalg.norm(delta_rotvec) + 1e-9) + env._prev_eef_quat = eef_quat.copy() + approach_vec = r_curr.apply(np.array([1.0, 0.0, 0.0])) + + print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| EEF Pos (world, m): {eef_pos}") + print(f"| EEF approach vec (world): [{approach_vec[0]:+.3f}, {approach_vec[1]:+.3f}, {approach_vec[2]:+.3f}] (夹爪朝向)") + print(f"| EEF rot since last print: {delta_angle_deg:+.2f}° around [{delta_axis[0]:+.3f},{delta_axis[1]:+.3f},{delta_axis[2]:+.3f}] (world)") + print(f"| Cmd arm(abs pos+quat): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd gripper: {last_act[11:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") + + except Exception as e: + logger.error(f"Error during simulation step: {e}") + break + + if streamer is not None: + streamer.close() + teleop_interface.close() + if is_dual_arm: + teleop_right_ref.close() + shared_client.close() + env.close() + + +if __name__ == "__main__": + main() + simulation_app.close() diff --git a/scripts/environments/teleoperation/xr_teleop/__init__.py b/scripts/environments/teleoperation/xr_teleop/__init__.py new file mode 100644 index 0000000..0200e2c --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/__init__.py @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""XR teleoperation package for MindBot. + +Agent hierarchy: + BaseTeleopAgent → SingleArmXrAgent (8D) + BaseTeleopAgent → DualArmXrAgent (20D) → DualArmHeadXrAgent (23D) +""" + +from .base_agent import BaseTeleopAgent +from .single_arm_agent import SingleArmXrAgent +from .dual_arm_agent import DualArmXrAgent +from .dual_arm_head_agent import DualArmHeadXrAgent +from .xr_controller import XrTeleopController +from .chassis import ChassisController +from .head_tracker import HeadTracker +from .streaming import StreamingManager +from .diagnostics import DiagnosticsReporter + +__all__ = [ + "BaseTeleopAgent", + "SingleArmXrAgent", + "DualArmXrAgent", + "DualArmHeadXrAgent", + "XrTeleopController", + "ChassisController", + "HeadTracker", + "StreamingManager", + "DiagnosticsReporter", +] diff --git a/scripts/environments/teleoperation/xr_teleop/base_agent.py b/scripts/environments/teleoperation/xr_teleop/base_agent.py new file mode 100644 index 0000000..0d34868 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/base_agent.py @@ -0,0 +1,136 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Base teleoperation agent with main loop, env management, and reset handling.""" + +from __future__ import annotations + +import logging +from abc import ABC, abstractmethod + +import torch + +logger = logging.getLogger(__name__) + + +class BaseTeleopAgent(ABC): + """Abstract base for XR teleoperation agents. + + Provides the main simulation loop, env lifecycle, and reset handling. + Subclasses implement assemble_action() to define their control dimensions. + """ + + def __init__(self, env, simulation_app, *, debug_viewports: bool = True): + self.env = env + self.simulation_app = simulation_app + self.device = env.unwrapped.device + self.num_envs = env.num_envs + self._should_reset = False + self.sim_frame = 0 + + # Viewport management + self._robot_viewports: dict[str, object] = {} + self._robot_cam_prims: dict[str, str] = { + "Left Hand": "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand", + "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand", + "Head": "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft", + "Chest": "/World/envs/env_0/Robot/robot_trunk/cam_chest", + } + if debug_viewports: + self._create_viewports() + + def request_reset(self): + self._should_reset = True + print("[INFO] Reset requested via XR button.") + + def _create_viewports(self): + try: + import omni.kit.viewport.utility as vp_util + except ImportError: + logger.warning("[Viewport] omni.kit.viewport.utility not available.") + return + for name, cam_path in self._robot_cam_prims.items(): + vp_win = vp_util.create_viewport_window(f"Robot {name} View", width=640, height=360) + vp_win.viewport_api.camera_path = cam_path + self._robot_viewports[name] = vp_win + print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}") + + def _rebind_viewports(self): + for name, vp_win in self._robot_viewports.items(): + vp_win.viewport_api.camera_path = self._robot_cam_prims[name] + + def _clear_ik_target_state(self): + """Sync IK controller internals to current EEF pose to prevent jumps after reset.""" + action_terms = self.env.action_manager._terms + for key in self._ik_action_term_names(): + if key in action_terms: + term = action_terms[key] + ee_pos_b, ee_quat_b = term._compute_frame_pose() + term._raw_actions.zero_() + term._processed_actions.zero_() + term._ik_controller._command.zero_() + term._ik_controller.ee_pos_des[:] = ee_pos_b + term._ik_controller.ee_quat_des[:] = ee_quat_b + + def _ik_action_term_names(self) -> list[str]: + """Return IK action term names to clear on reset. Override in subclass.""" + return ["arm_action"] + + def _do_reset(self, obs): + """Execute reset sequence. Returns new obs.""" + obs, _ = self.env.reset() + self._clear_ik_target_state() + self._rebind_viewports() + self._should_reset = False + self.sim_frame = 0 + self.on_reset() + return obs + + def on_reset(self): + """Hook for subclasses to reset their own state.""" + pass + + @abstractmethod + def assemble_action(self, obs) -> torch.Tensor: + """Build the full action tensor for env.step(). Must be implemented by subclass.""" + ... + + def post_step(self, obs): + """Hook called after env.step(). Subclasses can add streaming, diagnostics, etc.""" + pass + + def run(self): + """Main simulation loop.""" + obs, _ = self.env.reset() + self._clear_ik_target_state() + self.on_reset() + + self._print_banner() + + while self.simulation_app.is_running(): + try: + with torch.inference_mode(): + if self._should_reset: + obs = self._do_reset(obs) + continue + + action = self.assemble_action(obs) + actions = action.unsqueeze(0).repeat(self.num_envs, 1).to(self.device) + obs, _, _, _, _ = self.env.step(actions) + + self.sim_frame += 1 + self.post_step(obs) + except Exception as e: + logger.error(f"Error during simulation step: {e}") + break + + self.cleanup() + + def _print_banner(self): + print("\n" + "=" * 50) + print(" Teleoperation Started!") + print("=" * 50 + "\n") + + def cleanup(self): + """Release resources. Override in subclass to close controllers, streamers, etc.""" + self.env.close() diff --git a/scripts/environments/teleoperation/xr_teleop/chassis.py b/scripts/environments/teleoperation/xr_teleop/chassis.py new file mode 100644 index 0000000..d977bf5 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/chassis.py @@ -0,0 +1,58 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Chassis (mobile base) control via XR joystick.""" + +import torch + + +class ChassisController: + """Converts left joystick input into wheel velocities and direct base velocity commands. + + Args: + base_speed: Max wheel speed (rad/s) at full joystick. + base_turn: Max wheel differential (rad/s) at full joystick. + drive_speed: Max robot linear speed (m/s) for direct base control. + drive_turn: Max robot yaw rate (rad/s) for direct base control. + """ + + def __init__(self, base_speed: float = 5.0, base_turn: float = 2.0, + drive_speed: float = 0.5, drive_turn: float = 1.5): + self.base_speed = base_speed + self.base_turn = base_turn + self.drive_speed = drive_speed + self.drive_turn = drive_turn + + def get_commands(self, xr_client) -> tuple[torch.Tensor, float, float]: + """Read left joystick and return (wheel_cmd_4D, v_fwd_m_s, omega_rad_s).""" + try: + joy = xr_client.get_joystick("left") + jy = float(joy[1]) + jx = float(joy[0]) + except Exception: + return torch.zeros(4), 0.0, 0.0 + + v_w = jy * self.base_speed + omega_w = jx * self.base_turn + right_vel = v_w - omega_w + left_vel = v_w + omega_w + wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32) + + v_fwd = jy * self.drive_speed + omega = -jx * self.drive_turn # left push → positive yaw = left turn + return wheel_cmd, v_fwd, omega + + @staticmethod + def apply_base_velocity(robot, v_fwd: float, omega: float, num_envs: int, device) -> None: + """Directly set robot root velocity to bypass isotropic friction for skid-steer turning.""" + if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4: + return + rq = robot.data.root_quat_w # [N, 4] wxyz + w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3] + fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q) + fwd_y = 2.0 * (x_q * y_q + w_q * z_q) + vel = torch.zeros(num_envs, 6, device=device) + vel[:, 0] = v_fwd * fwd_x + vel[:, 1] = v_fwd * fwd_y + vel[:, 5] = omega + robot.write_root_velocity_to_sim(vel) diff --git a/scripts/environments/teleoperation/xr_teleop/diagnostics.py b/scripts/environments/teleoperation/xr_teleop/diagnostics.py new file mode 100644 index 0000000..7487297 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/diagnostics.py @@ -0,0 +1,91 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Periodic diagnostics reporter for teleoperation debugging.""" + +import numpy as np + + +class DiagnosticsReporter: + """Prints robot state info every N frames during teleoperation.""" + + def __init__(self, interval: int = 30, is_dual_arm: bool = False): + self.interval = interval + self.is_dual_arm = is_dual_arm + self._left_arm_idx: list[int] | None = None + self._right_arm_idx: list[int] | None = None + self._joint_names_printed = False + + def _ensure_indices(self, robot): + if self._left_arm_idx is None: + jnames = robot.joint_names + self._left_arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")] + self._right_arm_idx = [i for i, n in enumerate(jnames) if n.startswith("r_joint")] + + def report(self, env, obs, frame: int, last_act: np.ndarray | None = None, + xr_client=None, read_gripper_contact=None): + """Print diagnostics if frame is on interval boundary.""" + if frame % self.interval != 0: + return + + np.set_printoptions(precision=4, suppress=True, floatmode='fixed') + policy_obs = obs["policy"] + joint_pos = policy_obs["joint_pos"][0].cpu().numpy() + if last_act is None: + last_act = policy_obs["actions"][0].cpu().numpy() + + robot = env.unwrapped.scene["robot"] + self._ensure_indices(robot) + + # Print full joint table on first report + if not self._joint_names_printed: + jnames = robot.joint_names + print(f"\n{'=' * 70}") + print(f" ALL {len(jnames)} JOINT NAMES AND POSITIONS") + print(f"{'=' * 70}") + for i, name in enumerate(jnames): + print(f" [{i:2d}] {name:30s} = {joint_pos[i]:+.4f}") + print(f" Deduced left arm indices: {self._left_arm_idx}") + print(f"{'=' * 70}\n") + self._joint_names_printed = True + + arm_joints = joint_pos[self._left_arm_idx] + right_arm_joints = joint_pos[self._right_arm_idx] + + joy_str = "N/A" + if xr_client is not None: + try: + joy_dbg = xr_client.get_joystick("left") + joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]" + except Exception: + pass + + if self.is_dual_arm: + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + contact_str = read_gripper_contact() if read_gripper_contact else "N/A" + + print(f"\n---------------- [ROBOT STATE frame={frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| Right Arm Joints (rad): {right_arm_joints}") + print(f"| Left EEF Pos (world, m): {eef_pos_left}") + print(f"| Right EEF Pos (world, m): {eef_pos_right}") + print(f"| Left Gripper Contact: {contact_str}") + print(f"| Cmd left_arm(abs): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd left_gripper: {last_act[11:12]} (+1=open -1=close)") + print(f"| Cmd right_arm(abs): {last_act[12:19]}") + print(f"| Cmd right_gripper: {last_act[19:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") + else: + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + + print(f"\n---------------- [ROBOT STATE frame={frame}] ----------------") + print(f"| Left Arm Joints (rad): {arm_joints}") + print(f"| EEF Pos (world, m): {eef_pos}") + print(f"| Cmd arm(abs pos+quat): {last_act[:7]}") + print(f"| Cmd wheel vel (rad/s): {last_act[7:11]}") + print(f"| Cmd gripper: {last_act[11:]} (+1=open -1=close)") + print(f"| Joystick (x=turn,y=fwd): {joy_str}") + print(f"----------------------------------------------------------------") diff --git a/scripts/environments/teleoperation/xr_teleop/dual_arm_agent.py b/scripts/environments/teleoperation/xr_teleop/dual_arm_agent.py new file mode 100644 index 0000000..b0010ab --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/dual_arm_agent.py @@ -0,0 +1,115 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Dual-arm XR teleoperation agent with chassis control.""" + +from __future__ import annotations + +import torch + +from xr_utils import XrClient + +from .base_agent import BaseTeleopAgent +from .xr_controller import XrTeleopController +from .chassis import ChassisController +from .frame_utils import convert_action_world_to_root + + +class DualArmXrAgent(BaseTeleopAgent): + """Dual-arm teleoperation with chassis joystick control. + + Action: left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1) = 20D + """ + + def __init__(self, env, simulation_app, *, + pos_sensitivity: float = 1.0, + rot_sensitivity: float = 0.3, + base_speed: float = 5.0, + base_turn: float = 2.0, + drive_speed: float = 0.5, + drive_turn: float = 1.5, + debug_viewports: bool = True): + super().__init__(env, simulation_app, debug_viewports=debug_viewports) + + self.shared_client = XrClient() + self.teleop_left = XrTeleopController( + arm="left", pos_sensitivity=pos_sensitivity, + rot_sensitivity=rot_sensitivity, xr_client=self.shared_client, + ) + self.teleop_right = XrTeleopController( + arm="right", pos_sensitivity=pos_sensitivity, + rot_sensitivity=rot_sensitivity, xr_client=self.shared_client, + ) + self.teleop_left.add_callback("RESET", self.request_reset) + + self.chassis = ChassisController( + base_speed=base_speed, base_turn=base_turn, + drive_speed=drive_speed, drive_turn=drive_turn, + ) + + self._last_root_left = None + self._last_root_right = None + + @property + def xr_client(self): + return self.shared_client + + def _ik_action_term_names(self) -> list[str]: + return ["left_arm_action", "right_arm_action"] + + def on_reset(self): + self.teleop_left.reset() + self.teleop_right.reset() + self._last_root_left = None + self._last_root_right = None + + def assemble_action(self, obs) -> torch.Tensor: + policy_obs = obs["policy"] + robot = self.env.unwrapped.scene["robot"] + + # Read chassis + wheel_cmd, self._v_fwd, self._omega = self.chassis.get_commands(self.shared_client) + + # Left arm + eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy() + eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy() + left_action = self.teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left) + + # Right arm + eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy() + eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy() + right_action = self.teleop_right.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right) + + # Joint-locking: only convert when grip active + if self.teleop_left.grip_active or self._last_root_left is None: + self._last_root_left = convert_action_world_to_root(left_action, robot)[:7].clone() + if self.teleop_right.grip_active or self._last_root_right is None: + self._last_root_right = convert_action_world_to_root(right_action, robot)[:7].clone() + + # left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1) + return torch.cat([ + self._last_root_left, wheel_cmd, left_action[7:8], + self._last_root_right, right_action[7:8], + ]) + + def post_step(self, obs): + # Apply direct base velocity override for skid-steer + robot = self.env.unwrapped.scene["robot"] + self.chassis.apply_base_velocity(robot, self._v_fwd, self._omega, self.num_envs, self.device) + + def cleanup(self): + self.teleop_left.close() + self.teleop_right.close() + self.shared_client.close() + super().cleanup() + + def _print_banner(self): + print("\n" + "=" * 50) + print(" Teleoperation Started!") + print(" LEFT controller -> left arm") + print(" RIGHT controller -> right arm") + print(" TRIGGER: open/close gripper") + print(" GRIP (hold): move the arm") + print(" Left joystick: Y=forward/back, X=turn") + print(" Y (left controller): reset environment") + print("=" * 50 + "\n") diff --git a/scripts/environments/teleoperation/xr_teleop/dual_arm_head_agent.py b/scripts/environments/teleoperation/xr_teleop/dual_arm_head_agent.py new file mode 100644 index 0000000..95ecc77 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/dual_arm_head_agent.py @@ -0,0 +1,85 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Dual-arm + head tracking + trunk + stereo streaming XR agent.""" + +from __future__ import annotations + +import torch + +from .dual_arm_agent import DualArmXrAgent +from .head_tracker import HeadTracker +from .streaming import StreamingManager +from .diagnostics import DiagnosticsReporter + + +class DualArmHeadXrAgent(DualArmXrAgent): + """Extends DualArmXrAgent with head tracking, trunk hold, and VR stereo streaming. + + Action: left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1) | head(2) | trunk(1) = 23D + """ + + def __init__(self, env, simulation_app, *, + pos_sensitivity: float = 1.0, + rot_sensitivity: float = 0.3, + base_speed: float = 5.0, + base_turn: float = 2.0, + drive_speed: float = 0.5, + drive_turn: float = 1.5, + stream_to: str | None = None, + stream_port: int = 12345, + stream_bitrate: int = 20_000_000, + trunk_target: float = 0.1, + debug_viewports: bool = True): + super().__init__( + env, simulation_app, + pos_sensitivity=pos_sensitivity, rot_sensitivity=rot_sensitivity, + base_speed=base_speed, base_turn=base_turn, + drive_speed=drive_speed, drive_turn=drive_turn, + debug_viewports=debug_viewports, + ) + + self.head_tracker = HeadTracker() + self.trunk_cmd = torch.tensor([trunk_target], dtype=torch.float32) + + # Streaming + self.streamer: StreamingManager | None = None + if stream_to: + scene = env.unwrapped.scene + self.streamer = StreamingManager(stream_to, stream_port, scene, bitrate=stream_bitrate) + + # Diagnostics + self.diagnostics = DiagnosticsReporter(interval=30, is_dual_arm=True) + + def on_reset(self): + super().on_reset() + self.head_tracker.reset() + + def assemble_action(self, obs) -> torch.Tensor: + base_action = super().assemble_action(obs) + + # Head tracking + head_targets = self.head_tracker.get_targets(self.shared_client) + head_cmd = torch.tensor(head_targets, dtype=torch.float32) + + # base(20) | head(2) | trunk(1) + return torch.cat([base_action, head_cmd, self.trunk_cmd]) + + def post_step(self, obs): + super().post_step(obs) + + # Stereo streaming + if self.streamer is not None: + scene = self.env.unwrapped.scene + self.streamer.send(scene, self.sim_frame) + + # Diagnostics + self.diagnostics.report( + self.env, obs, self.sim_frame, + xr_client=self.shared_client, + ) + + def cleanup(self): + if self.streamer is not None: + self.streamer.close() + super().cleanup() diff --git a/scripts/environments/teleoperation/xr_teleop/frame_utils.py b/scripts/environments/teleoperation/xr_teleop/frame_utils.py new file mode 100644 index 0000000..beafe45 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/frame_utils.py @@ -0,0 +1,55 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Coordinate frame utilities for XR teleoperation.""" + +import numpy as np +import torch +from scipy.spatial.transform import Rotation as R + + +def quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R: + """Convert Isaac-style wxyz quaternion to scipy Rotation.""" + return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]]) + + +def rotation_to_quat_wxyz(rot: R) -> np.ndarray: + """Convert scipy Rotation to Isaac-style wxyz quaternion.""" + q = rot.as_quat() + return np.array([q[3], q[0], q[1], q[2]]) + + +def world_pose_to_root_frame( + pos_w: np.ndarray, + quat_wxyz: np.ndarray, + root_pos_w: np.ndarray, + root_quat_wxyz: np.ndarray, +) -> tuple[np.ndarray, np.ndarray]: + """Express a world-frame pose in the robot root frame.""" + root_rot = quat_wxyz_to_rotation(root_quat_wxyz) + pose_rot = quat_wxyz_to_rotation(quat_wxyz) + pos_root = root_rot.inv().apply(pos_w - root_pos_w) + quat_root = rotation_to_quat_wxyz(root_rot.inv() * pose_rot) + return pos_root, quat_root + + +def convert_action_world_to_root(action_tensor: torch.Tensor, robot) -> torch.Tensor: + """Convert absolute IK action from world frame to robot root frame. + + Args: + action_tensor: 8D tensor [x, y, z, qw, qx, qy, qz, gripper] in world frame. + robot: Isaac Lab Articulation object (env.unwrapped.scene["robot"]). + + Returns: + 8D tensor with pos/quat converted to robot root frame. + """ + root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy() + root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy() + pos_root, quat_root = world_pose_to_root_frame( + action_tensor[:3].numpy(), action_tensor[3:7].numpy(), + root_pos_w, root_quat_w, + ) + out = action_tensor.clone() + out[:3] = torch.tensor(pos_root, dtype=torch.float32) + out[3:7] = torch.tensor(quat_root, dtype=torch.float32) + return out diff --git a/scripts/environments/teleoperation/xr_teleop/head_tracker.py b/scripts/environments/teleoperation/xr_teleop/head_tracker.py new file mode 100644 index 0000000..51b5f7d --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/head_tracker.py @@ -0,0 +1,44 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Head tracking via XR headset (HMD) for robot head joint control.""" + +import numpy as np +from scipy.spatial.transform import Rotation as R + +from xr_utils import transform_xr_pose, is_valid_quaternion + + +class HeadTracker: + """Extracts yaw/pitch from HMD pose relative to a calibrated neutral orientation.""" + + def __init__(self, yaw_limit: float = 1.57, pitch_limit: float = 0.52): + self.yaw_limit = yaw_limit # ±90° + self.pitch_limit = pitch_limit # ±30° + self.neutral_rot: R | None = None + + def reset(self): + """Clear neutral calibration; next call to get_targets() will re-calibrate.""" + self.neutral_rot = None + + def get_targets(self, xr_client) -> np.ndarray: + """Return [yaw, pitch] in radians relative to neutral HMD pose. + + First call after reset() records neutral orientation and returns zeros. + """ + try: + raw_pose = xr_client.get_pose("headset") + if not is_valid_quaternion(raw_pose[3:]): + return np.zeros(2, dtype=np.float32) + _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + head_rot = R.from_quat(head_world_quat_xyzw) + if self.neutral_rot is None: + self.neutral_rot = head_rot + return np.zeros(2, dtype=np.float32) + rel_rot = head_rot * self.neutral_rot.inv() + euler_zyx = rel_rot.as_euler("ZYX") + yaw = float(np.clip(euler_zyx[0], -self.yaw_limit, self.yaw_limit)) + pitch = float(np.clip(euler_zyx[1], -self.pitch_limit, self.pitch_limit)) + return np.array([yaw, pitch], dtype=np.float32) + except Exception: + return np.zeros(2, dtype=np.float32) diff --git a/scripts/environments/teleoperation/xr_teleop/single_arm_agent.py b/scripts/environments/teleoperation/xr_teleop/single_arm_agent.py new file mode 100644 index 0000000..5e8bb84 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/single_arm_agent.py @@ -0,0 +1,60 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Single-arm XR teleoperation agent.""" + +from __future__ import annotations + +import torch + +from .base_agent import BaseTeleopAgent +from .xr_controller import XrTeleopController +from .frame_utils import convert_action_world_to_root + + +class SingleArmXrAgent(BaseTeleopAgent): + """Single-arm teleoperation: one XR controller -> one robot arm + gripper. + + Action: arm(7) | gripper(1) = 8D + """ + + def __init__(self, env, simulation_app, *, + arm: str = "left", + pos_sensitivity: float = 1.0, + rot_sensitivity: float = 0.3, + xr_client=None, + debug_viewports: bool = True): + super().__init__(env, simulation_app, debug_viewports=debug_viewports) + + self.xr_client = xr_client + self.teleop = XrTeleopController( + arm=arm, pos_sensitivity=pos_sensitivity, + rot_sensitivity=rot_sensitivity, xr_client=xr_client, + ) + self.teleop.add_callback("RESET", self.request_reset) + self._last_root = None + + def _ik_action_term_names(self) -> list[str]: + return ["arm_action"] + + def on_reset(self): + self.teleop.reset() + self._last_root = None + + def assemble_action(self, obs) -> torch.Tensor: + policy_obs = obs["policy"] + eef_pos = policy_obs["eef_pos"][0].cpu().numpy() + eef_quat = policy_obs["eef_quat"][0].cpu().numpy() + + action = self.teleop.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat) + + robot = self.env.unwrapped.scene["robot"] + if self.teleop.grip_active or self._last_root is None: + self._last_root = convert_action_world_to_root(action, robot)[:7].clone() + + # arm(7) | gripper(1) + return torch.cat([self._last_root, action[7:8]]) + + def cleanup(self): + self.teleop.close() + super().cleanup() diff --git a/scripts/environments/teleoperation/xr_teleop/streaming.py b/scripts/environments/teleoperation/xr_teleop/streaming.py new file mode 100644 index 0000000..efc7df5 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/streaming.py @@ -0,0 +1,65 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Sim-to-VR H264 stereo streaming manager.""" + +import logging + +logger = logging.getLogger(__name__) + + +class StreamingManager: + """Manages stereo camera → H264 → TCP streaming to VR headset. + + Wraps SimVideoStreamer with graceful error handling and auto-disable on failure. + """ + + def __init__(self, host: str, port: int, scene, bitrate: int = 20_000_000): + self._streamer = None + self._enabled = False + + if "vr_left_eye" not in scene.sensors: + logger.warning("[Stream] vr_left_eye camera not found. Streaming disabled.") + return + + try: + from mindbot.utils.sim_video_streamer import SimVideoStreamer + cam = scene["vr_left_eye"] + self._streamer = SimVideoStreamer( + host=host, port=port, + width=cam.cfg.width, height=cam.cfg.height, + fps=30, bitrate=bitrate, + ) + if self._streamer.connect(): + self._enabled = True + print(f"[INFO] Sim stereo streaming to {host}:{port} " + f"({cam.cfg.width * 2}x{cam.cfg.height} SBS @ 30fps)") + else: + logger.error("[Stream] Failed to connect. Streaming disabled.") + self._streamer = None + except ImportError: + logger.error("[Stream] mindbot.utils.sim_video_streamer not found. pip install av?") + except Exception as e: + logger.error(f"[Stream] Init failed: {e}") + + @property + def enabled(self) -> bool: + return self._enabled + + def send(self, scene, frame_count: int = 0) -> None: + """Send one stereo frame from scene cameras. Auto-disables on failure.""" + if not self._enabled: + return + try: + left_rgb = scene["vr_left_eye"].data.output["rgb"][0].cpu().numpy() + right_rgb = scene["vr_right_eye"].data.output["rgb"][0].cpu().numpy() + if not self._streamer.send_frame(left_rgb, right_rgb): + logger.warning("[Stream] Connection lost. Disabling streaming.") + self._enabled = False + except Exception as e: + if frame_count % 300 == 0: + logger.warning(f"[Stream] Frame error: {e}") + + def close(self): + if self._streamer is not None: + self._streamer.close() diff --git a/scripts/environments/teleoperation/xr_teleop/xr_controller.py b/scripts/environments/teleoperation/xr_teleop/xr_controller.py new file mode 100644 index 0000000..72c4f81 --- /dev/null +++ b/scripts/environments/teleoperation/xr_teleop/xr_controller.py @@ -0,0 +1,176 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""XR teleoperation controller with absolute IK target accumulation and grip clutch.""" + +from collections.abc import Callable + +import numpy as np +import torch +from scipy.spatial.transform import Rotation as R + +from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion + +from .frame_utils import quat_wxyz_to_rotation, rotation_to_quat_wxyz + + +class XrTeleopController: + """Teleop controller for PICO XR headset (absolute IK mode). + + Accumulates an absolute EEF target in Isaac Sim world frame. + Grip button acts as clutch; B/Y buttons trigger environment reset. + """ + + def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3, + max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None): + self.xr_client = xr_client if xr_client is not None else XrClient() + self._owns_client = xr_client is None + self.pos_sensitivity = pos_sensitivity + self.rot_sensitivity = rot_sensitivity + self.max_pos_per_step = max_pos_per_step + self.max_rot_per_step = max_rot_per_step + self.arm = arm + + self.controller_name = "left_controller" if arm == "left" else "right_controller" + self.grip_name = "left_grip" if arm == "left" else "right_grip" + self.trigger_name = "left_trigger" if arm == "left" else "right_trigger" + + from xr_utils.geometry import R_HEADSET_TO_WORLD + self.R_headset_world = R_HEADSET_TO_WORLD + + self.prev_xr_pos = None + self.prev_xr_quat = None + self.target_eef_pos = None + self.target_eef_quat = None + + self.grip_active = False + self.frame_count = 0 + self.reset_button_latched = False + self.require_grip_reengage = False + self.grip_engage_threshold = 0.8 + self.grip_release_threshold = 0.2 + self.callbacks = {} + + def add_callback(self, name: str, func: Callable): + self.callbacks[name] = func + + def reset(self): + self.prev_xr_pos = None + self.prev_xr_quat = None + self.grip_active = False + self.frame_count = 0 + self.target_eef_pos = None + self.target_eef_quat = None + self.require_grip_reengage = True + + def close(self): + if self._owns_client: + self.xr_client.close() + + def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose.""" + action = torch.zeros(8) + if self.target_eef_pos is not None and self.target_eef_quat is not None: + action[:3] = torch.tensor(self.target_eef_pos.copy()) + action[3:7] = torch.tensor(self.target_eef_quat.copy()) + elif current_eef_pos is not None and current_eef_quat is not None: + action[:3] = torch.tensor(current_eef_pos.copy()) + action[3:7] = torch.tensor(current_eef_quat.copy()) + else: + action[3] = 1.0 + action[7] = 1.0 if trigger > 0.5 else -1.0 + return action + + def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor: + """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper]. + + Position and quaternion are in Isaac Sim world frame. + Caller must convert to robot root frame before sending to IK. + """ + try: + if self.arm == "left": + reset_pressed = self.xr_client.get_button("Y") + else: + reset_pressed = False + if reset_pressed and not self.reset_button_latched: + if "RESET" in self.callbacks: + self.callbacks["RESET"]() + self.reset_button_latched = reset_pressed + except Exception: + pass + + try: + raw_pose = self.xr_client.get_pose(self.controller_name) + grip = self.xr_client.get_key_value(self.grip_name) + trigger = self.xr_client.get_key_value(self.trigger_name) + except Exception: + return self.get_zero_action(0.0, current_eef_pos, current_eef_quat) + + if not is_valid_quaternion(raw_pose[3:]): + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + if self.require_grip_reengage: + if grip <= self.grip_release_threshold: + self.require_grip_reengage = False + else: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold + + if not grip_pressed: + self.prev_xr_pos = None + self.prev_xr_quat = None + if self.grip_active or self.target_eef_pos is None: + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = False + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + if not self.grip_active: + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + if current_eef_pos is not None and current_eef_quat is not None: + self.target_eef_pos = current_eef_pos.copy() + self.target_eef_quat = current_eef_quat.copy() + self.grip_active = True + return self.get_zero_action(trigger, current_eef_pos, current_eef_quat) + + xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:]) + prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat) + + world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity + pos_norm = np.linalg.norm(world_delta_pos) + if pos_norm > self.max_pos_per_step: + world_delta_pos *= self.max_pos_per_step / pos_norm + + world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity + rot_norm = np.linalg.norm(world_delta_rot) + if rot_norm > self.max_rot_per_step: + world_delta_rot *= self.max_rot_per_step / rot_norm + + gripper_action = 1.0 if trigger > 0.5 else -1.0 + + if self.target_eef_pos is None: + self.target_eef_pos = np.zeros(3) + self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0]) + + self.target_eef_pos += world_delta_pos + + target_r = quat_wxyz_to_rotation(self.target_eef_quat) + self.target_eef_quat = rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r) + + self.prev_xr_pos = raw_pose[:3].copy() + self.prev_xr_quat = raw_pose[3:].copy() + self.frame_count += 1 + + action = torch.tensor([ + *self.target_eef_pos, + *self.target_eef_quat, + gripper_action, + ], dtype=torch.float32) + + return action diff --git a/scripts/test_sim_streamer.py b/scripts/test_sim_streamer.py new file mode 100644 index 0000000..39ea005 --- /dev/null +++ b/scripts/test_sim_streamer.py @@ -0,0 +1,97 @@ +#!/usr/bin/env python3 +"""Standalone test for SimVideoStreamer: generates a color-cycling SBS test pattern +and streams it to a TCP server (e.g. PICO VR headset). + +Usage: + # 1. On PICO: select SIM-Stereo, click Listen + # 2. On PC: + python scripts/test_sim_streamer.py --host 172.20.103.171 --port 12345 + + The PICO should display a side-by-side test pattern: + left half = cycling red, right half = cycling blue. +""" + +import argparse +import time + +import numpy as np + + +def main(): + parser = argparse.ArgumentParser(description="Test SimVideoStreamer with synthetic frames") + parser.add_argument("--host", type=str, required=True, help="PICO IP address (TCP server)") + parser.add_argument("--port", type=int, default=12345, help="TCP port") + parser.add_argument("--width", type=int, default=960, help="Eye width") + parser.add_argument("--height", type=int, default=540, help="Eye height") + parser.add_argument("--fps", type=int, default=30, help="Target FPS") + parser.add_argument("--duration", type=int, default=30, help="Stream duration in seconds") + args = parser.parse_args() + + # Direct import to avoid pulling in the full mindbot/isaaclab dependency chain + import importlib.util, pathlib + _spec = importlib.util.spec_from_file_location( + "sim_video_streamer", + pathlib.Path(__file__).resolve().parents[1] / "source/mindbot/mindbot/utils/sim_video_streamer.py", + ) + _mod = importlib.util.module_from_spec(_spec) + _spec.loader.exec_module(_mod) + SimVideoStreamer = _mod.SimVideoStreamer + + streamer = SimVideoStreamer( + host=args.host, + port=args.port, + width=args.width, + height=args.height, + fps=args.fps, + ) + + print(f"Connecting to {args.host}:{args.port}...") + if not streamer.connect(): + print("Failed to connect. Is PICO listening?") + return + + print(f"Streaming {args.width*2}x{args.height} SBS test pattern for {args.duration}s...") + frame_interval = 1.0 / args.fps + start_time = time.time() + frame_count = 0 + + try: + while time.time() - start_time < args.duration: + t = time.time() - start_time + + # Generate test pattern: cycling colors + # Left eye: red gradient with horizontal bars + left = np.zeros((args.height, args.width, 3), dtype=np.uint8) + red_val = int(128 + 127 * np.sin(t * 2.0)) + left[:, :, 0] = red_val # R channel + # Add horizontal position indicator + bar_y = int((args.height - 20) * (0.5 + 0.5 * np.sin(t * 1.5))) + left[bar_y:bar_y+20, :, :] = 255 + + # Right eye: blue gradient with horizontal bars + right = np.zeros((args.height, args.width, 3), dtype=np.uint8) + blue_val = int(128 + 127 * np.sin(t * 2.0 + 1.0)) + right[:, :, 2] = blue_val # B channel + right[bar_y:bar_y+20, :, :] = 255 # same bar position for stereo reference + + if not streamer.send_frame(left, right): + print("Connection lost.") + break + + frame_count += 1 + # Maintain frame rate + elapsed = time.time() - start_time + expected = frame_count * frame_interval + if expected > elapsed: + time.sleep(expected - elapsed) + + except KeyboardInterrupt: + print("\nStopped by user.") + + elapsed = time.time() - start_time + print(f"Sent {frame_count} frames in {elapsed:.1f}s ({frame_count/elapsed:.1f} fps)") + streamer.close() + + +if __name__ == "__main__": + main() diff --git a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py index 2c1cad6..f0b18fd 100644 --- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py +++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py @@ -13,7 +13,7 @@ The following configurations are available: """ import isaaclab.sim as sim_utils -from isaaclab.actuators import ImplicitActuatorCfg +from isaaclab.actuators import IdealPDActuatorCfg, ImplicitActuatorCfg from isaaclab.assets import ArticulationCfg from mindbot.utils.assets import MINDBOT_ASSETS_DIR @@ -21,7 +21,7 @@ from mindbot.utils.assets import MINDBOT_ASSETS_DIR MINDBOT_CFG = ArticulationCfg( spawn=sim_utils.UsdFileCfg( # usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindrobot_2i/mindrobot_2i.usd", - usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindrobot_2i/contact_sensor.usd", + usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindbot_zed/mindbot.usd", activate_contact_sensors=True, rigid_props=sim_utils.RigidBodyPropertiesCfg( disable_gravity=False, @@ -29,7 +29,7 @@ MINDBOT_CFG = ArticulationCfg( ), articulation_props=sim_utils.ArticulationRootPropertiesCfg( fix_root_link=False, - enabled_self_collisions=True, + enabled_self_collisions=False, # TODO: re-enable after fixing trunk self-collision solver_position_iteration_count=8, solver_velocity_iteration_count=0, ), @@ -58,11 +58,14 @@ MINDBOT_CFG = ArticulationCfg( "right_hand_joint_left": 0.0, "right_hand_joint_right": 0.0, # ====== Trunk & Head ====== + # NOTE: pos z = 0.44 (base clearance) + PrismaticJoint value. + # If you change PrismaticJoint, update pos z accordingly to avoid + # geometry clipping that causes the robot to fall on spawn. "PrismaticJoint": 0.26, "head_pitch_Joint": 0.0, "head_yaw_Joint": 0.0, }, - pos=(0.0, 0.0, 0.7), + pos=(0.001, 0.001, 0.74), # z = 0.44 (base clearance) + 0.26 (PrismaticJoint) ), actuators={ # RM-65 J1-J3: shoulder/elbow — max 180°/s ≈ 3.14 rad/s @@ -104,12 +107,16 @@ MINDBOT_CFG = ArticulationCfg( stiffness=80.0, damping=4.0, ), + # IdealPDActuator: explicitly computes F = k_p*(target-pos) + k_d*(target_vel-vel) + # and sends via set_dof_actuation_forces(), bypassing PhysX drive which + # doesn't work correctly for prismatic joints with set_dof_position_targets(). + # NOTE: trunk gravity is disabled in USD, so moderate stiffness is sufficient. "trunk": ImplicitActuatorCfg( joint_names_expr=["PrismaticJoint"], - effort_limit_sim=200.0, - velocity_limit_sim=0.2, - stiffness=1000.0, - damping=100.0, + effort_limit=100.0, + velocity_limit=0.2, + stiffness=10000.0, + damping=1000.0, ), "wheels": ImplicitActuatorCfg( # joint_names_expr=[".*_revolute_Joint"], @@ -168,6 +175,8 @@ MINDBOT_HEAD_JOIONTS =[ "head_pitch_Joint" ] +MINDBOT_TRUNK_JOINT = "PrismaticJoint" + # EEF body names (as defined in the USD asset) MINDBOT_LEFT_EEF_BODY = "left_hand_body" MINDBOT_RIGHT_EEF_BODY = "right_hand_body" diff --git a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py index ef4e93a..295f9cb 100644 --- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py +++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py @@ -30,6 +30,7 @@ from isaaclab.managers import TerminationTermCfg as DoneTerm from isaaclab.markers.config import FRAME_MARKER_CFG from isaaclab.scene import InteractiveSceneCfg from isaaclab.sensors.frame_transformer.frame_transformer_cfg import FrameTransformerCfg, OffsetCfg +from isaaclab.sensors import CameraCfg from isaaclab.sim.schemas.schemas_cfg import RigidBodyPropertiesCfg from isaaclab.sim.spawners.from_files.from_files_cfg import GroundPlaneCfg from isaaclab.sim.spawners.materials import RigidBodyMaterialCfg @@ -48,6 +49,7 @@ from .mindrobot_2i_cfg import ( MINDBOT_RIGHT_GRIPPER_JOINTS, MINDBOT_WHEEL_JOINTS, MINDBOT_HEAD_JOIONTS, + MINDBOT_TRUNK_JOINT, MINDBOT_LEFT_ARM_PRIM_ROOT, MINDBOT_RIGHT_ARM_PRIM_ROOT, MINDBOT_LEFT_EEF_PRIM, @@ -96,6 +98,28 @@ class MindRobotDualArmSceneCfg(InteractiveSceneCfg): ee_frame: FrameTransformerCfg = None ee_frame_right: FrameTransformerCfg = None + # VR stereo cameras — reuse existing ZED_X CameraLeft/CameraRight prims from USD + # (spawn=None means use existing prim, don't create a new one) + # The ZED_X baseline in USD is 120mm; for perfect VR IPD we'd ideally set 65mm, + # but reusing existing prims avoids spawn issues. stereoOffset=0 in video_source.yml + # since sim rendering gives correct perspective regardless of physical baseline. + vr_left_eye: CameraCfg = CameraCfg( + prim_path="{ENV_REGEX_NS}/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft", + update_period=1 / 30, + height=1440, # 2560x1440 per eye → SBS 5120x1440; effective ~1920x1440 after shader crop + width=2560, + data_types=["rgb"], + spawn=None, + ) + vr_right_eye: CameraCfg = CameraCfg( + prim_path="{ENV_REGEX_NS}/Robot/robot_head/ZED_X/base_link/ZED_X/CameraRight", + update_period=1 / 30, + height=1440, + width=2560, + data_types=["rgb"], + spawn=None, + ) + # ===================================================================== # Actions Configuration @@ -166,6 +190,13 @@ class MindRobotDualArmActionsCfg: scale=1.0, ) + trunk_action = JointPositionActionCfg( + asset_name="robot", + joint_names=[MINDBOT_TRUNK_JOINT], + scale=1.0, + use_default_offset=False, # action = absolute target, not offset from default + ) + # ===================================================================== # Observations Configuration @@ -236,12 +267,29 @@ def _disable_arm_gravity(env, env_ids: torch.Tensor): schemas.modify_rigid_body_properties(arm_path, arm_cfg) +def _disable_trunk_head_gravity(env, env_ids: torch.Tensor): + """Disable gravity for trunk and head at startup.""" + import isaaclab.sim.schemas as schemas + + cfg = RigidBodyPropertiesCfg(disable_gravity=True) + for env_id in range(env.num_envs): + for prim_path in [ + f"/World/envs/env_{env_id}/Robot/robot_trunk", + f"/World/envs/env_{env_id}/Robot/robot_head", + ]: + schemas.modify_rigid_body_properties(prim_path, cfg) + + @configclass class MindRobotDualArmEventsCfg: disable_arm_gravity = EventTerm( func=_disable_arm_gravity, mode="startup", ) + disable_trunk_head_gravity = EventTerm( + func=_disable_trunk_head_gravity, + mode="startup", + ) reset_scene = EventTerm( func=mdp.reset_scene_to_default, mode="reset", @@ -292,7 +340,7 @@ class MindRobot2iDualArmIKAbsEnvCfg(ManagerBasedRLEnvCfg): lookat=(0.0, 0.0, 0.8), # 瞄准点(机器人腰部高度附近) origin_type="world", ) - self.decimation = 2 + self.decimation = 2 # 25 Hz control (was 2 = 50 Hz, unnecessary for teleop) self.episode_length_s = 50.0 self.sim.dt = 0.01 # 100 Hz self.sim.render_interval = 2 diff --git a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py index 74f7899..837bf78 100644 --- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py +++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py @@ -56,10 +56,13 @@ MINDBOT_CFG = ArticulationCfg( "right_hand_joint_left": 0.0, "right_hand_joint_right": 0.0, # ====== Trunk & Head ====== + # NOTE: pos z = 0.44 (base clearance) + PrismaticJoint value. + # If you change PrismaticJoint, update pos z accordingly to avoid + # geometry clipping that causes the robot to fall on spawn. "PrismaticJoint": 0.26, "head_revoluteJoint": 0.0, }, - pos=(0.0, 0.0, 0.7), + pos=(0.0, 0.0, 0.7), # z = 0.44 (base clearance) + 0.26 (PrismaticJoint) ), actuators={ # RM-65 J1-J3: shoulder/elbow — max 180°/s ≈ 3.14 rad/s diff --git a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py index 44503ba..31d2319 100644 --- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py +++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py @@ -283,7 +283,7 @@ class MindRobotDualArmIKAbsEnvCfg(ManagerBasedRLEnvCfg): lookat=(0.0, 0.0, 0.8), # 瞄准点(机器人腰部高度附近) origin_type="world", ) - self.decimation = 2 + self.decimation = 2 # 25 Hz control (was 2 = 50 Hz, unnecessary for teleop) self.episode_length_s = 50.0 self.sim.dt = 0.01 # 100 Hz self.sim.render_interval = 2 diff --git a/source/mindbot/mindbot/utils/assets.py b/source/mindbot/mindbot/utils/assets.py index e1d6eab..b8cd9f6 100644 --- a/source/mindbot/mindbot/utils/assets.py +++ b/source/mindbot/mindbot/utils/assets.py @@ -20,7 +20,7 @@ import os ## MINDBOT_ASSETS_DIR: str = os.environ.get( "MINDBOT_ASSETS_DIR", - # "/home/tangger/LYT/maic_usd_assets_moudle", - "/home/maic/xh/maic_usd_assets_moudle" + "/home/tangger/LYT/maic_usd_assets_moudle", + # "/home/maic/xh/maic_usd_assets_moudle" # "/home/maic/LYT/maic_usd_assets_moudle", ) diff --git a/source/mindbot/mindbot/utils/sim_video_streamer.py b/source/mindbot/mindbot/utils/sim_video_streamer.py new file mode 100644 index 0000000..212e88b --- /dev/null +++ b/source/mindbot/mindbot/utils/sim_video_streamer.py @@ -0,0 +1,238 @@ +# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers. +# SPDX-License-Identifier: BSD-3-Clause + +"""Simulation stereo video streamer for VR headset. + +Encodes stereo RGB frames (left + right eye) as H264 side-by-side video +and streams to the Unity VR client over TCP, using the same 4-byte +big-endian length-prefixed NALU protocol as RobotVisionConsole. + +Usage: + streamer = SimVideoStreamer("172.20.103.171", 12345, width=960, height=540) + streamer.connect() + # In the sim loop: + streamer.send_frame(left_rgb_np, right_rgb_np) + # Cleanup: + streamer.close() + +The Unity client (PICO) acts as TCP SERVER on the specified port. +This streamer connects as TCP CLIENT — same role as RobotVisionConsole. +""" + +from __future__ import annotations + +import logging +import socket +import struct +import threading +import time +from typing import Optional + +import av +import numpy as np + +logger = logging.getLogger(__name__) + + +class SimVideoStreamer: + """H264 stereo video streamer over TCP for VR headset display. + + Accepts left/right RGB numpy arrays, assembles a side-by-side frame, + encodes to H264 with PyAV, and sends each NALU packet prefixed with + a 4-byte big-endian length header over a TCP socket. + """ + + def __init__( + self, + host: str, + port: int = 12345, + width: int = 960, + height: int = 540, + fps: int = 30, + bitrate: int = 20_000_000, + ): + """Initialize the streamer. + + Args: + host: IP address of the PICO VR headset (TCP server). + port: TCP port the PICO is listening on. + width: Width of EACH eye image (SBS output = width*2 x height). + height: Height of each eye image. + fps: Target frame rate for H264 encoding. + bitrate: H264 encoding bitrate in bits per second. + """ + self.host = host + self.port = port + self.eye_width = width + self.eye_height = height + self.sbs_width = width * 2 + self.sbs_height = height + self.fps = fps + self.bitrate = bitrate + + self._sock: Optional[socket.socket] = None + self._codec_ctx: Optional[av.CodecContext] = None + self._frame_idx = 0 + self._last_frame_time = 0.0 + self._min_frame_interval = 1.0 / fps + + # Async encoding: background thread picks up frames from _pending_frame + self._pending_frame: Optional[np.ndarray] = None + self._frame_lock = threading.Lock() + self._encode_thread: Optional[threading.Thread] = None + self._running = False + + # ------------------------------------------------------------------ + # Public API + # ------------------------------------------------------------------ + + def connect(self) -> bool: + """Connect to the PICO VR headset TCP server. + + Returns: + True if connection succeeded. + """ + try: + self._sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + self._sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) + self._sock.connect((self.host, self.port)) + logger.info(f"[SimVideoStreamer] Connected to {self.host}:{self.port}") + except OSError as e: + logger.error(f"[SimVideoStreamer] TCP connect failed: {e}") + self._sock = None + return False + + self._init_encoder() + self._frame_idx = 0 + self._last_frame_time = 0.0 + + # Start background encode+send thread + self._running = True + self._encode_thread = threading.Thread(target=self._encode_loop, daemon=True) + self._encode_thread.start() + return True + + def send_frame(self, left_rgb: np.ndarray, right_rgb: np.ndarray) -> bool: + """Submit a stereo frame for async encoding and sending. + + This method is NON-BLOCKING: it copies the frame data and returns + immediately. The background thread handles encoding and TCP send. + If the previous frame hasn't been encoded yet, this frame replaces it + (latest-frame-wins, no queue buildup). + + Args: + left_rgb: Left eye image, shape (H, W, 3), dtype uint8, RGB order. + right_rgb: Right eye image, shape (H, W, 3), dtype uint8, RGB order. + + Returns: + True if frame was accepted. False if connection is lost. + """ + if not self._running: + return False + + # Assemble SBS and hand off to background thread (latest-frame-wins) + sbs = np.concatenate([left_rgb, right_rgb], axis=1) # (H, W*2, 3) + with self._frame_lock: + self._pending_frame = sbs + + return True + + def close(self): + """Stop background thread, flush encoder, and close connection.""" + self._running = False + if self._encode_thread is not None: + self._encode_thread.join(timeout=2.0) + self._encode_thread = None + + if self._codec_ctx is not None: + try: + packets = self._codec_ctx.encode(None) + if self._sock is not None: + for pkt in packets: + nalu_data = bytes(pkt) + header = struct.pack(">I", len(nalu_data)) + self._sock.sendall(header + nalu_data) + except Exception: + pass + self._codec_ctx = None + + if self._sock is not None: + try: + self._sock.close() + except Exception: + pass + self._sock = None + logger.info("[SimVideoStreamer] Connection closed.") + + # ------------------------------------------------------------------ + # Private + # ------------------------------------------------------------------ + + def _init_encoder(self): + """Initialize H264 encoder via PyAV.""" + codec = av.Codec("libx264", "w") + self._codec_ctx = codec.create() + self._codec_ctx.width = self.sbs_width + self._codec_ctx.height = self.sbs_height + self._codec_ctx.pix_fmt = "yuv420p" + from fractions import Fraction + self._codec_ctx.time_base = Fraction(1, self.fps) + self._codec_ctx.framerate = Fraction(self.fps, 1) + self._codec_ctx.bit_rate = self.bitrate + self._codec_ctx.gop_size = self.fps # keyframe every 1 second + self._codec_ctx.max_b_frames = 0 # no B-frames for low latency + self._codec_ctx.thread_count = 2 + + # Low-latency encoding options + self._codec_ctx.options = { + "preset": "ultrafast", + "tune": "zerolatency", + "profile": "baseline", # widest decoder compatibility + } + + self._codec_ctx.open() + logger.info( + f"[SimVideoStreamer] H264 encoder initialized: " + f"{self.sbs_width}x{self.sbs_height} @ {self.fps}fps, " + f"bitrate={self.bitrate/1e6:.1f}Mbps" + ) + + def _encode_loop(self): + """Background thread: encode and send frames at target fps.""" + while self._running: + # Grab the latest pending frame (if any) + with self._frame_lock: + sbs = self._pending_frame + self._pending_frame = None + + if sbs is None: + time.sleep(0.001) # 1ms idle wait + continue + + # Rate limit encoding to target fps + now = time.monotonic() + if now - self._last_frame_time < self._min_frame_interval: + continue + self._last_frame_time = now + + # Encode + frame = av.VideoFrame.from_ndarray(sbs, format="rgb24") + frame.pts = self._frame_idx + self._frame_idx += 1 + + try: + packets = self._codec_ctx.encode(frame) + except av.error.FFmpegError as e: + logger.warning(f"[SimVideoStreamer] Encode error: {e}") + continue + + # Send each packet + for pkt in packets: + nalu_data = bytes(pkt) + header = struct.pack(">I", len(nalu_data)) + try: + self._sock.sendall(header + nalu_data) + except (BrokenPipeError, ConnectionResetError, OSError) as e: + logger.error(f"[SimVideoStreamer] Send failed: {e}") + self._running = False + return