Fix PrismaticJoint (trunk) control: IdealPDActuator with high stiffness

Replace ImplicitActuator (which doesn't work for prismatic joints) with IdealPDActuator. This bypasses PhysX drive and explicitly computes PD forces using set_dof_actuation_forces(). Changes: - stiffness=100k, damping=5k to resist arm reaction forces (~150N) with <2mm deflection - Read trunk_cmd from ArticulationCfg initial position (single source of truth) - Integrate trunk diagnostics into robot state output - Remove hardcoded trunk_target parameter
Fix XR teleop: body-frame IK control for mobile chassis
2026-03-25 16:32:52 +08:00 · 2026-03-25 11:32:28 +08:00 · 2026-03-23 22:06:13 +08:00
22 changed files with 2411 additions and 763 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,15 +1,16 @@
 # MindBot Project
-基于 Isaac Lab 的移动操作机器人（MindRobot）仿真学习框架，支持强化学习（RL）和模仿学习（IL）。
+基于 Isaac Lab 的移动操作机器人（MindRobot）仿真学习框架，支持强化学习（RL）、模仿学习（IL）和 XR 遥操作。
 ## 环境要求
 - **Isaac Sim**: 5.1.0
 - **Python**: 3.11
 - **GPU**: NVIDIA RTX PRO 6000 96GB（开发机），推荐开启 DLSS
 - **依赖**: isaaclab, isaaclab_assets, isaaclab_mimic, isaaclab_rl, isaaclab_tasks
 - **Conda 环境**：
  - `env_isaaclab` — 主开发环境
-  - `xr` — 另一个同样包含 isaaclab/isaacsim 的环境，两者均可运行 Isaac Lab 脚本
+  - `xr` — XR 遥操作 + 仿真推流环境，包含 isaaclab/isaacsim + PyAV
 **IMPORTANT**: 所有 Isaac Lab 脚本必须通过 `~/IsaacLab/isaaclab.sh -p` 启动，直接用 `python` 会报 `KeyError: 'EXP_PATH'`（该变量只有 `isaaclab.sh` 会在启动时设置）。
@@ -43,8 +44,19 @@ python scripts/rsl_rl/play.py --task Template-Mindbot-v0
 # 遥操作（Spacemouse / 键盘）
 python scripts/environments/teleoperation/teleop_se3_agent.py --task Isaac-MindRobot-LeftArm-IK-Rel-v0
-# 遥操作（XR 设备）
+# 遥操作（XR 双臂 + 头部控制）
-python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-LeftArm-IK-Rel-v0
+python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0
 # 遥操作（XR 双臂，无头部）
 python scripts/environments/teleoperation/teleop_xr_agent.py --task Isaac-MindRobot-DualArm-IK-Abs-v0
 # 遥操作 + 仿真画面推送到 PICO VR（需要 --enable_cameras）
 python scripts/environments/teleoperation/teleop_xr_agent.py \
  --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0 \
  --stream-to <PICO_IP> --enable_cameras
 # 推流测试（不需要 Isaac Lab，用 xr 环境直接运行）
 conda run -n xr python scripts/test_sim_streamer.py --host <PICO_IP>
 # 代码格式化
 pre-commit run --all-files
@@ -55,7 +67,9 @@ pre-commit run --all-files
 ```
 source/mindbot/mindbot/
 ├── robot/mindbot.py              # MindRobot 关节/执行器配置（MINDBOT_CFG）
-├── utils/assets.py               # USD 资源路径（MINDBOT_ASSETS_DIR）
+├── utils/
 │   ├── assets.py                 # USD 资源路径（MINDBOT_ASSETS_DIR）
 │   └── sim_video_streamer.py     # 仿真→VR H264推流模块（PyAV编码 + TCP）
 └── tasks/manager_based/
    ├── rl/                       # 强化学习任务
    │   ├── mindbot/              # Template-Mindbot-v0（抓取）
@@ -64,19 +78,75 @@ source/mindbot/mindbot/
    │   └── pullUltrasoundLidUp/  # Pull-Ultrasound-Lid-Up-v0
    └── il/                       # 模仿学习任务
        ├── demo/                 # Demo 任务（H1、Franka）
-        └── open_drybox/          # Isaac-MindRobot-LeftArm-IK-{Rel,Abs}-v0
+        └── open_drybox/          # 双臂遥操作任务
 scripts/environments/teleoperation/
 ├── teleop_xr_agent.py            # PICO XR 遥操作（双臂/单臂，支持仿真推流）
 ├── teleop_se3_agent.py           # Spacemouse/键盘 遥操作
 └── test_sim_streamer.py          # 推流独立测试（红蓝测试图案）
 deps/
 ├── XRoboToolkit-Teleop-Sample-Python/  # XR SDK Python 封装
 ├── XRoboToolkit-RobotVision-PC/        # 物理ZED相机采集+H264推流（C++）
 └── zed-isaac-sim/                      # ZED 仿真扩展（Isaac Sim OGN）
 ```
 ## 注册任务列表
-| 任务 ID | 类型 | 说明 |
+| 任务 ID | 类型 | Action维度 | 说明 |
-|---|---|---|
+|---|---|---|---|
-| `Template-Mindbot-v0` | RL | MindRobot 抓取 |
+| `Template-Mindbot-v0` | RL | - | MindRobot 抓取 |
-| `Template-centrifuge-lidup-v0` | RL | 离心机旋转 |
+| `Template-centrifuge-lidup-v0` | RL | - | 离心机旋转 |
-| `Template-Pull-v0` | RL | 物体拉取 |
+| `Template-Pull-v0` | RL | - | 物体拉取 |
-| `Pull-Ultrasound-Lid-Up-v0` | RL | 超声机盖拉取 |
+| `Pull-Ultrasound-Lid-Up-v0` | RL | - | 超声机盖拉取 |
-| `Isaac-MindRobot-LeftArm-IK-Rel-v0` | IL | 左臂 IK 相对控制 |
+| `Isaac-MindRobot-LeftArm-IK-Rel-v0` | IL | 12D | 左臂 IK 相对控制 |
-| `Isaac-MindRobot-LeftArm-IK-Abs-v0` | IL | 左臂 IK 绝对控制 |
+| `Isaac-MindRobot-LeftArm-IK-Abs-v0` | IL | 12D | 左臂 IK 绝对控制 |
 | `Isaac-MindRobot-DualArm-IK-Abs-v0` | IL | 20D | 双臂 IK（无头部） |
 | `Isaac-MindRobot-2i-DualArm-IK-Abs-v0` | IL | 22D | 双臂 IK + 头部（带VR推流相机） |
 ## XR 遥操作
 ### teleop_xr_agent.py 功能
 - **双臂绝对 IK 控制**：左右 PICO 手柄分别控制左右机械臂
 - **Grip 按钮**：按住移动手臂，松开锁定关节（缓存 root-frame IK 命令）
 - **左摇杆**：底盘前后/转向（直接速度控制，绕过摩擦限制）
 - **左手 Y 键**：重置环境（右手 B 键不触发重置）
 - **头部追踪**：HMD 偏航/俯仰映射到机器人头部关节（仅 2i 任务）
 - **`--stream-to`**：将仿真立体相机画面实时推送到 PICO VR 显示
 ### 仿真→VR 推流架构
 ```
 Isaac Lab env.step()
  → vr_left_eye / vr_right_eye (CameraCfg, 1280x720)
  → .cpu().numpy() (GPU→CPU)
  → SimVideoStreamer (后台线程: np.concatenate SBS → PyAV H264 → TCP)
  → PICO Unity Client (TCP Server :12345, H264解码 + VR双眼显示)
 ```
 - TCP 协议：4字节大端长度头 + H264 NALU（与 RobotVisionConsole 相同）
 - Unity Client 零修改，选择 SIM-Stereo 或 ZEDMINI 配置均可
 - PICO 端 `video_source.yml` 通过 adb push 热更新：
  ```bash
  adb push Assets/StreamingAssets/video_source.yml \
    /sdcard/Android/data/com.xrobotoolkit.client/files/video_source.yml
  adb shell am force-stop com.xrobotoolkit.client
  ```
 ### 性能优化
 - **decimation=4**（25Hz 控制频率，平衡 FPS 和响应性）
 - 推流时可关闭调试 viewport 节省 GPU（`--stream-to` 默认关，`--debug-viewports` 开）
 - Isaac Sim 开启 **DLSS + Frame Generation** 提升渲染帧率
 - 编码在后台线程异步执行，不阻塞主仿真循环
 ### ZED 2i 物理相机的 IPD 问题
 ZED 2i 基线 120mm >> 人眼 IPD 65mm，直接在 VR 中显示会导致所有距离视差过大，无法融合3D。
 - Shader `stereoOffset` 只是 toe-in 矫正，只对一个固定距离有效
 - PC 端深度矫正（DIBR）理论正确但 artifact 严重
 - **根本解决**：在仿真中用 65mm 基线虚拟相机，或使用 ZED Mini（63mm ≈ 人眼 IPD）
 ## 新增任务的规范
@@ -119,12 +189,13 @@ tasks/manager_based/il/<task_name>/
 | 左夹爪 | left_hand_joint_left/right |
 | 右夹爪 | right_hand_joint_left/right |
 | 躯干升降 | PrismaticJoint |
-| 头部 | head_revoluteJoint |
+| 头部 | head_revoluteJoint, head_pitch_Joint |
 | 底盘轮子 | front/back revolute joints |
 ## 遥操作数据收集
- Isaac Lab 内遥操作脚本使用 `env_isaaclab` 环境
+- Isaac Lab 内遥操作脚本使用 `xr` 环境（包含 isaaclab + XR SDK）
- XR 设备接入依赖 `deps/XRoboToolkit-Teleop-Sample-Python/`，需切换到 `xr` 环境
+- XR 设备接入依赖 `deps/XRoboToolkit-Teleop-Sample-Python/`
 - 示教数据格式兼容 LeRobot / RoboMimic
 - 多相机配置：cam_head, cam_chest, cam_left_hand, cam_right_hand, cam_top, cam_side
 - VR 立体相机：vr_left_eye, vr_right_eye（ZED_X CameraLeft/CameraRight）
--- a/deps/XRoboToolkit-Unity-Client
+++ b/deps/XRoboToolkit-Unity-Client
--- a/scripts/environments/teleoperation/teleop_xr_agent.py
+++ b/scripts/environments/teleoperation/teleop_xr_agent.py
@@ -1,53 +1,49 @@
 #!/usr/bin/env python3
-# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
+# SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers.
 # All rights reserved.
 #
 # SPDX-License-Identifier: BSD-3-Clause
-"""
+"""XR teleoperation entry point for MindBot.
 Script to run teleoperation with Isaac Lab manipulation environments using PICO XR Controllers.
-Only absolute IK mode is supported (Isaac-MindRobot-LeftArm-IK-Abs-v0).
+Selects the appropriate agent class based on the task's action manager:
-The controller computes delta pose in Isaac Sim world frame and accumulates
+  - DualArmHeadXrAgent (23D) for tasks with head_action (e.g. Isaac-MindRobot-2i-DualArm-IK-Abs-v0)
-an absolute EEF target, which is then converted to robot root frame before
+  - DualArmXrAgent (20D) for dual-arm tasks without head
-being sent to the DifferentialIK action manager.
+  - SingleArmXrAgent (8D) for single-arm tasks (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)
 Usage:
    ~/IsaacLab/isaaclab.sh -p scripts/environments/teleoperation/teleop_xr_agent.py \\
        --task Isaac-MindRobot-2i-DualArm-IK-Abs-v0 --stream-to <PICO_IP>
 """
 import argparse
 import logging
 import sys
 import os
 from collections.abc import Callable
-# Ensure xr_utils (next to this script) is importable when running directly
+# Ensure xr_utils (next to this script) is importable
 sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
 from isaaclab.app import AppLauncher
 logger = logging.getLogger(__name__)
-# add argparse arguments
+# -- CLI arguments --
 parser = argparse.ArgumentParser(
-    description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode only)."
+    description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode)."
 )
-parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
+parser.add_argument("--num_envs", type=int, default=1)
-parser.add_argument(
+parser.add_argument("--task", type=str, default="Isaac-MindRobot-LeftArm-IK-Abs-v0")
-    "--task",
+parser.add_argument("--sensitivity", type=float, default=None)
-    type=str,
+parser.add_argument("--pos_sensitivity", type=float, default=None)
-    default="Isaac-MindRobot-LeftArm-IK-Abs-v0",
+parser.add_argument("--rot_sensitivity", type=float, default=None)
-    help="Name of the task (must be an Abs IK task).",
+parser.add_argument("--arm", type=str, default="left", choices=["left", "right"])
-)
+parser.add_argument("--base_speed", type=float, default=5.0)
-parser.add_argument(
+parser.add_argument("--base_turn", type=float, default=2.0)
-    "--sensitivity", type=float, default=None,
+parser.add_argument("--drive_speed", type=float, default=0.5)
-    help="Set both pos and rot sensitivity (overridden by --pos_sensitivity/--rot_sensitivity).",
+parser.add_argument("--drive_turn", type=float, default=1.5)
-)
+parser.add_argument("--stream-to", type=str, default=None, dest="stream_to")
-parser.add_argument("--pos_sensitivity", type=float, default=None, help="Position sensitivity. Default: 1.0.")
+parser.add_argument("--stream-port", type=int, default=12345, dest="stream_port")
-parser.add_argument("--rot_sensitivity", type=float, default=None, help="Rotation sensitivity. Default: 0.3.")
+parser.add_argument("--stream-bitrate", type=int, default=20_000_000, dest="stream_bitrate")
-parser.add_argument("--arm", type=str, default="left", choices=["left", "right"], help="Which arm/controller to use.")
+parser.add_argument("--debug-viewports", action="store_true", dest="debug_viewports", default=True)
 parser.add_argument("--base_speed", type=float, default=5.0, help="Max wheel speed (rad/s) for joystick full forward.")
 parser.add_argument("--base_turn", type=float, default=2.0, help="Max wheel differential (rad/s) for joystick full left/right.")
 parser.add_argument("--drive_speed", type=float, default=0.5, help="Max robot linear speed (m/s) for direct base control. Default: 0.5")
 parser.add_argument("--drive_turn", type=float, default=1.5, help="Max robot yaw rate (rad/s) for direct base control. Default: 1.5")
 AppLauncher.add_app_launcher_args(parser)
 args_cli = parser.parse_args()
@@ -57,726 +53,69 @@ app_launcher_args["xr"] = False
 app_launcher = AppLauncher(app_launcher_args)
 simulation_app = app_launcher.app
-"""Rest everything follows."""
+# -- Post-launch imports (require Isaac Sim runtime) --
 import gymnasium as gym
 import numpy as np
 import torch
 from scipy.spatial.transform import Rotation as R
 from isaaclab.envs import ManagerBasedRLEnvCfg
 import isaaclab_tasks  # noqa: F401
 import mindbot.tasks  # noqa: F401
 from isaaclab_tasks.utils import parse_env_cfg
-from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion
+from xr_teleop import SingleArmXrAgent, DualArmXrAgent, DualArmHeadXrAgent
 from xr_utils.geometry import R_HEADSET_TO_WORLD
 # =====================================================================
 # Robot Camera Viewport Utilities
 # =====================================================================
 # Resolved prim paths for env_0 — must match CameraCfg prim_path with
 # {ENV_REGEX_NS} → /World/envs/env_0
 _ROBOT_CAM_PRIMS: dict[str, str] = {
    "Left Hand":  "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand",
    "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand",
    "Head":       "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft",
    "Chest":      "/World/envs/env_0/Robot/robot_trunk/cam_chest",
 }
 # Stores created viewport window objects
 _robot_viewports: dict[str, object] = {}
 def create_robot_viewports() -> None:
    """Create 4 viewport windows and bind each to a robot camera prim.
    Must be called after env.reset() so all prims exist on the USD stage.
    """
    try:
        import omni.kit.viewport.utility as vp_util
    except ImportError:
        logger.warning("[Viewport] omni.kit.viewport.utility not available — skipping viewport creation.")
        return
    for name, cam_path in _ROBOT_CAM_PRIMS.items():
        vp_win = vp_util.create_viewport_window(
            f"Robot {name} View", width=640, height=360
        )
        vp_win.viewport_api.camera_path = cam_path
        _robot_viewports[name] = vp_win
        print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}")
 def rebind_robot_viewports() -> None:
    """Re-bind all robot viewports to their camera paths.
    Call this after every env reset so viewports stay locked to the cameras.
    """
    for name, vp_win in _robot_viewports.items():
        cam_path = _ROBOT_CAM_PRIMS[name]
        vp_win.viewport_api.camera_path = cam_path
 # =====================================================================
 # Helpers
 # =====================================================================
 def _quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R:
    """Convert Isaac-style wxyz quaternion to scipy Rotation."""
    return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]])
 def _rotation_to_quat_wxyz(rot: R) -> np.ndarray:
    """Convert scipy Rotation to Isaac-style wxyz quaternion."""
    q = rot.as_quat()
    return np.array([q[3], q[0], q[1], q[2]])
 def world_pose_to_root_frame(
    pos_w: np.ndarray,
    quat_wxyz: np.ndarray,
    root_pos_w: np.ndarray,
    root_quat_wxyz: np.ndarray,
 ) -> tuple[np.ndarray, np.ndarray]:
    """Express a world-frame pose in the robot root frame."""
    root_rot = _quat_wxyz_to_rotation(root_quat_wxyz)
    pose_rot = _quat_wxyz_to_rotation(quat_wxyz)
    pos_root = root_rot.inv().apply(pos_w - root_pos_w)
    quat_root = _rotation_to_quat_wxyz(root_rot.inv() * pose_rot)
    return pos_root, quat_root
 # =====================================================================
 # Teleoperation Controller
 # =====================================================================
 class XrTeleopController:
    """Teleop controller for PICO XR headset (absolute IK mode).
    Accumulates an absolute EEF target in Isaac Sim world frame.
    Grip button acts as clutch; B/Y buttons trigger environment reset.
    """
    def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3,
                 max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None):
        self.xr_client = xr_client if xr_client is not None else XrClient()
        self._owns_client = xr_client is None  # only close if we created it
        self.pos_sensitivity = pos_sensitivity
        self.rot_sensitivity = rot_sensitivity
        # Hard caps per physics step to prevent IK divergence.
        # max_rot_per_step ~4.6°, max_pos_per_step 5 cm.
        self.max_pos_per_step = max_pos_per_step
        self.max_rot_per_step = max_rot_per_step
        self.arm = arm
        self.controller_name = "left_controller" if arm == "left" else "right_controller"
        self.grip_name = "left_grip" if arm == "left" else "right_grip"
        self.trigger_name = "left_trigger" if arm == "left" else "right_trigger"
        self.R_headset_world = R_HEADSET_TO_WORLD
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.target_eef_pos = None   # world frame
        self.target_eef_quat = None  # world frame, wxyz
        self.grip_active = False
        self.frame_count = 0
        self.reset_button_latched = False
        self.require_grip_reengage = False
        self.grip_engage_threshold = 0.8
        self.grip_release_threshold = 0.2
        self.callbacks = {}
    def add_callback(self, name: str, func: Callable):
        self.callbacks[name] = func
    def reset(self):
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.grip_active = False
        self.frame_count = 0
        self.target_eef_pos = None
        self.target_eef_quat = None
        # Require grip release after reset to avoid driving to stale pose.
        self.require_grip_reengage = True
    def close(self):
        if self._owns_client:
            self.xr_client.close()
    def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose."""
        action = torch.zeros(8)
        if self.target_eef_pos is not None and self.target_eef_quat is not None:
            # Prefer frozen world-frame target so IK holds a fixed world position
            # even if the robot chassis drifts slightly under gravity.
            action[:3] = torch.tensor(self.target_eef_pos.copy())
            action[3:7] = torch.tensor(self.target_eef_quat.copy())
        elif current_eef_pos is not None and current_eef_quat is not None:
            action[:3] = torch.tensor(current_eef_pos.copy())
            action[3:7] = torch.tensor(current_eef_quat.copy())
        else:
            action[3] = 1.0  # identity quaternion
        action[7] = 1.0 if trigger > 0.5 else -1.0
        return action
    def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper].
        Position and quaternion are in Isaac Sim world frame.
        Caller must convert to robot root frame before sending to IK.
        """
        try:
            reset_pressed = self.xr_client.get_button("B") or self.xr_client.get_button("Y")
            if reset_pressed and not self.reset_button_latched:
                if "RESET" in self.callbacks:
                    self.callbacks["RESET"]()
            self.reset_button_latched = reset_pressed
        except Exception:
            pass
        try:
            raw_pose = self.xr_client.get_pose(self.controller_name)
            grip = self.xr_client.get_key_value(self.grip_name)
            trigger = self.xr_client.get_key_value(self.trigger_name)
        except Exception:
            return self.get_zero_action(0.0, current_eef_pos, current_eef_quat)
        if not is_valid_quaternion(raw_pose[3:]):
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Wait for grip release after reset
        if self.require_grip_reengage:
            if grip <= self.grip_release_threshold:
                self.require_grip_reengage = False
            else:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
                return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Grip button as clutch with hysteresis
        grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold
        if not grip_pressed:
            self.prev_xr_pos = None
            self.prev_xr_quat = None
            # Only snapshot target on the transition frame (grip just released) or first ever frame.
            # After that, keep it frozen so IK holds a fixed world-frame position.
            if self.grip_active or self.target_eef_pos is None:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = False
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        if not self.grip_active:
            self.prev_xr_pos = raw_pose[:3].copy()
            self.prev_xr_quat = raw_pose[3:].copy()
            if current_eef_pos is not None and current_eef_quat is not None:
                self.target_eef_pos = current_eef_pos.copy()
                self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = True
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Project current and previous XR poses into Isaac Sim world frame
        xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
        prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat)
        # Delta position (world frame), clamped per step
        world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity
        pos_norm = np.linalg.norm(world_delta_pos)
        if pos_norm > self.max_pos_per_step:
            world_delta_pos *= self.max_pos_per_step / pos_norm
        # Delta rotation (world frame), clamped per step
        world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity
        rot_norm = np.linalg.norm(world_delta_rot)
        if rot_norm > self.max_rot_per_step:
            world_delta_rot *= self.max_rot_per_step / rot_norm
        gripper_action = 1.0 if trigger > 0.5 else -1.0
        # Accumulate absolute target
        if self.target_eef_pos is None:
            self.target_eef_pos = np.zeros(3)
            self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0])
        self.target_eef_pos += world_delta_pos
        # Apply rotation delta in world frame so controller direction = EEF world direction
        target_r = _quat_wxyz_to_rotation(self.target_eef_quat)
        self.target_eef_quat = _rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r)
        self.prev_xr_pos = raw_pose[:3].copy()
        self.prev_xr_quat = raw_pose[3:].copy()
        self.frame_count += 1
        action = torch.tensor([
            *self.target_eef_pos,
            *self.target_eef_quat,
            gripper_action,
        ], dtype=torch.float32)
        if self.frame_count % 30 == 0:
            np.set_printoptions(precision=4, suppress=True, floatmode='fixed')
            print("\n====================== [VR TELEOP DEBUG] ======================")
            print(f"| Raw VR Pos (OpenXR):      {np.array(raw_pose[:3])}")
            print(f"| Raw VR Quat (xyzw):       {np.array(raw_pose[3:])}")
            print(f"| XR Delta Pos (world):     {world_delta_pos}  (norm={np.linalg.norm(world_delta_pos):.4f})")
            print(f"| XR Delta Rot (world):     {world_delta_rot}  (norm={np.linalg.norm(world_delta_rot):.4f})")
            print(f"| Targ Pos (world):         {action[:3].numpy()}")
            print(f"| Targ Quat (world, wxyz):  {action[3:7].numpy()}")
            print(f"| Gripper & Trigger:          Grip={grip:.2f}, Trig={trigger:.2f}")
            print("===============================================================")
        return action
 # =====================================================================
 # Head Tracking Utility
 # =====================================================================
 def get_head_joint_targets(
    xr_client, neutral_rot: R | None
 ) -> tuple[np.ndarray, R | None]:
    """Extract head joint targets [yaw, pitch] from HMD pose.
    At first call (neutral_rot is None) the current headset orientation
    is recorded as the neutral reference and zeros are returned.
    Subsequent calls return yaw/pitch relative to that neutral pose.
    Returns:
        (joint_targets, neutral_rot) where joint_targets is shape (2,)
        float32 [head_yaw_Joint_rad, head_pitch_Joint_rad].
    """
    try:
        raw_pose = xr_client.get_pose("headset")
        if not is_valid_quaternion(raw_pose[3:]):
            return np.zeros(2, dtype=np.float32), neutral_rot
        _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
        head_rot = R.from_quat(head_world_quat_xyzw)
        if neutral_rot is None:
            # First call — record neutral and hold zero
            return np.zeros(2, dtype=np.float32), head_rot
        # Relative rotation from neutral heading
        rel_rot = head_rot * neutral_rot.inv()
        # ZYX Euler: [0]=yaw (Z), [1]=pitch (Y)
        euler_zyx = rel_rot.as_euler("ZYX")
        yaw   = float(np.clip(euler_zyx[0], -1.57, 1.57))   # ±90°
        pitch = float(np.clip(euler_zyx[1], -0.52, 0.52))   # ±30°
        return np.array([yaw, pitch], dtype=np.float32), neutral_rot
    except Exception:
        return np.zeros(2, dtype=np.float32), neutral_rot
 # =====================================================================
 # Main Execution Loop
 # =====================================================================
 def main() -> None:
-    """Run teleoperation with PICO XR Controller against Isaac Lab environment."""
+    args = args_cli
-    env_cfg = parse_env_cfg(args_cli.task, num_envs=args_cli.num_envs)
+    # Resolve sensitivity
-    env_cfg.env_name = args_cli.task
+    pos_sens = args.pos_sensitivity or args.sensitivity or 1.0
    rot_sens = args.rot_sensitivity or args.sensitivity or 0.3
    # Create environment
    env_cfg = parse_env_cfg(args.task, num_envs=args.num_envs)
    env_cfg.env_name = args.task
    if not isinstance(env_cfg, ManagerBasedRLEnvCfg):
        raise ValueError(f"Teleoperation requires ManagerBasedRLEnvCfg. Got: {type(env_cfg)}")
-    if "Abs" not in args_cli.task:
+    if "Abs" not in args.task:
-        raise ValueError(
+        raise ValueError(f"Task '{args.task}' is not an absolute IK task.")
            f"Task '{args_cli.task}' is not an absolute IK task. "
            "Only Abs tasks are supported (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)."
        )
    env_cfg.terminations.time_out = None
-    try:
+    env = gym.make(args.task, cfg=env_cfg).unwrapped
        env = gym.make(args_cli.task, cfg=env_cfg).unwrapped
    except Exception as e:
        logger.error(f"Failed to create environment '{args_cli.task}': {e}")
        simulation_app.close()
        return
-    # Detect single-arm vs dual-arm based on action manager term names
+    # Select agent based on task capabilities
-    is_dual_arm = "left_arm_action" in env.action_manager._terms
+    action_terms = env.action_manager._terms
    is_dual_arm = "left_arm_action" in action_terms
    has_head = "head_action" in action_terms
-    # Sensitivity: explicit args override --sensitivity, which overrides defaults
+    if is_dual_arm and has_head:
-    pos_sens = 1.0
+        print(f"[INFO] DualArmHeadXrAgent (23D) | pos_sens={pos_sens} rot_sens={rot_sens}")
-    rot_sens = 0.3
+        agent = DualArmHeadXrAgent(
-    if args_cli.sensitivity is not None:
+            env, simulation_app,
-        pos_sens = args_cli.sensitivity
+            pos_sensitivity=pos_sens, rot_sensitivity=rot_sens,
-        rot_sens = args_cli.sensitivity
+            base_speed=args.base_speed, base_turn=args.base_turn,
-    if args_cli.pos_sensitivity is not None:
+            drive_speed=args.drive_speed, drive_turn=args.drive_turn,
-        pos_sens = args_cli.pos_sensitivity
+            stream_to=args.stream_to, stream_port=args.stream_port,
-    if args_cli.rot_sensitivity is not None:
+            stream_bitrate=args.stream_bitrate,
-        rot_sens = args_cli.rot_sensitivity
+            debug_viewports=args.debug_viewports or bool(args.stream_to),
-
+        )
-    if is_dual_arm:
+    elif is_dual_arm:
-        print(f"\n[INFO] Dual-arm mode detected. Using both controllers.")
+        print(f"[INFO] DualArmXrAgent (20D) | pos_sens={pos_sens} rot_sens={rot_sens}")
-        print(f"[INFO] IK Mode: ABSOLUTE")
+        agent = DualArmXrAgent(
-        print(f"[INFO] Sensitivity: pos={pos_sens:.3f}  rot={rot_sens:.3f}")
+            env, simulation_app,
-        shared_client = XrClient()
+            pos_sensitivity=pos_sens, rot_sensitivity=rot_sens,
-        teleop_left = XrTeleopController(arm="left", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client)
+            base_speed=args.base_speed, base_turn=args.base_turn,
-        teleop_right = XrTeleopController(arm="right", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client)
+            drive_speed=args.drive_speed, drive_turn=args.drive_turn,
-        teleop_interface = teleop_left  # primary interface for callbacks/reset
+            debug_viewports=args.debug_viewports,
-        teleop_right_ref = teleop_right
+        )
    else:
-        print(f"\n[INFO] Connecting to PICO XR Headset using {args_cli.arm} controller...")
+        print(f"[INFO] SingleArmXrAgent (8D) | arm={args.arm} pos_sens={pos_sens} rot_sens={rot_sens}")
-        print(f"[INFO] IK Mode: ABSOLUTE")
+        agent = SingleArmXrAgent(
-        print(f"[INFO] Sensitivity: pos={pos_sens:.3f}  rot={rot_sens:.3f}")
+            env, simulation_app,
-        teleop_interface = XrTeleopController(
+            arm=args.arm,
-            arm=args_cli.arm,
+            pos_sensitivity=pos_sens, rot_sensitivity=rot_sens,
-            pos_sensitivity=pos_sens,
+            debug_viewports=args.debug_viewports,
            rot_sensitivity=rot_sens,
        )
-    should_reset = False
+    agent.run()
    def request_reset():
        nonlocal should_reset
        should_reset = True
        print("[INFO] Reset requested via XR button.")
    teleop_interface.add_callback("RESET", request_reset)
    def clear_ik_target_state():
        """Sync IK controller's internal target to current EEF pose to avoid jumps after reset."""
        if is_dual_arm:
            for term_key in ["left_arm_action", "right_arm_action"]:
                arm_action_term = env.action_manager._terms[term_key]
                ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose()
                arm_action_term._raw_actions.zero_()
                arm_action_term._processed_actions.zero_()
                arm_action_term._ik_controller._command.zero_()
                arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b
                arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b
        else:
            arm_action_term = env.action_manager._terms["arm_action"]
            ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose()
            arm_action_term._raw_actions.zero_()
            arm_action_term._processed_actions.zero_()
            arm_action_term._ik_controller._command.zero_()
            arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b
            arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b
    def convert_action_world_to_root(action_tensor: torch.Tensor) -> torch.Tensor:
        """Convert absolute IK action from world frame to robot root frame."""
        robot = env.unwrapped.scene["robot"]
        root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy()
        root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy()
        pos_root, quat_root = world_pose_to_root_frame(
            action_tensor[:3].numpy(), action_tensor[3:7].numpy(),
            root_pos_w, root_quat_w,
        )
        out = action_tensor.clone()
        out[:3] = torch.tensor(pos_root, dtype=torch.float32)
        out[3:7] = torch.tensor(quat_root, dtype=torch.float32)
        return out
    def get_chassis_commands() -> tuple[torch.Tensor, float, float]:
        """Read left joystick → (wheel_cmd 4D, v_fwd m/s, omega rad/s).
        wheel_cmd  : differential wheel velocities for the action vector (training data).
        v_fwd      : desired robot forward speed in m/s (for direct base control).
        omega      : desired robot yaw rate in rad/s  (for direct base control).
        """
        try:
            joy = teleop_interface.xr_client.get_joystick("left")
            jy = float(joy[1])
            jx = float(joy[0])
        except Exception:
            return torch.zeros(4), 0.0, 0.0
        # Wheel velocity commands for action vector (rad/s)
        v_w = jy * args_cli.base_speed
        omega_w = jx * args_cli.base_turn
        right_vel = v_w - omega_w
        left_vel  = v_w + omega_w
        wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32)
        if abs(v_w) > 0.1 or abs(omega_w) > 0.1:
            print(f"[WHEEL] joy=({jx:+.2f},{jy:+.2f}) v_wheel={v_w:.2f} ω_wheel={omega_w:.2f} → {wheel_cmd.tolist()}")
        # Direct base velocity commands (m/s, rad/s)
        v_fwd  =  jy * args_cli.drive_speed
        omega  = -jx * args_cli.drive_turn   # jx<0 (left push) → positive yaw = left turn
        return wheel_cmd, v_fwd, omega
    def apply_base_velocity(v_fwd: float, omega: float) -> None:
        """Directly set robot root linear+angular velocity to bypass isotropic friction.
        PhysX uses isotropic friction, so skid-steer lateral slip is impossible through
        wheel torques alone. This function overrides the base velocity each step to give
        clean translational + rotational motion regardless of friction.
        The write is buffered and applied at the start of the next env.step().
        """
        if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4:
            return
        robot = env.unwrapped.scene["robot"]
        rq = robot.data.root_quat_w          # [N, 4]  wxyz
        w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3]
        # Robot forward direction in world frame (rotate [1,0,0] by root quaternion)
        fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q)
        fwd_y = 2.0 * (x_q * y_q + w_q * z_q)
        vel = torch.zeros(env.num_envs, 6, device=device)
        vel[:, 0] = v_fwd * fwd_x   # world x linear
        vel[:, 1] = v_fwd * fwd_y   # world y linear
        # vel[:, 2] = 0              # z: let physics handle gravity/contact
        vel[:, 5] = omega            # world z angular (yaw)
        robot.write_root_velocity_to_sim(vel)
    obs, _ = env.reset()
    clear_ik_target_state()
    teleop_interface.reset()
    # Read contact sensor data directly from OmniGraph node USD attributes.
    # The USD has action graphs that execute each physics step and write outputs to:
    #   outputs:inContact (bool)  outputs:value (float, Newtons)
    # /root in USD  →  /World/envs/env_0/Robot  in the loaded scene.
    _CONTACT_OG_NODES = {
        "left_r": (
            "/World/envs/env_0/Robot"
            "/l_zhuajia___m2_1_cs/isaac_read_contact_sensor_node"
        ),
        "left_l": (
            "/World/envs/env_0/Robot"
            "/l_zhuajia___m3_1_cs/isaac_read_contact_sensor_node"
        ),
    }
    def _og_contact_reading(stage, node_path: str) -> tuple[bool, float]:
        """Read one OmniGraph contact node output via USD attribute API."""
        prim = stage.GetPrimAtPath(node_path)
        if not prim.IsValid():
            return False, float("nan")
        in_contact_attr = prim.GetAttribute("outputs:inContact")
        force_attr = prim.GetAttribute("outputs:value")
        in_contact = in_contact_attr.Get() if in_contact_attr.IsValid() else False
        force = force_attr.Get() if force_attr.IsValid() else 0.0
        return bool(in_contact), float(force or 0.0)
    # One-time diagnostic: run after first env.reset() so all prims exist
    def _diag_contact_nodes() -> None:
        import omni.usd
        stage = omni.usd.get_context().get_stage()
        print("\n[DIAG] OmniGraph contact node check:")
        for name, path in _CONTACT_OG_NODES.items():
            prim = stage.GetPrimAtPath(path)
            if prim.IsValid():
                has_in = prim.GetAttribute("outputs:inContact").IsValid()
                has_val = prim.GetAttribute("outputs:value").IsValid()
                print(f"  [{name}] ✓ prim valid | outputs:inContact={has_in} outputs:value={has_val}")
            else:
                # Try to find similar prims to hint correct path
                parent_path = "/".join(path.split("/")[:-1])
                parent = stage.GetPrimAtPath(parent_path)
                print(f"  [{name}] ✗ NOT FOUND at: {path}")
                if parent.IsValid():
                    children = [c.GetName() for c in parent.GetChildren()]
                    print(f"           parent exists, children: {children}")
                else:
                    gp_path = "/".join(path.split("/")[:-2])
                    gp = stage.GetPrimAtPath(gp_path)
                    print(f"           grandparent '{gp_path}' valid={gp.IsValid()}")
    def read_gripper_contact() -> str:
        """Return contact status string by reading OmniGraph node USD attributes."""
        try:
            import omni.usd
            stage = omni.usd.get_context().get_stage()
            c_r, f_r = _og_contact_reading(stage, _CONTACT_OG_NODES["left_r"])
            c_l, f_l = _og_contact_reading(stage, _CONTACT_OG_NODES["left_l"])
            if f_r != f_r:  # nan → prim not found
                return "prim not found (run with diag)"
            sym_r = "✓" if c_r else "✗"
            sym_l = "✓" if c_l else "✗"
            return f"{sym_r}R={f_r:.1f}N  {sym_l}L={f_l:.1f}N"
        except Exception as _e:
            return f"N/A ({_e})"
    # One-time diagnostic: verify OmniGraph contact sensor node paths
    _diag_contact_nodes()
    # Create 4 viewport windows bound to the robot cameras
    create_robot_viewports()
    if is_dual_arm:
        teleop_right_ref.reset()
    print("\n" + "=" * 50)
    print("  Teleoperation Started!")
    if is_dual_arm:
        print("  LEFT controller  → left arm")
        print("  RIGHT controller → right arm")
    print("  TRIGGER: open/close gripper")
    print("  GRIP (hold): move the arm")
    print("  Left joystick: Y=forward/back, X=turn")
    print("  B / Y: reset environment")
    print("=" * 50 + "\n")
    device = env.unwrapped.device
    sim_frame = 0
    obs, _ = env.reset()
    clear_ik_target_state()
    last_root_left = None
    last_root_right = None
    neutral_head_rot = None  # calibrated on first headset sample after reset
    while simulation_app.is_running():
        try:
            with torch.inference_mode():
                if should_reset:
                    obs, _ = env.reset()
                    clear_ik_target_state()
                    teleop_interface.reset()
                    if is_dual_arm:
                        teleop_right_ref.reset()
                    rebind_robot_viewports()
                    should_reset = False
                    sim_frame = 0
                    last_root_left = None
                    last_root_right = None
                    neutral_head_rot = None  # re-calibrate on next headset sample
                    continue
                policy_obs = obs["policy"]
                wheel_np, v_fwd, omega_base = get_chassis_commands()
                # Head tracking: HMD yaw/pitch relative to neutral pose
                head_cmd_np, neutral_head_rot = get_head_joint_targets(
                    teleop_interface.xr_client, neutral_head_rot
                )
                head_cmd = torch.tensor(head_cmd_np, dtype=torch.float32)
                if is_dual_arm:
                    eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy()
                    eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy()
                    eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy()
                    eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy()
                    left_action = teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left)
                    right_action = teleop_right_ref.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right)
                    # Only recompute root-frame IK target when grip is active; otherwise freeze joints
                    if teleop_left.grip_active or last_root_left is None:
                        last_root_left = convert_action_world_to_root(left_action)[:7].clone()
                    if teleop_right_ref.grip_active or last_root_right is None:
                        last_root_right = convert_action_world_to_root(right_action)[:7].clone()
                    # Action layout: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) | head(2)
                    action_np = torch.cat([
                        last_root_left, wheel_np, left_action[7:8],
                        last_root_right, right_action[7:8],
                        head_cmd,
                    ])
                else:
                    eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
                    eef_quat = policy_obs["eef_quat"][0].cpu().numpy()
                    action_np = teleop_interface.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat)
                    if teleop_interface.grip_active or last_root_left is None:
                        last_root_left = convert_action_world_to_root(action_np)[:7].clone()
                    # Action layout: arm(7) | wheel(4) | gripper(1)
                    action_np = torch.cat([last_root_left, wheel_np, action_np[7:8]])
                actions = action_np.unsqueeze(0).repeat(env.num_envs, 1).to(device)
                obs, _, _, _, _ = env.step(actions)
                # Direct base velocity override: bypasses isotropic friction limitation
                # so skid-steer turning works correctly.
                apply_base_velocity(v_fwd, omega_base)
                sim_frame += 1
                if sim_frame % 30 == 0:
                    np.set_printoptions(precision=4, suppress=True, floatmode='fixed')
                    policy_obs = obs["policy"]
                    joint_pos = policy_obs["joint_pos"][0].cpu().numpy()
                    last_act = policy_obs["actions"][0].cpu().numpy()
                    if sim_frame == 30:
                        robot = env.unwrapped.scene["robot"]
                        jnames = robot.joint_names
                        print(f"\n{'='*70}")
                        print(f"  ALL {len(jnames)} JOINT NAMES AND POSITIONS")
                        print(f"{'='*70}")
                        for i, name in enumerate(jnames):
                            print(f"  [{i:2d}] {name:30s}  = {joint_pos[i]:+.4f}")
                        arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")]
                        print(f"  Deduced left arm indices: {arm_idx}")
                        print(f"{'='*70}\n")
                    if not hasattr(env, '_arm_idx_cache'):
                        robot = env.unwrapped.scene["robot"]
                        jnames = robot.joint_names
                        env._arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("l_joint")]
                        env._right_arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("r_joint")]
                    arm_joints = joint_pos[env._arm_idx_cache]
                    right_arm_joints = joint_pos[env._right_arm_idx_cache]
                    try:
                        joy_dbg = teleop_interface.xr_client.get_joystick("left")
                        joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]"
                    except Exception:
                        joy_str = "N/A"
                    if is_dual_arm:
                        eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy()
                        eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy()
                        eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy()
                        print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------")
                        print(f"| Left  Arm Joints (rad):     {arm_joints}")
                        print(f"| Right Arm Joints (rad):     {right_arm_joints}")
                        print(f"| Left  EEF Pos (world, m):   {eef_pos_left}")
                        print(f"| Right EEF Pos (world, m):   {eef_pos_right}")
                        print(f"| Left Gripper Contact:       {read_gripper_contact()}")
                        print(f"| Cmd left_arm(abs):          {last_act[:7]}")
                        print(f"| Cmd wheel vel (rad/s):      {last_act[7:11]}")
                        print(f"| Cmd left_gripper:           {last_act[11:12]}  (+1=open -1=close)")
                        print(f"| Cmd right_arm(abs):         {last_act[12:19]}")
                        print(f"| Cmd right_gripper:          {last_act[19:]}  (+1=open -1=close)")
                        print(f"| Joystick (x=turn,y=fwd):    {joy_str}")
                        print(f"----------------------------------------------------------------")
                    else:
                        eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
                        eef_quat = policy_obs["eef_quat"][0].cpu().numpy()
                        if not hasattr(env, '_prev_eef_quat'):
                            env._prev_eef_quat = eef_quat.copy()
                        prev_q = env._prev_eef_quat
                        r_prev = R.from_quat([prev_q[1], prev_q[2], prev_q[3], prev_q[0]])
                        r_curr = R.from_quat([eef_quat[1], eef_quat[2], eef_quat[3], eef_quat[0]])
                        delta_rotvec = (r_curr * r_prev.inv()).as_rotvec()
                        delta_angle_deg = np.degrees(np.linalg.norm(delta_rotvec))
                        delta_axis = delta_rotvec / (np.linalg.norm(delta_rotvec) + 1e-9)
                        env._prev_eef_quat = eef_quat.copy()
                        approach_vec = r_curr.apply(np.array([1.0, 0.0, 0.0]))
                        print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------")
                        print(f"| Left Arm Joints (rad):    {arm_joints}")
                        print(f"| EEF Pos (world, m):       {eef_pos}")
                        print(f"| EEF approach vec (world): [{approach_vec[0]:+.3f}, {approach_vec[1]:+.3f}, {approach_vec[2]:+.3f}]  (夹爪朝向)")
                        print(f"| EEF rot since last print: {delta_angle_deg:+.2f}° around [{delta_axis[0]:+.3f},{delta_axis[1]:+.3f},{delta_axis[2]:+.3f}] (world)")
                        print(f"| Cmd arm(abs pos+quat):    {last_act[:7]}")
                        print(f"| Cmd wheel vel (rad/s):    {last_act[7:11]}")
                        print(f"| Cmd gripper:              {last_act[11:]}  (+1=open -1=close)")
                        print(f"| Joystick (x=turn,y=fwd):  {joy_str}")
                        print(f"----------------------------------------------------------------")
        except Exception as e:
            logger.error(f"Error during simulation step: {e}")
            break
    teleop_interface.close()
    if is_dual_arm:
        teleop_right_ref.close()
        shared_client.close()
    env.close()
 if __name__ == "__main__":
--- a/scripts/environments/teleoperation/teleop_xr_agent.py.bak
+++ b/scripts/environments/teleoperation/teleop_xr_agent.py.bak
@@ -0,0 +1,856 @@
 #!/usr/bin/env python3
 # Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
 # All rights reserved.
 #
 # SPDX-License-Identifier: BSD-3-Clause
 """
 Script to run teleoperation with Isaac Lab manipulation environments using PICO XR Controllers.
 Only absolute IK mode is supported (Isaac-MindRobot-LeftArm-IK-Abs-v0).
 The controller computes delta pose in Isaac Sim world frame and accumulates
 an absolute EEF target, which is then converted to robot root frame before
 being sent to the DifferentialIK action manager.
 """
 import argparse
 import logging
 import sys
 import os
 from collections.abc import Callable
 # Ensure xr_utils (next to this script) is importable when running directly
 sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
 from isaaclab.app import AppLauncher
 logger = logging.getLogger(__name__)
 # add argparse arguments
 parser = argparse.ArgumentParser(
    description="Teleoperation for Isaac Lab environments with PICO XR Controller (absolute IK mode only)."
 )
 parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
 parser.add_argument(
    "--task",
    type=str,
    default="Isaac-MindRobot-LeftArm-IK-Abs-v0",
    help="Name of the task (must be an Abs IK task).",
 )
 parser.add_argument(
    "--sensitivity", type=float, default=None,
    help="Set both pos and rot sensitivity (overridden by --pos_sensitivity/--rot_sensitivity).",
 )
 parser.add_argument("--pos_sensitivity", type=float, default=None, help="Position sensitivity. Default: 1.0.")
 parser.add_argument("--rot_sensitivity", type=float, default=None, help="Rotation sensitivity. Default: 0.3.")
 parser.add_argument("--arm", type=str, default="left", choices=["left", "right"], help="Which arm/controller to use.")
 parser.add_argument("--base_speed", type=float, default=5.0, help="Max wheel speed (rad/s) for joystick full forward.")
 parser.add_argument("--base_turn", type=float, default=2.0, help="Max wheel differential (rad/s) for joystick full left/right.")
 parser.add_argument("--drive_speed", type=float, default=0.5, help="Max robot linear speed (m/s) for direct base control. Default: 0.5")
 parser.add_argument("--drive_turn", type=float, default=1.5, help="Max robot yaw rate (rad/s) for direct base control. Default: 1.5")
 parser.add_argument(
    "--stream-to", type=str, default=None, dest="stream_to",
    help="PICO VR headset IP for sim stereo streaming. Streams H264 SBS to the Unity client TCP server. "
         "Requires vr_left_eye/vr_right_eye cameras in the env scene config.",
 )
 parser.add_argument("--stream-port", type=int, default=12345, dest="stream_port", help="TCP port for video streaming.")
 parser.add_argument("--stream-bitrate", type=int, default=20_000_000, dest="stream_bitrate", help="H264 bitrate (bps).")
 parser.add_argument(
    "--debug-viewports", action="store_true", dest="debug_viewports", default=False,
    help="Enable debug viewport windows (Left Hand, Right Hand, Head, Chest). Off by default when --stream-to is used.",
 )
 AppLauncher.add_app_launcher_args(parser)
 args_cli = parser.parse_args()
 app_launcher_args = vars(args_cli)
 app_launcher_args["xr"] = False
 app_launcher = AppLauncher(app_launcher_args)
 simulation_app = app_launcher.app
 """Rest everything follows."""
 import gymnasium as gym
 import numpy as np
 import torch
 from scipy.spatial.transform import Rotation as R
 from isaaclab.envs import ManagerBasedRLEnvCfg
 import isaaclab_tasks  # noqa: F401
 import mindbot.tasks  # noqa: F401
 from isaaclab_tasks.utils import parse_env_cfg
 from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion
 from xr_utils.geometry import R_HEADSET_TO_WORLD
 # =====================================================================
 # Robot Camera Viewport Utilities
 # =====================================================================
 # Resolved prim paths for env_0 — must match CameraCfg prim_path with
 # {ENV_REGEX_NS} → /World/envs/env_0
 _ROBOT_CAM_PRIMS: dict[str, str] = {
    "Left Hand":  "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand",
    "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand",
    "Head":       "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft",
    "Chest":      "/World/envs/env_0/Robot/robot_trunk/cam_chest",
 }
 # Stores created viewport window objects
 _robot_viewports: dict[str, object] = {}
 def create_robot_viewports() -> None:
    """Create 4 viewport windows and bind each to a robot camera prim.
    Must be called after env.reset() so all prims exist on the USD stage.
    """
    try:
        import omni.kit.viewport.utility as vp_util
    except ImportError:
        logger.warning("[Viewport] omni.kit.viewport.utility not available — skipping viewport creation.")
        return
    for name, cam_path in _ROBOT_CAM_PRIMS.items():
        vp_win = vp_util.create_viewport_window(
            f"Robot {name} View", width=640, height=360
        )
        vp_win.viewport_api.camera_path = cam_path
        _robot_viewports[name] = vp_win
        print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}")
 def rebind_robot_viewports() -> None:
    """Re-bind all robot viewports to their camera paths.
    Call this after every env reset so viewports stay locked to the cameras.
    """
    for name, vp_win in _robot_viewports.items():
        cam_path = _ROBOT_CAM_PRIMS[name]
        vp_win.viewport_api.camera_path = cam_path
 # =====================================================================
 # Helpers
 # =====================================================================
 def _quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R:
    """Convert Isaac-style wxyz quaternion to scipy Rotation."""
    return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]])
 def _rotation_to_quat_wxyz(rot: R) -> np.ndarray:
    """Convert scipy Rotation to Isaac-style wxyz quaternion."""
    q = rot.as_quat()
    return np.array([q[3], q[0], q[1], q[2]])
 def world_pose_to_root_frame(
    pos_w: np.ndarray,
    quat_wxyz: np.ndarray,
    root_pos_w: np.ndarray,
    root_quat_wxyz: np.ndarray,
 ) -> tuple[np.ndarray, np.ndarray]:
    """Express a world-frame pose in the robot root frame."""
    root_rot = _quat_wxyz_to_rotation(root_quat_wxyz)
    pose_rot = _quat_wxyz_to_rotation(quat_wxyz)
    pos_root = root_rot.inv().apply(pos_w - root_pos_w)
    quat_root = _rotation_to_quat_wxyz(root_rot.inv() * pose_rot)
    return pos_root, quat_root
 # =====================================================================
 # Teleoperation Controller
 # =====================================================================
 class XrTeleopController:
    """Teleop controller for PICO XR headset (absolute IK mode).
    Accumulates an absolute EEF target in Isaac Sim world frame.
    Grip button acts as clutch; B/Y buttons trigger environment reset.
    """
    def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3,
                 max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None):
        self.xr_client = xr_client if xr_client is not None else XrClient()
        self._owns_client = xr_client is None  # only close if we created it
        self.pos_sensitivity = pos_sensitivity
        self.rot_sensitivity = rot_sensitivity
        # Hard caps per physics step to prevent IK divergence.
        # max_rot_per_step ~4.6°, max_pos_per_step 5 cm.
        self.max_pos_per_step = max_pos_per_step
        self.max_rot_per_step = max_rot_per_step
        self.arm = arm
        self.controller_name = "left_controller" if arm == "left" else "right_controller"
        self.grip_name = "left_grip" if arm == "left" else "right_grip"
        self.trigger_name = "left_trigger" if arm == "left" else "right_trigger"
        self.R_headset_world = R_HEADSET_TO_WORLD
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.target_eef_pos = None   # world frame
        self.target_eef_quat = None  # world frame, wxyz
        self.grip_active = False
        self.frame_count = 0
        self.reset_button_latched = False
        self.require_grip_reengage = False
        self.grip_engage_threshold = 0.8
        self.grip_release_threshold = 0.2
        self.callbacks = {}
    def add_callback(self, name: str, func: Callable):
        self.callbacks[name] = func
    def reset(self):
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.grip_active = False
        self.frame_count = 0
        self.target_eef_pos = None
        self.target_eef_quat = None
        # Require grip release after reset to avoid driving to stale pose.
        self.require_grip_reengage = True
    def close(self):
        if self._owns_client:
            self.xr_client.close()
    def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose."""
        action = torch.zeros(8)
        if self.target_eef_pos is not None and self.target_eef_quat is not None:
            # Prefer frozen world-frame target so IK holds a fixed world position
            # even if the robot chassis drifts slightly under gravity.
            action[:3] = torch.tensor(self.target_eef_pos.copy())
            action[3:7] = torch.tensor(self.target_eef_quat.copy())
        elif current_eef_pos is not None and current_eef_quat is not None:
            action[:3] = torch.tensor(current_eef_pos.copy())
            action[3:7] = torch.tensor(current_eef_quat.copy())
        else:
            action[3] = 1.0  # identity quaternion
        action[7] = 1.0 if trigger > 0.5 else -1.0
        return action
    def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper].
        Position and quaternion are in Isaac Sim world frame.
        Caller must convert to robot root frame before sending to IK.
        """
        try:
            # Only left controller Y button resets (not right controller B)
            if self.arm == "left":
                reset_pressed = self.xr_client.get_button("Y")
            else:
                reset_pressed = False
            if reset_pressed and not self.reset_button_latched:
                if "RESET" in self.callbacks:
                    self.callbacks["RESET"]()
            self.reset_button_latched = reset_pressed
        except Exception:
            pass
        try:
            raw_pose = self.xr_client.get_pose(self.controller_name)
            grip = self.xr_client.get_key_value(self.grip_name)
            trigger = self.xr_client.get_key_value(self.trigger_name)
        except Exception:
            return self.get_zero_action(0.0, current_eef_pos, current_eef_quat)
        if not is_valid_quaternion(raw_pose[3:]):
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Wait for grip release after reset
        if self.require_grip_reengage:
            if grip <= self.grip_release_threshold:
                self.require_grip_reengage = False
            else:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
                return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Grip button as clutch with hysteresis
        grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold
        if not grip_pressed:
            self.prev_xr_pos = None
            self.prev_xr_quat = None
            # Only snapshot target on the transition frame (grip just released) or first ever frame.
            # After that, keep it frozen so IK holds a fixed world-frame position.
            if self.grip_active or self.target_eef_pos is None:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = False
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        if not self.grip_active:
            self.prev_xr_pos = raw_pose[:3].copy()
            self.prev_xr_quat = raw_pose[3:].copy()
            if current_eef_pos is not None and current_eef_quat is not None:
                self.target_eef_pos = current_eef_pos.copy()
                self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = True
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        # Project current and previous XR poses into Isaac Sim world frame
        xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
        prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat)
        # Delta position (world frame), clamped per step
        world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity
        pos_norm = np.linalg.norm(world_delta_pos)
        if pos_norm > self.max_pos_per_step:
            world_delta_pos *= self.max_pos_per_step / pos_norm
        # Delta rotation (world frame), clamped per step
        world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity
        rot_norm = np.linalg.norm(world_delta_rot)
        if rot_norm > self.max_rot_per_step:
            world_delta_rot *= self.max_rot_per_step / rot_norm
        gripper_action = 1.0 if trigger > 0.5 else -1.0
        # Accumulate absolute target
        if self.target_eef_pos is None:
            self.target_eef_pos = np.zeros(3)
            self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0])
        self.target_eef_pos += world_delta_pos
        # Apply rotation delta in world frame so controller direction = EEF world direction
        target_r = _quat_wxyz_to_rotation(self.target_eef_quat)
        self.target_eef_quat = _rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r)
        self.prev_xr_pos = raw_pose[:3].copy()
        self.prev_xr_quat = raw_pose[3:].copy()
        self.frame_count += 1
        action = torch.tensor([
            *self.target_eef_pos,
            *self.target_eef_quat,
            gripper_action,
        ], dtype=torch.float32)
        if self.frame_count % 30 == 0:
            np.set_printoptions(precision=4, suppress=True, floatmode='fixed')
            print("\n====================== [VR TELEOP DEBUG] ======================")
            print(f"| Raw VR Pos (OpenXR):      {np.array(raw_pose[:3])}")
            print(f"| Raw VR Quat (xyzw):       {np.array(raw_pose[3:])}")
            print(f"| XR Delta Pos (world):     {world_delta_pos}  (norm={np.linalg.norm(world_delta_pos):.4f})")
            print(f"| XR Delta Rot (world):     {world_delta_rot}  (norm={np.linalg.norm(world_delta_rot):.4f})")
            print(f"| Targ Pos (world):         {action[:3].numpy()}")
            print(f"| Targ Quat (world, wxyz):  {action[3:7].numpy()}")
            print(f"| Gripper & Trigger:          Grip={grip:.2f}, Trig={trigger:.2f}")
            print("===============================================================")
        return action
 # =====================================================================
 # Head Tracking Utility
 # =====================================================================
 def get_head_joint_targets(
    xr_client, neutral_rot: R | None
 ) -> tuple[np.ndarray, R | None]:
    """Extract head joint targets [yaw, pitch] from HMD pose.
    At first call (neutral_rot is None) the current headset orientation
    is recorded as the neutral reference and zeros are returned.
    Subsequent calls return yaw/pitch relative to that neutral pose.
    Returns:
        (joint_targets, neutral_rot) where joint_targets is shape (2,)
        float32 [head_yaw_Joint_rad, head_pitch_Joint_rad].
    """
    try:
        raw_pose = xr_client.get_pose("headset")
        if not is_valid_quaternion(raw_pose[3:]):
            return np.zeros(2, dtype=np.float32), neutral_rot
        _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
        head_rot = R.from_quat(head_world_quat_xyzw)
        if neutral_rot is None:
            # First call — record neutral and hold zero
            return np.zeros(2, dtype=np.float32), head_rot
        # Relative rotation from neutral heading
        rel_rot = head_rot * neutral_rot.inv()
        # ZYX Euler: [0]=yaw (Z), [1]=pitch (Y)
        euler_zyx = rel_rot.as_euler("ZYX")
        yaw   = float(np.clip(euler_zyx[0], -1.57, 1.57))   # ±90°
        pitch = float(np.clip(euler_zyx[1], -0.52, 0.52))   # ±30°
        return np.array([yaw, pitch], dtype=np.float32), neutral_rot
    except Exception:
        return np.zeros(2, dtype=np.float32), neutral_rot
 # =====================================================================
 # Main Execution Loop
 # =====================================================================
 def main() -> None:
    """Run teleoperation with PICO XR Controller against Isaac Lab environment."""
    env_cfg = parse_env_cfg(args_cli.task, num_envs=args_cli.num_envs)
    env_cfg.env_name = args_cli.task
    if not isinstance(env_cfg, ManagerBasedRLEnvCfg):
        raise ValueError(f"Teleoperation requires ManagerBasedRLEnvCfg. Got: {type(env_cfg)}")
    if "Abs" not in args_cli.task:
        raise ValueError(
            f"Task '{args_cli.task}' is not an absolute IK task. "
            "Only Abs tasks are supported (e.g. Isaac-MindRobot-LeftArm-IK-Abs-v0)."
        )
    env_cfg.terminations.time_out = None
    try:
        env = gym.make(args_cli.task, cfg=env_cfg).unwrapped
    except Exception as e:
        logger.error(f"Failed to create environment '{args_cli.task}': {e}")
        simulation_app.close()
        return
    # Detect single-arm vs dual-arm based on action manager term names
    is_dual_arm = "left_arm_action" in env.action_manager._terms
    has_head = "head_action" in env.action_manager._terms
    # -- Sim-to-VR stereo streaming setup --
    streamer = None
    if args_cli.stream_to:
        try:
            from mindbot.utils.sim_video_streamer import SimVideoStreamer
            # Get camera resolution from scene config (vr_left_eye)
            scene = env.unwrapped.scene
            if "vr_left_eye" not in scene.sensors:
                logger.warning("[Stream] vr_left_eye camera not found in scene. Streaming disabled.")
            else:
                cam = scene["vr_left_eye"]
                cam_w = cam.cfg.width
                cam_h = cam.cfg.height
                streamer = SimVideoStreamer(
                    host=args_cli.stream_to,
                    port=args_cli.stream_port,
                    width=cam_w,
                    height=cam_h,
                    fps=30,
                    bitrate=args_cli.stream_bitrate,
                )
                if streamer.connect():
                    print(f"[INFO] Sim stereo streaming to {args_cli.stream_to}:{args_cli.stream_port} "
                          f"({cam_w*2}x{cam_h} SBS @ 30fps)")
                else:
                    logger.error("[Stream] Failed to connect. Streaming disabled.")
                    streamer = None
        except ImportError:
            logger.error("[Stream] mindbot.utils.sim_video_streamer not found. pip install av?")
        except Exception as e:
            logger.error(f"[Stream] Init failed: {e}")
    # Sensitivity: explicit args override --sensitivity, which overrides defaults
    pos_sens = 1.0
    rot_sens = 0.3
    if args_cli.sensitivity is not None:
        pos_sens = args_cli.sensitivity
        rot_sens = args_cli.sensitivity
    if args_cli.pos_sensitivity is not None:
        pos_sens = args_cli.pos_sensitivity
    if args_cli.rot_sensitivity is not None:
        rot_sens = args_cli.rot_sensitivity
    if is_dual_arm:
        print(f"\n[INFO] Dual-arm mode detected. Using both controllers.")
        print(f"[INFO] IK Mode: ABSOLUTE")
        print(f"[INFO] Sensitivity: pos={pos_sens:.3f}  rot={rot_sens:.3f}")
        shared_client = XrClient()
        teleop_left = XrTeleopController(arm="left", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client)
        teleop_right = XrTeleopController(arm="right", pos_sensitivity=pos_sens, rot_sensitivity=rot_sens, xr_client=shared_client)
        teleop_interface = teleop_left  # primary interface for callbacks/reset
        teleop_right_ref = teleop_right
    else:
        print(f"\n[INFO] Connecting to PICO XR Headset using {args_cli.arm} controller...")
        print(f"[INFO] IK Mode: ABSOLUTE")
        print(f"[INFO] Sensitivity: pos={pos_sens:.3f}  rot={rot_sens:.3f}")
        teleop_interface = XrTeleopController(
            arm=args_cli.arm,
            pos_sensitivity=pos_sens,
            rot_sensitivity=rot_sens,
        )
    should_reset = False
    def request_reset():
        nonlocal should_reset
        should_reset = True
        print("[INFO] Reset requested via XR button.")
    teleop_interface.add_callback("RESET", request_reset)
    def clear_ik_target_state():
        """Sync IK controller's internal target to current EEF pose to avoid jumps after reset."""
        if is_dual_arm:
            for term_key in ["left_arm_action", "right_arm_action"]:
                arm_action_term = env.action_manager._terms[term_key]
                ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose()
                arm_action_term._raw_actions.zero_()
                arm_action_term._processed_actions.zero_()
                arm_action_term._ik_controller._command.zero_()
                arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b
                arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b
        else:
            arm_action_term = env.action_manager._terms["arm_action"]
            ee_pos_b, ee_quat_b = arm_action_term._compute_frame_pose()
            arm_action_term._raw_actions.zero_()
            arm_action_term._processed_actions.zero_()
            arm_action_term._ik_controller._command.zero_()
            arm_action_term._ik_controller.ee_pos_des[:] = ee_pos_b
            arm_action_term._ik_controller.ee_quat_des[:] = ee_quat_b
    def convert_action_world_to_root(action_tensor: torch.Tensor) -> torch.Tensor:
        """Convert absolute IK action from world frame to robot root frame."""
        robot = env.unwrapped.scene["robot"]
        root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy()
        root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy()
        pos_root, quat_root = world_pose_to_root_frame(
            action_tensor[:3].numpy(), action_tensor[3:7].numpy(),
            root_pos_w, root_quat_w,
        )
        out = action_tensor.clone()
        out[:3] = torch.tensor(pos_root, dtype=torch.float32)
        out[3:7] = torch.tensor(quat_root, dtype=torch.float32)
        return out
    def get_chassis_commands() -> tuple[torch.Tensor, float, float]:
        """Read left joystick → (wheel_cmd 4D, v_fwd m/s, omega rad/s).
        wheel_cmd  : differential wheel velocities for the action vector (training data).
        v_fwd      : desired robot forward speed in m/s (for direct base control).
        omega      : desired robot yaw rate in rad/s  (for direct base control).
        """
        try:
            joy = teleop_interface.xr_client.get_joystick("left")
            jy = float(joy[1])
            jx = float(joy[0])
        except Exception:
            return torch.zeros(4), 0.0, 0.0
        # Wheel velocity commands for action vector (rad/s)
        v_w = jy * args_cli.base_speed
        omega_w = jx * args_cli.base_turn
        right_vel = v_w - omega_w
        left_vel  = v_w + omega_w
        wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32)
        if abs(v_w) > 0.1 or abs(omega_w) > 0.1:
            print(f"[WHEEL] joy=({jx:+.2f},{jy:+.2f}) v_wheel={v_w:.2f} ω_wheel={omega_w:.2f} → {wheel_cmd.tolist()}")
        # Direct base velocity commands (m/s, rad/s)
        v_fwd  =  jy * args_cli.drive_speed
        omega  = -jx * args_cli.drive_turn   # jx<0 (left push) → positive yaw = left turn
        return wheel_cmd, v_fwd, omega
    def apply_base_velocity(v_fwd: float, omega: float) -> None:
        """Directly set robot root linear+angular velocity to bypass isotropic friction.
        PhysX uses isotropic friction, so skid-steer lateral slip is impossible through
        wheel torques alone. This function overrides the base velocity each step to give
        clean translational + rotational motion regardless of friction.
        The write is buffered and applied at the start of the next env.step().
        """
        if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4:
            return
        robot = env.unwrapped.scene["robot"]
        rq = robot.data.root_quat_w          # [N, 4]  wxyz
        w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3]
        # Robot forward direction in world frame (rotate [1,0,0] by root quaternion)
        fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q)
        fwd_y = 2.0 * (x_q * y_q + w_q * z_q)
        vel = torch.zeros(env.num_envs, 6, device=device)
        vel[:, 0] = v_fwd * fwd_x   # world x linear
        vel[:, 1] = v_fwd * fwd_y   # world y linear
        # vel[:, 2] = 0              # z: let physics handle gravity/contact
        vel[:, 5] = omega            # world z angular (yaw)
        robot.write_root_velocity_to_sim(vel)
    obs, _ = env.reset()
    clear_ik_target_state()
    teleop_interface.reset()
    # Read contact sensor data directly from OmniGraph node USD attributes.
    # The USD has action graphs that execute each physics step and write outputs to:
    #   outputs:inContact (bool)  outputs:value (float, Newtons)
    # /root in USD  →  /World/envs/env_0/Robot  in the loaded scene.
    _CONTACT_OG_NODES = {
        "left_r": (
            "/World/envs/env_0/Robot"
            "/l_zhuajia___m2_1_cs/isaac_read_contact_sensor_node"
        ),
        "left_l": (
            "/World/envs/env_0/Robot"
            "/l_zhuajia___m3_1_cs/isaac_read_contact_sensor_node"
        ),
    }
    def _og_contact_reading(stage, node_path: str) -> tuple[bool, float]:
        """Read one OmniGraph contact node output via USD attribute API."""
        prim = stage.GetPrimAtPath(node_path)
        if not prim.IsValid():
            return False, float("nan")
        in_contact_attr = prim.GetAttribute("outputs:inContact")
        force_attr = prim.GetAttribute("outputs:value")
        in_contact = in_contact_attr.Get() if in_contact_attr.IsValid() else False
        force = force_attr.Get() if force_attr.IsValid() else 0.0
        return bool(in_contact), float(force or 0.0)
    # One-time diagnostic: run after first env.reset() so all prims exist
    def _diag_contact_nodes() -> None:
        import omni.usd
        stage = omni.usd.get_context().get_stage()
        print("\n[DIAG] OmniGraph contact node check:")
        for name, path in _CONTACT_OG_NODES.items():
            prim = stage.GetPrimAtPath(path)
            if prim.IsValid():
                has_in = prim.GetAttribute("outputs:inContact").IsValid()
                has_val = prim.GetAttribute("outputs:value").IsValid()
                print(f"  [{name}] ✓ prim valid | outputs:inContact={has_in} outputs:value={has_val}")
            else:
                # Try to find similar prims to hint correct path
                parent_path = "/".join(path.split("/")[:-1])
                parent = stage.GetPrimAtPath(parent_path)
                print(f"  [{name}] ✗ NOT FOUND at: {path}")
                if parent.IsValid():
                    children = [c.GetName() for c in parent.GetChildren()]
                    print(f"           parent exists, children: {children}")
                else:
                    gp_path = "/".join(path.split("/")[:-2])
                    gp = stage.GetPrimAtPath(gp_path)
                    print(f"           grandparent '{gp_path}' valid={gp.IsValid()}")
    def read_gripper_contact() -> str:
        """Return contact status string by reading OmniGraph node USD attributes."""
        try:
            import omni.usd
            stage = omni.usd.get_context().get_stage()
            c_r, f_r = _og_contact_reading(stage, _CONTACT_OG_NODES["left_r"])
            c_l, f_l = _og_contact_reading(stage, _CONTACT_OG_NODES["left_l"])
            if f_r != f_r:  # nan → prim not found
                return "prim not found (run with diag)"
            sym_r = "✓" if c_r else "✗"
            sym_l = "✓" if c_l else "✗"
            return f"{sym_r}R={f_r:.1f}N  {sym_l}L={f_l:.1f}N"
        except Exception as _e:
            return f"N/A ({_e})"
    # One-time diagnostic: verify OmniGraph contact sensor node paths
    _diag_contact_nodes()
    # Create debug viewport windows (always on)
    create_robot_viewports()
    if is_dual_arm:
        teleop_right_ref.reset()
    print("\n" + "=" * 50)
    print("  Teleoperation Started!")
    if is_dual_arm:
        print("  LEFT controller  → left arm")
        print("  RIGHT controller → right arm")
    print("  TRIGGER: open/close gripper")
    print("  GRIP (hold): move the arm")
    print("  Left joystick: Y=forward/back, X=turn")
    print("  Y (left controller): reset environment")
    print("=" * 50 + "\n")
    device = env.unwrapped.device
    sim_frame = 0
    obs, _ = env.reset()
    clear_ik_target_state()
    last_root_left = None
    last_root_right = None
    neutral_head_rot = None  # calibrated on first headset sample after reset
    # trunk_action uses use_default_offset=False, so action = absolute position target.
    trunk_cmd = torch.tensor([0.1], dtype=torch.float32)
    while simulation_app.is_running():
        try:
            with torch.inference_mode():
                if should_reset:
                    obs, _ = env.reset()
                    clear_ik_target_state()
                    teleop_interface.reset()
                    if is_dual_arm:
                        teleop_right_ref.reset()
                    rebind_robot_viewports()
                    should_reset = False
                    sim_frame = 0
                    last_root_left = None
                    last_root_right = None
                    neutral_head_rot = None  # re-calibrate on next headset sample
                    continue
                policy_obs = obs["policy"]
                wheel_np, v_fwd, omega_base = get_chassis_commands()
                # Head tracking: HMD yaw/pitch relative to neutral pose
                head_cmd_np, neutral_head_rot = get_head_joint_targets(
                    teleop_interface.xr_client, neutral_head_rot
                )
                head_cmd = torch.tensor(head_cmd_np, dtype=torch.float32)
                if is_dual_arm:
                    eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy()
                    eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy()
                    eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy()
                    eef_quat_right = policy_obs["eef_quat_right"][0].cpu().numpy()
                    left_action = teleop_left.advance(current_eef_pos=eef_pos_left, current_eef_quat=eef_quat_left)
                    right_action = teleop_right_ref.advance(current_eef_pos=eef_pos_right, current_eef_quat=eef_quat_right)
                    # Only recompute root-frame IK target when grip is active; otherwise freeze joints
                    if teleop_left.grip_active or last_root_left is None:
                        last_root_left = convert_action_world_to_root(left_action)[:7].clone()
                    if teleop_right_ref.grip_active or last_root_right is None:
                        last_root_right = convert_action_world_to_root(right_action)[:7].clone()
                    # Action layout depends on task:
                    #   with head:    left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) | head(2) | trunk(1) = 23D
                    #   without head: left_arm(7) | wheel(4) | left_gripper(1) | right_arm(7) | right_gripper(1) = 20D
                    parts = [
                        last_root_left, wheel_np, left_action[7:8],
                        last_root_right, right_action[7:8],
                    ]
                    if has_head:
                        parts.append(head_cmd)
                        parts.append(trunk_cmd)
                    action_np = torch.cat(parts)
                else:
                    eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
                    eef_quat = policy_obs["eef_quat"][0].cpu().numpy()
                    action_np = teleop_interface.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat)
                    if teleop_interface.grip_active or last_root_left is None:
                        last_root_left = convert_action_world_to_root(action_np)[:7].clone()
                    # Action layout: arm(7) | wheel(4) | gripper(1)
                    action_np = torch.cat([last_root_left, wheel_np, action_np[7:8]])
                actions = action_np.unsqueeze(0).repeat(env.num_envs, 1).to(device)
                obs, _, _, _, _ = env.step(actions)
                # Direct base velocity override: bypasses isotropic friction limitation
                # so skid-steer turning works correctly.
                apply_base_velocity(v_fwd, omega_base)
                # Stream sim stereo to VR headset (if enabled)
                if streamer is not None:
                    try:
                        scene = env.unwrapped.scene
                        left_rgb = scene["vr_left_eye"].data.output["rgb"][0].cpu().numpy()
                        right_rgb = scene["vr_right_eye"].data.output["rgb"][0].cpu().numpy()
                        if not streamer.send_frame(left_rgb, right_rgb):
                            logger.warning("[Stream] Connection lost. Disabling streaming.")
                            streamer = None
                    except Exception as e:
                        if sim_frame % 300 == 0:
                            logger.warning(f"[Stream] Frame error: {e}")
                sim_frame += 1
                if sim_frame % 30 == 0:
                    np.set_printoptions(precision=4, suppress=True, floatmode='fixed')
                    policy_obs = obs["policy"]
                    joint_pos = policy_obs["joint_pos"][0].cpu().numpy()
                    last_act = policy_obs["actions"][0].cpu().numpy()
                    if sim_frame == 30:
                        robot = env.unwrapped.scene["robot"]
                        jnames = robot.joint_names
                        print(f"\n{'='*70}")
                        print(f"  ALL {len(jnames)} JOINT NAMES AND POSITIONS")
                        print(f"{'='*70}")
                        for i, name in enumerate(jnames):
                            print(f"  [{i:2d}] {name:30s}  = {joint_pos[i]:+.4f}")
                        arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")]
                        print(f"  Deduced left arm indices: {arm_idx}")
                        print(f"{'='*70}\n")
                    if not hasattr(env, '_arm_idx_cache'):
                        robot = env.unwrapped.scene["robot"]
                        jnames = robot.joint_names
                        env._arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("l_joint")]
                        env._right_arm_idx_cache = [i for i, n in enumerate(jnames) if n.startswith("r_joint")]
                    arm_joints = joint_pos[env._arm_idx_cache]
                    right_arm_joints = joint_pos[env._right_arm_idx_cache]
                    try:
                        joy_dbg = teleop_interface.xr_client.get_joystick("left")
                        joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]"
                    except Exception:
                        joy_str = "N/A"
                    if is_dual_arm:
                        eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy()
                        eef_quat_left = policy_obs["eef_quat_left"][0].cpu().numpy()
                        eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy()
                        print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------")
                        print(f"| Left  Arm Joints (rad):     {arm_joints}")
                        print(f"| Right Arm Joints (rad):     {right_arm_joints}")
                        print(f"| Left  EEF Pos (world, m):   {eef_pos_left}")
                        print(f"| Right EEF Pos (world, m):   {eef_pos_right}")
                        print(f"| Left Gripper Contact:       {read_gripper_contact()}")
                        print(f"| Cmd left_arm(abs):          {last_act[:7]}")
                        print(f"| Cmd wheel vel (rad/s):      {last_act[7:11]}")
                        print(f"| Cmd left_gripper:           {last_act[11:12]}  (+1=open -1=close)")
                        print(f"| Cmd right_arm(abs):         {last_act[12:19]}")
                        print(f"| Cmd right_gripper:          {last_act[19:]}  (+1=open -1=close)")
                        print(f"| Joystick (x=turn,y=fwd):    {joy_str}")
                        print(f"----------------------------------------------------------------")
                    else:
                        eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
                        eef_quat = policy_obs["eef_quat"][0].cpu().numpy()
                        if not hasattr(env, '_prev_eef_quat'):
                            env._prev_eef_quat = eef_quat.copy()
                        prev_q = env._prev_eef_quat
                        r_prev = R.from_quat([prev_q[1], prev_q[2], prev_q[3], prev_q[0]])
                        r_curr = R.from_quat([eef_quat[1], eef_quat[2], eef_quat[3], eef_quat[0]])
                        delta_rotvec = (r_curr * r_prev.inv()).as_rotvec()
                        delta_angle_deg = np.degrees(np.linalg.norm(delta_rotvec))
                        delta_axis = delta_rotvec / (np.linalg.norm(delta_rotvec) + 1e-9)
                        env._prev_eef_quat = eef_quat.copy()
                        approach_vec = r_curr.apply(np.array([1.0, 0.0, 0.0]))
                        print(f"\n---------------- [ROBOT STATE frame={sim_frame}] ----------------")
                        print(f"| Left Arm Joints (rad):    {arm_joints}")
                        print(f"| EEF Pos (world, m):       {eef_pos}")
                        print(f"| EEF approach vec (world): [{approach_vec[0]:+.3f}, {approach_vec[1]:+.3f}, {approach_vec[2]:+.3f}]  (夹爪朝向)")
                        print(f"| EEF rot since last print: {delta_angle_deg:+.2f}° around [{delta_axis[0]:+.3f},{delta_axis[1]:+.3f},{delta_axis[2]:+.3f}] (world)")
                        print(f"| Cmd arm(abs pos+quat):    {last_act[:7]}")
                        print(f"| Cmd wheel vel (rad/s):    {last_act[7:11]}")
                        print(f"| Cmd gripper:              {last_act[11:]}  (+1=open -1=close)")
                        print(f"| Joystick (x=turn,y=fwd):  {joy_str}")
                        print(f"----------------------------------------------------------------")
        except Exception as e:
            logger.error(f"Error during simulation step: {e}")
            break
    if streamer is not None:
        streamer.close()
    teleop_interface.close()
    if is_dual_arm:
        teleop_right_ref.close()
        shared_client.close()
    env.close()
 if __name__ == "__main__":
    main()
    simulation_app.close()
--- a/scripts/environments/teleoperation/xr_teleop/init.py
+++ b/scripts/environments/teleoperation/xr_teleop/init.py
@@ -0,0 +1,31 @@
 # SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers.
 # SPDX-License-Identifier: BSD-3-Clause
 """XR teleoperation package for MindBot.
 Agent hierarchy:
    BaseTeleopAgent → SingleArmXrAgent (8D)
    BaseTeleopAgent → DualArmXrAgent (20D) → DualArmHeadXrAgent (23D)
 """
 from .base_agent import BaseTeleopAgent
 from .single_arm_agent import SingleArmXrAgent
 from .dual_arm_agent import DualArmXrAgent
 from .dual_arm_head_agent import DualArmHeadXrAgent
 from .xr_controller import XrTeleopController
 from .chassis import ChassisController
 from .head_tracker import HeadTracker
 from .streaming import StreamingManager
 from .diagnostics import DiagnosticsReporter
 __all__ = [
    "BaseTeleopAgent",
    "SingleArmXrAgent",
    "DualArmXrAgent",
    "DualArmHeadXrAgent",
    "XrTeleopController",
    "ChassisController",
    "HeadTracker",
    "StreamingManager",
    "DiagnosticsReporter",
 ]
--- a/scripts/environments/teleoperation/xr_teleop/base_agent.py
+++ b/scripts/environments/teleoperation/xr_teleop/base_agent.py
@@ -0,0 +1,136 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Base teleoperation agent with main loop, env management, and reset handling."""
 from __future__ import annotations
 import logging
 from abc import ABC, abstractmethod
 import torch
 logger = logging.getLogger(__name__)
 class BaseTeleopAgent(ABC):
    """Abstract base for XR teleoperation agents.
    Provides the main simulation loop, env lifecycle, and reset handling.
    Subclasses implement assemble_action() to define their control dimensions.
    """
    def __init__(self, env, simulation_app, *, debug_viewports: bool = True):
        self.env = env
        self.simulation_app = simulation_app
        self.device = env.unwrapped.device
        self.num_envs = env.num_envs
        self._should_reset = False
        self.sim_frame = 0
        # Viewport management
        self._robot_viewports: dict[str, object] = {}
        self._robot_cam_prims: dict[str, str] = {
            "Left Hand": "/World/envs/env_0/Robot/rm_65_fb_left/Link_6/robot___left_8_02/cam_left_hand",
            "Right Hand": "/World/envs/env_0/Robot/rm_65_b_right/Link6/robot___right_8_2/cam_right_hand",
            "Head": "/World/envs/env_0/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft",
            "Chest": "/World/envs/env_0/Robot/robot_trunk/cam_chest",
        }
        if debug_viewports:
            self._create_viewports()
    def request_reset(self):
        self._should_reset = True
        print("[INFO] Reset requested via XR button.")
    def _create_viewports(self):
        try:
            import omni.kit.viewport.utility as vp_util
        except ImportError:
            logger.warning("[Viewport] omni.kit.viewport.utility not available.")
            return
        for name, cam_path in self._robot_cam_prims.items():
            vp_win = vp_util.create_viewport_window(f"Robot {name} View", width=640, height=360)
            vp_win.viewport_api.camera_path = cam_path
            self._robot_viewports[name] = vp_win
            print(f"[INFO] Viewport 'Robot {name} View' bound to: {cam_path}")
    def _rebind_viewports(self):
        for name, vp_win in self._robot_viewports.items():
            vp_win.viewport_api.camera_path = self._robot_cam_prims[name]
    def _clear_ik_target_state(self):
        """Sync IK controller internals to current EEF pose to prevent jumps after reset."""
        action_terms = self.env.action_manager._terms
        for key in self._ik_action_term_names():
            if key in action_terms:
                term = action_terms[key]
                ee_pos_b, ee_quat_b = term._compute_frame_pose()
                term._raw_actions.zero_()
                term._processed_actions.zero_()
                term._ik_controller._command.zero_()
                term._ik_controller.ee_pos_des[:] = ee_pos_b
                term._ik_controller.ee_quat_des[:] = ee_quat_b
    def _ik_action_term_names(self) -> list[str]:
        """Return IK action term names to clear on reset. Override in subclass."""
        return ["arm_action"]
    def _do_reset(self, obs):
        """Execute reset sequence. Returns new obs."""
        obs, _ = self.env.reset()
        self._clear_ik_target_state()
        self._rebind_viewports()
        self._should_reset = False
        self.sim_frame = 0
        self.on_reset()
        return obs
    def on_reset(self):
        """Hook for subclasses to reset their own state."""
        pass
    @abstractmethod
    def assemble_action(self, obs) -> torch.Tensor:
        """Build the full action tensor for env.step(). Must be implemented by subclass."""
        ...
    def post_step(self, obs):
        """Hook called after env.step(). Subclasses can add streaming, diagnostics, etc."""
        pass
    def run(self):
        """Main simulation loop."""
        obs, _ = self.env.reset()
        self._clear_ik_target_state()
        self.on_reset()
        self._print_banner()
        while self.simulation_app.is_running():
            try:
                with torch.inference_mode():
                    if self._should_reset:
                        obs = self._do_reset(obs)
                        continue
                    action = self.assemble_action(obs)
                    actions = action.unsqueeze(0).repeat(self.num_envs, 1).to(self.device)
                    obs, _, _, _, _ = self.env.step(actions)
                    self.sim_frame += 1
                    self.post_step(obs)
            except Exception as e:
                logger.error(f"Error during simulation step: {e}")
                break
        self.cleanup()
    def _print_banner(self):
        print("\n" + "=" * 50)
        print("  Teleoperation Started!")
        print("=" * 50 + "\n")
    def cleanup(self):
        """Release resources. Override in subclass to close controllers, streamers, etc."""
        self.env.close()
--- a/scripts/environments/teleoperation/xr_teleop/chassis.py
+++ b/scripts/environments/teleoperation/xr_teleop/chassis.py
@@ -0,0 +1,58 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Chassis (mobile base) control via XR joystick."""
 import torch
 class ChassisController:
    """Converts left joystick input into wheel velocities and direct base velocity commands.
    Args:
        base_speed: Max wheel speed (rad/s) at full joystick.
        base_turn: Max wheel differential (rad/s) at full joystick.
        drive_speed: Max robot linear speed (m/s) for direct base control.
        drive_turn: Max robot yaw rate (rad/s) for direct base control.
    """
    def __init__(self, base_speed: float = 5.0, base_turn: float = 2.0,
                 drive_speed: float = 0.5, drive_turn: float = 1.5):
        self.base_speed = base_speed
        self.base_turn = base_turn
        self.drive_speed = drive_speed
        self.drive_turn = drive_turn
    def get_commands(self, xr_client) -> tuple[torch.Tensor, float, float]:
        """Read left joystick and return (wheel_cmd_4D, v_fwd_m_s, omega_rad_s)."""
        try:
            joy = xr_client.get_joystick("left")
            jy = float(joy[1])
            jx = float(joy[0])
        except Exception:
            return torch.zeros(4), 0.0, 0.0
        v_w = jy * self.base_speed
        omega_w = jx * self.base_turn
        right_vel = v_w - omega_w
        left_vel = v_w + omega_w
        wheel_cmd = torch.tensor([right_vel, left_vel, left_vel, right_vel], dtype=torch.float32)
        v_fwd = jy * self.drive_speed
        omega = -jx * self.drive_turn  # left push → positive yaw = left turn
        return wheel_cmd, v_fwd, omega
    @staticmethod
    def apply_base_velocity(robot, v_fwd: float, omega: float, num_envs: int, device) -> None:
        """Directly set robot root velocity to bypass isotropic friction for skid-steer turning."""
        if abs(v_fwd) < 1e-4 and abs(omega) < 1e-4:
            return
        rq = robot.data.root_quat_w  # [N, 4] wxyz
        w_q, x_q, y_q, z_q = rq[:, 0], rq[:, 1], rq[:, 2], rq[:, 3]
        fwd_x = 1.0 - 2.0 * (y_q * y_q + z_q * z_q)
        fwd_y = 2.0 * (x_q * y_q + w_q * z_q)
        vel = torch.zeros(num_envs, 6, device=device)
        vel[:, 0] = v_fwd * fwd_x
        vel[:, 1] = v_fwd * fwd_y
        vel[:, 5] = omega
        robot.write_root_velocity_to_sim(vel)
--- a/scripts/environments/teleoperation/xr_teleop/diagnostics.py
+++ b/scripts/environments/teleoperation/xr_teleop/diagnostics.py
@@ -0,0 +1,98 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Periodic diagnostics reporter for teleoperation debugging."""
 import numpy as np
 class DiagnosticsReporter:
    """Prints robot state info every N frames during teleoperation."""
    def __init__(self, interval: int = 30, is_dual_arm: bool = False):
        self.interval = interval
        self.is_dual_arm = is_dual_arm
        self._left_arm_idx: list[int] | None = None
        self._right_arm_idx: list[int] | None = None
        self._joint_names_printed = False
    def _ensure_indices(self, robot):
        if self._left_arm_idx is None:
            jnames = robot.joint_names
            self._left_arm_idx = [i for i, n in enumerate(jnames) if n.startswith("l_joint")]
            self._right_arm_idx = [i for i, n in enumerate(jnames) if n.startswith("r_joint")]
            self._trunk_idx = jnames.index("PrismaticJoint") if "PrismaticJoint" in jnames else None
    def report(self, env, obs, frame: int, last_act: np.ndarray | None = None,
               xr_client=None, read_gripper_contact=None, trunk_target: float | None = None):
        """Print diagnostics if frame is on interval boundary."""
        if frame % self.interval != 0:
            return
        np.set_printoptions(precision=4, suppress=True, floatmode='fixed')
        policy_obs = obs["policy"]
        joint_pos = policy_obs["joint_pos"][0].cpu().numpy()
        if last_act is None:
            last_act = policy_obs["actions"][0].cpu().numpy()
        robot = env.unwrapped.scene["robot"]
        self._ensure_indices(robot)
        # Print full joint table on first report
        if not self._joint_names_printed:
            jnames = robot.joint_names
            print(f"\n{'=' * 70}")
            print(f"  ALL {len(jnames)} JOINT NAMES AND POSITIONS")
            print(f"{'=' * 70}")
            for i, name in enumerate(jnames):
                print(f"  [{i:2d}] {name:30s}  = {joint_pos[i]:+.4f}")
            print(f"  Deduced left arm indices: {self._left_arm_idx}")
            print(f"{'=' * 70}\n")
            self._joint_names_printed = True
        arm_joints = joint_pos[self._left_arm_idx]
        right_arm_joints = joint_pos[self._right_arm_idx]
        joy_str = "N/A"
        if xr_client is not None:
            try:
                joy_dbg = xr_client.get_joystick("left")
                joy_str = f"[{joy_dbg[0]:+.2f}, {joy_dbg[1]:+.2f}]"
            except Exception:
                pass
        if self.is_dual_arm:
            eef_pos_left = policy_obs["eef_pos_left"][0].cpu().numpy()
            eef_pos_right = policy_obs["eef_pos_right"][0].cpu().numpy()
            contact_str = read_gripper_contact() if read_gripper_contact else "N/A"
            print(f"\n---------------- [ROBOT STATE frame={frame}] ----------------")
            print(f"| Left  Arm Joints (rad):     {arm_joints}")
            print(f"| Right Arm Joints (rad):     {right_arm_joints}")
            print(f"| Left  EEF Pos (world, m):   {eef_pos_left}")
            print(f"| Right EEF Pos (world, m):   {eef_pos_right}")
            print(f"| Left Gripper Contact:       {contact_str}")
            print(f"| Cmd left_arm(abs):          {last_act[:7]}")
            print(f"| Cmd wheel vel (rad/s):      {last_act[7:11]}")
            print(f"| Cmd left_gripper:           {last_act[11:12]}  (+1=open -1=close)")
            print(f"| Cmd right_arm(abs):         {last_act[12:19]}")
            print(f"| Cmd right_gripper:          {last_act[19:]}  (+1=open -1=close)")
            print(f"| Joystick (x=turn,y=fwd):    {joy_str}")
            if self._trunk_idx is not None:
                trunk_pos = robot.data.joint_pos[0, self._trunk_idx].item()
                trunk_vel = robot.data.joint_vel[0, self._trunk_idx].item()
                trunk_force = robot.data.applied_torque[0, self._trunk_idx].item() if hasattr(robot.data, 'applied_torque') else float('nan')
                trunk_tgt_str = f"{trunk_target:.4f}" if trunk_target is not None else "N/A"
                print(f"| Trunk pos={trunk_pos:.4f}  vel={trunk_vel:.6f}  target={trunk_tgt_str}  force={trunk_force:.2f}")
            print(f"----------------------------------------------------------------")
        else:
            eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
            print(f"\n---------------- [ROBOT STATE frame={frame}] ----------------")
            print(f"| Left Arm Joints (rad):    {arm_joints}")
            print(f"| EEF Pos (world, m):       {eef_pos}")
            print(f"| Cmd arm(abs pos+quat):    {last_act[:7]}")
            print(f"| Cmd wheel vel (rad/s):    {last_act[7:11]}")
            print(f"| Cmd gripper:              {last_act[11:]}  (+1=open -1=close)")
            print(f"| Joystick (x=turn,y=fwd):  {joy_str}")
            print(f"----------------------------------------------------------------")
--- a/scripts/environments/teleoperation/xr_teleop/dual_arm_agent.py
+++ b/scripts/environments/teleoperation/xr_teleop/dual_arm_agent.py
@@ -0,0 +1,139 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Dual-arm XR teleoperation agent with chassis control."""
 from __future__ import annotations
 import torch
 from xr_utils import XrClient
 from .base_agent import BaseTeleopAgent
 from .xr_controller import XrTeleopController
 from .chassis import ChassisController
 from .frame_utils import convert_action_world_to_root
 class DualArmXrAgent(BaseTeleopAgent):
    """Dual-arm teleoperation with chassis joystick control.
    Action: left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1) = 20D
    """
    def __init__(self, env, simulation_app, *,
                 pos_sensitivity: float = 1.0,
                 rot_sensitivity: float = 0.3,
                 base_speed: float = 5.0,
                 base_turn: float = 2.0,
                 drive_speed: float = 0.5,
                 drive_turn: float = 1.5,
                 debug_viewports: bool = True):
        super().__init__(env, simulation_app, debug_viewports=debug_viewports)
        self.shared_client = XrClient()
        self.teleop_left = XrTeleopController(
            arm="left", pos_sensitivity=pos_sensitivity,
            rot_sensitivity=rot_sensitivity, xr_client=self.shared_client,
        )
        self.teleop_right = XrTeleopController(
            arm="right", pos_sensitivity=pos_sensitivity,
            rot_sensitivity=rot_sensitivity, xr_client=self.shared_client,
        )
        self.teleop_left.add_callback("RESET", self.request_reset)
        self.chassis = ChassisController(
            base_speed=base_speed, base_turn=base_turn,
            drive_speed=drive_speed, drive_turn=drive_turn,
        )
        self._last_root_left = None
        self._last_root_right = None
    @property
    def xr_client(self):
        return self.shared_client
    def _ik_action_term_names(self) -> list[str]:
        return ["left_arm_action", "right_arm_action"]
    def on_reset(self):
        self.teleop_left.reset()
        self.teleop_right.reset()
        self._last_root_left = None
        self._last_root_right = None
    def assemble_action(self, obs) -> torch.Tensor:
        policy_obs = obs["policy"]
        robot = self.env.unwrapped.scene["robot"]
        # Get robot root pose for world→body EEF conversion
        root_pos_w = robot.data.root_pos_w[0].cpu().numpy()
        root_quat_w = robot.data.root_quat_w[0].cpu().numpy()
        # Read chassis
        wheel_cmd, self._v_fwd, self._omega = self.chassis.get_commands(self.shared_client)
        # Convert EEF observations from world frame to body frame
        from .frame_utils import world_pose_to_root_frame
        eef_pos_left_w = policy_obs["eef_pos_left"][0].cpu().numpy()
        eef_quat_left_w = policy_obs["eef_quat_left"][0].cpu().numpy()
        eef_pos_left_b, eef_quat_left_b = world_pose_to_root_frame(
            eef_pos_left_w, eef_quat_left_w, root_pos_w, root_quat_w)
        eef_pos_right_w = policy_obs["eef_pos_right"][0].cpu().numpy()
        eef_quat_right_w = policy_obs["eef_quat_right"][0].cpu().numpy()
        eef_pos_right_b, eef_quat_right_b = world_pose_to_root_frame(
            eef_pos_right_w, eef_quat_right_w, root_pos_w, root_quat_w)
        # XR controller works in body frame:
        # - current_eef is body frame (converted above)
        # - XR delta is treated as body-frame delta directly (no rotation)
        #   because in VR teleop the user's visual frame tracks the robot heading
        # - output is body frame → send directly to IK
        left_action = self.teleop_left.advance(current_eef_pos=eef_pos_left_b, current_eef_quat=eef_quat_left_b)
        right_action = self.teleop_right.advance(current_eef_pos=eef_pos_right_b, current_eef_quat=eef_quat_right_b)
        # Joint-locking: target is already in body frame, use directly
        if self.teleop_left.grip_active or self._last_root_left is None:
            self._last_root_left = left_action[:7].clone()
        if self.teleop_right.grip_active or self._last_root_right is None:
            self._last_root_right = right_action[:7].clone()
        # Diagnostic: print once per second when left grip active
        _cnt = getattr(self, '_diag_cnt', 0) + 1
        self._diag_cnt = _cnt
        if self.teleop_left.grip_active and _cnt % 30 == 0:
            import math
            print(f"[FRAME DIAG] root_quat_w={root_quat_w}")
            print(f"  eef_left_world=({eef_pos_left_w[0]:.3f},{eef_pos_left_w[1]:.3f},{eef_pos_left_w[2]:.3f})")
            print(f"  eef_left_body =({eef_pos_left_b[0]:.3f},{eef_pos_left_b[1]:.3f},{eef_pos_left_b[2]:.3f})")
            print(f"  ik_cmd_body   =({self._last_root_left[0]:.3f},{self._last_root_left[1]:.3f},{self._last_root_left[2]:.3f})")
        # left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1)
        return torch.cat([
            self._last_root_left, wheel_cmd, left_action[7:8],
            self._last_root_right, right_action[7:8],
        ])
    def post_step(self, obs):
        # Apply direct base velocity override for skid-steer
        robot = self.env.unwrapped.scene["robot"]
        self.chassis.apply_base_velocity(robot, self._v_fwd, self._omega, self.num_envs, self.device)
    def cleanup(self):
        self.teleop_left.close()
        self.teleop_right.close()
        self.shared_client.close()
        super().cleanup()
    def _print_banner(self):
        print("\n" + "=" * 50)
        print("  Teleoperation Started!")
        print("  LEFT controller  -> left arm")
        print("  RIGHT controller -> right arm")
        print("  TRIGGER: open/close gripper")
        print("  GRIP (hold): move the arm")
        print("  Left joystick: Y=forward/back, X=turn")
        print("  Y (left controller): reset environment")
        print("=" * 50 + "\n")
--- a/scripts/environments/teleoperation/xr_teleop/dual_arm_head_agent.py
+++ b/scripts/environments/teleoperation/xr_teleop/dual_arm_head_agent.py
@@ -0,0 +1,91 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Dual-arm + head tracking + trunk + stereo streaming XR agent."""
 from __future__ import annotations
 import torch
 from .dual_arm_agent import DualArmXrAgent
 from .head_tracker import HeadTracker
 from .streaming import StreamingManager
 from .diagnostics import DiagnosticsReporter
 class DualArmHeadXrAgent(DualArmXrAgent):
    """Extends DualArmXrAgent with head tracking, trunk hold, and VR stereo streaming.
    Action: left_arm(7) | wheel(4) | left_grip(1) | right_arm(7) | right_grip(1) | head(2) | trunk(1) = 23D
    """
    def __init__(self, env, simulation_app, *,
                 pos_sensitivity: float = 1.0,
                 rot_sensitivity: float = 0.3,
                 base_speed: float = 5.0,
                 base_turn: float = 2.0,
                 drive_speed: float = 0.5,
                 drive_turn: float = 1.5,
                 stream_to: str | None = None,
                 stream_port: int = 12345,
                 stream_bitrate: int = 20_000_000,
                 debug_viewports: bool = True):
        super().__init__(
            env, simulation_app,
            pos_sensitivity=pos_sensitivity, rot_sensitivity=rot_sensitivity,
            base_speed=base_speed, base_turn=base_turn,
            drive_speed=drive_speed, drive_turn=drive_turn,
            debug_viewports=debug_viewports,
        )
        self.head_tracker = HeadTracker()
        # Read trunk initial position from ArticulationCfg (single source of truth)
        robot = env.unwrapped.scene["robot"]
        joint_names = robot.data.joint_names
        trunk_idx = joint_names.index("PrismaticJoint")
        trunk_init = robot.data.default_joint_pos[0, trunk_idx].item()
        self.trunk_cmd = torch.tensor([trunk_init], dtype=torch.float32)
        # Streaming
        self.streamer: StreamingManager | None = None
        if stream_to:
            scene = env.unwrapped.scene
            self.streamer = StreamingManager(stream_to, stream_port, scene, bitrate=stream_bitrate)
        # Diagnostics
        self.diagnostics = DiagnosticsReporter(interval=30, is_dual_arm=True)
    def on_reset(self):
        super().on_reset()
        self.head_tracker.reset()
    def assemble_action(self, obs) -> torch.Tensor:
        base_action = super().assemble_action(obs)
        # Head tracking
        head_targets = self.head_tracker.get_targets(self.shared_client)
        head_cmd = torch.tensor(head_targets, dtype=torch.float32)
        # base(20) | head(2) | trunk(1)
        return torch.cat([base_action, head_cmd, self.trunk_cmd])
    def post_step(self, obs):
        super().post_step(obs)
        # Stereo streaming
        if self.streamer is not None:
            scene = self.env.unwrapped.scene
            self.streamer.send(scene, self.sim_frame)
        # Diagnostics
        self.diagnostics.report(
            self.env, obs, self.sim_frame,
            xr_client=self.shared_client,
            trunk_target=self.trunk_cmd[0].item(),
        )
    def cleanup(self):
        if self.streamer is not None:
            self.streamer.close()
        super().cleanup()
--- a/scripts/environments/teleoperation/xr_teleop/frame_utils.py
+++ b/scripts/environments/teleoperation/xr_teleop/frame_utils.py
@@ -0,0 +1,55 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Coordinate frame utilities for XR teleoperation."""
 import numpy as np
 import torch
 from scipy.spatial.transform import Rotation as R
 def quat_wxyz_to_rotation(quat_wxyz: np.ndarray) -> R:
    """Convert Isaac-style wxyz quaternion to scipy Rotation."""
    return R.from_quat([quat_wxyz[1], quat_wxyz[2], quat_wxyz[3], quat_wxyz[0]])
 def rotation_to_quat_wxyz(rot: R) -> np.ndarray:
    """Convert scipy Rotation to Isaac-style wxyz quaternion."""
    q = rot.as_quat()
    return np.array([q[3], q[0], q[1], q[2]])
 def world_pose_to_root_frame(
    pos_w: np.ndarray,
    quat_wxyz: np.ndarray,
    root_pos_w: np.ndarray,
    root_quat_wxyz: np.ndarray,
 ) -> tuple[np.ndarray, np.ndarray]:
    """Express a world-frame pose in the robot root frame."""
    root_rot = quat_wxyz_to_rotation(root_quat_wxyz)
    pose_rot = quat_wxyz_to_rotation(quat_wxyz)
    pos_root = root_rot.inv().apply(pos_w - root_pos_w)
    quat_root = rotation_to_quat_wxyz(root_rot.inv() * pose_rot)
    return pos_root, quat_root
 def convert_action_world_to_root(action_tensor: torch.Tensor, robot) -> torch.Tensor:
    """Convert absolute IK action from world frame to robot root frame.
    Args:
        action_tensor: 8D tensor [x, y, z, qw, qx, qy, qz, gripper] in world frame.
        robot: Isaac Lab Articulation object (env.unwrapped.scene["robot"]).
    Returns:
        8D tensor with pos/quat converted to robot root frame.
    """
    root_pos_w = robot.data.root_pos_w[0].detach().cpu().numpy()
    root_quat_w = robot.data.root_quat_w[0].detach().cpu().numpy()
    pos_root, quat_root = world_pose_to_root_frame(
        action_tensor[:3].numpy(), action_tensor[3:7].numpy(),
        root_pos_w, root_quat_w,
    )
    out = action_tensor.clone()
    out[:3] = torch.tensor(pos_root, dtype=torch.float32)
    out[3:7] = torch.tensor(quat_root, dtype=torch.float32)
    return out
--- a/scripts/environments/teleoperation/xr_teleop/head_tracker.py
+++ b/scripts/environments/teleoperation/xr_teleop/head_tracker.py
@@ -0,0 +1,44 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Head tracking via XR headset (HMD) for robot head joint control."""
 import numpy as np
 from scipy.spatial.transform import Rotation as R
 from xr_utils import transform_xr_pose, is_valid_quaternion
 class HeadTracker:
    """Extracts yaw/pitch from HMD pose relative to a calibrated neutral orientation."""
    def __init__(self, yaw_limit: float = 1.57, pitch_limit: float = 0.52):
        self.yaw_limit = yaw_limit      # ±90°
        self.pitch_limit = pitch_limit   # ±30°
        self.neutral_rot: R | None = None
    def reset(self):
        """Clear neutral calibration; next call to get_targets() will re-calibrate."""
        self.neutral_rot = None
    def get_targets(self, xr_client) -> np.ndarray:
        """Return [yaw, pitch] in radians relative to neutral HMD pose.
        First call after reset() records neutral orientation and returns zeros.
        """
        try:
            raw_pose = xr_client.get_pose("headset")
            if not is_valid_quaternion(raw_pose[3:]):
                return np.zeros(2, dtype=np.float32)
            _, head_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
            head_rot = R.from_quat(head_world_quat_xyzw)
            if self.neutral_rot is None:
                self.neutral_rot = head_rot
                return np.zeros(2, dtype=np.float32)
            rel_rot = head_rot * self.neutral_rot.inv()
            euler_zyx = rel_rot.as_euler("ZYX")
            yaw = float(np.clip(euler_zyx[0], -self.yaw_limit, self.yaw_limit))
            pitch = float(np.clip(euler_zyx[1], -self.pitch_limit, self.pitch_limit))
            return np.array([yaw, pitch], dtype=np.float32)
        except Exception:
            return np.zeros(2, dtype=np.float32)
--- a/scripts/environments/teleoperation/xr_teleop/single_arm_agent.py
+++ b/scripts/environments/teleoperation/xr_teleop/single_arm_agent.py
@@ -0,0 +1,60 @@
 # SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers.
 # SPDX-License-Identifier: BSD-3-Clause
 """Single-arm XR teleoperation agent."""
 from __future__ import annotations
 import torch
 from .base_agent import BaseTeleopAgent
 from .xr_controller import XrTeleopController
 from .frame_utils import convert_action_world_to_root
 class SingleArmXrAgent(BaseTeleopAgent):
    """Single-arm teleoperation: one XR controller -> one robot arm + gripper.
    Action: arm(7) | gripper(1) = 8D
    """
    def __init__(self, env, simulation_app, *,
                 arm: str = "left",
                 pos_sensitivity: float = 1.0,
                 rot_sensitivity: float = 0.3,
                 xr_client=None,
                 debug_viewports: bool = True):
        super().__init__(env, simulation_app, debug_viewports=debug_viewports)
        self.xr_client = xr_client
        self.teleop = XrTeleopController(
            arm=arm, pos_sensitivity=pos_sensitivity,
            rot_sensitivity=rot_sensitivity, xr_client=xr_client,
        )
        self.teleop.add_callback("RESET", self.request_reset)
        self._last_root = None
    def _ik_action_term_names(self) -> list[str]:
        return ["arm_action"]
    def on_reset(self):
        self.teleop.reset()
        self._last_root = None
    def assemble_action(self, obs) -> torch.Tensor:
        policy_obs = obs["policy"]
        eef_pos = policy_obs["eef_pos"][0].cpu().numpy()
        eef_quat = policy_obs["eef_quat"][0].cpu().numpy()
        action = self.teleop.advance(current_eef_pos=eef_pos, current_eef_quat=eef_quat)
        robot = self.env.unwrapped.scene["robot"]
        if self.teleop.grip_active or self._last_root is None:
            self._last_root = convert_action_world_to_root(action, robot)[:7].clone()
        # arm(7) | gripper(1)
        return torch.cat([self._last_root, action[7:8]])
    def cleanup(self):
        self.teleop.close()
        super().cleanup()
--- a/scripts/environments/teleoperation/xr_teleop/streaming.py
+++ b/scripts/environments/teleoperation/xr_teleop/streaming.py
@@ -0,0 +1,82 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """Sim-to-VR H264 stereo streaming manager."""
 import logging
 logger = logging.getLogger(__name__)
 class StreamingManager:
    """Manages stereo camera → H264 → TCP streaming to VR headset.
    Wraps SimVideoStreamer with graceful error handling and auto-disable on failure.
    """
    def __init__(self, host: str, port: int, scene, bitrate: int = 20_000_000):
        self._streamer = None
        self._enabled = False
        if "vr_left_eye" not in scene.sensors:
            logger.warning("[Stream] vr_left_eye camera not found. Streaming disabled.")
            return
        try:
            from mindbot.utils.sim_video_streamer import SimVideoStreamer
            cam = scene["vr_left_eye"]
            self._streamer = SimVideoStreamer(
                host=host, port=port,
                width=cam.cfg.width, height=cam.cfg.height,
                fps=30, bitrate=bitrate,
            )
            if self._streamer.connect():
                self._enabled = True
                print(f"[INFO] Sim stereo streaming to {host}:{port} "
                      f"({cam.cfg.width * 2}x{cam.cfg.height} SBS @ 30fps)")
            else:
                logger.error("[Stream] Failed to connect. Streaming disabled.")
                self._streamer = None
        except ImportError:
            logger.error("[Stream] mindbot.utils.sim_video_streamer not found. pip install av?")
        except Exception as e:
            logger.error(f"[Stream] Init failed: {e}")
    @property
    def enabled(self) -> bool:
        return self._enabled
    _debug_saved = False
    def send(self, scene, frame_count: int = 0) -> None:
        """Send one stereo frame from scene cameras. Auto-disables on failure."""
        if not self._enabled:
            return
        try:
            left_rgb = scene["vr_left_eye"].data.output["rgb"][0].cpu().numpy()
            right_rgb = scene["vr_right_eye"].data.output["rgb"][0].cpu().numpy()
            # Save debug images once (frame 5 to ensure rendering is stable)
            if frame_count == 150 and not self._debug_saved:
                self._debug_saved = True
                try:
                    from PIL import Image
                    import numpy as np
                    Image.fromarray(left_rgb[..., :3]).save("/tmp/vr_left_debug.png")
                    Image.fromarray(right_rgb[..., :3]).save("/tmp/vr_right_debug.png")
                    sbs = np.concatenate([left_rgb, right_rgb], axis=1)
                    Image.fromarray(sbs[..., :3]).save("/tmp/vr_sbs_debug.png")
                    print(f"[STREAM DIAG] Saved debug frames: left={left_rgb.shape}, right={right_rgb.shape}")
                except Exception as e:
                    print(f"[STREAM DIAG] Save failed: {e}")
            if not self._streamer.send_frame(left_rgb, right_rgb):
                logger.warning("[Stream] Connection lost. Disabling streaming.")
                self._enabled = False
        except Exception as e:
            if frame_count % 300 == 0:
                logger.warning(f"[Stream] Frame error: {e}")
    def close(self):
        if self._streamer is not None:
            self._streamer.close()
--- a/scripts/environments/teleoperation/xr_teleop/xr_controller.py
+++ b/scripts/environments/teleoperation/xr_teleop/xr_controller.py
@@ -0,0 +1,191 @@
 # Copyright (c) 2025, Yutang Li, SIAT (yt.li2@siat.ac.cn)
 # SPDX-License-Identifier: BSD-3-Clause
 """XR teleoperation controller with absolute IK target accumulation and grip clutch."""
 from collections.abc import Callable
 import numpy as np
 import torch
 from scipy.spatial.transform import Rotation as R
 from xr_utils import XrClient, transform_xr_pose, quat_diff_as_rotvec_xyzw, is_valid_quaternion
 from .frame_utils import quat_wxyz_to_rotation, rotation_to_quat_wxyz
 class XrTeleopController:
    """Teleop controller for PICO XR headset (absolute IK mode).
    Accumulates an absolute EEF target in Isaac Sim world frame.
    Grip button acts as clutch; B/Y buttons trigger environment reset.
    """
    def __init__(self, arm="left", pos_sensitivity=1.0, rot_sensitivity=0.3,
                 max_pos_per_step=0.05, max_rot_per_step=0.08, xr_client=None):
        self.xr_client = xr_client if xr_client is not None else XrClient()
        self._owns_client = xr_client is None
        self.pos_sensitivity = pos_sensitivity
        self.rot_sensitivity = rot_sensitivity
        self.max_pos_per_step = max_pos_per_step
        self.max_rot_per_step = max_rot_per_step
        self.arm = arm
        self.controller_name = "left_controller" if arm == "left" else "right_controller"
        self.grip_name = "left_grip" if arm == "left" else "right_grip"
        self.trigger_name = "left_trigger" if arm == "left" else "right_trigger"
        from xr_utils.geometry import R_HEADSET_TO_WORLD
        self.R_headset_world = R_HEADSET_TO_WORLD
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.target_eef_pos = None
        self.target_eef_quat = None
        self.grip_active = False
        self.frame_count = 0
        self.reset_button_latched = False
        self.require_grip_reengage = False
        self.grip_engage_threshold = 0.8
        self.grip_release_threshold = 0.2
        self.callbacks = {}
        self._root_yaw_rad = 0.0  # current chassis heading, updated each frame
    def add_callback(self, name: str, func: Callable):
        self.callbacks[name] = func
    def reset(self):
        self.prev_xr_pos = None
        self.prev_xr_quat = None
        self.grip_active = False
        self.frame_count = 0
        self.target_eef_pos = None
        self.target_eef_quat = None
        self.require_grip_reengage = True
    def set_root_yaw(self, yaw_rad: float):
        """Update the chassis heading so XR deltas are in body frame."""
        self._root_yaw_rad = yaw_rad
    def close(self):
        if self._owns_client:
            self.xr_client.close()
    def get_zero_action(self, trigger, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Return 8D hold action [x, y, z, qw, qx, qy, qz, gripper] at frozen target pose."""
        action = torch.zeros(8)
        if self.target_eef_pos is not None and self.target_eef_quat is not None:
            action[:3] = torch.tensor(self.target_eef_pos.copy())
            action[3:7] = torch.tensor(self.target_eef_quat.copy())
        elif current_eef_pos is not None and current_eef_quat is not None:
            action[:3] = torch.tensor(current_eef_pos.copy())
            action[3:7] = torch.tensor(current_eef_quat.copy())
        else:
            action[3] = 1.0
        action[7] = 1.0 if trigger > 0.5 else -1.0
        return action
    def advance(self, current_eef_pos=None, current_eef_quat=None) -> torch.Tensor:
        """Read XR controller and return 8D absolute action [x, y, z, qw, qx, qy, qz, gripper].
        Position and quaternion are in Isaac Sim world frame.
        Caller must convert to robot root frame before sending to IK.
        """
        try:
            if self.arm == "left":
                reset_pressed = self.xr_client.get_button("Y")
            else:
                reset_pressed = False
            if reset_pressed and not self.reset_button_latched:
                if "RESET" in self.callbacks:
                    self.callbacks["RESET"]()
            self.reset_button_latched = reset_pressed
        except Exception:
            pass
        try:
            raw_pose = self.xr_client.get_pose(self.controller_name)
            grip = self.xr_client.get_key_value(self.grip_name)
            trigger = self.xr_client.get_key_value(self.trigger_name)
        except Exception:
            return self.get_zero_action(0.0, current_eef_pos, current_eef_quat)
        if not is_valid_quaternion(raw_pose[3:]):
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        if self.require_grip_reengage:
            if grip <= self.grip_release_threshold:
                self.require_grip_reengage = False
            else:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
                return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        grip_pressed = grip > self.grip_release_threshold if self.grip_active else grip >= self.grip_engage_threshold
        if not grip_pressed:
            self.prev_xr_pos = None
            self.prev_xr_quat = None
            if self.grip_active or self.target_eef_pos is None:
                if current_eef_pos is not None and current_eef_quat is not None:
                    self.target_eef_pos = current_eef_pos.copy()
                    self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = False
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        if not self.grip_active:
            self.prev_xr_pos = raw_pose[:3].copy()
            self.prev_xr_quat = raw_pose[3:].copy()
            if current_eef_pos is not None and current_eef_quat is not None:
                self.target_eef_pos = current_eef_pos.copy()
                self.target_eef_quat = current_eef_quat.copy()
            self.grip_active = True
            return self.get_zero_action(trigger, current_eef_pos, current_eef_quat)
        xr_world_pos, xr_world_quat_xyzw = transform_xr_pose(raw_pose[:3], raw_pose[3:])
        prev_xr_world_pos, prev_xr_world_quat_xyzw = transform_xr_pose(self.prev_xr_pos, self.prev_xr_quat)
        world_delta_pos = (xr_world_pos - prev_xr_world_pos) * self.pos_sensitivity
        # In VR teleop, user "inhabits" the robot: their physical forward (XR +X)
        # maps directly to robot forward (body +X). No rotation needed — the VR view
        # already rotates with the robot, so XR deltas naturally align with body frame.
        # Diagnostic
        if self.frame_count % 30 == 0 and self.target_eef_pos is not None:
            print(f"[XR {self.arm}] delta=({world_delta_pos[0]:+.4f},{world_delta_pos[1]:+.4f},{world_delta_pos[2]:+.4f}) "
                  f"target=({self.target_eef_pos[0]:.3f},{self.target_eef_pos[1]:.3f},{self.target_eef_pos[2]:.3f})")
        pos_norm = np.linalg.norm(world_delta_pos)
        if pos_norm > self.max_pos_per_step:
            world_delta_pos *= self.max_pos_per_step / pos_norm
        world_delta_rot = quat_diff_as_rotvec_xyzw(prev_xr_world_quat_xyzw, xr_world_quat_xyzw) * self.rot_sensitivity
        rot_norm = np.linalg.norm(world_delta_rot)
        if rot_norm > self.max_rot_per_step:
            world_delta_rot *= self.max_rot_per_step / rot_norm
        gripper_action = 1.0 if trigger > 0.5 else -1.0
        if self.target_eef_pos is None:
            self.target_eef_pos = np.zeros(3)
            self.target_eef_quat = np.array([1.0, 0.0, 0.0, 0.0])
        self.target_eef_pos += world_delta_pos
        target_r = quat_wxyz_to_rotation(self.target_eef_quat)
        self.target_eef_quat = rotation_to_quat_wxyz(R.from_rotvec(world_delta_rot) * target_r)
        self.prev_xr_pos = raw_pose[:3].copy()
        self.prev_xr_quat = raw_pose[3:].copy()
        self.frame_count += 1
        action = torch.tensor([
            *self.target_eef_pos,
            *self.target_eef_quat,
            gripper_action,
        ], dtype=torch.float32)
        return action
--- a/scripts/test_sim_streamer.py
+++ b/scripts/test_sim_streamer.py
@@ -0,0 +1,97 @@
 #!/usr/bin/env python3
 """Standalone test for SimVideoStreamer: generates a color-cycling SBS test pattern
 and streams it to a TCP server (e.g. PICO VR headset).
 Usage:
    # 1. On PICO: select SIM-Stereo, click Listen
    # 2. On PC:
    python scripts/test_sim_streamer.py --host 172.20.103.171 --port 12345
    The PICO should display a side-by-side test pattern:
    left half = cycling red, right half = cycling blue.
 """
 import argparse
 import time
 import numpy as np
 def main():
    parser = argparse.ArgumentParser(description="Test SimVideoStreamer with synthetic frames")
    parser.add_argument("--host", type=str, required=True, help="PICO IP address (TCP server)")
    parser.add_argument("--port", type=int, default=12345, help="TCP port")
    parser.add_argument("--width", type=int, default=960, help="Eye width")
    parser.add_argument("--height", type=int, default=540, help="Eye height")
    parser.add_argument("--fps", type=int, default=30, help="Target FPS")
    parser.add_argument("--duration", type=int, default=30, help="Stream duration in seconds")
    args = parser.parse_args()
    # Direct import to avoid pulling in the full mindbot/isaaclab dependency chain
    import importlib.util, pathlib
    _spec = importlib.util.spec_from_file_location(
        "sim_video_streamer",
        pathlib.Path(__file__).resolve().parents[1] / "source/mindbot/mindbot/utils/sim_video_streamer.py",
    )
    _mod = importlib.util.module_from_spec(_spec)
    _spec.loader.exec_module(_mod)
    SimVideoStreamer = _mod.SimVideoStreamer
    streamer = SimVideoStreamer(
        host=args.host,
        port=args.port,
        width=args.width,
        height=args.height,
        fps=args.fps,
    )
    print(f"Connecting to {args.host}:{args.port}...")
    if not streamer.connect():
        print("Failed to connect. Is PICO listening?")
        return
    print(f"Streaming {args.width*2}x{args.height} SBS test pattern for {args.duration}s...")
    frame_interval = 1.0 / args.fps
    start_time = time.time()
    frame_count = 0
    try:
        while time.time() - start_time < args.duration:
            t = time.time() - start_time
            # Generate test pattern: cycling colors
            # Left eye: red gradient with horizontal bars
            left = np.zeros((args.height, args.width, 3), dtype=np.uint8)
            red_val = int(128 + 127 * np.sin(t * 2.0))
            left[:, :, 0] = red_val  # R channel
            # Add horizontal position indicator
            bar_y = int((args.height - 20) * (0.5 + 0.5 * np.sin(t * 1.5)))
            left[bar_y:bar_y+20, :, :] = 255
            # Right eye: blue gradient with horizontal bars
            right = np.zeros((args.height, args.width, 3), dtype=np.uint8)
            blue_val = int(128 + 127 * np.sin(t * 2.0 + 1.0))
            right[:, :, 2] = blue_val  # B channel
            right[bar_y:bar_y+20, :, :] = 255  # same bar position for stereo reference
            if not streamer.send_frame(left, right):
                print("Connection lost.")
                break
            frame_count += 1
            # Maintain frame rate
            elapsed = time.time() - start_time
            expected = frame_count * frame_interval
            if expected > elapsed:
                time.sleep(expected - elapsed)
    except KeyboardInterrupt:
        print("\nStopped by user.")
    elapsed = time.time() - start_time
    print(f"Sent {frame_count} frames in {elapsed:.1f}s ({frame_count/elapsed:.1f} fps)")
    streamer.close()
 if __name__ == "__main__":
    main()
--- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py
+++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_cfg.py
@@ -13,7 +13,7 @@ The following configurations are available:
 """
 import isaaclab.sim as sim_utils
-from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.actuators import IdealPDActuatorCfg, ImplicitActuatorCfg
 from isaaclab.assets import ArticulationCfg
 from mindbot.utils.assets import MINDBOT_ASSETS_DIR
@@ -21,7 +21,7 @@ from mindbot.utils.assets import MINDBOT_ASSETS_DIR
 MINDBOT_CFG = ArticulationCfg(
    spawn=sim_utils.UsdFileCfg(
        # usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindrobot_2i/mindrobot_2i.usd",
-        usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindrobot_2i/contact_sensor.usd",
+        usd_path=f"{MINDBOT_ASSETS_DIR}/robots/mindbot_zed/mindbot_mini.usd",
        activate_contact_sensors=True,
        rigid_props=sim_utils.RigidBodyPropertiesCfg(
            disable_gravity=False,
@@ -29,7 +29,7 @@ MINDBOT_CFG = ArticulationCfg(
        ),
        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
            fix_root_link=False,
-            enabled_self_collisions=True,
+            enabled_self_collisions=False,  # TODO: re-enable after fixing trunk self-collision
            solver_position_iteration_count=8,
            solver_velocity_iteration_count=0,
        ),
@@ -58,11 +58,14 @@ MINDBOT_CFG = ArticulationCfg(
            "right_hand_joint_left": 0.0,
            "right_hand_joint_right": 0.0,
            # ====== Trunk & Head ======
-            "PrismaticJoint": 0.26,
+            # NOTE: pos z = 0.44 (base clearance) + PrismaticJoint value.
            # If you change PrismaticJoint, update pos z accordingly to avoid
            # geometry clipping that causes the robot to fall on spawn.
            "PrismaticJoint": 0.2,
            "head_pitch_Joint": 0.0,
            "head_yaw_Joint": 0.0,
        },
-        pos=(0.0, 0.0, 0.7),
+        pos=(0.001, 0.001, 0.7),  # z = 0.44 (base clearance) + 0.26 (PrismaticJoint)
    ),
    actuators={
        # RM-65 J1-J3: shoulder/elbow — max 180°/s ≈ 3.14 rad/s
@@ -104,12 +107,17 @@ MINDBOT_CFG = ArticulationCfg(
            stiffness=80.0,
            damping=4.0,
        ),
-        "trunk": ImplicitActuatorCfg(
+        # IdealPDActuator: explicitly computes F = k_p*(target-pos) + k_d*(target_vel-vel)
        # and sends via set_dof_actuation_forces(), bypassing PhysX drive which
        # doesn't work correctly for prismatic joints with set_dof_position_targets().
        # NOTE: trunk gravity disabled by startup event. High stiffness to resist
        # arm reaction forces (~150N) — deflection < 2mm at 100k stiffness.
        "trunk": IdealPDActuatorCfg(
            joint_names_expr=["PrismaticJoint"],
-            effort_limit_sim=200.0,
+            effort_limit=2000.0,
-            velocity_limit_sim=0.2,
+            velocity_limit=0.5,
-            stiffness=1000.0,
+            stiffness=100000.0,
-            damping=100.0,
+            damping=5000.0,
        ),
        "wheels": ImplicitActuatorCfg(
            # joint_names_expr=[".*_revolute_Joint"],
@@ -168,6 +176,8 @@ MINDBOT_HEAD_JOIONTS =[
    "head_pitch_Joint"   
 ]
 MINDBOT_TRUNK_JOINT = "PrismaticJoint"
 # EEF body names (as defined in the USD asset)
 MINDBOT_LEFT_EEF_BODY = "left_hand_body"
 MINDBOT_RIGHT_EEF_BODY = "right_hand_body"
--- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py
+++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_2i_dual_arm_ik_env_cfg.py
@@ -30,6 +30,7 @@ from isaaclab.managers import TerminationTermCfg as DoneTerm
 from isaaclab.markers.config import FRAME_MARKER_CFG
 from isaaclab.scene import InteractiveSceneCfg
 from isaaclab.sensors.frame_transformer.frame_transformer_cfg import FrameTransformerCfg, OffsetCfg
 from isaaclab.sensors import CameraCfg
 from isaaclab.sim.schemas.schemas_cfg import RigidBodyPropertiesCfg
 from isaaclab.sim.spawners.from_files.from_files_cfg import GroundPlaneCfg
 from isaaclab.sim.spawners.materials import RigidBodyMaterialCfg
@@ -48,6 +49,7 @@ from .mindrobot_2i_cfg import (
    MINDBOT_RIGHT_GRIPPER_JOINTS,
    MINDBOT_WHEEL_JOINTS,
    MINDBOT_HEAD_JOIONTS,
    MINDBOT_TRUNK_JOINT,
    MINDBOT_LEFT_ARM_PRIM_ROOT,
    MINDBOT_RIGHT_ARM_PRIM_ROOT,
    MINDBOT_LEFT_EEF_PRIM,
@@ -96,6 +98,28 @@ class MindRobotDualArmSceneCfg(InteractiveSceneCfg):
    ee_frame: FrameTransformerCfg = None
    ee_frame_right: FrameTransformerCfg = None
    # VR stereo cameras — reuse existing ZED_X CameraLeft/CameraRight prims from USD
    # (spawn=None means use existing prim, don't create a new one)
    # The ZED_X baseline in USD is 120mm; for perfect VR IPD we'd ideally set 65mm,
    # but reusing existing prims avoids spawn issues. stereoOffset=0 in video_source.yml
    # since sim rendering gives correct perspective regardless of physical baseline.
    vr_left_eye: CameraCfg = CameraCfg(
        prim_path="{ENV_REGEX_NS}/Robot/robot_head/ZED_X/base_link/ZED_X/CameraLeft",
        update_period=1 / 30,
        height=1440,  # 2560x1440 per eye → SBS 5120x1440; effective ~1920x1440 after shader crop
        width=2560,
        data_types=["rgb"],
        spawn=None,
    )
    vr_right_eye: CameraCfg = CameraCfg(
        prim_path="{ENV_REGEX_NS}/Robot/robot_head/ZED_X/base_link/ZED_X/CameraRight",
        update_period=1 / 30,
        height=1440,
        width=2560,
        data_types=["rgb"],
        spawn=None,
    )
 # =====================================================================
 # Actions Configuration
@@ -166,6 +190,13 @@ class MindRobotDualArmActionsCfg:
        scale=1.0,
    )
    trunk_action = JointPositionActionCfg(
        asset_name="robot",
        joint_names=[MINDBOT_TRUNK_JOINT],
        scale=1.0,
        use_default_offset=False,  # action = absolute target, not offset from default
    )
 # =====================================================================
 # Observations Configuration
@@ -236,12 +267,29 @@ def _disable_arm_gravity(env, env_ids: torch.Tensor):
            schemas.modify_rigid_body_properties(arm_path, arm_cfg)
 def _disable_trunk_head_gravity(env, env_ids: torch.Tensor):
    """Disable gravity for trunk and head at startup."""
    import isaaclab.sim.schemas as schemas
    cfg = RigidBodyPropertiesCfg(disable_gravity=True)
    for env_id in range(env.num_envs):
        for prim_path in [
            f"/World/envs/env_{env_id}/Robot/robot_trunk",
            f"/World/envs/env_{env_id}/Robot/robot_head",
        ]:
            schemas.modify_rigid_body_properties(prim_path, cfg)
@configclass
 class MindRobotDualArmEventsCfg:
    disable_arm_gravity = EventTerm(
        func=_disable_arm_gravity,
        mode="startup",
    )
    disable_trunk_head_gravity = EventTerm(
        func=_disable_trunk_head_gravity,
        mode="startup",
    )
    reset_scene = EventTerm(
        func=mdp.reset_scene_to_default,
        mode="reset",
@@ -292,7 +340,7 @@ class MindRobot2iDualArmIKAbsEnvCfg(ManagerBasedRLEnvCfg):
            lookat=(0.0, 0.0, 0.8),    # 瞄准点（机器人腰部高度附近）
            origin_type="world",
        )
-        self.decimation = 2
+        self.decimation = 2  # 25 Hz control (was 2 = 50 Hz, unnecessary for teleop)
        self.episode_length_s = 50.0
        self.sim.dt = 0.01  # 100 Hz
        self.sim.render_interval = 2
--- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py
+++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_cfg.py
@@ -56,10 +56,13 @@ MINDBOT_CFG = ArticulationCfg(
            "right_hand_joint_left": 0.0,
            "right_hand_joint_right": 0.0,
            # ====== Trunk & Head ======
            # NOTE: pos z = 0.44 (base clearance) + PrismaticJoint value.
            # If you change PrismaticJoint, update pos z accordingly to avoid
            # geometry clipping that causes the robot to fall on spawn.
            "PrismaticJoint": 0.26,
            "head_revoluteJoint": 0.0,
        },
-        pos=(0.0, 0.0, 0.7),
+        pos=(0.0, 0.0, 0.7),  # z = 0.44 (base clearance) + 0.26 (PrismaticJoint)
    ),
    actuators={
        # RM-65 J1-J3: shoulder/elbow — max 180°/s ≈ 3.14 rad/s
--- a/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py
+++ b/source/mindbot/mindbot/tasks/manager_based/il/open_drybox/mindrobot_dual_arm_ik_env_cfg.py
@@ -283,7 +283,7 @@ class MindRobotDualArmIKAbsEnvCfg(ManagerBasedRLEnvCfg):
            lookat=(0.0, 0.0, 0.8),    # 瞄准点（机器人腰部高度附近）
            origin_type="world",
        )
-        self.decimation = 2
+        self.decimation = 2  # 25 Hz control (was 2 = 50 Hz, unnecessary for teleop)
        self.episode_length_s = 50.0
        self.sim.dt = 0.01  # 100 Hz
        self.sim.render_interval = 2
--- a/source/mindbot/mindbot/utils/assets.py
+++ b/source/mindbot/mindbot/utils/assets.py
@@ -20,7 +20,7 @@ import os
 ##
 MINDBOT_ASSETS_DIR: str = os.environ.get(
    "MINDBOT_ASSETS_DIR",
-    # "/home/tangger/LYT/maic_usd_assets_moudle",
+    "/home/tangger/LYT/maic_usd_assets_moudle",
-    "/home/maic/xh/maic_usd_assets_moudle"
+    # "/home/maic/xh/maic_usd_assets_moudle"
    # "/home/maic/LYT/maic_usd_assets_moudle",
 )
--- a/source/mindbot/mindbot/utils/sim_video_streamer.py
+++ b/source/mindbot/mindbot/utils/sim_video_streamer.py
@@ -0,0 +1,238 @@
 # SPDX-FileCopyrightText: Copyright (c) 2022-2025, The Isaac Lab Project Developers.
 # SPDX-License-Identifier: BSD-3-Clause
 """Simulation stereo video streamer for VR headset.
 Encodes stereo RGB frames (left + right eye) as H264 side-by-side video
 and streams to the Unity VR client over TCP, using the same 4-byte
 big-endian length-prefixed NALU protocol as RobotVisionConsole.
 Usage:
    streamer = SimVideoStreamer("172.20.103.171", 12345, width=960, height=540)
    streamer.connect()
    # In the sim loop:
    streamer.send_frame(left_rgb_np, right_rgb_np)
    # Cleanup:
    streamer.close()
 The Unity client (PICO) acts as TCP SERVER on the specified port.
 This streamer connects as TCP CLIENT — same role as RobotVisionConsole.
 """
 from __future__ import annotations
 import logging
 import socket
 import struct
 import threading
 import time
 from typing import Optional
 import av
 import numpy as np
 logger = logging.getLogger(__name__)
 class SimVideoStreamer:
    """H264 stereo video streamer over TCP for VR headset display.
    Accepts left/right RGB numpy arrays, assembles a side-by-side frame,
    encodes to H264 with PyAV, and sends each NALU packet prefixed with
    a 4-byte big-endian length header over a TCP socket.
    """
    def __init__(
        self,
        host: str,
        port: int = 12345,
        width: int = 960,
        height: int = 540,
        fps: int = 30,
        bitrate: int = 20_000_000,
    ):
        """Initialize the streamer.
        Args:
            host: IP address of the PICO VR headset (TCP server).
            port: TCP port the PICO is listening on.
            width: Width of EACH eye image (SBS output = width*2 x height).
            height: Height of each eye image.
            fps: Target frame rate for H264 encoding.
            bitrate: H264 encoding bitrate in bits per second.
        """
        self.host = host
        self.port = port
        self.eye_width = width
        self.eye_height = height
        self.sbs_width = width * 2
        self.sbs_height = height
        self.fps = fps
        self.bitrate = bitrate
        self._sock: Optional[socket.socket] = None
        self._codec_ctx: Optional[av.CodecContext] = None
        self._frame_idx = 0
        self._last_frame_time = 0.0
        self._min_frame_interval = 1.0 / fps
        # Async encoding: background thread picks up frames from _pending_frame
        self._pending_frame: Optional[np.ndarray] = None
        self._frame_lock = threading.Lock()
        self._encode_thread: Optional[threading.Thread] = None
        self._running = False
    # ------------------------------------------------------------------
    # Public API
    # ------------------------------------------------------------------
    def connect(self) -> bool:
        """Connect to the PICO VR headset TCP server.
        Returns:
            True if connection succeeded.
        """
        try:
            self._sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            self._sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
            self._sock.connect((self.host, self.port))
            logger.info(f"[SimVideoStreamer] Connected to {self.host}:{self.port}")
        except OSError as e:
            logger.error(f"[SimVideoStreamer] TCP connect failed: {e}")
            self._sock = None
            return False
        self._init_encoder()
        self._frame_idx = 0
        self._last_frame_time = 0.0
        # Start background encode+send thread
        self._running = True
        self._encode_thread = threading.Thread(target=self._encode_loop, daemon=True)
        self._encode_thread.start()
        return True
    def send_frame(self, left_rgb: np.ndarray, right_rgb: np.ndarray) -> bool:
        """Submit a stereo frame for async encoding and sending.
        This method is NON-BLOCKING: it copies the frame data and returns
        immediately. The background thread handles encoding and TCP send.
        If the previous frame hasn't been encoded yet, this frame replaces it
        (latest-frame-wins, no queue buildup).
        Args:
            left_rgb: Left eye image, shape (H, W, 3), dtype uint8, RGB order.
            right_rgb: Right eye image, shape (H, W, 3), dtype uint8, RGB order.
        Returns:
            True if frame was accepted. False if connection is lost.
        """
        if not self._running:
            return False
        # Assemble SBS and hand off to background thread (latest-frame-wins)
        sbs = np.concatenate([left_rgb, right_rgb], axis=1)  # (H, W*2, 3)
        with self._frame_lock:
            self._pending_frame = sbs
        return True
    def close(self):
        """Stop background thread, flush encoder, and close connection."""
        self._running = False
        if self._encode_thread is not None:
            self._encode_thread.join(timeout=2.0)
            self._encode_thread = None
        if self._codec_ctx is not None:
            try:
                packets = self._codec_ctx.encode(None)
                if self._sock is not None:
                    for pkt in packets:
                        nalu_data = bytes(pkt)
                        header = struct.pack(">I", len(nalu_data))
                        self._sock.sendall(header + nalu_data)
            except Exception:
                pass
            self._codec_ctx = None
        if self._sock is not None:
            try:
                self._sock.close()
            except Exception:
                pass
            self._sock = None
            logger.info("[SimVideoStreamer] Connection closed.")
    # ------------------------------------------------------------------
    # Private
    # ------------------------------------------------------------------
    def _init_encoder(self):
        """Initialize H264 encoder via PyAV."""
        codec = av.Codec("libx264", "w")
        self._codec_ctx = codec.create()
        self._codec_ctx.width = self.sbs_width
        self._codec_ctx.height = self.sbs_height
        self._codec_ctx.pix_fmt = "yuv420p"
        from fractions import Fraction
        self._codec_ctx.time_base = Fraction(1, self.fps)
        self._codec_ctx.framerate = Fraction(self.fps, 1)
        self._codec_ctx.bit_rate = self.bitrate
        self._codec_ctx.gop_size = self.fps  # keyframe every 1 second
        self._codec_ctx.max_b_frames = 0  # no B-frames for low latency
        self._codec_ctx.thread_count = 2
        # Low-latency encoding options
        self._codec_ctx.options = {
            "preset": "ultrafast",
            "tune": "zerolatency",
            "profile": "baseline",  # widest decoder compatibility
        }
        self._codec_ctx.open()
        logger.info(
            f"[SimVideoStreamer] H264 encoder initialized: "
            f"{self.sbs_width}x{self.sbs_height} @ {self.fps}fps, "
            f"bitrate={self.bitrate/1e6:.1f}Mbps"
        )
    def _encode_loop(self):
        """Background thread: encode and send frames at target fps."""
        while self._running:
            # Grab the latest pending frame (if any)
            with self._frame_lock:
                sbs = self._pending_frame
                self._pending_frame = None
            if sbs is None:
                time.sleep(0.001)  # 1ms idle wait
                continue
            # Rate limit encoding to target fps
            now = time.monotonic()
            if now - self._last_frame_time < self._min_frame_interval:
                continue
            self._last_frame_time = now
            # Encode
            frame = av.VideoFrame.from_ndarray(sbs, format="rgb24")
            frame.pts = self._frame_idx
            self._frame_idx += 1
            try:
                packets = self._codec_ctx.encode(frame)
            except av.error.FFmpegError as e:
                logger.warning(f"[SimVideoStreamer] Encode error: {e}")
                continue
            # Send each packet
            for pkt in packets:
                nalu_data = bytes(pkt)
                header = struct.pack(">I", len(nalu_data))
                try:
                    self._sock.sendall(header + nalu_data)
                except (BrokenPipeError, ConnectionResetError, OSError) as e:
                    logger.error(f"[SimVideoStreamer] Send failed: {e}")
                    self._running = False
                    return