GitHub - xiaoxiaoxh/reactive_diffusion_policy: [RSS 2025] Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

Reactive Diffuison Policy:

Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

RSS 2025

Han Xue^1*, Jieji Ren^1*, Wendi Chen^1*,
Gu Zhang^234†, Yuan Fang^1†, Guoying Gu, Huazhe Xu^234‡, Cewu Lu^15‡
¹Shanghai Jiao Tong University ²Tsinghua University, IIIS ³Shanghai Qi Zhi Institute
⁴Shanghai AI Lab ⁵Shanghai Innovation Institute
^*Equal contribution ^†Equal contribution ^‡Equal advising

TODO

Release the code of TactAR and Quest3 APP.
Release the code of RDP.
Release the data.
Release the checkpoints.
Add guide for customized tasks, tactile / force sensors and robots.
Add guide for creating the tactile dataset and the tactile embedding.
Support more robots (e.g. Franka) (ETA: May 2025).

⚙️ Environment Setup

📝 Use Customized Tactile / Force Sensors, Robots and Customized Tasks

Please refer to docs/customized_deployment_guide.md.

Hardware

Device List

Meta Quest 3 VR headset.
Workstation with Ubuntu 22.04 for compatibility with ROS2 Humble.

A workstation with a GPU (e.g., NVIDIA RTX 3090) is required. If GelSight Mini is used, a high-performance CPU (e.g., Core i9-13900K) is required to ensure 24 FPS tactile sensing.
2 robot arms with (optional) joint torque sensors.

We use Flexiv Rizon 4 with the GRAV gripper by default and will support single-arm robot and other robots soon.
1-3 RealSense cameras.

We use D435 for wrist camera and D415 for external cameras. Follow the official document to install librealsense2.
(Optional) 1-2 GelSight Mini tactile sensors with tracking marker gel.

We use 1 sensor for each robot arm. Download the CAD model and 3D print the mount to attach the sensor to the GRAV gripper.

Software

Quest 3 Setup

Build and install the TactAR APP on the Quest 3 according to the README in our Unity Repo.

You can also adopt other Teleoperation system with force feedback.

Workstation Setup

Follow the official document to install ROS2 Humble.

Since ROS2 has some compatibility issues with Conda, we recommend using a virtual environment with venv.

python3 -m venv rdp_venv
source rdp_venv/bin/activate
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

(Optional) Follow third_party/mvsdk/README.md to install MindVision SDK. This package is needed only if you want to record experiment videos with MindVision cameras.

📦 Data Collection

TactAR Setup

For the workstation, the environment and the task have to be configured first and then start several services for teleoperating robots, publishing sensor data and record the data.
1. Environment and Task Configuration.
  - Calibration. Example calibration files are proviced in data/calibration. Each A_to_B_transform.json contains the transformation from coordinate system A to coordinate system B.
    
    We use calibration files only for bimanual manipulation.
  - Environment Configuration. Edit reactive_diffusion_policy/config/task/real_robot_env.yaml to configure the environment settings including host_ip, robot_ip, vr_server_ip and calibration_path.
  - Task Configuration. Create task config file which assigns the camera and sensor to use. You can take reactive_diffusion_policy/config/task/real_peel_two_realsense_one_gelsight_one_mctac_24fps.yaml as an example. Refer to docs/customized_deployment_guide.md for more details.
2. Start services. Run each command in a separate terminal. You can use tmux to split the terminal.
```
# start teleoperation server
python teleop.py task=[task_config_file_name]
# start camera node launcher
python camera_node_launcher.py task=[task_config_file_name]
# start data recorder
python record_data.py --save_to_disk --save_file_dir [task_data_dir] --save_file_name [record_seq_file_name]
```
For Quest 3, follow the user guide in our Unity Repo to run the TactAR APP.

(Important) Data Collection Tips

Please refer to docs/data_collection_tips.md.

Example Data

We provide the data we collected on .

Generate your own tactile embedding

Please refer to the tactile embedding guide for collecting your own tactile dataset and generate tactile embedding based on this dataset. You can also directly use our pre-calculated PCA transformation matrix for tactile embedding in data/PCA_Transform_GelSight.

Data Postprocessing

Change the config in post_process_data.py (including TAG, ACTION_DIM, TEMPORAL_DOWNSAMPLE_RATIO and SENSOR_MODE) and execute the command

python post_process_data.py

We use Zarr to store the data. After postprocessing, you may see the following structure:

 ├── action (25710, 4) float32
 ├── external_img (25710, 240, 320, 3) uint8
 ├── left_gripper1_img (25710, 240, 320, 3) uint8
 ├── left_gripper1_initial_marker (25710, 63, 2) float32
 ├── left_gripper1_marker_offset (25710, 63, 2) float32
 ├── left_gripper1_marker_offset_emb (25710, 15) float32
 ├── left_gripper2_img (25710, 240, 320, 3) uint8
 ├── left_gripper2_initial_marker (25710, 25, 2) float32
 ├── left_gripper2_marker_offset (25710, 25, 2) float32
 ├── left_gripper2_marker_offset_emb (25710, 15) float32
 ├── left_robot_gripper_force (25710, 1) float32
 ├── left_robot_gripper_width (25710, 1) float32
 ├── left_robot_tcp_pose (25710, 9) float32
 ├── left_robot_tcp_vel (25710, 6) float32
 ├── left_robot_tcp_wrench (25710, 6) float32
 ├── left_wrist_img (25710, 240, 320, 3) uint8
 ├── right_robot_gripper_force (25710, 1) float32
 ├── right_robot_gripper_width (25710, 1) float32
 ├── right_robot_tcp_pose (25710, 9) float32
 ├── right_robot_tcp_vel (25710, 6) float32
 ├── right_robot_tcp_wrench (25710, 6) float32
 ├── target (25710, 4) float32
 └── timestamp (25710,) float32

📚 Training

Task Configuration. In addition to the task config file used in data collection, another file is needed to configure dataset, runner, and model-related parameters such as obs and action. You can take reactive_diffusion_policy/config/task/real_peel_image_dp_absolute_12fps.yaml as an example. Refer to docs/customized_deployment_guide.md for more details.

The dp, at and ldp in the config file name indicate the Diffusion policy, Asymmetric Tokenizer and Latent Diffusion Policy.
Generate Dataset with Correct Frequency. The fps at the end of config file name indicates the control frequency. Make sure the control_fps in the task config file is consistent with dataset. For instance, we record 24fps data and want to train a model with 12fps control frequency, then we have to modify the TEMPORAL_DOWNSAMPLE_RATIO in post_process_data.py to 2 to generate the correct dataset.

We record 24fps data for experiments. We train DP with 12fps control frequency and RDP (AT + LDP) with 24fps control frequency.
Run the Training Script. We provide training scripts for Diffusion Policy and Reactive Diffusion Policy. You can modify the training script to train the desired task and model.
```
# config multi-gpu training
accelerate config
# Diffusion Policy
./train_dp.sh
# Reactive Diffusion Policy
./train_rdp.sh
```

🚀 Inference

(Optional) Refer to vcamera_server_ip and vcamera_server_port in the task config file and start the corresponding vcamera server
```
# run vcamera server
python vcamera_server.py --host_ip [host_ip] --port [port] --camera_id [camera_id]
```

Modify eval.sh to set the task and model you want to evaluate and run the command in separate terminals.

# start teleoperation server
python teleop.py task=[task_config_file_name]
# start camera node launcher
python camera_node_launcher.py task=[task_config_file_name]
# start inference
./eval.sh

Checkpoints

We provide the checkpoints in our experiments on .

❔ Q&A

Please refer to docs/Q&A.md.

🙏 Acknowledgement

Our work is built upon Diffusion Policy, VQ-BeT, Stable Diffusion, UMI and Data Scaling Laws. Thanks for their great work!

🔗 Citation

If you find our work useful, please consider citing:

@article{xue2025reactive,
  title     = {Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation},
  author    = {Xue, Han and Ren, Jieji and Chen, Wendi and Zhang, Gu and Fang, Yuan and Gu, Guoying and Xu, Huazhe and Lu, Cewu},
  journal   = {arXiv preprint arXiv:2503.02881},
  year      = {2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reactive Diffuison Policy:

Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

RSS 2025

TODO

⚙️ Environment Setup

📝 Use Customized Tactile / Force Sensors, Robots and Customized Tasks

Hardware

Device List

Software

Quest 3 Setup

Workstation Setup

📦 Data Collection

TactAR Setup

(Important) Data Collection Tips

Example Data

Generate your own tactile embedding

Data Postprocessing

📚 Training

🚀 Inference

Checkpoints

❔ Q&A

🙏 Acknowledgement

🔗 Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
assets		assets
data		data
docs		docs
reactive_diffusion_policy		reactive_diffusion_policy
tests		tests
third_party		third_party
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
camera_node_launcher.py		camera_node_launcher.py
eval.sh		eval.sh
eval_real_robot_flexiv.py		eval_real_robot_flexiv.py
post_process_data.py		post_process_data.py
record_data.py		record_data.py
requirements.txt		requirements.txt
teleop.py		teleop.py
train.py		train.py
train_dp.sh		train_dp.sh
train_rdp.sh		train_rdp.sh
vcamera_server.py		vcamera_server.py

License

xiaoxiaoxh/reactive_diffusion_policy

Folders and files

Latest commit

History

Repository files navigation

Reactive Diffuison Policy:

Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

RSS 2025

TODO

⚙️ Environment Setup

📝 Use Customized Tactile / Force Sensors, Robots and Customized Tasks

Hardware

Device List

Software

Quest 3 Setup

Workstation Setup

📦 Data Collection

TactAR Setup

(Important) Data Collection Tips

Example Data

Generate your own tactile embedding

Data Postprocessing

📚 Training

🚀 Inference

Checkpoints

❔ Q&A

🙏 Acknowledgement

🔗 Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages