CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes

1Peking University
🏆 NeurIPS 2025

CADGrasp learns a contact- and collision-aware intermediate representation as a constraint, and further obtains the dexterous grasp pose with an optimization method to achieve single-view dexterous hand grasping in cluttered scenes.

Abstract

Dexterous grasping in cluttered environments presents substantial challenges due to the high degrees of freedom of dexterous hands, occlusion, and potential collisions arising from diverse object geometries and complex layouts. To address these challenges, we propose CADGrasp, a two-stage algorithm for general dexterous grasping using single-view point cloud inputs. In the first stage, we predict a scene-decoupled, contact- and collision-aware representation—sparse IBS—as the optimization target. Sparse IBS compactly encodes the geometric and contact relationships between the dexterous hand and the scene, enabling stable and collision-free dexterous grasp pose optimization. To enhance the prediction of this high-dimensional representation, we introduce an occupancy-diffusion model with voxel-level conditional guidance and force closure score filtering. In the second stage, we develop several energy functions and ranking strategies for optimization based on sparse IBS to generate high-quality dexterous grasp poses. Extensive experiments in both simulated and real-world settings validate the effectiveness of our approach, demonstrating its capability to mitigate collisions while maintaining a high grasp success rate across diverse objects and complex scenes.

Video

Method Highlights

🎯

Sparse IBS Representation

A scene-decoupled, contact- and collision-aware intermediate representation that compactly encodes geometric and contact relationships.

🔮

Occupancy-Diffusion Model

Novel diffusion model with voxel-level conditional guidance for high-dimensional IBS prediction from single-view point clouds.

🔄

Cross-Embodiment Potential

Since IBS is independent of hand morphology and kinematics, our method enables zero-shot optimization-based grasping for unseen robotic hands.

Pipeline Overview

Overview of CADGrasp, a two-stage framework for dexterous grasping in cluttered scenes:

  • I. Conditional IBS Generation: A diffusion model is trained to model the conditional probability distribution \( p(\mathcal{I}|\mathcal{P}, \mathbf{T}) \).
  • II. Grasp Pose Optimization: We optimize the grasp poses \( \mathcal{G} \) with predicted sparse IBS \( \hat{I} \) as constraints.

Predicted IBS Visualization

Interactive visualization of predicted Sparse IBS and optimization trajectory. Select a scene to explore.

Select Scene:

🎯 Predicted Sparse IBS

⚙️ Optimization Progress

BibTeX

@inproceedings{zhang2025cadgrasp,
  title={CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes},
  author={Zhang, Jiyao and Ma, Zhiyuan and Wu, Tianhao and Chen, Zeyuan and Dong, Hao},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2025}
}