Abstract:
Accurately estimating the six-degrees of freedom (DoF) pose of unseen objects is crucial for successful robotic manipulation in industrial automation. Some existing metho...Show MoreMetadata
Abstract:
Accurately estimating the six-degrees of freedom (DoF) pose of unseen objects is crucial for successful robotic manipulation in industrial automation. Some existing methods for this task rely on prior knowledge of individual objects, i.e., the model must be trained on the exact object instance or object category. Others perform unseen object pose estimation but are limited in their feature learning and pose refinement ability. To address these problems, we propose an unseen object pose estimation method that follows a coarse-to-fine framework and leverages the powerful learning ability of diffusion models. We introduce a diffusion model for generating object poses, and conduct a comparison between the generated poses and the original pose to determine the optimal one. We design a novel pose estimation module to provide coarse poses for the PoseDiffusion. This module comprises two feature extraction modules that extract global and masked features. In addition, we propose a strategy to estimate the pose by comparing the similarity between rendered and query poses. The renderings of an unseen object from various viewpoints are generated from its computer-aided design (CAD) model. Our method requires a CAD model of the unseen object only during inference, a scenario well suited to industrial applications. Experimental evaluation on benchmark datasets demonstrates that the proposed framework outperforms existing approaches, achieving state-of-the-art performance in six-DoF object pose estimation.
Published in: IEEE Transactions on Industrial Informatics ( Volume: 20, Issue: 9, September 2024)