Abstract:
6D object pose estimation for texture-less objects from RGB images remains challenging, especially in occlusion scenarios. Instead of localizing sparse keypoints by regre...Show MoreMetadata
Abstract:
6D object pose estimation for texture-less objects from RGB images remains challenging, especially in occlusion scenarios. Instead of localizing sparse keypoints by regressing their image coordinates or heatmaps, which are sensitive to occlusion, we introduce GeoPose, a novel reconstruction guided pose estimation pipeline that predicts dense correspondences and leverages geometric consistency effectively. We first design a dense reconstruction network (ReconNet) to reconstruct pixel-wise object coordinates in normalization space. Dense 2D-3D correspondences are generated intuitively by our explicit parameterization for 3D object models, which dismisses keypoint selection efforts. These 2D-3D correspondences are then utilized to estimate 6D poses by the PnP algorithm with RANSAC iterations. Furthermore, a novel Cycle Loss is proposed to provide 3D prior supervision, which significantly correlates with the pose estimation task by guiding geometric consistency between reconstruction (pixel to 3D) and reprojection (3D to pixel). In addition, a training data augmentation method is proposed to handle the insufficiency of 6D datasets, the acquisition of which is error-prone and time-consuming. Extensive experiments demonstrate that, compared with existing RGB-based methods, our GeoPose can achieve state-of-the-art (SOTA) 6D pose estimation performance on the LINEMOD, Occlusion LINEMOD and T-LESS datasets.
Published in: IEEE Transactions on Multimedia ( Volume: 24)