Shape Descriptor Guided Learning for Category-Level Object Pose Estimation

Liu, Yun; Wang, Weiming; Wang, Fu Lee; Xie, Haoran; Chen, Honghua; Wei, Mingqiang; Qin, Jing

doi:10.1007/978-3-031-82024-3_4

Yun Liu¹³,
Weiming Wang¹³,
Fu Lee Wang¹³,
Haoran Xie¹⁴,
Honghua Chen¹⁵,
Mingqiang Wei¹⁵ &
…
Jing Qin¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15340))

Included in the following conference series:

Computer Graphics International Conference

133 Accesses

Abstract

Category-level object pose estimation plays a crucial role in a wide range of practical applications by accurately predicting the poses and sizes of unseen objects within a specific category. However, accurately estimating object poses remains a significant challenge due to substantial shape variations within the same category. To address this issue, this paper introduces a novel learning network for object pose estimation that is guided by a shape descriptor. By capturing the geometric information of an object’s shape, the shape descriptor provides valuable input for subsequent feature learning, effectively handling shape variations. Moreover, our framework incorporates a confidence-based pose estimator, which assigns confidence scores to each pose prediction. This integration allows for the acquisition of more accurate poses with higher confidence by penalizing poses with low confidence. Experimental results on the CAMERA25 and REAL275 datasets demonstrate the superiority of our approach over state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, D., Li, J., Wang, Z., Xu, K.: Learning canonical shape space for category-level 6D object pose and size estimation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11973–11982 (2020)
Google Scholar
Chen, K., Dou, Q.: SGPA: structure-guided prior adaptation for category-level 6d object pose estimation. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2753–2762 (2021)
Google Scholar
Chen, W., Jia, X., Chang, H.J., Duan, J., Shen, L., Leonardis, A.: FS-Net: fast shape-based network for category-level 6d object pose estimation with decoupled rotation mechanism. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1581–1590 (2021)
Google Scholar
Di, Y., et al.: GPV-pose: category-level object pose estimation via geometry-guided point-wise voting, pp. 6771–6781 (2022)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Communications of the ACM, pp. 381–395 (1981)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE Computer Society, Los Alamitos, CA, USA (2016)
Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42, 386–397 (2020)
Article Google Scholar
He, Y., Huang, H., Fan, H., Chen, Q., Sun, J.: FFB6D: a full flow bidirectional fusion network for 6d pose estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3002–3012 (2021)
Google Scholar
He, Y., Sun, W., Huang, H., Liu, J., Fan, H., Sun, J.: PVN3D: a deep point-wise 3D keypoints voting network for 6dof pose estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11629–11638 (2020)
Google Scholar
Lee, T., et al.: UDA-COPE: unsupervised domain adaptation for category-level object pose estimation. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14871–14880. IEEE Computer Society, Los Alamitos, CA, USA (2022)
Google Scholar
Li, G., et al.: Generative category-level shape and pose estimation with semantic primitives. In: 6th Annual Conference on Robot Learning (2022)
Google Scholar
Li, Z., Wang, G., Ji, X.: CDPN: coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7677–7686 (2019)
Google Scholar
Lin, H., Liu, Z., Cheang, C., Fu, Y., Guo, G., Xue, X.: SAR-Net: shape alignment and recovery network for category-level 6D object pose and size estimation. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6697–6707 (2022)
Google Scholar
Lin, H., Liu, Z., Cheang, C.H., Zhang, L., Fu, Y., Xue, X.: DONet: learning category-level 6D object pose and size estimation from depth observation. ArXiv abs/2106.14193 (2021)
Google Scholar
Lin, J., Wei, Z., Li, Z., Xu, S., Jia, K., Li, Y.: DualPoseNet: category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency, pp. 3540–3549 (2021)
Google Scholar
Lin, J., Wei, Z., Ding, C., Jia, K.: Category-level 6d object pose and size estimation using self-supervised deep prior deformation networks, pp. 19–34 (2022)
Google Scholar
Lin, Z.H., Huang, S.Y., Wang, Y.C.F.: Convolution in the cloud: learning deformable kernels in 3d graph convolution networks for point cloud analysis. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1809 (2020)
Google Scholar
Liu, C., et al.: Adaptive smooth L1 loss: a better way to regress scene texts with extreme aspect ratios. In: 2021 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7 (2021)
Google Scholar
Liu, J., Chen, Y., Ye, X., Qi, X.: IST-Net: prior-free category-level pose estimation with implicit space transformation, pp. 13932–13942 (2023)
Google Scholar
Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4556–4565 (2019)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Sahin, C., Kim, T.: Category-level 6D object pose recovery in depth images. In: Computer Vision – ECCV 2018 Workshops, vol. 11129, pp. 665–681 (2018)
Google Scholar
Wang, C., et al.: DenseFusion: 6D object pose estimation by iterative dense fusion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3338–3347 (2019)
Google Scholar
Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6D object pose and size estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2637–2646 (Jun 2019)
Google Scholar
Wang, J., Chen, K., Dou, Q.: Category-level 6D object pose estimation via cascaded relation and recurrent reconstruction networks. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4807–4814 (2021)
Google Scholar
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6d object pose estimation in cluttered scenes. In: Robotics: Science and Systems XIV (2018)
Google Scholar
Zhang, R., Di, Y., Lou, Z., Manhardt, F., Tombari, F., Ji, X.: RBP-Pose: residual bounding box projection for category-level pose estimation, pp. 655–672 (2022)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6230–6239 (2017)
Google Scholar

Download references

Acknowledgements

The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (No. UGC/FDS16/E14/21).

Author information

Authors and Affiliations

Hong Kong Metropolitan University, Hong Kong, China
Yun Liu, Weiming Wang & Fu Lee Wang
Lingnan University, Hong Kong, China
Haoran Xie
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Honghua Chen & Mingqiang Wei
The Hong Kong Polytechnic University, Hong Kong, China
Jing Qin

Authors

Yun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weiming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fu Lee Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Xie
View author publications
You can also search for this author in PubMed Google Scholar
Honghua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mingqiang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jing Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiming Wang .

Editor information

Editors and Affiliations

University of Geneva, Geneva, Switzerland
Nadia Magnenat-Thalmann
The University of Sydney, Sydney, NSW, Australia
Jinman Kim
Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
University of Houston, Houston, TX, USA
Zhigang Deng
EPFL, Lausanne, Switzerland
Daniel Thalmann
The Hong Kong Polytechnic University, Kowloon, Hong Kong
Ping Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y. et al. (2025). Shape Descriptor Guided Learning for Category-Level Object Pose Estimation. In: Magnenat-Thalmann, N., Kim, J., Sheng, B., Deng, Z., Thalmann, D., Li, P. (eds) Advances in Computer Graphics. CGI 2024. Lecture Notes in Computer Science, vol 15340. Springer, Cham. https://doi.org/10.1007/978-3-031-82024-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-82024-3_4
Published: 25 February 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-82023-6
Online ISBN: 978-3-031-82024-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Shape Descriptor Guided Learning for Category-Level Object Pose Estimation