DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

Lu, Yinning; Pei, Songwei

doi:10.1007/s10489-024-05942-9

DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

Published: 30 December 2024

Volume 55, article number 240, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yinning Lu¹ &
Songwei Pei¹

77 Accesses
Explore all metrics

Abstract

With the benefit of reduced memory and computational overhead, the sparse-based 6 degrees-of-freedom (6D) pose estimation method leverages the creation of sparse two-dimensional (2D) to three-dimensional (3D) correspondences to estimate the pose of objects in an RGB image. However, this method often leads to accuracy degradation. In this paper, we propose a data field weighting based pixel-wise voting network (DFW-PVNet), aiming at improving the accuracy of the 6D pose estimation while keeping excellent memory and computational overheads. The proposed DFW-PVNet first assigns potential weights to pixels at different positions by utilizing data field theory and then selects the pixels with higher potential weights to participate in the voting and locating of 2D keypoints. By building accurate sparse 2D-3D correspondences between the located 2D keypoints and the corresponding predefined 3D keypoints, the 6D pose of the object can be calculated through a perspective-n-point (PnP) solver. Experiments are conducted based on the LINEMOD and the Occlusion LINEMOD datasets, and the results show that the accuracy of the proposed method surpasses the state-of-the-art sparse-based methods and is comparable to dense-based methods but with significantly lower memory and computational overheads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time 3D reconstruction using point-dependent pose graph optimization framework

Article 29 January 2022

CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer

Article 17 June 2024

An Efficient and Consistent Solution to the PnP Problem

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The datasets for this study can be found in the online repository [https://bop.felk.cvut.cz/datasets/].

References

Fan Z, Zhu Y, He Y et al (2022) Deep learning on monocular object pose detection and tracking: A comprehensive overview. ACM Comput Surv 55(4):1–40
Article MATH Google Scholar
Wang Y, Xie J, Cheng J, Dou L (2023) Review of object pose estimation in RGB images based on deep learning. J Comp Appl 43(8):2546
MATH Google Scholar
Gorschlüter F, Rojtberg P, Pöllabauer T (2022) A survey of 6d object detection based on 3d models for industrial applications. J Imaging 8(3):53
Article Google Scholar
Wang C, Xu D, Zhu Y, Martín-Martín R, Lu C, Fei-Fei L, Savarese S (2019) Densefusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3343–3352
He Y, Sun W, Huang H, Liu J, Fan H, Sun J (2020) PVN3D: A deep point-wise 3D keypoints voting network for 6DoF pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11632–11641
He Y, Wang Y, Fan H, Sun J, Chen Q (2022) FS6D: Few-shot 6D pose estimation of novel objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6814–6824
Brachmann E, Michel F, Krull A, Yang MY, Gumhold S (2016) Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3364–3372
Rad M, Lepetit V (2017) Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3828–3836
Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 292–301
Zhou Y, Liu S (2022) Object pose estimation based on improved YOLOX algorithm. In: 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS), pp 699–705. IEEE
Zhang S, Zhao W, Guan Z, Peng X, Peng J (2021) Keypoint-graph-driven learning framework for object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1065–1073
Oberweger M, Rad M, Lepetit V (2018) Making deep heatmaps robust to partial occlusions for 3D object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 119–134
Park K, Patten T, Vincze M (2019) Pix2pose: Pixel-wise coordinate regression of objects for 6D pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 7668–7677s
Peng S, Liu Y, Huang Q, Zhou X, Bao H (2019) PVNet: Pixel-wise voting network for 6DoF object pose estimation. IEEE Trans Pattern Anal Mach Intell 14(8)
Li Z, Wang G, Ji X (2019) CDPN: Coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 7678–7687
Wang G, Manhardt F, Tombari F, Ji X (2021) GDR-Net: Geometry-guided direct regression network for monocular 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16611–16621
Li D, Du Y (2017) Artificial intelligence with uncertainty. CRC Press
Book MATH Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) SUN database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3485–3492. IEEE
Hinterstoisser S, Cagniart C, Ilic S, Sturm P, Navab N, Fua P, Lepetit V (2011) Gradient response maps for real-time detection of textureless objects. IEEE Trans Pattern Anal Mach Intell 34(5):876–888
Article Google Scholar
Hu Y, Hugonot J, Fua P, Salzmann M (2019) Segmentation-driven 6D object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3385–3394
Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: An accurate O(n) solution to the PnP problem. Int J Comput Vis 81:155–166
Article MATH Google Scholar
Zakharov S, Shugurov I, Ilic S (2019) DPOD: 6D pose object detector and refiner. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1941–1950
Hodan T, Barath D, Matas J (2020) EPOS: Estimating 6D pose of objects with symmetries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11703–11712
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: Exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Advances in Neural Information Processing Systems 31
Guo, S., Hu, Y., Alvarez, J. M., & Salzmann, M. (2023). Knowledge distillation for 6d pose estimation by aligning distributions of local predictions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 18633–18642
Xu, Y., Lin, K. Y., Zhang, G., Wang, X., & Li, H. (2024). Rnnpose: 6-dof object pose estimation via recurrent correspondence field estimation and pose optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence
Li, F., Vutukur, S. R., Yu, H., Shugurov, I., Busam, B., Yang, S., & Ilic, S. (2023). Nerf-pose: A first-reconstruct-then-regress approach for weakly-supervised 6d object pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2123–2133
Liu P, Zhang Q, Cheng J (2024) Bdr6d: Bidirectional deep residual fusion network for 6d pose estimation. IEEE Trans Autom Sci Eng 21(2):1793–1804
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (NSFC) under Grant No.61772061.

Author information

Authors and Affiliations

School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing, 100876, China
Yinning Lu & Songwei Pei

Authors

Yinning Lu
View author publications
You can also search for this author in PubMed Google Scholar
Songwei Pei
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yinning Lu proposed the initial research idea, conducted the experiments, and wrote initial manuscript draft. Songwei Pei supervised the work and revised the manuscript.

Corresponding author

Correspondence to Songwei Pei.

Ethics declarations

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Competing interests

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lu, Y., Pei, S. DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation. Appl Intell 55, 240 (2025). https://doi.org/10.1007/s10489-024-05942-9

Download citation

Accepted: 11 October 2024
Published: 30 December 2024
DOI: https://doi.org/10.1007/s10489-024-05942-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Real-time 3D reconstruction using point-dependent pose graph optimization framework

CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer

An Efficient and Consistent Solution to the PnP Problem

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Real-time 3D reconstruction using point-dependent pose graph optimization framework

CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer

An Efficient and Consistent Solution to the PnP Problem

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation