Skip to main content

6D Object Pose Estimation with Attention Aware Bi-gated Fusion

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14448))

Included in the following conference series:

  • 1198 Accesses

Abstract

Accurate object pose estimation is a prerequisite for successful robotic grasping tasks. Currently keypoint-based pose estimation methods using RGB-D data have shown promising results in simple environments. However, how to fuse the complementary features from RGB-D data is still a challenging task. To this end, this paper proposes a two-branch network with attention aware bi-gated fusion (A2BF) module for the keypoint-based 6D object pose estimation, named A2BNet for abbreviation. A2BF module consists of two key components, bidirectional gated fusion and attention mechanism modules to effectively extract information from both RGB and point cloud data, prioritizing crucial details while disregarding irrelevant information. Several A2BF modules can be embedded in the network to generate complementary texture and geometric information. Extensive experiments are conducted on the public LineMOD and Occlusion LineMOD datasets. Experimental results demonstrate that the average accuracy using the proposed method on both datasets can reach 99.8% and 67.6% respectively, outperforms the state-of-the-art methods.

Supported by the Natural Science Foundation of China (62272322, 62002246, 62272323) and the Project of Beijing Municipal Education Commission (KM202010028010) and Applied Basic Research Project of Liaoning Province(2022JH2/101300279).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhang, H., Tang, J., Sun, S., et al.: Robotic grasping from classical to modern: a survey. arXiv preprint arXiv:2202.03631 (2022)

  2. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: Posecnn: a convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)

  3. Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: Pvnet: pixel-wise voting network for 6dof pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)

    Google Scholar 

  4. Zakharov, S., Shugurov, I., Ilic, S.: Dpod: dense 6d pose object detector in RGB images. arXiv preprint arXiv:1902.11020 (2019)

  5. Wang, C., et al.: DenseFusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp. 3338–3347 (2019)

    Google Scholar 

  6. He, Y., Sun, W., Huang, H., Liu, J., Fan, H., Sun, J.: PVN3D: a deep point-wise 3D keypoints voting network for 6DoF pose estimation. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020, pp. 11629–11638 (2020)

    Google Scholar 

  7. Castro, P., Kim, T.K.: CRT-6D: fast 6D object pose estimation with cascaded refinement transformers. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 5735–5744 (2022)

    Google Scholar 

  8. Hu, Y., Hugonot, J., Fua, P., Salzmann, M.: Segmentation-driven 6D object pose estimation. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp. 3380–3389 (2019)

    Google Scholar 

  9. Chen, W., Jia, X., Chang, H.J., Duan, J., Leonardis, A.: G2L-net: global to local network for real time 6D pose estimation with embedding vector features. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020, pp. 4232–4241 (2020)

    Google Scholar 

  10. He, Y., Huang, H., Fan, H., Chen, Q., Sun, J.: FFB6D: a full flow bidirectional fusion network for 6D pose estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021, pp. 3002–3012 (2021)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

    Google Scholar 

  13. Hu, Q., et al. Randla-net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11108–11117 (2020)

    Google Scholar 

  14. Li, Z., Wang, G., Ji, X.: CDPN: coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7678–7687 (2019)

    Google Scholar 

  15. Xu, D., Anguelov, D., Jain, A.: Pointfusion: deep sensor fusion for 3d bounding box estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 244–253 (2018)

    Google Scholar 

  16. Zhou, G., Wang, H., Chen, J., Huang, D.: PR-GCN: a deep graph convolutional network with point refinement for 6D pose estimation. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 2773–2782 (2021). https://doi.org/10.1109/ICCV48922.2021.00279

  17. Huang, J., Xia, C., Liu, H., Liang, B.: PAV-Net: point-wise attention keypoints voting network for real-time 6D object pose estimation. In: 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, pp. 1–8 (2022). https://doi.org/10.1109/IJCNN55064.2022.9892089

  18. Park, K., Patten, T., Vincze, M.: Pix2Pose: pixel-wise coordinate regression of objects for 6D pose estimation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 7667–7676 (2019). https://doi.org/10.1109/ICCV.2019.00776

  19. Song, C., Song, J., Huang, Q.: Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 431–440 (2020)

    Google Scholar 

  20. Hu, Y., Fua, P., Wang, W., Salzmann, M.: Single-stage 6d object pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2930–2939 (2020)

    Google Scholar 

  21. Li, Y., Wang, G., Ji, X., et al.: Deepim: deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 683–698 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenzhou Shao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Âİ 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, L., Lu, W., Tian, Y., Guan, Y., Shao, Z., Shi, Z. (2024). 6D Object Pose Estimation with Attention Aware Bi-gated Fusion. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14448. Springer, Singapore. https://doi.org/10.1007/978-981-99-8082-6_44

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8082-6_44

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8081-9

  • Online ISBN: 978-981-99-8082-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics