Skip to main content

Vision-Based Categorical Object Pose Estimation and Manipulation

  • Conference paper
  • First Online:
Intelligent Robotics and Applications (ICIRA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14267))

Included in the following conference series:

  • 483 Accesses

Abstract

Object manipulation and environment interaction are of great significance for intelligent robots, especially service robots working under unstructured household and office scenarios. This paper proposes a novel approach for categorical unseen object grasping and manipulation. Different from recently popular end-to-end reinforcement learning methods, we develop models for geometric primitive abstraction of target objects, and accordingly estimate their pose as well as generate task-orientated grasp points. Such design emphasizes visual perception in guiding robotic manipulation, thereby enhancing model interpretability and reliability during implementation. In addition, we also conduct object grasping experiments both under simulation and real-world settings, which further verify the effectiveness and superiority of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Babin, V., Gosselin, C.: Mechanisms for robotic grasping and manipulation. Annual Review of Control, Robotics, and Autonomous Systems 4, 573–593 (2021)

    Article  Google Scholar 

  2. Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis-a survey. IEEE Trans. Rob. 30(2), 289–309 (2013)

    Article  Google Scholar 

  3. Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  4. Chang, A.X., et al.: Shapenet: an information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)

  5. Chen, K., Dou, Q.: Sgpa: structure-guided prior adaptation for category-level 6d object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2773–2782 (2021)

    Google Scholar 

  6. Chu, F.J., Vela, P.A.: Deep grasp: detection and localization of grasps with deep neural networks. arXiv preprint arXiv:1802.00520 1 (2018)

  7. Diallo, A.D., Gobee, S., Durairajah, V.: Autonomous tour guide robot using embedded system control. Procedia Comput. Sci. 76, 126–133 (2015)

    Article  Google Scholar 

  8. Du, G., Wang, K., Lian, S., Zhao, K.: Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif. Intell. Rev. 54(3), 1677–1734 (2021)

    Article  Google Scholar 

  9. Fan, Z., Zhu, Y., He, Y., Sun, Q., Liu, H., He, J.: Deep learning on monocular object pose detection and tracking: a comprehensive overview. ACM Comput. Surv. 55(4), 1–40 (2022)

    Article  Google Scholar 

  10. Filonik, D., Bednarz, T., Rittenbruch, M., Foth, M.: Glance: generalized geometric primitives and transformations for information visualization in ar/vr environments. In: Proceedings of the 15th ACM SIGGRAPH Conference on Virtual-Reality Continuum and Its Applications in Industry-Volume 1, pp. 461–468 (2016)

    Google Scholar 

  11. Gonzalez-Aguirre, J.A., et al.: Service robots: trends and technology. Appl. Sci. 11(22), 10702 (2021)

    Google Scholar 

  12. Gul, F., Mir, I., Abualigah, L., Sumari, P.: Multi-robot space exploration: an augmented arithmetic approach. IEEE Access 9, 107738–107750 (2021)

    Article  Google Scholar 

  13. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  14. Kleeberger, K., Bormann, R., Kraus, W., Huber, M.F.: A survey on learning-based robotic grasping. Current Robot. Rep. 1, 239–249 (2020)

    Article  Google Scholar 

  15. Kuffner, J.J., LaValle, S.M.: Rrt-connect: an efficient approach to single-query path planning. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), vol. 2, pp. 995–1001. IEEE (2000)

    Google Scholar 

  16. Lobbezoo, A., Qian, Y., Kwon, H.J.: Reinforcement learning for pick and place operations in robotics: a survey. Robotics 10(3), 105 (2021)

    Article  Google Scholar 

  17. Mahler, J., et al.: Dex-net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv preprint arXiv:1703.09312 (2017)

  18. Matas, J., James, S., Davison, A.J.: Sim-to-real reinforcement learning for deformable object manipulation. In: Conference on Robot Learning, pp. 734–743. PMLR (2018)

    Google Scholar 

  19. Mohammed, M.Q., Chung, K.L., Chyi, C.S.: Review of deep reinforcement learning-based object grasping: techniques, open challenges, and recommendations. IEEE Access 8, 178450–178481 (2020)

    Article  Google Scholar 

  20. Ni, P., Zhang, W., Zhu, X., Cao, Q.: Pointnet++ grasping: learning an end-to-end spatial grasp generation algorithm from sparse point clouds. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3619–3625. IEEE (2020)

    Google Scholar 

  21. O’Mahony, N., Campbell, S., Krpalkova, L., Riordan, D., Walsh, J., Murphy, A., Ryan, C.: Computer vision for 3d perception: a review. In: Intelligent Systems and Applications: Proceedings of the 2018 Intelligent Systems Conference (IntelliSys) Volume 2, pp. 788–804. Springer (2019)

    Google Scholar 

  22. Rosete, A., Soares, B., Salvadorinho, J., Reis, J., Amorim, M.: Service robots in the hospitality industry: an exploratory literature review. In: Exploring Service Science: 10th International Conference, IESS 2020, Porto, Portugal, February 5–7, 2020, Proceedings 10, pp. 174–186. Springer (2020)

    Google Scholar 

  23. Sahbani, A., El-Khoury, S., Bidaud, P.: An overview of 3d object grasp synthesis algorithms. Robot. Auton. Syst. 60(3), 326–336 (2012)

    Article  Google Scholar 

  24. Sahin, C., Garcia-Hernando, G., Sock, J., Kim, T.K.: A review on object pose recovery: from 3d bounding box detectors to full 6d pose estimators. Image Vis. Comput. 96, 103898 (2020)

    Article  Google Scholar 

  25. Sharif, M., Erdogmus, D., Amato, C., Padir, T.: End-to-end grasping policies for human-in-the-loop robots via deep reinforcement learning. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2768–2774. IEEE (2021)

    Google Scholar 

  26. Thosar, M., Zug, S., Skaria, A.M., Jain, A.: A review of knowledge bases for service robots in household environments. In: AIC, pp. 98–110 (2018)

    Google Scholar 

  27. Tian, M., Ang, M.H., Lee, G.H.: Shape prior deformation for categorical 6d object pose and size estimation. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp. 530–546. Springer (2020)

    Google Scholar 

  28. Wang, C., et al.: Densefusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)

    Google Scholar 

  29. Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6d object pose and size estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2642–2651 (2019)

    Google Scholar 

  30. Wang, L., Xiang, Y., Fox, D.: Manipulation trajectory optimization with online grasp synthesis and selection. arXiv preprint arXiv:1911.10280 (2019)

  31. Wang, Y., Tan, X., Yang, Y., Liu, X., Ding, E., Zhou, F., Davis, L.S.: 3d pose estimation for fine-grained object categories. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)

    Google Scholar 

  32. Wang, Z., Li, W., Kao, Y., Zou, D., Wang, Q., Ahn, M., Hong, S.: Hcr-net: a hybrid of classification and regression network for object pose estimation. In: IJCAI, pp. 1014–1020 (2018)

    Google Scholar 

  33. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: Posecnn: a convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)

  34. Yuan, W., Hang, K., Kragic, D., Wang, M.Y., Stork, J.A.: End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer. Robot. Auton. Syst. 119, 119–134 (2019)

    Article  Google Scholar 

  35. Yurtsever, E., Lambert, J., Carballo, A., Takeda, K.: A survey of autonomous driving: common practices and emerging technologies. IEEE access 8, 58443–58469 (2020)

    Article  Google Scholar 

Download references

Acknowledgment

Supported by Key Research Project of Zhejiang Lab (No. G2021NB0AL03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Meng, Q. et al. (2023). Vision-Based Categorical Object Pose Estimation and Manipulation. In: Yang, H., et al. Intelligent Robotics and Applications. ICIRA 2023. Lecture Notes in Computer Science(), vol 14267. Springer, Singapore. https://doi.org/10.1007/978-981-99-6483-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-6483-3_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-6482-6

  • Online ISBN: 978-981-99-6483-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics