skip to main content
10.1145/3638884.3638912acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

Embodied Visual Navigation for Grasping

Published: 23 April 2024 Publication History

Abstract

This paper presents a novel approach to robotic grasping by integrating embodied visual navigation with reinforcement learning. The primary objective is to determine the optimal location for a robot to stand for successful object grasping. The motivation for this research is to address the existing gap in the literature where navigation and grasping are often treated as separate problems, leading to suboptimal performance. Our approach leverages multimodal sensory data, including RGB images, depth images, and semantic information, to guide the robot’s navigation. It also utilizes deep reinforcement learning to enable the robot to learn optimal navigation strategies from visual input. The effectiveness of this approach is demonstrated through a series of experiments conducted in simple and complex scenes with varying numbers of obstacles. The results show that our method achieves a high success rate and a fast grasping speed in different scenarios, outperforming other methods. This work contributes significantly to the field of robotic grasping by integrating embodied visual navigation and deep reinforcement learning, and by demonstrating its effectiveness through rigorous experiments.

References

[1]
Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, and Anton Van Den Hengel. 2018. Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[2]
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, 2022. Rt-1: Robotics transformer for real-world control at scale. arXiv preprint arXiv:2212.06817 (2022).
[3]
Henrik Christensen, Nancy Amato, Holly Yanco, Maja Mataric, Howie Choset, Ann Drobnis, Ken Goldberg, Jessy Grizzle, Gregory Hager, John Hollerbach, 2021. A roadmap for us robotics–from internet to robotics 2020 edition. Foundations and Trends® in Robotics 8, 4 (2021), 307–424.
[4]
Y. Ding. 2023. BestMan_Pybullet. https://github.com/yding25/BestMan_Pybullet GitHub repository.
[5]
Theophile Gervet, Soumith Chintala, Dhruv Batra, Jitendra Malik, and Devendra Singh Chaplot. 2022. Navigating to Objects in the Real World. http://arxiv.org/abs/2212.00922 arXiv:2212.00922 [cs].
[6]
Daekyum Kim, Sang-Hun Kim, Taekyoung Kim, B. B. Kang, Minhyuk Lee, Wookeun Park, Subyeong Ku, Dongwook Kim, Junghan Kwon, Hochang Lee, J. Bae, Yong‐Lae Park, Kyu-Jin Cho, and Sungho Jo. 2021. Review of machine learning methods in soft robotics. PLoS ONE (2021).
[7]
Kai Lemmerz, Paul Glogowski, Phil Kleineberg, A. Hypki, and B. Kuhlenkötter. 2019. A Hybrid Collaborative Operation for Human-Robot Interaction Supported by Machine Learning. International Conference on Human System Interaction (2019).
[8]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and B. Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. IEEE International Conference on Computer Vision (2021).
[9]
K. Mathai, R. Enoch, and J. Jishnu. 2016. Robotics and artificial intelligence : The future of surgeons and surgery. arXiv preprint (2016).
[10]
Justinas Mišeikis, Pietro Caroni, Patricia Duchamp, A. Gasser, Rastislav Marko, Nelija Miseikiene, Frederik Zwilling, Charles de Castelbajac, Lucas Eicher, M. Früh, and H. Früh. 2020. Lio-A Personal Robot Assistant for Human-Robot Interaction and Care Applications. IEEE Robotics and Automation Letters (2020).
[11]
V. Prasad, Dorothea Koert, R. Stock-Homburg, Jan Peters, and G. Chalvatzaki. 2022. MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction. IEEE-RAS International Conference on Humanoid Robots (2022).
[12]
Nicholas Roy, Ingmar Posner, Tim Barfoot, Philippe Beaudoin, Yoshua Bengio, Jeannette Bohg, Oliver Brock, Isabelle Depatie, Dieter Fox, Dan Koditschek, 2021. From machine learning to robotics: challenges and opportunities for embodied intelligence. arXiv preprint arXiv:2110.15245 (2021).
[13]
M. Selvaggio, Marco Cognetti, S. Nikolaidis, S. Ivaldi, and B. Siciliano. 2021. Autonomy in Physical Human-Robot Interaction: A Brief Survey. IEEE Robotics and Automation Letters (2021).
[14]
Xiaowen Su, Fengpei Yuan, Ran Zhang, Jian Liu, M. Boltz, and Xiaopeng Zhao. 2022. Deploying a Human Robot Interaction Model for Dementia Care in Federated Learning. IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (2022).
[15]
Kai Zhu and Tao Zhang. 2021. Deep Reinforcement Learning Based Mobile Robot Navigation: A Review. (2021).
[16]
Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi. 2016. Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning. arxiv:1609.05143 [cs]

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information Processing
December 2023
648 pages
ISBN:9798400708909
DOI:10.1145/3638884
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. artificial intelligence
  2. deep reinforcement learning
  3. robotics

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICCIP 2023

Acceptance Rates

Overall Acceptance Rate 61 of 301 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 32
    Total Downloads
  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)4
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media