skip to main content
10.1145/3342999.3343007acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdltConference Proceedingsconference-collections
research-article

End-to-End Vision-to-Motion Model with Auxiliary Segmentation Module for Indoor Navigation

Published: 05 July 2019 Publication History

Abstract

The majority of vision-based navigation algorithms consist of two independent sub-algorithms, environment perception algorithm and motion planning algorithm. Environment perception involves the positioning of the navigation robot and the construction of obstacle avoidance maps, from which motion decisions are drawn with planning algorithms. This type of algorithm requires special equipment such as laser camera and the hierarchical design tend to amplify errors transferred between upstream and downstream algorithms. To tackle the problems above, an end-to-end model based on deep learning is proposed to achieve a direct mapping from vision to motion under unconstructed indoor environment. To implicitly guide the mode to learn an internal ability of obstacle avoidance with the prior map information, an auxiliary semantic segmentation module is introduced into the backbone model, yielding a multi-task loss for the optimization of the model. Eventually, transfer learning technique is applied to extend training dataset with an opensource dataset and strengthen the generalization capability of the model. Experimental results demonstrate that the proposed framework is accurate, efficient and feasible in the real application.

References

[1]
Tateno K, Tombari F, Laina I, et al. CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction{J}. arXiv preprint arXiv:1704.03489, 2017.
[2]
DeTone D, Malisiewicz T, Rabinovich A. Toward Geometric Deep SLAM{J}. arXiv preprint arXiv:1707.07410, 2017.
[3]
Zhou T, Brown M, Snavely N, et al. Unsupervised learning of depth and ego-motion from video{J}. arXiv preprint arXiv:1704.07813, 2017.
[4]
Vijayanarasimhan S, Ricco S, Schmid C, et al. SfM-Net: Learning of Structure and Motion from Video{J}. arXiv preprint arXiv:1704.07804, 2017.
[5]
Wu J, Ma L, Hu X. Delving deeper into convolutional neural networks for camera relocalization{C}//Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 2017: 5644--5651.
[6]
Kendall A, Grimes M, Cipolla R. Posenet: A convolutional network for real-time 6-dof camera relocalization{C}//Proceedings of the IEEE international conference on computer vision. 2015: 2938--2946.
[7]
Li X, Belaroussi R. Semi-Dense 3D Semantic Mapping from Monocular SLAM{J}. arXiv preprint arXiv:1611.04144, 2016.
[8]
N. Smolyanskiy, A. Kamenev, J. Smith, and S. Birchfield, "Toward Low-Flying Autonomous MAV Trail Navigation using Deep Neural Networks for Environmental Awareness," May 2017.
[9]
Zhu Y, Mottaghi R, Kolve E, et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning{C}// IEEE International Conference on Robotics & Automation. 2017.
[10]
G. Kahn, A. Villaflor, B. Ding, P. Abbeel, and S. Levine, "Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation," Sep. 2017.
[11]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition{J}. arXiv preprint arXiv:1409.1556, 2014.
[12]
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation{C}//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431--3440.
[13]
Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database{C}//2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009: 248--255.
[14]
Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding{C}//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 3213--3223.
[15]
Russell B C, Torralba A, Murphy K P, et al. LabelMe: a database and web-based tool for image annotation{J}. International journal of computer vision, 2008, 77(1-3): 157--173.
[16]
Chollet F. Keras{J}. 2015

Cited By

View all
  • (2020)Semantic Segmentation to Develop an Indoor Navigation System for an Autonomous Mobile RobotMathematics10.3390/math80508558:5(855)Online publication date: 25-May-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDLT '19: Proceedings of the 2019 3rd International Conference on Deep Learning Technologies
July 2019
106 pages
ISBN:9781450371605
DOI:10.1145/3342999
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Nanyang Technological University
  • Chongqing University of Posts and Telecommunications

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Convolutional neural networks
  2. Indoor navigation
  3. Semantic segmentation
  4. Vision-based navigation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICDLT 2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)3
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Semantic Segmentation to Develop an Indoor Navigation System for an Autonomous Mobile RobotMathematics10.3390/math80508558:5(855)Online publication date: 25-May-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media