Skip to main content

A Semi-supervised Learning Based on Variational Autoencoder for Visual-Based Robot Localization

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1491))

  • 738 Accesses

Abstract

Robot localization, the task of determining the current pose of a robot, is a crucial problem of mobile robotic. Visual-based robot localization, which using only cameras as exteroceptive sensors, has become extremely popular due to the relatively cheap cost of cameras. However, current approaches such as Bayes Filter based methods and Visual Odometry need knowledge of prior location and also rely on the feature points in images. This paper presents a novel semi-supervised learning method based on Variational Autoencoder (VAE) for visual-based robot localization, which does not rely on the prior location and feature points. Because our method does not need prior knowledge, it also can be used as a correction of dead reckoning. We adopt VAE as an unsupervised learning method to preprocess the environment images, followed by a supervised learning model to learn the mapping between the robot’s location and processed images. Therefore, one merit of the proposed approach is that it can adopt any state-of-the-art supervised learning models. Furthermore, this semi-supervised learning scheme is also suitable for improving other supervised learning problems by adding extra unlabeled data to the training data set to solve the problem in a semi-supervised manner. We show that this semi-supervised learning scheme can get a high accuracy for pose prediction using a surprisingly small number of labeled images compared to other machine learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bernstein, A.V., Kuleshov, A.P.: Manifold learning: generalization ability and tangent proximity. Int. J. Softw. Inform. 7(3) (2013)

    Google Scholar 

  2. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)

    Article  Google Scholar 

  3. Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54

    Chapter  Google Scholar 

  4. Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference On Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)

    Google Scholar 

  5. Hashimoto, S., Namihira, K.: Self-localization from a 360-\(\circ \) camera based on the deep neural network. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 145–158. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_11

    Chapter  Google Scholar 

  6. Huang, H., Li, Z., He, R., Sun, Z., Tan, T.: Introvae: Introspective variational autoencoders for photographic image synthesis. arXiv preprint arXiv:1807.06358 (2018)

  7. Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)

    Google Scholar 

  8. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)

  9. Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)

    Article  Google Scholar 

  10. Kuleshov, A., Bernstein, A., Burnaev, E., Yanovich, Y.: Machine learning in appearance-based robot self-localization. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 106–112. IEEE (2017)

    Google Scholar 

  11. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 1–27 (2008)

    MATH  Google Scholar 

  12. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)

    Article  Google Scholar 

  13. Pu, Y., et al.: Variational autoencoder for deep learning of images, labels and captions. arXiv preprint arXiv:1609.08976 (2016)

  14. Razavi, A., Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. arXiv preprint arXiv:1906.00446 (2019)

  15. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)

    Google Scholar 

  16. Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18(4), 80–92 (2011)

    Article  Google Scholar 

  17. Thrun, S., Wolfram Burgard, D.F.: Probabilistic Robotics. The MIT Press, Cambridge (2005)

    Google Scholar 

  18. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  19. Thrun, S., Fox, D., Burgard, W., Dellaert, F.: Robust monte Carlo localization for mobile robots. Artif. Intell. 128(1–2), 99–141 (2001)

    Article  Google Scholar 

  20. Zhang, Z., Rebecq, H., Forster, C., Scaramuzza, D.: Benefit of large field-of-view cameras for visual odometry. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 801–808. IEEE (2016)

    Google Scholar 

  21. Zhao, T., Zhao, R., Eskenazi, M.: Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arXiv preprint arXiv:1703.10960 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fazhi He .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, K., He, F., Zhu, Y., Gao, X. (2022). A Semi-supervised Learning Based on Variational Autoencoder for Visual-Based Robot Localization. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2021. Communications in Computer and Information Science, vol 1491. Springer, Singapore. https://doi.org/10.1007/978-981-19-4546-5_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-4546-5_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-4545-8

  • Online ISBN: 978-981-19-4546-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics