Abstract
Robot localization, the task of determining the current pose of a robot, is a crucial problem of mobile robotic. Visual-based robot localization, which using only cameras as exteroceptive sensors, has become extremely popular due to the relatively cheap cost of cameras. However, current approaches such as Bayes Filter based methods and Visual Odometry need knowledge of prior location and also rely on the feature points in images. This paper presents a novel semi-supervised learning method based on Variational Autoencoder (VAE) for visual-based robot localization, which does not rely on the prior location and feature points. Because our method does not need prior knowledge, it also can be used as a correction of dead reckoning. We adopt VAE as an unsupervised learning method to preprocess the environment images, followed by a supervised learning model to learn the mapping between the robot’s location and processed images. Therefore, one merit of the proposed approach is that it can adopt any state-of-the-art supervised learning models. Furthermore, this semi-supervised learning scheme is also suitable for improving other supervised learning problems by adding extra unlabeled data to the training data set to solve the problem in a semi-supervised manner. We show that this semi-supervised learning scheme can get a high accuracy for pose prediction using a surprisingly small number of labeled images compared to other machine learning methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bernstein, A.V., Kuleshov, A.P.: Manifold learning: generalization ability and tangent proximity. Int. J. Softw. Inform. 7(3) (2013)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54
Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference On Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)
Hashimoto, S., Namihira, K.: Self-localization from a 360-\(\circ \) camera based on the deep neural network. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 145–158. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_11
Huang, H., Li, Z., He, R., Sun, Z., Tan, T.: Introvae: Introspective variational autoencoders for photographic image synthesis. arXiv preprint arXiv:1807.06358 (2018)
Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Kuleshov, A., Bernstein, A., Burnaev, E., Yanovich, Y.: Machine learning in appearance-based robot self-localization. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 106–112. IEEE (2017)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 1–27 (2008)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Pu, Y., et al.: Variational autoencoder for deep learning of images, labels and captions. arXiv preprint arXiv:1609.08976 (2016)
Razavi, A., Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. arXiv preprint arXiv:1906.00446 (2019)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)
Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18(4), 80–92 (2011)
Thrun, S., Wolfram Burgard, D.F.: Probabilistic Robotics. The MIT Press, Cambridge (2005)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Thrun, S., Fox, D., Burgard, W., Dellaert, F.: Robust monte Carlo localization for mobile robots. Artif. Intell. 128(1–2), 99–141 (2001)
Zhang, Z., Rebecq, H., Forster, C., Scaramuzza, D.: Benefit of large field-of-view cameras for visual odometry. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 801–808. IEEE (2016)
Zhao, T., Zhao, R., Eskenazi, M.: Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arXiv preprint arXiv:1703.10960 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liang, K., He, F., Zhu, Y., Gao, X. (2022). A Semi-supervised Learning Based on Variational Autoencoder for Visual-Based Robot Localization. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2021. Communications in Computer and Information Science, vol 1491. Springer, Singapore. https://doi.org/10.1007/978-981-19-4546-5_48
Download citation
DOI: https://doi.org/10.1007/978-981-19-4546-5_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-4545-8
Online ISBN: 978-981-19-4546-5
eBook Packages: Computer ScienceComputer Science (R0)