Abstract
The present work introduces a study about the use of a deep learning tool to tackle the visual localization. The approach proposed consists in developing a Convolutional Neural Network (CNN) with the aim of addressing the room retrieval task. Additionally, the network can be used to extract holistic descriptors from intermediate layers. Therefore, the localization can be carried out by meas of comparing the holistic descriptor obtained during the localization process with the descriptors obtained during the mapping process, but it can be carried out by using a hierarchical strategy. Concerning the hierarchical localization approach, in previous works, it has been addressed by means of a nearest neighbour search using different layers of information. In the present work, first is addressed a rough step which consists in solving the room retrieval with the CNN and, after that, a nearest neighbour search is carried out by using the holistic descriptors contained in the room selected. Hence, this work evaluates firstly the validity of the holistic descriptors extracted from the CNN and, secondly, evaluates the hierarchical method based on the CNN tool. The experiments to evaluate the validity of the proposed methods are carried out with an indoor dataset with real-operation conditions. The results show that the proposed approach based on deep learning is a robust solution to tackle the visual localization task.
Keywords
This work has been supported by the Generalitat Valenciana and the FSE through the grants ACIF/2017/146 and ACIF/2018/224, by the Spanish government through the project DPI 2016-78361-R (AEI/FEDER, UE): “Creación de mapas mediante métodos de apariencia visual para la navegación de robots.” and by Generalitat Valenciana through the project AICO/2019/031: “Creación de modelos jerárquicos y localización robusta de robots móviles en entornos sociales”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M.H.B., Oskoei, M.A., Fakharian, A.: Mobile robot navigation using sonar vision algorithm applied to omnidirectional vision. In: 2015 AI & Robotics (IRANOPEN), pp. 1–6. IEEE (2015)
Amorós, F., Payá, L., Marín, J.M., Reinoso, O.: Trajectory estimation and optimization through loop closure detection, using omnidirectional imaging and global-appearance descriptors. Expert Syst. Appl. 102, 273–290 (2018)
Amorós, F., Payá, L., Mayol-Cuevas, W., Jiménez, L.M., Reinoso, O.: Holistic descriptors of omnidirectional color images and their performance in estimation of position and orientation. IEEE Access 8, 81822–81848 (2020)
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)
Arroyo, R., Alcantarilla, P.F., Bergasa, L.M., Romera, E.: Fusion and binarization of CNN features for robust topological localization across seasons. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4656–4663, October 2016. https://doi.org/10.1109/IROS.2016.7759685
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Berenguer, Y., Payá, L., Valiente, D., Peidró, A., Reinoso, O.: Relative altitude estimation using omnidirectional imaging and holistic descriptors. Remote Sens. 11(3), 323 (2019)
Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M.L., Ciarfuglia, T.A.: Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features. Robot. Auton. Syst. 92, 53–65 (2017). https://doi.org/10.1016/j.robot.2017.03.004. http://www.sciencedirect.com/science/article/pii/S0921889016304900
Cattaneo, D., Vaghi, M., Ballardini, A.L., Fontana, S., Sorrenti, D.G., Burgard, W.: CMRNet: camera to LiDAR-map registration. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 1283–1289, October 2019. https://doi.org/10.1109/ITSC.2019.8917470
Cebollada, S., Payá, L., Flores, M., Román, V., Peidró, A., Reinoso, O.: A deep learning tool to solve localization in mobile autonomous robotics. In: ICINCO 2020, 17th International Conference on Informatics in Control, Automation and Robotics, Lieusaint-Paris, France, 7–9 July 2020. Ed. INSTICC (2020)
Cebollada, S., Payá, L., Mayol, W., Reinoso, O.: Evaluation of clustering methods in compression of topological models and visual place recognition using global appearance descriptors. Appl. Sci. 9(3), 377 (2019)
Cebollada, S., Payá, L., Román, V., Reinoso, O.: Hierarchical localization in topological models under varying illumination using holistic visual descriptors. IEEE Access 7, 49580–49595 (2019). https://doi.org/10.1109/ACCESS.2019.2910581
Cebollada, S., Payá, L., Valiente, D., Jiang, X., Reinoso, O.: An evaluation between global appearance descriptors based on analytic methods and deep learning techniques for localization in autonomous mobile robots. In: ICINCO 2019, 16th International Conference on Informatics in Control, Automation and Robotics, Prague, Czech Republic, 29–31 July 2019, pp. 284–291. Ed. INSTICC (2019)
Chaves, D., Ruiz-Sarmiento, J.R., Petkov, N., Gonzalez-Jimenez, J.: Integration of CNN into a robotic architecture to build semantic maps of indoor environments. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2019. LNCS, vol. 11507, pp. 313–324. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20518-8_27
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, vol. II, pp. 886–893 (2005)
Do, H.N., Choi, J., Young Lim, C., Maiti, T.: Appearance-based localization of mobile robots using group lasso regression. J. Dyn. Syst. Meas. Control 140(9), 091016 (2018)
Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)
Dymczyk, M., Gilitschenski, I., Nieto, J., Lynen, S., Zeisl, B., Siegwart, R.: LandmarkBoost: efficient visualcontext classifiers for robust localization. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 677–684, October 2018. https://doi.org/10.1109/IROS.2018.8594100
Faessler, M., Fontana, F., Forster, C., Mueggler, E., Pizzoli, M., Scaramuzza, D.: Autonomous, vision-based flight and live dense 3D mapping with a quadrotor micro aerial vehicle. J. Field Robot. 33(4), 431–450 (2016)
Filliat, D., Meyer, J.A.: Map-based navigation in mobile robots: I. A review of localization strategies. Cogn. Syst. Res. 4(4), 243–282 (2003). https://doi.org/10.1016/S1389-0417(03)00008-1. http://www.sciencedirect.com/science/article/pii/S1389041703000081
Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 513–520 (2011)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 241–257. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_15
Guo, J., Gould, S.: Deep CNN ensemble with data augmentation for object detection. arXiv preprint arXiv:1506.07224 (2015)
Han, D., Liu, Q., Fan, W.: A new image classification method using CNN transfer learning and web data augmentation. Expert Syst. Appl. 95, 43–56 (2018)
Holliday, A., Dudek, G.: Scale-robust localization using general object landmarks. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1688–1694, October 2018. https://doi.org/10.1109/IROS.2018.8594011
Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2938–2946, December 2015. https://doi.org/10.1109/ICCV.2015.336
Korrapati, H., Mezouar, Y.: Multi-resolution map building and loop closure with omnidirectional images. Auton. Robot. 41(4), 967–987 (2016). https://doi.org/10.1007/s10514-016-9560-6
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Kunii, Y., Kovacs, G., Hoshi, N.: Mobile robot navigation in natural environments using robust object tracking. In: 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE), pp. 1747–1752. IEEE (2017)
Li, R., Liu, Q., Gui, J., Gu, D., Hu, H.: Indoor relocalization in challenging environments with dual-stream convolutional neural networks. IEEE Trans. Autom. Sci. Eng. 15(2), 651–662 (2018). https://doi.org/10.1109/TASE.2017.2664920
Liu, R., Zhang, J., Yin, K., Pan, Z., Lin, R., Chen, S.: Absolute orientation and localization estimation from an omnidirectional image. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11013, pp. 309–316. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97310-4_35
van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
Mancini, M., Bulò, S.R., Ricci, E., Caputo, B.: Learning deep NBNN representations for robust place categorization. IEEE Robot. Autom. Lett. 2(3), 1794–1801 (2017)
Meng, L., Chen, J., Tung, F., Little, J.J., Valentin, J., de Silva, C.W.: Backtracking regression forests for accurate camera relocalization. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6886–6893, September 2017. https://doi.org/10.1109/IROS.2017.8206611
Moolan-Feroze, O., Calway, A.: Predicting out-of-view feature points for model-based camera pose estimation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 82–88 (2018). https://doi.org/10.1109/IROS.2018.8594297
Murillo, A.C., Singh, G., Kosecká, J., Guerrero, J.J.: Localization in urban environments using a panoramic gist descriptor. IEEE Trans. Rob. 29(1), 146–160 (2013)
Ngiam, J., Chen, Z., Koh, P.W., Ng, A.Y.: Learning deep energy models (2011)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Progr. Brain Res.: Spec. Issue Vis. Percept. 155, 23–36 (2006)
Payá, L., Peidró, A., Amorós, F., Valiente, D., Reinoso, O.: Modeling environments hierarchically with omnidirectional imaging and global-appearance descriptors. Remote Sens. 10(4), 522 (2018)
Payá, L., Reinoso, O., Berenguer, Y., Úbeda, D.: Using omnidirectional vision to create a model of the environment: a comparative evaluation of global-appearance descriptors. J. Sens. 2016 (2016). Article ID 1209507
Payá, L., Gil, A., Reinoso, O.: A state-of-the-art review on mapping and localization of mobile robots using omnidirectional vision sensors. J. Sens. 2017, 1–21 (2017)
Pronobis, A., Caputo, B.: COLD: COsy localization database. Int. J. Robot. Res. (IJRR) 28(5), 588–594 (2009). https://doi.org/10.1177/0278364909103912. http://www.pronobis.pro/publications/pronobis2009ijrr
Reinoso, O., Payá, L.: Special issue on visual sensors. Sensors 20(3) (2020). https://doi.org/10.3390/s20030910. https://www.mdpi.com/1424-8220/20/3/910
Reinoso, O., Payá, L.: Special issue on mobile robots navigation. Appl. Sci. 10(4) (2020). https://doi.org/10.3390/app10041317. https://www.mdpi.com/2076-3417/10/4/1317
Rituerto, A., Murillo, A.C., Guerrero, J.: Semantic labeling for indoor topological mapping using a wearable catadioptric system. Robot. Auton. Syst. 62(5), 685–695 (2014)
Román, V., Payá, L., Cebollada, S., Reinoso, Ó.: Creating incremental models of indoor environments through omnidirectional imaging. Appl. Sci. 10(18), 6480 (2020)
Schalkoff, R.J.: Artificial Intelligence: An Engineering Approach. McGraw-Hill, New York (1990)
Singh, M.K., Parhi, D.R.: Path optimisation of a mobile robot using an artificial neural network controller. Int. J. Syst. Sci. 42(1), 107–120 (2011)
Sinha, H., Patrikar, J., Dhekane, E.G., Pandey, G., Kothari, M.: Convolutional neural network based sensors for mobile robot relocalization. In: 2018 23rd International Conference on Methods Models in Automation Robotics (MMAR), pp. 774–779, August 2018. https://doi.org/10.1109/MMAR.2018.8485921
Sommer, K., Kim, K., Kim, Y., Jo, S.: Towards accurate kidnap resolution through deep learning. In: 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), pp. 502–506, June 2017. https://doi.org/10.1109/URAI.2017.7992654
Su, Z., Zhou, X., Cheng, T., Zhang, H., Xu, B., Chen, W.: Global localization of a mobile robot using lidar and visual features. In: 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2377–2383. IEEE (2017)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Ullah, M.M., Pronobis, A., Caputo, B., Luo, J., Jensfelt, P.: The cold database. Technical report, Idiap (2007)
Unicomb, J., Ranasinghe, R., Dantanarayana, L., Dissanayake, G.: A monocular indoor localiser based on an extended kalman filter and edge images from a convolutional neural network. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9, October 2018. https://doi.org/10.1109/IROS.2018.8594337
Vyborny, C.J., Giger, M.L.: Computer vision and artificial intelligence in mammography. AJR Am. J. Roentgenol. 162(3), 699–708 (1994)
Wachs, J.P., Kölsch, M., Stern, H., Edan, Y.: Vision-based hand-gesture applications. Commun. ACM 54(2), 60–71 (2011)
Weinzaepfel, P., Csurka, G., Cabon, Y., Humenberger, M.: Visual localization by learning objects-of-interest dense match regression. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5627–5636, June 2019. https://doi.org/10.1109/CVPR.2019.00578
Wozniak, P., Afrisal, H., Esparza, R.G., Kwolek, B.: Scene recognition for indoor localization of mobile robots using deep CNN. In: Chmielewski, L.J., Kozera, R., Orłowski, A., Wojciechowski, K., Bruckstein, A.M., Petkov, N. (eds.) ICCVG 2018. LNCS, vol. 11114, pp. 137–147. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00692-1_13
Xu, S., Chou, W., Dong, H.: A robust indoor localization system integrating visual localization aided by CNN-based image retrieval with Monte Carlo localization. Sensors 19(2), 249 (2019)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cebollada, S., Payá, L., Flores, M., Román, V., Peidró, A., Reinoso, O. (2022). A Localization Approach Based on Omnidirectional Vision and Deep Learning. In: Gusikhin, O., Madani, K., Zaytoon, J. (eds) Informatics in Control, Automation and Robotics. ICINCO 2020. Lecture Notes in Electrical Engineering, vol 793. Springer, Cham. https://doi.org/10.1007/978-3-030-92442-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-92442-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92441-6
Online ISBN: 978-3-030-92442-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)