Abstract
In this paper we propose a deep neural network based algorithm for indoor place recognition. It uses transfer learning to retrain VGG-F, a pretrained convolutional neural network to classify places on images acquired by a humanoid robot. The network has been trained as well as evaluated on a dataset consisting of 8000 images, which were recorded in sixteen rooms. The dataset is freely accessed from our website. We demonstrated experimentally that the proposed algorithm considerably outperforms BoW algorithms, which are frequently used in loop-closure. It also outperforms an algorithm in which features extracted by FC-6 layer of the VGG-F are classified by a linear SVM.
Keywords
- Visual Place Recognition
- Deep Neural Networks
- Humanoid Robot
- Simultaneous Localization And Mapping (SLAM)
- Visual SLAM
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arroyo, R., Alcantarilla, P., Bergasa, L., Romera, E.: OpenABLE: an open-source toolbox for application in life-long visual localization of autonomous vehicles. In: IEEE International Conference on Intelligent Transportation Systems, pp. 965–970 (2016)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. Eur. Conf. Comput. Vis. 3951, 404–417 (2006)
Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016)
Chatfield, K., Lempitsky, V.S., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: British Machine Vision Conference (BMVC) (2011)
Chen, Z., Lam, O., Jacobson, A., Milford, M.: Convolutional neural network-based place recognition. In: Australasian Conference on Robotics and Automation (2014). https://eprints.qut.edu.au/79662/
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. Int. J. Rob. Res. 27(6), 647–665 (2008)
Galvez-Lopez, D., Tardos, T.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28, 1188–1197 (2012)
Garcia-Fidalgo, E., Ortiz, A.: Vision-based topological mapping and localization by means of local invariant features and map refinement. Robotica 33, 1446–1470 (2014)
Harris, C., Stephens, M.: A combined corner and edge detector. Alvey Vis. Conf. 15, 10–5244 (1988)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Processing Systems, pp. 1097–1105 (2012)
Kuindersma, S., et al.: Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot. Adv. Neural Proc. Syst. 40, 429–455 (2016)
Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: International Conference on Computer Vision (ICCV) (2011)
Levitt, T., Lawton, D.: Qualitative navigation for mobile robots. Artif. Intell. 44(3), 305–360 (1990)
Li, Q., Li, K., You, X., Bu, S., Liu, Z.: Place recognition based on deep feature and adaptive weighting of similarity matrix. Neurocomputing 199, 114–127 (2016)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Lowry, S., et al.: Visual place recognition: a survey. IEEE Trans. Robot. 32, 1–19 (2016)
Newman, P., Ho, K.: SLAM-loop closing with visually salient features. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 635–642 (2005)
Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. In: Visual Perception, Progress in Brain Research, vol. 155, pp. 23–36. Elsevier (2006)
Oriolo, G., Paolillo, A., Rosa, L., Vendittelli, M.: Humanoid odometric localization integrating kinematic, inertial and visual information. Auton. Robots 40, 867–879 (2016)
Radford, N., et al.: Valkryrie: NASA’s first bipedal humanoid robot. J. Field Robot. 32, 397–419 (2015)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision (ICCV), vol. 32 (2011)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Sahdev, R., Tsotsos, J.: Indoor place recognition system for localization of mobile robots. In: IEEE Conference on Computer and Robot Vision, pp. 53–60 (2016)
Schönberger, J., Hardmeier, H., Sattler, T., Pollefeys, M.: Comparative evaluation of hand-crafted and learned local features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6959–6968 (2017)
Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. In: International Conference on Document Analysis and Recognition, pp. 958–963 (2003)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Sivic, J., Russell, B., Efros, A., Zisserman, A., Freeman, W.: Discovering objects and their location in images. In: IEEE International Conference on Computer Vision, vol. 1, pp. 370–377 (2005)
Sünderhauf, N., Protzel, P.: BRIEF-Gist - closing the loop by simple means. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1234–1241 (2011)
Sünderhauf, N., et al.: Place recognition with convNet landmarks: viewpoint-robust, condition-robust, training-free. In: Proceedings of Robotics: Science and Systems XII (2015)
Tai, L., Liu, M.: Deep-learning in mobile robotics - from perception to control systems: a survey on why and why not. arXiv (2016)
Torii, A., Sivic, J., Pajdla, T., Okutomi, M.: Visual place recognition with repetitive structures. In: Proceedings of the IEEE Conference on Computer, Vision and Pattern Recognition (2013)
Wang, Z., Wu, F., Hu, Z.: MSLD: a robust descriptor for line matching. Pattern Recogn. 42, 941–953 (2009)
Acknowledgment
This work was supported by Polish National Science Center (NCN) under a research grant 2014/15/B/ST6/02808.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wozniak, P., Afrisal, H., Esparza, R.G., Kwolek, B. (2018). Scene Recognition for Indoor Localization of Mobile Robots Using Deep CNN. In: Chmielewski, L., Kozera, R., Orłowski, A., Wojciechowski, K., Bruckstein, A., Petkov, N. (eds) Computer Vision and Graphics. ICCVG 2018. Lecture Notes in Computer Science(), vol 11114. Springer, Cham. https://doi.org/10.1007/978-3-030-00692-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-00692-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00691-4
Online ISBN: 978-3-030-00692-1
eBook Packages: Computer ScienceComputer Science (R0)