Abstract
Visual Simultaneous Localization and Mapping (vSLAM) are expected to promote the initiatives in Smart City including driverless cars and intelligent robots. Loop closure detection (LCD) is an important module in a vSLAM system. Existing works with convolutional neural networks exhibit better performance on feature extraction, but this is far from enough. Concerning the characteristics of LCD, it is of great significance to have a customized loss function and a method to construct suitable training image sets. Based on this motivation, we propose a novel framework for LCD. Through a deep analysis of the distance relationships in the LCD problem, we propose the multi-tuplet clusters loss function together with mini-batch construction scheme. The proposed framework can map images to a low dimensional space and extract more discriminative image features, which help learn a more essential distance relationship of the LCD problem. Extensive evaluations demonstrate that our method outperforms many state-of-art approaches even in complex environments with strong appearance changes. Importantly, though the training process is computationally demanding, its online application is very efficient.
Similar content being viewed by others
References
Polok L, Ila V, Solony M et al (2016) Incremental block Cholesky factorization for nonlinear least squares in robotics. IFAC Proceedings Volumes 46(10):172–178. https://doi.org/10.15607/rss.2013.ix.042
Zaffar M, Khaliq A, Ehsan S, et al (2019) Levelling the playing field: a comprehensive comparison of visual place recognition approaches under changing conditions, in the international conference on robotics and automation (ICRA), Montreal, Canada
Gálvez-López D, Tardós JD (2011) Real-time loop detection with bags of binary words. IEEE Int Conf Intell Robot Syst:51–58. https://doi.org/10.1109/IROS.2011.6048525
Yan F, Copeland R, Brittain HG (1983) LSD-SLAM: large-scale direct monocular SLAM. Inorganica Chim Acta 72:211–216. https://doi.org/10.1016/S0020-1693(00)81721-1
Mur-Artal R, Montiel JMM, Tardos JD (2015) ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans Robot 31:1147–1163. https://doi.org/10.1109/TRO.2015.2463671
Cummins M, Newman P (2008) FAB-MAP: probabilistic localization and mapping in the space of appearance. Int J Robot Res 27:647–665. https://doi.org/10.1177/0278364908090961
Rosten E (2006) Drummond T (2006) machine learning for high speed corner detection. Comput Vis -- ECCV 1:430–443. https://doi.org/10.1007/11744023_34
Chen Y, Gan W, Zhang L, et al (2017) A survey on visual place recognition for Mobile robots localization. 2017 14th web Inf Syst Appl Conf 187–192. https://doi.org/10.1109/WISA.2017.7
Milford MJ, Wyeth GF (2012) SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. Proc - IEEE Int Conf Robot Autom:1643–1649. https://doi.org/10.1109/ICRA.2012.6224623
Chen Z, Lam O, Jacobson A, Milford M (2013) Convolutional neural network-based place recognition. 2014 Australas Conf robot autom (ACRA 2014) 8
Hou Y, Zhang H, Zhou S (2015) Convolutional neural network-based image representation for visual loop closure detection. 2015 IEEE Int Conf Inf autom ICIA 2015 - conjunction with 2015 IEEE Int Conf autom Logist 2238–2245. https://doi.org/10.1109/ICInfA.2015.7279659
Zhou B, Lapedriza A, Xiao J, et al (2014) Learning deep features for scene recognition using places database. Annu Conf Neural Inf Process Syst 487–495. https://doi.org/10.1162/153244303322533223
Sünderhauf N, Shirazi S, Dayoub F, et al (2015) On the performance of ConvNet features for place recognition. IROS 2015-Decem:4297–4304. https://doi.org/10.1109/IROS.2015.7353986
Krizhevsky A, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. NIPS’12 Proc 25th Int Conf 1:1–9
Lopez-antequera M, Gomez-ojeda R, Petkov N, Gonzalez-jimenez J (2017) Appearance-invariant place recognition by discriminatively training a convolutional neural network. Pattern Recogn Lett 92:89–95. https://doi.org/10.1016/j.patrec.2017.04.017
Li Q, Li K, You X et al (2016) Place recognition based on deep feature and adaptive weighting of similarity matrix. Neurocomputing 199:114–127. https://doi.org/10.1016/j.neucom.2016.03.029
Gao X, Zhang T (2017) Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton Robots 41:1–18. https://doi.org/10.1007/s10514-015-9516-2
Vincent P (2010) Stacked Denoising autoencoders : learning useful representations in a deep network with a local Denoising criterion. J Mach Learn Res 11:3371–3408
Ding S, Lin L, Wang G, Chao H (2015) Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn 48:2993–3003. https://doi.org/10.1016/j.patcog.2015.04.005
Sohn K (2016) Improved deep metric learning with multi-class N-pair loss objective. NIPS’16 Proc 29th Int Conf
Liu H, Tian Y, Wang Y, et al (2016) Deep relative distance learning: tell the difference between similar vehicles. 2016 IEEE Conf Comput Vis Pattern Recognit 2167–2175. https://doi.org/10.1109/CVPR.2016.238
Schroff F, Philbin J FaceNet: A Unified Embedding for Face Recognition and Clustering. 2015 IEEE Conf Comput Vis Pattern Recognit 815–823. https://doi.org/10.1109/CVPR.2015.7298682
Wu C, Austin UT, Smola AJ, Austin UT Sampling Matters in Deep Embedding Learning. 2017 International Conference on Computer Vision 2840–2848
Olid D, Fácil JM, Civera J Single-view place recognition under seasonal changes, in IEEE/RSJ international conference on intelligent robots and systems (IROS), Madrid, Spain
Chen Z, Jacobson A, Sunderhauf N et al (2017) Deep learning features at scale for visual place recognition. Proc - IEEE Int Conf Robot Autom 1:3223–3230. https://doi.org/10.1109/ICRA.2017.7989366
Arandjelovi R, Gronat P, Torii A et al (2016) NetVLAD: CNN architecture for weakly supervised place recognition CNN architecture for weakly supervised place recognition. IEEE Transactions on Pattern Analysis and Machine 40(6):1437–1451. https://doi.org/10.1145/3078971.3079033
Merrill N, Huang G Lightweight unsupervised deep loop closure, in robotics: science and systems (RSS), Cardiff, UK
Author information
Authors and Affiliations
Corresponding author
Additional information
Guest Editors: Mohamed Elhoseny, Xiaohui Yuan, and Saru Kumari
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection: Special Issue on Future Networking Applications Plethora for Smart Cities
Rights and permissions
About this article
Cite this article
Jin, S., Gao, Y. & Chen, L. Improved Deep Distance Learning for Visual Loop Closure Detection in Smart City. Peer-to-Peer Netw. Appl. 13, 1260–1271 (2020). https://doi.org/10.1007/s12083-019-00861-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-019-00861-w