Skip to main content
Log in

Loop closure detection using CNN words

  • Original Research Paper
  • Published:
Intelligent Service Robotics Aims and scope Submit manuscript

Abstract

Loop closure detection (LCD) is crucial for the simultaneous localization and mapping system of an autonomous robot. Image features from a convolution neural network (CNN) have been widely used for LCD in recent years. Instead of directly using the feature vectors to compute the image similarity, we propose a novel and easy-to-implement method that manages features from a CNN via a novel approach to improve the performance. In this method, the elements of feature maps from the higher layer of the CNN are clustered to generate CNN words (CNNW). To encode spatial information of CNNW, we create word pairs (CNNWP) that are based on single words to improve the performance. In addition, traditional tricks that are used in methods that are based on bag of words (BoW) are integrated into our approach. We also demonstrate that the feature maps from lower layers can be used as descriptors to conduct local region matching between images. Via this approach, we can perform geometric verification for possible loop closures, similar to BoW methods, in our approach. The experimental results demonstrate that our method substantially outperforms state-of-the-art methods that directly use CNN features for LCD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Cummins M, Newman P (2008) FAB-MAP: probabilistic localization and mapping in the space of appearance. Int J Robot Res 27(6):647–665

    Article  Google Scholar 

  2. Falliat D (2007) A visual bag of words method for interactive qualitative localization and mapping. In: Proceedings of 2007 IEEE international conference on robotics and automation, pp 3921–3926

  3. Naseer T, Spinello L, Burgard W, Stachniss C (2014) Robust visual robot localization across seasons using network flows. In: Twenty-eighth AAAI conference on artificial intelligence, pp 2564–2570

  4. Glover AJ, Maddern WP, Wyeth MJ, Milford GF (2010) FAB-MAP + RatSLAM: appearance-based SLAM for multiple times of day. In: IEEE international conference on robotics and automation, pp 3507–3512

  5. Chen Z, Lam O, Jacobson A, Milford M (2014) Convolutional neural network-based place recognition. arXiv preprint arXiv:1411.509

  6. Sünderhauf N, Shirazi S, Dayoub F, Upcroft B, Milford M (2015) On the performance of ConvNet features for place recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4297–4304

  7. Sünderhauf N, Shirazi S, Jacobson A, Dayoub F, Pepperell E, Upcroft B, Milford M (2015) Place recognition with ConvNet landmarks: viewpoint-robust, condition-robust, training-free. In: Proceedings of robotics: science and systems XII

  8. Lowry S, Newman P, Leonard J, Cox D, Corke P, Milford M (2016) Visual place recognition: a survey. IEEE Trans Robot 32(1):1–19

    Article  Google Scholar 

  9. Cummins M, Newman P (2011) Appearance-only SLAM at large scale with FAB-MAP 2.0. Int J Robot Res 30(9):1100–1123

    Article  Google Scholar 

  10. Loukas B, Amanatiadis A, Gasteratos A (2018) Fast loop-closure detection using visual-word-vectors from image sequences. Int J Robot Res 37(1):62–82

    Article  Google Scholar 

  11. Nicosevici T, Garcia R (2012) Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans Robot 28(4):886–898

    Article  Google Scholar 

  12. Johns E, Yang GZ (2013) Feature co-occurrence maps: appearance-based localisation throughout the day. In: IEEE international conference on robotics and automation, pp 3212–3218

  13. Kejriwal N, Kumar S, Shibata T (2016) High performance loop closure detection using bag of word pairs. Robot Auton Syst 77(C):55–65

    Article  Google Scholar 

  14. Konolige K, Bowman J, Chen J, Mihelich P, Calonder M, Lepetit V, Fua P (2010) View-based maps. Int J Robot Res 29(8):941–957

    Article  Google Scholar 

  15. Loquercio A, Dymczyk M, Zeisl B, Lynen S, Gilitschenski I, Siegwart R (2017) Efficient descriptor learning for large scale localization. In: IEEE international conference on robotics and automation (ICRA), pp 3170–3177

  16. Gao X, Zhang T (2017) Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot 41(1):1–18

    Article  MathSciNet  Google Scholar 

  17. Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN Architecture for weakly supervised place recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5297–5307

  18. Chen Z, Jacobson A, Sünderhauf N, Upcroft B, Liu L, Shen C (2017) Deep learning features at scale for visual place recognition. In: IEEE international conference on robotics and automation (ICRA), pp 3223–3230

  19. Naseer T, Oliveira GL, Brox T, Burgard W (2017) Semantics-aware visual localization under challenging perceptual conditions. In: IEEE international conference on robotics and automation, pp 2614–2620

  20. Hou Y, Zhang H, Zhou S (2018) Evaluation of object proposals and ConvNet features for landmark-based visual place recognition. J Intell Rob Syst 92(4):505–520

    Article  Google Scholar 

  21. Cascianelli S, Costante G, Bellocchio E, Valigi P, Fravolini M, Ciarfuglia T (2017) Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features. Robot Auton Syst 92:53–65

    Article  Google Scholar 

  22. Milford MJ, Wyeth GF (2012) SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: 2012 IEEE international conference on robotics and automation (ICRA), pp 1643–1649

  23. Hansen, Peter, rowning B (2014) Visual place recognition using HMM sequence matching. In: IEEE international conference on intelligent robots and systems, pp 4549–4555

  24. Bampis L, Amanatiadis A, Gasteratos A (2016) Encoding the description of image sequences: A two-layered pipeline for loop closure detection. In: IEEE international conference on intelligent robots and systems, pp 4530–4536

  25. Angeli A, Filliat D, Doncieux S, Meyer JA (2008) Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans Robot 24(5):1027–1037

    Article  Google Scholar 

  26. Wang XL, Peng G, Zhang H (2018) Combining multiple image descriptions for loop closure detection. J Intell Rob Syst 92(3):565–585

    Article  Google Scholar 

  27. Arroyo R, Alcantarilla PF, Bergasa LM, Romera E (2016) Fusion and binarization of CNN features for robust topological localization across seasons. In: IEEE international conference on intelligent robots and systems, pp 4656–4663

  28. Li Q, Li K, You X, Bu S, Liu Z (2016) Place recognition based on deep feature and adaptive weighting of similarity matrix. Neurocomputing 199(2):114–127

    Article  Google Scholar 

  29. Galvez-López D, Tardos JD (2012) Bags of binary words for fast place recognition in image sequences. IEEE Trans Robot 28(5):1188–1197

    Article  Google Scholar 

  30. Siam SM, Zhang H (2017) Fast-SeqSLAM: A fast appearance based place recognition algorithm. In: IEEE international conference on robotics and automation. Piscataway, pp 5702–5708

  31. Labbé M, Michaud F (2013) Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Trans Robot 29(3):734–745

    Article  Google Scholar 

  32. Endres F, Hess J, Sturm J, Cremers D, Burgard W (2017) 3-D mapping with an RGB-D camera. IEEE Trans Robot 30(1):177–187

    Article  Google Scholar 

  33. Blanco JL, Moreno FA, González J (2009) A collection of outdoor robotic datasets with centimeter-accuracy ground truth. Auton. Robots 27(4):327–351

    Article  Google Scholar 

  34. Simonyan, K., Zisserman, A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)

  35. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826

  36. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  37. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

Download references

Acknowledgments

The authors express sincere appreciation to the editors and reviewers for their efforts to improve this paper. We also want to thank Arren Glover, Mark Cummins, and Blanco Jose Luis for providing the Garden Point, City Center, New College, and Malaga Parking 6L datasets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fuhai Duan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Q., Duan, F. Loop closure detection using CNN words. Intel Serv Robotics 12, 303–318 (2019). https://doi.org/10.1007/s11370-019-00284-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11370-019-00284-9

Keywords

Navigation