Skip to main content
Log in

Graph-Based Place Recognition in Image Sequences with CNN Features

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Visual place recognition is a critical and challenging problem in both robotics and computer vision communities. In this paper, we focus on place recognition for visual Simultaneous Localization and Mapping (vSLAM) systems. These systems have been limited to handcrafted feature based paradigms for a long time, which normally use local visual information of images and are not sufficiently robust against variations applied to images. In this work, we address place recognition with the features automatically learned from data. First, we propose a graph-based visual place recognition method. The graph is constructed by combining the visual features extracted from convolutional neural networks (CNNs) and the temporal information of the images in a sequence. Second, we propose to employ diffusion process to enhance the data association in the graph to achieve more accurate recognition results. Finally, to evaluate the proposed method, we experiment on four commonly used datasets. Experimental results indicate that the proposed method is able to obtain significantly better performance (e.g. 95.37% recall at 100% of precision) than that of FAB-MAP (47.16% recall at 100% of precision), a commonly used method for place recognition based on handcrafted features, especially on some challenging datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)

  2. Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1269–1277 (2015)

  3. Bai, S., Zhou, Z., Wang, J., Bai, X., Latecki, L.J., Tian, Q.: Ensemble diffusion for retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 774–783 (2017)

  4. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. Comput. Vis.–ECCV 2006, 404–417 (2006)

    Google Scholar 

  5. Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  6. Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J.J.: Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016)

    Article  Google Scholar 

  7. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. Comput. Vis.–ECCV 2010, 778–792 (2010)

    Google Scholar 

  8. Chen, Z., Jacobson, A., Sünderhauf, N., Upcroft, B., Liu, L., Shen, C., Reid, I., Milford, M.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3223–3230. IEEE, Singapore (2017). https://doi.org/https://eprints.qut.edu.au/109651/. https://doi.org/10.1109/ICRA.2017.7989366

  9. Chen, Z., Lam, O., Jacobson, A., Milford, M.: Convolutional neural network-based place recognition. Comput. Sci. (2014)

  10. Cummins, M., Newman, P.: FAB-MAP: Probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. 27(6), 647–665 (2008). https://doi.org/10.1177/0278364908090961. http://ijr.sagepub.com/cgi/content/abstract/27/6/647

    Article  Google Scholar 

  11. Cummins, M., Newman, P.: Appearance-only slam at large scale with fab-map 2.0. Int. J. Robot. Res. 30 (9), 1100–1123 (2011)

    Article  Google Scholar 

  12. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009)

  13. Donoser, M., Bischof, H.: Diffusion processes for retrieval revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1320–1327 (2013)

  14. Galvez-LoPez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)

    Article  Google Scholar 

  15. Gao, X., Zhang, T.: Unsupervised learning to detect loops using deep neural networks for visual slam system. Auton. Robot. 41(1), 1–18 (2017)

    Article  MathSciNet  Google Scholar 

  16. Garcia-Fidalgo, E., Ortiz, A.: Hierarchical place recognition for topological mapping. IEEE Trans. Robot. 33(5), 1061–1074 (2017)

    Article  Google Scholar 

  17. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The Kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

  18. Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vis. 124(2), 237–254 (2017)

    Article  MathSciNet  Google Scholar 

  19. Guclu, O., Can, A.B.: Fast and effective loop closure detection to improve slam performance. J. Intell. Robot. Syst., 1–23 (2017)

  20. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  21. Ho, K.L., Newman, P.: Detecting loop closure with scene sequences. Int. J. Comput. Vis. 74(3), 261–286 (2007)

    Article  Google Scholar 

  22. Hou, Y., Zhang, H., Zhou, S.: Convolutional neural network-based image representation for visual loop closure detection. In: IEEE International Conference on Information and Automation, pp. 2238–2245 (2015)

  23. Hou, Y., Zhang, H., Zhou, S.: Evaluation of object proposals and convnet features for landmark-based visual place recognition. J. Intell. Robot. Syst., 1–16 (2017)

  24. Iscen, A., Tolias, G., Avrithis, Y., Furon, T., Chum, O.: Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations. CVPR (2017)

  25. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)

  26. Lategahn, H., Beck, J., Kitt, B., Stiller, C.: How to learn an illumination robust image feature for place recognition. In: 2013 IEEE Intelligent Vehicles Symposium (IV), pp. 285–291. IEEE (2013)

  27. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  28. Lowry, S., Sünderhauf, N., Newman, P., Leonard, J.J., Cox, D., Corke, P., Milford, M.J.: Visual place recognition: A survey. IEEE Trans. Robot. 32(1), 1–19 (2016). https://doi.org/10.1109/TRO.2015.2496823

    Article  Google Scholar 

  29. Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., Burgard, W.: Robust visual slam across seasons. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2529–2535. IEEE (2015)

  30. Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: Proceedings of the IEEE Conference on Computer Vision (2017)

  31. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717–1724 (2014)

  32. Radenović, F., Tolias, G., Chum, O.: Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In: European Conference on Computer Vision, pp. 3–20. Springer (2016)

  33. Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: An astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 512–519 (2014). arXiv:http://arXiv.org/abs/1403.6382v3

  34. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:http://arXiv.org/abs/1409.1556 (2014)

  35. Stumm, E., Mei, C., Lacroix, S., Chli, M.: Location graphs for visual place recognition. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 5475–5480. IEEE (2015)

  36. Stumm, E., Mei, C., Lacroix, S., Nieto, J., Hutter, M., Siegwart, R.: Robust visual place recognition with graph kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4535–4544 (2016)

  37. Sünderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304. IEEE (2015)

  38. Sunderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., Milford, M.: Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free. Springer International Publishing (2015)

  39. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  40. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015). arXiv:1409.4842

  41. Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of cnn activations. International Conference on Learning Representations (ICLR) (2016)

  42. Vedaldi, A., Lenc, K.: Matconvnet – convolutional neural networks for matlab. In: Proceeding of the ACM Int. Conf. on Multimedia (2015)

  43. Vysotska, O., Stachniss, C.: Lazy data association for image sequences matching under substantial appearance changes. IEEE Robot. Autom. Lett. 1(1), 213–220 (2016)

    Article  Google Scholar 

  44. Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I., Tardós, J.: A comparison of loop closing techniques in monocular slam. Robot. Auton. Syst. 57(12), 1188–1197 (2009)

    Article  Google Scholar 

  45. Xie, L., Tian, Q., Zhou, W., Zhang, B.: Fast and accurate near-duplicate image search with affinity propagation on the imageweb. Comput. Vis. Image Underst. 124, 31–41 (2014)

    Article  Google Scholar 

  46. Yang, F., Matei, B., Davis, L.S.: Re-ranking by multi-feature fusion with diffusion for image retrieval. In: 2015 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 572–579. IEEE (2015)

  47. Yang, X., Koknar-Tezel, S., Latecki, L.J.: Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 357–364. IEEE (2009)

  48. Zhang, X., Su, Y., Zhu, X.: Loop closure detection for visual slam systems using convolutional neural network. In: 2017 23rd International Conference on Automation and Computing (ICAC), pp. 1–6. IEEE (2017)

  49. Zhou, B., Garcia, A.L., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 1, 487–495 (2015)

    Google Scholar 

  50. Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on data manifolds. In: Advances in Neural Information Processing Systems, pp. 169–176 (2004)

  51. Chung, F., Lu, L., Vu, V.: Spectra of random graphs with given expected degrees. Proc. Nat. Acad. Sci. 100(11), 6313–6318 (2003)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lei Wang or Yan Su.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work in this paper was conducted during Xiwu Zhang’s visit to University of Wollongong, Australia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Wang, L., Zhao, Y. et al. Graph-Based Place Recognition in Image Sequences with CNN Features. J Intell Robot Syst 95, 389–403 (2019). https://doi.org/10.1007/s10846-018-0917-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-018-0917-2

Keywords

Navigation