Skip to main content

VGG-CAE: Unsupervised Visual Place Recognition Using VGG16-Based Convolutional Autoencoder

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13020))

Included in the following conference series:

  • 2251 Accesses

Abstract

Visual Place Recognition (VPR) is a challenging task in Visual Simultaneous Localization and Mapping (VSLAM), which expects to find out paired images corresponding to the same place in different conditions. Although most methods based on Convolutional Neural Network (CNN) perform well, they require a large number of annotated images for supervised training, which is time and energy consuming. Thus, to train the CNN in an unsupervised way and achieve better performance, we propose a new place recognition method in this paper. We design a VGG16-based Convolutional Autoencoder (VGG-CAE), which uses the features outputted by VGG16 as the label of images. In this case, VGG-CAE learns the latent representation from the label of images and improves the robustness against appearance and viewpoint variation. When deploying VGG-CAE, features are extracted from query images and reference images with post-processing, the Cosine similarities of features are calculated respectively and a matrix for feature matching is formed accordingly. To verify the performance of our method, we conducted experiments with several public datasets, showing our method achieves competitive results comparing to existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Glover, A.: Gardens point walking (2014). https://wiki.qut.edu.au/display/raq/day+and+night+with+lateral+pose+change+datasets

  2. Chollet, F., et al.: Keras (2015). https://keras.io/

  3. ImageNet, an image database organized according to the wordnet hierarchy. http://www.image-net.org/

  4. Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: IEEE International Conference on Computer Vision (ICCV), pp. 1269–1277 (2015)

    Google Scholar 

  5. Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. Int. J. Robot. Res. (IJRR) 37(1), 62–82 (2018)

    Article  Google Scholar 

  6. Camara, L.G., Gäbert, C., Přeučil, L.: Highly robust visual place recognition through spatial matching of CNN features. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3748–3755 (2020)

    Google Scholar 

  7. Chen, Z., et al.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230 (2017)

    Google Scholar 

  8. Chen, Z., Lam, O., Jacobson, A., Milford, M.: Convolutional neural network-based place recognition. arXiv preprint arXiv:1411.1509 (2014)

  9. Chen, Z., Maffra, F., Sa, I., Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9–16 (2017)

    Google Scholar 

  10. Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. (IJRR) 27(6), 647–665 (2008)

    Article  Google Scholar 

  11. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)

    Google Scholar 

  12. Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. (TRO) 28(5), 1188–1197 (2012)

    Article  Google Scholar 

  13. Gao, X., Zhang, T.: Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot. 41(1), 1–18 (2015). https://doi.org/10.1007/s10514-015-9516-2

    Article  MathSciNet  Google Scholar 

  14. Hausler, S., Jacobson, A., Milford, M.: Feature map filtering: improving visual place recognition with convolutional calibration. arXiv preprint arXiv:1810.12465 (2018)

  15. Hou, Y., Zhang, H., Zhou, S., Zou, H.: Use of roadway scene semantic information and geometry-preserving landmark pairs to improve visual place recognition in changing environments. IEEE Access 5, 7702–7713 (2017)

    Article  Google Scholar 

  16. Kenshimov, C., Bampis, L., Amirgaliyev, B., Arslanov, M., Gasteratos, A.: Deep learning features exception for cross-season visual place recognition. Pattern Recognit. Lett. (PRL) 100, 124–130 (2017)

    Article  Google Scholar 

  17. Khaliq, A., Ehsan, S., Chen, Z., Milford, M., McDonald-Maier, K.: A holistic visual place recognition approach using lightweight CNNs for significant viewpoint and appearance changes. IEEE Trans. Robot. (TRO) 36(2), 561–569 (2019)

    Article  Google Scholar 

  18. Labbe, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based SLAM. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2661–2666 (2014)

    Google Scholar 

  19. Liu, L., Shen, C., van den Hengel, A.: The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4749–4757 (2015)

    Google Scholar 

  20. Liu, Y., Xiang, R., Zhang, Q., Ren, Z., Cheng, J.: Loop closure detection based on improved hybrid deep learning architecture. In: IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS), pp. 312–317 (2019)

    Google Scholar 

  21. Lopez-Antequera, M., Gomez-Ojeda, R., Petkov, N., Gonzalez-Jimenez, J.: Appearance-invariant place recognition by discriminatively training a convolutional neural network. Pattern Recogn. Lett. (PRL) 92, 89–95 (2017)

    Article  Google Scholar 

  22. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)

    Article  Google Scholar 

  23. Lowry, S., et al.: Visual place recognition: a survey. IEEE Trans. Robot. (TRO) 32(1), 1–19 (2015)

    Google Scholar 

  24. Maffra, F., Teixeira, L., Chen, Z., Chli, M.: Real-time wide-baseline place recognition using depth completion. IEEE Robot. Autom. Lett. (RAL) 4(2), 1525–1532 (2019)

    Article  Google Scholar 

  25. Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv preprint arXiv:1805.07703 (2018)

  26. Milford, M.J., Wyeth, G.F.: SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1643–1649 (2012)

    Google Scholar 

  27. Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., Burgard, W.: Robust visual SLAM across seasons. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2529–2535 (2015)

    Google Scholar 

  28. Olid, D., Fácil, J.M., Civera, J.: Single-view place recognition under seasonal changes. arXiv preprint arXiv:1808.06516 (2018)

  29. Pepperell, E., Corke, P.I., Milford, M.J.: All-environment visual place recognition with smart. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1612–1618 (2014)

    Google Scholar 

  30. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)

    Google Scholar 

  31. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations, pp. 1–14 (2015)

    Google Scholar 

  33. Sünderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304 (2015)

    Google Scholar 

  34. Sünderhauf, N., et al.: Place recognition with convnet landmarks: viewpoint-robust, condition-robust, training-free. Robot. Sci. Syst. (RSS) XI, 1–10 (2015)

    Google Scholar 

  35. Tomită, M.A., Zaffar, M., Milford, M., McDonald-Maier, K., Ehsan, S.: Convsequential-slam: a sequence-based, training-less visual place recognition technique for changing environments. arXiv preprint arXiv:2009.13454 (2020)

  36. Xiang, R., Liu, Y., Zhang, Q., Cheng, J.: Spatial pyramid pooling based convolutional autoencoder network for loop closure detection. In: IEEE International Conference on Real-time Computing and Robotics (RCAR), pp. 714–719 (2019)

    Google Scholar 

  37. Zaffar, M., Ehsan, S., Milford, M., McDonald-Maier, K.: CoHOG: a light-weight, compute-efficient, and training-free visual place recognition technique for changing environments. IEEE Robot. Autom. Lett. 5(2), 1835–1842 (2020)

    Article  Google Scholar 

  38. Zaffar, M., Khaliq, A., Ehsan, S., Milford, M., Alexis, K., McDonald-Maier, K.: Are state-of-the-art visual place recognition techniques any good for aerial robotics? arXiv preprint arXiv:1904.07967 (2019)

  39. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(6), 1452–1464 (2017)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. U1913202, U1813205, U1713213, 61772508), CAS Key Technology Talent Program, Shenzhen Technology Project (nos. JCYJ20180507182610734, JSGG20191129094012321)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qieshi Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, Z., Zhang, Q., Hao, F., Ren, Z., Kang, Y., Cheng, J. (2021). VGG-CAE: Unsupervised Visual Place Recognition Using VGG16-Based Convolutional Autoencoder. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13020. Springer, Cham. https://doi.org/10.1007/978-3-030-88007-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88007-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88006-4

  • Online ISBN: 978-3-030-88007-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics