Abstract
Visual Place Recognition is a vital part of image localization and loop closure detection systems, and it has attracted widespread interest in multiple domains such as computer vision, robotics and AR/VR. In this work, we propose a faster, lighter and stronger approach that can generate models with fewer parameters and can spend less time in the inference stage. We designed RepVGG-lite as the backbone network in our architecture, it is more discriminative than other general networks in the Place Recognition task. RepVGG-lite has more speed advantages while achieving higher performance. We extract only one scale patch-level descriptors from global descriptors in the feature extraction stage. Then we design a trainable feature matcher to exploit both the space relationships and the visual appearance of the features, which is based on the attention mechanism. Extensive experiments on difficult datasets show that the proposed approach outperforming previous other advanced learning approaches, and achieving even higher inference speed. Our system has 14 times less params than Patch-NetVLAD, 6.8 times lower theoretical FLOPs, and run faster 21 and 33 times in feature extraction and feature matching. Moreover, the performance of our approach is 0.5% better than Patch-NetVLAD in Recall@1. We used subsets of Mapillary Street Level Sequences dataset to conduct experiments for all other challenging conditions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arandjelovic, R., et al.: NetVLAD: CNN architecture for weakly supervised place recognition. In: CVPR, pp. 5297–5307 (2016)
DeTone, D., et al.: Superpoint: self-supervised interest point detection and description. In: CVPR, pp. 224–236 (2018)
Ding, X., et al.: RepVGG: making VGG-style convnets great again. In: CVPR, pp. 13733–13742 (2021)
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)
Dusmanu, M., et al.: D2-net: a trainable CNN for joint description and detection of local features. In: CVPR, pp. 8092–8101 (2019)
Hausler, S., et al.: Patch-NetVLAD: multi-scale fusion of locally-global descriptors for place recognition. In: CVPR, pp. 14141–14152 (2021)
Newman, P., Ho, K.: SLAM-loop closing with visually salient features. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation, pp. 635–642. IEEE (2005)
Peyré, G., Cuturi, M., et al.: Computational optimal transport: with applications to data science. Found. Trends® Mach. Learn. 11(5–6), 355–607 (2019)
Revaud, J., Almazán, J., Rezende, R.S., de Souza, C.R.: Learning with average precision: training image retrieval with a listwise loss. In: CVPR, pp. 5107–5116 (2019)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)
Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: CVPR, pp. 4938–4947 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Torii, A., et al.: 24/7 place recognition by view synthesis. In: CVPR, pp. 1808–1817 (2015)
Torii, A., Sivic, J., Pajdla, T., Okutomi, M.: Visual place recognition with repetitive structures. In: CVPR, pp. 883–890 (2013)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Warburg, F., et al.: Mapillary street-level sequences: a dataset for lifelong place recognition. In: CVPR, pp. 2626–2635 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huang, R., Huang, Z., Su, S. (2023). A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1682. Springer, Singapore. https://doi.org/10.1007/978-981-99-2385-4_34
Download citation
DOI: https://doi.org/10.1007/978-981-99-2385-4_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2384-7
Online ISBN: 978-981-99-2385-4
eBook Packages: Computer ScienceComputer Science (R0)