skip to main content
10.1145/3512527.3531427acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

DMPCANet: A Low Dimensional Aggregation Network for Visual Place Recognition

Published: 27 June 2022 Publication History

Abstract

Visual place recognition (VPR) aims to estimate the geographical location of a query image by finding its nearest reference images from a large geo-tagged database. Most of the existing methods adopt convolutional neural networks to extract feature maps from images. Nevertheless, such feature maps are high-dimensional tensors, and it is a challenge to effectively aggregate them into a compact vector representation for efficient retrieval. To tackle this challenge, we develop an end-to-end convolutional neural network architecture named DMPCANet. The network adopts the regional pooling module to generate feature tensors of the same size from images of different sizes. The core component of our network, the Differentiable Multilinear Principal Component Analysis (DMPCA) module, directly acts on tensor data and utilizes convolution operations to generate projection matrices for dimensionality reduction, thereby reducing the dimensionality to one sixteenth. This module can preserve crucial information while reducing data dimensions. Experiments on two widely used place recognition datasets demonstrate that our proposed DMPCANet can generate low-dimensional discriminative global descriptors and achieve the state-of-the-art results.

Supplementary Material

MP4 File (MPCA-ICMR2022-v2.mp4)
DMPCANet Presentation video-short version
MP4 File (icmr22-sp099.mp4)
We develop an end-to-end convolutional neural network architecture named DMPCANet for visual place recognition. The network adopts the regional pooling module to generate feature tensors of the same size from images of different sizes. The core component of our network, the DMPCA module, preserves crucial information while reducing data dimensions. The module directly acts on tensor data and utilizes convolution operations to generate projection matrices for dimensionality reduction, thereby reducing the dimensionality to one-sixteenth. Extensive experiments on two widely used place recognition datasets demonstrate that our proposed DMPCANet can generate low-dimensional discriminative global descriptors and achieve state-of-the-art results.

References

[1]
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5297--5307.
[2]
Relja Arandjelovic and Andrew Zisserman. 2013. All about VLAD. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1578--1585.
[3]
Zetao Chen, Adam Jacobson, Niko Sünderhauf, Ben Upcroft, Lingqiao Liu, Chunhua Shen, Ian Reid, and Michael Milford. 2017. Deep learning features at scale for visual place recognition. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 3223--3230.
[4]
Sourav Garg and Michael Milford. 2020. Fast, compact and highly scalable visual place recognition through sequence-based matching of overloaded representations. In 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 3341--3348.
[5]
Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, and Tobias Fischer. 2021. Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14141--14152.
[6]
Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 3304--3311.
[7]
Ahmad Khaliq, Shoaib Ehsan, Zetao Chen, Michael Milford, and Klaus McDonald-Maier. 2019. A holistic visual place recognition approach using lightweight cnns for significant viewpoint and appearance changes. IEEE transactions on robotics 36, 2 (2019), 561--569.
[8]
Hyo Jin Kim, Enrique Dunn, and Jan-Michael Frahm. 2017. Learned contextual feature reweighting for image geo-localization. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3251--3260.
[9]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84--90.
[10]
Min Lin, Qiang Chen, and Shuicheng Yan. 2013. Network in network. arXiv preprint arXiv:1312.4400 (2013).
[11]
Haiping Lu, Konstantinos N Plataniotis, and Anastasios N Venetsanopoulos. 2008. MPCA: Multilinear principal component analysis of tensor objects. IEEE transactions on Neural Networks 19, 1 (2008), 18--39.
[12]
Carlo Masone and Barbara Caputo. 2021. A survey on deep visual place recognition. IEEE Access 9 (2021), 19516--19547.
[13]
Guohao Peng, Yufeng Yue, Jun Zhang, Zhenyu Wu, Xiaoyu Tang, and Danwei Wang. 2021. Semantic reinforced attention learning for visual place recognition. arXiv preprint arXiv:2108.08443 (2021).
[14]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.1556
[15]
Akihiko Torii, Relja Arandjelovic, Josef Sivic, Masatoshi Okutomi, and Tomas Pajdla. 2015. 24/7 place recognition by view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1808--1817.
[16]
Akihiko Torii, Josef Sivic, Tomas Pajdla, and Masatoshi Okutomi. 2013. Visual place recognition with repetitive structures. In Proceedings of the IEEE conference on computer vision and pattern recognition. 883--890.
[17]
Jun Yu, Chaoyang Zhu, Jian Zhang, Qingming Huang, and Dacheng Tao. 2019. Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE transactions on neural networks and learning systems 31, 2 (2019), 661--674.
[18]
Xiwu Zhang, Lei Wang, and Yan Su. 2021. Visual place recognition: A survey from deep learning perspective. Pattern Recognition 113 (2021), 107760.
[19]
Yingying Zhu, Biao Li, Jiong Wang, and Zhou Zhao. 2020. Regional Relation Modeling for Visual Place Recognition. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 821--830.
[20]
Yingying Zhu, Jiong Wang, Lingxi Xie, and Liang Zheng. 2018. Attention-based pyramid aggregation network for visual place recognition. In Proceedings of the 26th ACM international conference on Multimedia. 99--107.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval
June 2022
714 pages
ISBN:9781450392389
DOI:10.1145/3512527
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image representation
  2. image retrieval
  3. place recognition

Qualifiers

  • Short-paper

Funding Sources

Conference

ICMR '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 126
    Total Downloads
  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media