DaCo: domain-agnostic contrastive learning for visual place recognition

Ren, Hao; Zheng, Ziqiang; Wu, Yang; Lu, Hong

doi:10.1007/s10489-023-04629-x

DaCo: domain-agnostic contrastive learning for visual place recognition

Published: 14 June 2023

Volume 53, pages 21827–21840, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Hao Ren¹^na1,
Ziqiang Zheng²^na1,
Yang Wu³ &
…
Hong Lu ORCID: orcid.org/0000-0002-4572-2854¹

284 Accesses
1 Altmetric
Explore all metrics

Abstract

Visual place recognition is a core component of visual information analysis, which serves for the position and orientation perception of autonomous driving and robotics. The current place recognition methods usually rely on image retrieval techniques to identify the visual similarity between query and gallery images. However, state-of-the-art image retrieval methods are often based on extensive labels, such as matched pairs (e.g., the image correspondences). Besides, image retrieval methods heavily suffer from environmental condition changes (i.e., a large range of illumination and weather changes). To alleviate the annotation cost, we introduce contrastive learning to perform feature extraction and feature similarity measurement in a self-supervised manner. Considering the heavy data augmentations of the existing contrastive learning approaches cannot effectively simulate domain disparities, we design the generative adversarial model to promote the extraction of domain-agnostic features. To tightly integrate the domain-agnostic representations and self-supervision, we design a self-generated soft constraint to achieve domain-agnostic contrastive learning (termed “DaCo”). Extensive experiments and analysis on cross-illumination and cross-weather settings are conducted on three challenging datasets. The proposed “DaCo” outperforms current contrastive learning based image retrieval methods by a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-condition Place Generator for Robust Place Recognition

RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

Article Open access 07 May 2021

References

Zaffar M, Garg S, Milford M, Kooij J, Flynn D, McDonald-Maier K, Ehsan S (2021) Vpr-bench: An open-source visual place recognition evaluation framework with quantifiable viewpoint and appearance change. International Journal of Computer Vision 129(7):2136–2174
Article Google Scholar
Özdemir A, Scerri M, Barron AB, Philippides A, Mangan M, Vasilaki E, Manneschi L (2022) Echovpr: Echo state networks for visual place recognition. IEEE Robotics and Automation Letters 7(2):4520–4527
Article Google Scholar
Thoma, J., Paudel, D.P., Gool, L.V.: Soft contrastive learning for visual localization. In: Advances in Neural Information Processing Systems, vol. 33, pp. 11119–11130 (2020)
Skrzypczyński, P.: Mobile robot localization: Where we are and what are the challenges? International Conference Automation, 249–267 (2017)
Li, L., Kong, X., Zhao, X., Huang, T., Li, W., Wen, F., Zhang, H., Liu, Y.: Ssc: Semantic scan context for large-scale place recognition. In: IEEE RSJ International Conference on Intelligent Robots and Systems, pp. 2092–2099 (2021)
Wang, H., Pi, J., Qin, T., Shen, S., Shi, B.E.: Slam-based localization of 3d gaze using a mobile eye tracker. In: ACM Symposium on Eye Tracking Research & Applications, p. 65 (2018)
Fine-tuning cnn image retrieval with no human annotation
Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: A decade survey of instance retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(5):1224–1244
Article Google Scholar
Gadd, M., De Martini, D., Newman, P.: Contrastive learning for unsupervised radar place recognition. In: International Conference on Advanced Robotics, pp. 344–349 (2021)
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2020) A survey on contrastive self-supervised learning. Technologies 9(1):2
Article Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, vol. 1, pp. 1597–1607 (2020)
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 22243–22255 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2011–2023 (2018)
Zhao, S., Yue, X., Zhang, S., Li, B., Zhao, H., Wu, B., Krishna, R., Gonzalez, J.E., Sangiovanni-Vincentelli, A.L., Seshia, S.A., Keutzer, K.: A review of single-source deep unsupervised visual domain adaptation. IEEE Transactions on Neural Networks, 1–21 (2020)
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2020) Generative adversarial networks. Communications of The ACM 63(11):187–208
Article MathSciNet Google Scholar
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, pp. 2242–2251 (2017)
Zheng, Z., Wu, Y., Han, X., Shi, J.: Forkgan: Seeing into the rainy night. In: European Conference on Computer Vision, pp. 155–170 (2020)
Anoosheh, A., Sattler, T., Timofte, R., Pollefeys, M., Gool, L.V.: Night-to-day image translation for retrieval-based localization. In: International Conference on Robotics and Automation, pp. 5958–5964 (2019)
Lee, K., Zhu, Y., Sohn, K., Li, C.-L., Shin, J., Lee, H.: i-mix: A domain-agnostic strategy for contrastive representation learning. In: International Conference on Learning Representations (2021)
Verma, V., Luong, M.-T., Kawaguchi, K., Pham, H., Le, Q.V.: Towards domain-agnostic contrastive learning. In: International Conference on Machine Learning, vol. 139, pp. 10530–10541 (2021)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: Beyond empirical risk minimization. In: International Conference on Learning Representations (2017)
Chang, C., Yu, G., Liu, C., Volkovs, M.: Explore-exploit graph traversal for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 9423–9431 (2019)
Hausler, S., Garg, S., Xu, M., Milford, M., Fischer, T.: Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 14141–14152 (2021)
Akihiko, T., Relja, A., Josef, S., Masatoshi, O., Tomas, P.: 24/7 place recognition by view synthesis. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1808–1817 (2015)
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2):91–110
Article Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Arandjelovic, R., Zisserman, A.: All about vlad. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2018) Netvlad: Cnn architecture for weakly supervised place recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(6):1437–1451
Article Google Scholar
Liu, L., Li, H., Dai, Y.: Stochastic attraction-repulsion embedding for large scale image localization. In: IEEE International Conference on Computer Vision, pp. 2570–2579 (2019)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)
Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. In: European Conference on Computer Vision, pp. 776–794 (2019)
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9912–9924 (2020)
Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch, C., Pires, B.A., Guo, Z.D., Azar, M.G., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap your own latent: A new approach to self-supervised learning. In: Advances in Neural Information Processing Systems, vol. 33, pp. 21271–21284 (2020)
Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. In: IEEE International Conference on Computer Vision, pp. 9588–9597 (2021)
Chen, X., He, K.: Exploring simple siamese representation learning. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)
Liu, M.-Y., Huang, X., Mallya, A., Karras, T., Aila, T., Lehtinen, J., Kautz, J.: Few-shot unsupervised image-to-image translation. In: IEEE International Conference on Computer Vision, pp. 10551–10560 (2019)
Bhattacharjee, D., Kim, S., Vizier, G., Salzmann, M.: Dunit: Detection-based unsupervised image-to-image translation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4787–4796 (2020)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: Additive angular margin loss for deep face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Kansizoglou, I., Santavas, N., Bampis, L., Gasteratos, A.: Haseparator: Hyperplane-assisted softmax. In: IEEE International Conference on Machine Learning and Applications, pp. 519–526 (2020)
Sattler, T., Weyand, T., Leibe, B., Kobbelt, L.: Image retrieval for image-based localization revisited. In: British Machine Vision Conference, vol. 1, p. 4 (2012)
Maddern W, Pascoe G, Linegar C, Newman P (2017) 1 year, 1000 km: The oxford robotcar dataset. The International Journal of Robotics Research 36(1):3–15
Article Google Scholar
Jafarzadeh, A., Antequera, M.L., Gargallo, P., Kuang, Y., Toft, C., Kahl, F., Sattler, T.: Crowddriven: A new challenging dataset for outdoor visual localization. In: IEEE International Conference on Computer Vision, pp. 9845–9855 (2021)
Bansal, A., Badino, H., Huber, D.: Understanding how camera configuration and environmental conditions affect appearance-based localization. In: IEEE Intelligent Vehicles Symposium Proceedings, pp. 800–807 (2014)
Sakaridis, C., Dai, D., Hecker, S., Gool, L.V.: Model adaptation with synthetic and real data for semantic dense foggy scene understanding. In: European Conference on Computer Vision, pp. 707–724 (2018)
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)
Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4004–4012 (2016)
Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2015)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8026–8037 (2019)
Hu C, Wang Y, Gu J (2020) Cross-domain intelligent fault classification of bearings based on tensor-aligned invariant subspace learning and two-dimensional convolutional neural networks. Knowledge-Based Systems 209:106214
Hu C, He S, Wang Y (2021) A classification method to detect faults in a rotating machinery based on kernelled support tensor machine and multilinear principal component analysis. Applied Intelligence 51(4):2609–2621
Article Google Scholar

Download references

Funding

This work was supported by Scientific and Technological innovation action plan of Shanghai Science and Technology Committee (No.22511102202), Fudan University Double First-class Construction Fund (No. XM03211178).

Author information

Hao Ren and Ziqiang Zheng are authors contributed equally to this work.

Authors and Affiliations

Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China
Hao Ren & Hong Lu
UISEE Technology Co., Ltd., Shanghai, China
Ziqiang Zheng
Kyoto University, Kyoto, Japan
Yang Wu

Authors

Hao Ren
View author publications
You can also search for this author in PubMed Google Scholar
Ziqiang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Lu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection and analysis were performed by Hao Ren. The first draft of the manuscript was written by Ziqiang Zheng and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hong Lu.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ren, H., Zheng, Z., Wu, Y. et al. DaCo: domain-agnostic contrastive learning for visual place recognition. Appl Intell 53, 21827–21840 (2023). https://doi.org/10.1007/s10489-023-04629-x

Download citation

Accepted: 10 April 2023
Published: 14 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04629-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DaCo: domain-agnostic contrastive learning for visual place recognition

Abstract

Access this article

Similar content being viewed by others

Multi-condition Place Generator for Robust Place Recognition

RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DaCo: domain-agnostic contrastive learning for visual place recognition

Abstract

Access this article

Similar content being viewed by others

Multi-condition Place Generator for Robust Place Recognition

RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning

VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation