Skip to main content

Advertisement

Hierarchical loop closure detection with weighted local patch features and global descriptors

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Maintaining high-precision localization and ensuring map consistency are crucial objectives for mobile robots. However, loop closure detection remains a challenging aspect of their operation because of viewpoint and appearance changes. To address this issue, this paper proposes WP-VLAD, a novel hierarchical loop closure detection method that tightly couples global features and weighted local patch-level features (WPs). WP-VLAD employs MobileNetV3 as the backbone network for feature extraction, and integrates a trainable vector of local aggregated descriptors (VLAD) for compact global and local feature representation. A hierarchical navigable small world method is used to retrieve loop candidate frames based on the global features, whereas a multiscale feature fusion weighted map prediction module assigns weights to the local patches during mutual nearest neighbour matching. The proposed weight allocation strategy emphasizes salient regions, reducing interference from dynamic objects. The experimental results on benchmark datasets demonstrate that WP-VLAD significantly improves matching performance while maintaining efficient computation, exhibiting strong generalizability and robustness across various complex environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

This study has associated data in data repositories.

References

  1. Cadena C et al (2016) Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans Rob 32(6):1309–1332

    Article  MATH  Google Scholar 

  2. Galvez-López D, Tardos JD (2012) Bags of Binary Words for Fast Place Recognition in Image Sequences. IEEE Trans Rob 28(5):1188–1197

    Article  MATH  Google Scholar 

  3. Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) Computer Vision – ECCV 2006. ECCV 2006, Lecture notes in computer science, vol 3951. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744023_32

    Chapter  Google Scholar 

  4. Lowe DG (2004) Distinctive image features from scale-invariant key-points. Int J Comput Vis 60(2):91–110

    Article  MATH  Google Scholar 

  5. Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: binary robust independent elementary features. In: Daniilidis K, Maragos P, Paragios N (eds) Computer Vision – ECCV 2010. ECCV 2010, Lecture notes in computer science, vol 6314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15561-1_56

    Chapter  Google Scholar 

  6. Radenović F, Tolias G, Chum O (2016) CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision – ECCV 2016. ECCV 2016, Lecture Notes in Computer Science(), vol 9905. Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_1

    Chapter  Google Scholar 

  7. Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp 5297–5307. https://doi.org/10.1109/CVPR.2016.572.

  8. Xu Y, Huang J, Wang J et al (2021) ESA-VLAD: A lightweight network based on second-order attention and NetVLAD for loop closure detection. IEEE Robot Autom Lett 6(4):6545–6552

    Article  MATH  Google Scholar 

  9. Hausler S, Garg S, Xu M, Milford M, Fischer T (2021) Patch-NetVLAD: multi-scale fusion of locally-global descriptors for place recognition. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Nashville, TN, pp 14136–14147. https://doi.org/10.1109/CVPR46437.2021.01392.

  10. Jin S, Dai X, Meng Q (2023) Loop closure detection with patch-level local features and visual saliency prediction. Eng Appl Artif Intell 120:105902

    Article  MATH  Google Scholar 

  11. Yu J, Zhu C, Zhang J, Huang Q, Tao D (2020) Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition. IEEE Trans Neural Netw Learn Syst 31(2):661–674

    Article  MATH  Google Scholar 

  12. Khaliq A, Milford M, Garg S (2022) MultiRes-NetVLAD: Augmenting Place Recognition Training With Low-Resolution Imagery. IEEE Robot Autom Lett 7(2):3882–3889

    Article  Google Scholar 

  13. Noh H, Araujo A, Sim J, Weyand T, Han B (2017) Large-scale image retrieval with attentive deep local features. In: 2017 IEEE international conference on computer vision (ICCV), Venice, pp 3476-3485. https://doi.org/10.1109/ICCV.2017.374.

  14. DeTone D, Malisiewicz T, Rabinovich A (2018) SuperPoint: self-supervised interest point detection and description. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Salt Lake City, pp 337–33712. https://doi.org/10.1109/CVPRW.2018.00060

    Chapter  Google Scholar 

  15. Dusmanu M et al (2019) D2-Net: a trainable CNN for joint description and detection of local features. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, pp 8084–8093. https://doi.org/10.1109/CVPR.2019.00828

  16. Li D et al (2021) RaP-Net: a region-wise and point-wise weighting network to extract robust features for indoor localization. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), Prague, pp 1331–1338. https://doi.org/10.1109/IROS51168.2021.9636248.

  17. Ren M, Gao B (2023) Loop closure detection based on feature pyramids and NetVLAD. J Electron Imaging 32(6):063033. https://doi.org/10.1117/1.JEI.32.6.063033

  18. An S, Zhu H, Wei D et al (2022) Fast and incremental loop closure detection with deep features and proximity graphs. J Field Robot 39(4):473–493

    Article  MATH  Google Scholar 

  19. Keetha NV, Milford M, Garg S (2021) A hierarchical dual model of environment-and place-specific utility for visual place recognition. IEEE Robot Autom Lett 6(4):6969–6976

    Article  MATH  Google Scholar 

  20. Malkov YA, Yashunin DA (2020) Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans Pattern Anal Mach Intell 42(4):824–836

    Article  MATH  Google Scholar 

  21. Cummins M, Newman P (2008) FAB-MAP: Probabilistic localization and mapping in the space of appearance. Int J Robot Res 27(1):647–665

    Article  MATH  Google Scholar 

  22. Cummins M, Newman P (2011) Appearance-only SLAM at large scale with FAB-MAP 2.0. Int J Robot Res 30(9):1100–1123

    Article  MATH  Google Scholar 

  23. Labbé M, Michaud F (2013) Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation. IEEE Trans Rob 29(3):734–745

    Article  MATH  Google Scholar 

  24. Garcia-Fidalgo E, Ortiz A (2018) iBoW-LCD: An Appearance-Based Loop-Closure Detection Approach Using Incremental Bags of Binary Words. IEEE Robot Autom Lett 3(4):3051–3057

    Article  MATH  Google Scholar 

  25. Chen Z, Lam O, Jacobson A, Milford M (2014) Convolutional neural network-based place recognition. arXiv:1411.1509

  26. Sünderhauf N, Shirazi S, Dayoub F, Upcroft B, Milford M (2015) On the performance of ConvNet features for place recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). Hamburg, Germany, pp 4297–4304. https://doi.org/10.1109/IROS.2015.7353986

    Chapter  Google Scholar 

  27. Xin Z et al (2019) Localizing discriminative visual landmarks for place recognition. In: 2019 international conference on robotics and automation (ICRA), Montreal, pp 5979-5985. https://doi.org/10.1109/ICRA.2019.8794383.

  28. Cao B, Araujo A, Sim J (2020) Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX, pp 726–743. https://doi.org/10.1007/978-3-030-58565-5_43

    Book  Google Scholar 

  29. Sarlin P-E, DeTone D, Malisiewicz T, Rabinovich A (2020) SuperGlue: learning feature matching with graph neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Seattle, pp 4937–4946. https://doi.org/10.1109/CVPR42600.2020.00499

  30. Li Z, Lee CDW, Tung BXL, Huang Z, Rus D, Ang MH (2023) Hot-NetVLAD: Learning Discriminatory Key Points for Visual Place Recognition. IEEE Robot Autom Lett 8(2):974–980

    Article  Google Scholar 

  31. Ma J, Zhang K, Jiang J (2023) Loop Closure Detection via Locality Preserving Matching With Global Consensus. J Autom Sin 10(2):411–426

    MATH  Google Scholar 

  32. Li P, Wen S, Xu C, Qiu TZ (2024) Visual Place Recognition for Opposite Viewpoints and Environment Changes. IEEE Trans Instrum Meas 73:1–9

    Article  MATH  Google Scholar 

  33. Cai Y, Zhao J, Cui J, Zhang F, Feng T, Ye C (2022) Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition. In: 2022 IEEE international conference on multisensor fusion and integration for intelligent systems (MFI), Bedford, pp 1–8. https://doi.org/10.1109/MFI55806.2022.9913860

  34. Zhou Y, Chen S, Wang Y, Huan W (2020) Review of research on lightweight convolutional neural networks. In: 2020 IEEE 5th information technology and mechatronics engineering conference (ITOEC), Chongqing, pp 1713–1720. https://doi.org/10.1109/ITOEC49072.2020.9141847

  35. Howard A et al (2019) Searching for MobileNetV3. In: 2019 IEEE/CVF international conference on computer vision (ICCV). Seoul, pp 1314–1324. https://doi.org/10.1109/ICCV.2019.00140

  36. Tsintotas KA, Bampis L, Gasteratos A (2018) Assigning visual words to places for loop closure detection. In: IEEE international conference on robotics and automation (ICRA), vol 2018, Brisbane, QLD, pp 5979–5985. https://doi.org/10.1109/ICRA.2018.8461146

  37. Kenshimov C, Bampis L, Amirgaliyev B et al (2017) Deep learning features exception for cross-season visual place recognition. Pattern Recogn Lett 100:124–130

    Article  Google Scholar 

  38. Merrill N, Huang G (2018) Lightweight unsupervised deep loop closure. arXiv:1805.07703,2018,5

  39. Torii A, Sivic J, Okutomi M, Pajdla T (2015) Visual Place Recognition with Repetitive Structures. IEEE Trans Pattern Anal Mach Intell 37(11):2346–2359

    Article  MATH  Google Scholar 

  40. Torii A, Arandjelović R, Sivic J, Okutomi M, Pajdla T (2018) 24/7 Place Recognition by View Synthesis. IEEE Trans Pattern Anal Mach Intell 40(2):257–271

    Article  MATH  Google Scholar 

  41. Qin T, Li P, Shen S (2018) VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Trans Rob 34(4):1004–1020

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to gratefully thank the reviewers for their thorough review and are extraordinarily appreciative of their comments and suggestions, which have significantly improved the quality of the publication.

Funding

This work is partly supported by the National Natural Science Foundation of China (62373017).

Author information

Authors and Affiliations

Authors

Contributions

Mingrong Ren: study conception and design, methodology development, manuscript revision. Xiurui Zhang: manuscript preparation. Bin Liu: data analysis. Yuehui Zhu conducted the experiments.

Corresponding author

Correspondence to Mingrong Ren.

Ethics declarations

Conflicts of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, M., Zhang, X., Liu, B. et al. Hierarchical loop closure detection with weighted local patch features and global descriptors. Appl Intell 55, 266 (2025). https://doi.org/10.1007/s10489-024-06135-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06135-0

Keywords