Abstract:
Visual Place Recognition (VPR) is crucial for various computer vision and robotics applications. Traditional VPR techniques relying on handcrafted features, have been enh...Show MoreMetadata
Abstract:
Visual Place Recognition (VPR) is crucial for various computer vision and robotics applications. Traditional VPR techniques relying on handcrafted features, have been enhanced by using Convolutional Neural Networks (CNNs). Recently, MixVPR has set new benchmarks in VPR by using advanced feature aggregation techniques. However, MixVPR's full-image feature mixing approach can lead to the ignoring of critical local detail information and regional saliency information in large-scale images. To overcome this, we propose MIXVPR++, which integrates an Adaptive Gabor Texture Fuser with a Learnable Gabor Filter for enriching semantic context with texture details information and a Hierarchical-Region Feature-Mixer for better spatial hierarchy capture regional saliency information, thereby enhancing robustness and accuracy. Extensive experiments demonstrate that MIXVPR++ outperforms state-of-the-art methods across most challenging benchmarks. Despite its impressive performance, MIXVPR++ shows limitations in handling severe viewpoint changes, indicating an area for future improvement.
Published in: IEEE Robotics and Automation Letters ( Volume: 10, Issue: 1, January 2025)