Loading [MathJax]/extensions/MathMenu.js
EAAINet: An Element-Wise Attention Network With Global Affinity Information for Accurate Indoor Visual Localization | IEEE Journals & Magazine | IEEE Xplore

EAAINet: An Element-Wise Attention Network With Global Affinity Information for Accurate Indoor Visual Localization


Abstract:

Visual localization, a vital component of many visual applications, has been tackled by scene coordinates regression (SCoRe) methods that leverage neural networks to pred...Show More

Abstract:

Visual localization, a vital component of many visual applications, has been tackled by scene coordinates regression (SCoRe) methods that leverage neural networks to predict scene coordinates, followed by a PnP algorithm to recover camera pose. However, these methods do not consider the relationship between image patches, known as relative features or affinity information, which is instrumental for network to perform complete scene parsing. Besides, owing to the visual similarity between image patches, these methods are weak in extracting reliable absolute features that represent the context information of the image patches, resulting in inferior localization performance. In response, we propose EAAINet that is based on classical SCoRe approaches and consists of two novel modules: the Global Affinity Aggregation Module (GAAM) and the Element-wise Attention Module (EAM). Specifically, GAAM employs an interval sampling strategy to sample image patches to construct sparse graph neural networks (GNNs), from which global affinity information between image patches is retrieved, hence ensuring precise scene parsing. EAM integrates multi-level features to generate reliable absolute features to regress accurate scene coordinates, with the key insight that the structure information is essential to differentiate similar image patches and the semantic information assists in modeling regression problems. Technically, EAM predicts element-wise soft attention masks to reconcile multi-level feature maps, enabling efficient feature fusion. Positional encoding and uncertainty modeling are also employed to enhance visual localization performance. Experimental results show that EAAINet significantly outperforms the state-of-the-arts on multiple benchmarks with faster speed and less model parameters.
Published in: IEEE Robotics and Automation Letters ( Volume: 8, Issue: 6, June 2023)
Page(s): 3166 - 3173
Date of Publication: 24 March 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.