AAM-ORB: affine attention module on ORB for conditioned feature matching

Song, Shaojing; Ai, Luxia; Tang, Pan; Miao, Zhiqing; Gu, Yang; Chai, Yu

doi:10.1007/s11760-022-02452-4

AAM-ORB: affine attention module on ORB for conditioned feature matching

Original Paper
Published: 10 January 2023

Volume 17, pages 2351–2358, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Shaojing Song¹,
Luxia Ai²,
Pan Tang²,
Zhiqing Miao²,
Yang Gu¹ &
…
Yu Chai¹

278 Accesses
1 Altmetric
Explore all metrics

Abstract

Feature matching is determining true correspondences between image pairs, important for many computer vision applications. It is challenging to determine true correspondences quickly under scene changes in viewpoint, rotation, scaling and illumination. Higher accuracy and efficiency are required for feature matching. While other methods determine true correspondences by treating the images independently, we instead condition on image pairs to take account of the affine information between them.To achieve this, we propose AAM-ORB, an efficient and robust algorithm for feature matching in the scene-shift. The key to our approach is an affine attention module (AAM), which can condition the affine features on both images to boost robustness. AAM is integrated into the well-known ORB feature matching pipeline, resulting in a significant improvement. Although remarkably matching accuracy, AAM can reduce computation efficient. To overcome this, we select a grid-based motion statistics for separating true correspondences from false ones at high speed. Extensive experiments show that AAM-ORB surpasses state-of-the-art approaches for feature matching on benchmark datasets. Moreover, the proposed AAM-ORB has less time consumption. Finally, AAM-ORB achieves better performance and efficiency of feature matching under scene changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Self-adaptive SURF for image-to-video matching

Article 14 October 2023

MatchFormer: Interleaving Attention in Transformers for Feature Matching

Repformer: a robust shared-encoder dual-pipeline transformer for visual tracking

Article 22 July 2023

Data availability

Not applicable.

References

Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Article Google Scholar
Du, C., Yuan, J., Dong, J., Li, L., Li, T.: Gpu based parallel optimization for real time panoramic video stitching. Pattern Recogn. Lett.133 (2019)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded Up Robust Features. Springer-Verlag, Berlin (2006)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: Orb: an efficient alternative to sift or surf. In: IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6-13, 2011 (2011)
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: European Conference on Computer Vision (2010)
Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: binary robust invariant scalable keypoints. In: International Conference on Computer Vision (2011)
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: International Conference on Computer Vision Theory and Application Vissapp (2009)
Bian, J., Lin, W.Y., Matsushita, Y., Yeung, S.K., Cheng, M.M.: Gms: Grid-based motion statistics for fast, ultra-robust feature correspondence. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Wang, Y., Zhao, R., Liang, L., Zheng, X., Kan, S.: Block-based image matching for image retrieval. J. Vis. Commun. Image Represent. 74(6), 102998 (2020)
Xu, G., Wu, Q., Cheng, Y., Yan, F., Yu, Q.: A robust deformed image matching method for multi-source image matching. Infrared Phys. Technol. 115(22), 103691 (2021)
Article Google Scholar
Korman, S., Reichman, D., Tsur, G., Avidan, S.: Fast-match: fast affine template matching. Int. J. Comput. Vis. 121(1), 111–125 (2017)
Article MathSciNet MATH Google Scholar
Ma, J., Jiang, X., Fan, A., Jiang, J., Yan, J.: Image matching from handcrafted to deep features: a survey. Int. J. Comput. Vis. (1) (2020)
Wiles, O., Ehrhardt, S., Zisserman, A.: Co-attention for conditioned image matching (2020)
Jiang, X., Ma, J., Fan, A., Xu, H., Tian, X.: Robust feature matching for remote sensing image registration via linear adaptive filtering. IEEE Trans. Geosci. Remote Sens. 99, 1–15 (2020)
Jiang, X., Wang, Y., Fan, A., Ma, J.: Learning for mismatch removal via graph attention networks. ISPRS J. Photogram. Remote Sens. 190, 181–195 (2022)
Article Google Scholar
Chen, J., Chen, S., Chen, X., Yang, Y., Rao, Y.: Lsv-anet: Deep learning on local structure visualization for feature matching. IEEE Trans. Geosci. Remote Sens. 99, 1–18 (2021)
Chen, J., Chen, S., Chen, X., Dai, Y., Yang, Y.: Csr-net: Learning adaptive context structure representation for robust feature correspondence. IEEE Trans. Image Process. 31 (2022)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Article Google Scholar
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Jiang, X., Ma, J., Xiao, G., Shao, Z., Guo, X.: A review of multimodal image matching: methods and applications. Inf. Fus. (11) (2021)

Download references

Funding

This work was supported by the Shanghai Intelligent Manufacturing Collaborative Logistics Equipment Engineering Technology Research Center under Grant No. A10GY21H004-18 and the Collaborative Innovation Platform of Electronic Information Master under Grant No. A10GY21F015 of Shanghai Polytechnic University.

Author information

Authors and Affiliations

College of Computer and Informational Engineering, Shanghai Polytechnic University, 2360 Jinhai Road, Shanghai, 201209, China
Shaojing Song, Yang Gu & Yu Chai
School of Resources and Environmental Engineering, Shanghai Polytechnic University, 2360 Jinhai Road, Shanghai, 201209, China
Luxia Ai, Pan Tang & Zhiqing Miao

Authors

Shaojing Song
View author publications
You can also search for this author in PubMed Google Scholar
Luxia Ai
View author publications
You can also search for this author in PubMed Google Scholar
Pan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqing Miao
View author publications
You can also search for this author in PubMed Google Scholar
Yang Gu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Chai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SS and LA took part in conceptualization; LA involved in methodology, software, writing and preparing the original draft and visualization; SS, LA, PT, ZM, YG and YC had contributed to validation; formal analysis, PT and ZM took part in investigation, resources and data curation; SS and LA carried out writing, reviewing, editing, supervision and project administration. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Shaojing Song.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, S., Ai, L., Tang, P. et al. AAM-ORB: affine attention module on ORB for conditioned feature matching. SIViP 17, 2351–2358 (2023). https://doi.org/10.1007/s11760-022-02452-4

Download citation

Received: 26 July 2022
Revised: 11 December 2022
Accepted: 14 December 2022
Published: 10 January 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11760-022-02452-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AAM-ORB: affine attention module on ORB for conditioned feature matching

Abstract

Access this article

Similar content being viewed by others

Self-adaptive SURF for image-to-video matching

MatchFormer: Interleaving Attention in Transformers for Feature Matching

Repformer: a robust shared-encoder dual-pipeline transformer for visual tracking

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AAM-ORB: affine attention module on ORB for conditioned feature matching

Abstract

Access this article

Similar content being viewed by others

Self-adaptive SURF for image-to-video matching

MatchFormer: Interleaving Attention in Transformers for Feature Matching

Repformer: a robust shared-encoder dual-pipeline transformer for visual tracking

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation