Graph attention information fusion for Siamese adaptive attention tracking

Wei, Lixin; Xi, Zeyu; Hu, Ziyu; Sun, Hao

doi:10.1007/s10489-022-03502-7

Graph attention information fusion for Siamese adaptive attention tracking

Published: 05 May 2022

Volume 53, pages 2068–2087, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Lixin Wei^1,2,
Zeyu Xi ORCID: orcid.org/0000-0002-3181-6761^1,2,
Ziyu Hu^1,2 &
…
Hao Sun^1,2

475 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

A single target tracker based on a Siamese network regards tracking as a process of similarity matching. The convolution features of the template branch and search area branch realize similarity matching and information fusion by a correlation operation. However, the correlation operation is a local linear matching, which limits the tracker to capturing the complex nonlinear relationship between the template branch and search area branch. In addition, it is easy to lose useful information. Moreover, most trackers do not update the template. The template branch and the search area branch compute convolution features independently without information exchange. To solve these existing problems, a graph attention information fusion for Siamese adaptive attention tracking network (GIFT) is proposed. The information flow between the template branch and search area branch is connected by designing a Siamese adaptive attention module (SAA), and the template information is updated indirectly. The graph attention information fusion module (GAIF) is proposed to effectively fuse the information of the template branch and search area branch and realize the similarity matching of their corresponding parts. Layerwise aggregation makes full use of the shallow and deep features of neural networks. This further improves tracking performance. Experiments on 6 challenging benchmarks, including GOT-10k, OTB100, VOT2018, VOT2019, UAV123 and LaSOT, demonstrate that GIFT has the leading performance and runs at 28.34 FPS, which surpasses the real-time level of 25 FPS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LSNT: A Lightweight Siamese Network Based Tracker

Learning attention modules for visual tracking

Article 21 April 2022

Object Tracking Algorithm Based on Dual Layer Attention

References

Li T, Wu P, Ding F, Yang W (2020) Parallel dual networks for visual object tracking, vol 50
Lee DH, Chen KL, Liou KH, Liu CL, Liu JL (2021) Deep learning and control algorithms of direct perception for autonomous driving. Appl Intell, pp 1–11
Liu J, Song W, Chen C, Liu F (2021) Cross-modality person re-identification via channel-based partition network. Appl Intell, pp 1573–7497
Jian C, Liu X, Zhang M (2021) Rd-hand: a real-time regression-based detector for dynamic hand gesture. Appl Intell, pp 1–12
Tao R, Gavves E, Smeulders, A.W. (2016) Siamese instance search for tracking. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1420–1429
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A (2016) Fully-convolutional siamese networks for object tracking. European conference on computer vision, pp 850–865
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8971–8980
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Analysis and Machine Intell 39(6):1137–1149
Article Google Scholar
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. Proceedings of the european conference on computer vision (ECCV), pp 101–117
Zhang Z, Peng H (2019) Deeper and wider siamese networks for real-time visual tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4591–4600
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4282–4291
Wang Q, Zhang L, Bertinetto L, Hu W, Torr Philip HS (2019) Fast online object tracking and segmentation: A unifying approach. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1328–1338
Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6668–6677
Zhang Z, Peng H, Fu J, Li B (2020) Ocean: Object-aware anchor-free tracking. ECCV
Veličković P, Cucurull G, Casanova A, Romero A (2017) Graph attention networks
Guo D, Shao Y, Cui Y, Wang Z, Shen C (2021) Graph attention tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Cehovin Zajc L, Vojir T, Bhat G, Lukezic A, Eldesokey A et al (2018) The sixth visual object tracking vot2018 challenge results. Proceedings of the european conference on computer vision (ECCV) workshops, pp 1–52
Kristan M, Matas J, Leonardis A, Felsberg M, Pflugfelder R, Kamarainen J-K, Cehovin Zajc L, Drbohlav O, Lukezic A, Berg A et al (2019) The seventh visual object tracking vot2019 challenge results. Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 1–36
Huang L, Zhao X, Huang K (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Trans Pattern Analysis and Machine Intell
Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418
Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. European conference on computer vision, pp 445–461
Fan H, Lin L, Yang F, Chu P (2019) Lasot: A high-quality benchmark for large-scale single object tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5374–5383
Li X, Ma C, Wu B, He Z, Yang MH (2020) Target-aware deep tracking. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. Proc AAAI Conf Artificial Intel 34(07):12549–12556
Google Scholar
Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6269–6277
Choi J, Kwon J, Lee KM (2019) Deep meta learning for real-time target-aware visual tracking. Proceedings of the IEEE/CVF international conference on computer vision, pp 911–920
Li P, Chen B, Ouyang W, Dong W (2019) Gradnet: Gradient-guided network for visual object tracking. Proceedings of the IEEE/CVF international conference on computer vision, pp 6162–6171
Zhang L, Gonzalez-Garcia A, Weijer J, Danelljan M, Khan FS (2019) Learning the model update for siamese trackers. Proceedings of the IEEE/CVF international conference on computer vision, pp 4010–4019
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. Proceedings of the European conference on computer vision (ECCV), pp 3–19
Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 510–519
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2020) Dual attention network for scene segmentation. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Yu Y, Xiong Y, Huang W, Scott MR (2020) Deformable siamese attention networks for visual object tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Wang Q, Teng Z, Xing J, Gao J (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4854–4863
Liu Q, Li X, He Z, Fan N, Liang Y (2020) Multi-task driven feature models for thermal infrared tracking. Thirty-fourth AAAI conference on artificial intelligence
He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. Proc IEEE Conf Comput Vis Pattern Recognit, pp 4834–4843
Zhao F, Zhang T, Ma C, Tang M, Wang J, Wang X (2020) Siamese attentive graph tracking. MM’20: The 28th ACM international conference on multimedia
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Russakovsky O, Deng J, Su H, Krause J (2015) Imagenet large scale visual recognition challenge. International J Comp Vision 115(3):211–252
Article MathSciNet Google Scholar
Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5296–5305
Lin T-Y, Maire M, Belongie S, Hays J (2014) Microsoft coco: Common objects in context. European conference on computer vision, pp 740–755
Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. European conference on computer vision, pp 254–265
Danelljan M, Häger G, Khan F, Felsberg M (2014) Accurate scale estimation for robust visual tracking. British Machine Vision Conference
Pu S, Song Y, Ma C, Zhang H, Yang M-H (2018) Deep attentive tracking via reciprocative learning. NIPS
Zhang J, Ma S, Sclaroff S (2014) Meem: robust tracking via multiple experts using entropy minimization. European conference on computer vision, pp 188–203
Kiani Galoogahi H, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. Proceedings of the IEEE international conference on computer vision, pp 1135–1143
Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr Philip HS (2017) End-to-end representation learning for correlation filter based tracking. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2805–2813
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. IEEE
Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646
Sauer A, Aljalbout E, Haddadin S (2019) Tracking holistic object representations
Wang G, Luo C, Xiong Z, Zeng W (2019) Spm-tracker: Series-parallel matching for real-time visual object tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3643–3652
Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4660–4669
Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. Proceedings of the IEEE/CVF international conference on computer vision, pp 6182–6191

Download references

Acknowledgements

This work was supported by National Key Research and Development Program of China (No.2018YFB1702300), National Natural Science Foundation of China (No.62003296), Hebei Youth Fund (No.E2018203162).

Author information

Authors and Affiliations

Engineering Research Center of the Ministry of Education for Intelligent Control System and Intelligent Equipment, Yanshan University, Qinhuangdao, Hebei, China
Lixin Wei, Zeyu Xi, Ziyu Hu & Hao Sun
Key Lab of Industrial Computer Control Engineering of Hebei Province, Yanshan University, Qinhuangdao, Hebei, China
Lixin Wei, Zeyu Xi, Ziyu Hu & Hao Sun

Authors

Lixin Wei
View author publications
You can also search for this author in PubMed Google Scholar
Zeyu Xi
View author publications
You can also search for this author in PubMed Google Scholar
Ziyu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zeyu Xi.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, L., Xi, Z., Hu, Z. et al. Graph attention information fusion for Siamese adaptive attention tracking. Appl Intell 53, 2068–2087 (2023). https://doi.org/10.1007/s10489-022-03502-7

Download citation

Accepted: 10 March 2022
Published: 05 May 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03502-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph attention information fusion for Siamese adaptive attention tracking

Abstract

Access this article

Similar content being viewed by others

LSNT: A Lightweight Siamese Network Based Tracker

Learning attention modules for visual tracking

Object Tracking Algorithm Based on Dual Layer Attention

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graph attention information fusion for Siamese adaptive attention tracking

Abstract

Access this article

Similar content being viewed by others

LSNT: A Lightweight Siamese Network Based Tracker

Learning attention modules for visual tracking

Object Tracking Algorithm Based on Dual Layer Attention

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation