Spatial graph attention network-based object tracking with adaptive cosine window

Fan, Liu-Yi; Jiang, Xiao-Yan; Huang, Bo; Zhang, Juan; Gao, Yong-Bin

doi:10.1007/s10489-023-04839-3

Spatial graph attention network-based object tracking with adaptive cosine window

Published: 23 August 2023

Volume 53, pages 26439–26453, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Liu-Yi Fan¹,
Xiao-Yan Jiang¹,
Bo Huang¹,
Juan Zhang¹ &
…
Yong-Bin Gao¹

226 Accesses
Explore all metrics

Abstract

Most popular Siamese trackers optimize the classification map from the tracking head using a fixed cosine window penalty. However, this fixed operation, which sets the weight and center of the cosine window to fixed values, can lead to tracking errors when there are similar interferences or the target is out of view. In addition, traditional graph attention networks determine attention weights only based on the cosine similarity between nodes, ignoring the relationship between the positions of nodes in the template and search region. To address these issues, this paper proposes a spatial graph attention network-based object tracking with adaptive cosine window in tracking head. The adaptive cosine window combines spatial-temporal information and adjusts the cosine window, using a positional bias Kalman filter to predict the offset of the target in the search region. The location-based attention mask module considers both the similarity between nodes and their positions in the template and search region, rather than just node similarity, which reduces the impact of similar surroundings. The attention weights between nodes are constrained using a position matrix based on Gaussian functions. Extensive experiments on four challenging public datasets (GOT-10k, UAV123, OTB-100, and LaSOT) show that our tracker outperforms other state-of-the-art trackers.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Cheng S, Zhong B, Li G, Liu X, Tang Z, Li X, Wang J (2021) Learning to filter: Siamese relation network for robust tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on Neural Networks 20(1):61–80
Article Google Scholar
Zhang S, Tong H, Xu J, Maciejewski R (2019) Graph convolutional networks: a comprehensive review. Computational Social Networks 6(1):1–23
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks. Stat 1050(1):4
Google Scholar
Gao J, Zhang T, Xu C (2019) Graph convolutional tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Guo D, Shao Y, Cui Y, Wang Z, Zhang L, Shen C (2021) Graph attention tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Huang, L. Zhao, X. Huang, K. (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) Vol: 43(5), 1562–1577
Wu Y, Lim J, Yang M.-H (2013) Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Cen M, Jung C (2018) Fully convolutional siamese fusion networks for object tracking. In: 2018 25Th IEEE International Conference on Image Processing (ICIP)
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV)
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: Proceedings of the AAAI Conference on Artificial Intelligence (AIII)
Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Voigtlaender P, Luiten J, Torr PH, Leibe B (2020) Siam r-cnn: Visual tracking by re-detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Tang F, Ling Q (2022) Ranking-based siamese visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Xiao S, Wang S, Dai Y, Guo W (2022) Graph neural networks in node classification: survey and evaluation. Machine Vision and Applications 33(8):1–19
Google Scholar
Subasi A, Dogan S, Tuncer T (2023) novel automated tower graph ased ecg signal classification method with hexadecimal local adaptive inary pattern and deep learning. Journal of Ambient Intelligence and umanized Computing 14(2):711–725
Article Google Scholar
Dong Y, Liu Q, Du B, Zhang L (2022) eighted feature fusion of convolutional neural network and graph attention network for hyperspectral mage classification. IEEE Transactions on Image Processing 31(1):1559–1572
Article Google Scholar
Takyi-Aninakwa P, Wang S, Zhang H, Appiah E, Bobobee ED, Fernandez C (2022) A strong tracking adaptive fading-extended kalman filter for the state of charge estimation of lithium-ion batteries. Int J Energy Res 46(12):16427–16444
Article Google Scholar
Alaaudeen K, Aruna T, Ananthi G (2022) An improved strong tracking kalman filter algorithm for real-time vehicle tracking. Materials Today: Proceedings 64(1):931–939
Google Scholar
Takyi-Aninakwa P, Wang S, Zhang H, Appiah E, Bobobee ED, Fernandez C (2022) A strong tracking adaptive fading-extended kalman filter for the state of charge estimation of lithium-ion batteries. Int J Energy Res 46(12):16427–16444
Article Google Scholar
Fang W, Zhuo W, Yan J, Song Y, Jiang D, Zhou T (2022) Attention meets long short-term memory: A deep learning network for traffic flow forecasting. Physica A: Statistical Mechanics and its Applications 587(1)
Lin T.-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision (ECCV)
Deng J, Dong W, Socher R, Li L.-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtubeboundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Mueller, M. Smith, N. Ghanem, B. (2016) A benchmark and simulator for uav tracking. In: European Conference on Computer Vision (ECCV)
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Danelljan M, Gool LV, Timofte R (2020) Probabilistic regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Fu Z, Liu Q, Fu Z, Wang Y (2021) Stmtrack: Template-free visual tracking with space-time memory networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zhang Z, Peng H, Fu J, Li B, Hu W (2020) Ocean: Object-aware anchor-free tracking. In: European Conference on Computer Vision (ECCV)
Du F, Liu P, Wei Z, Tang X (2020) Correlation-guided attention for corner detection based visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zhang D, Zheng Z, Jia R, Li M (2021) Visual tracking via hierarchical deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence
Lukezic A, Matas J, Kristan M (2020) D3s-a discriminative single shot segmentation tracker. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Peng J, Jiang Z, Gu Y, Wu Y, Wang Y, Tai Y, Wang C, Lin W (2021) Siamrcr: Reciprocal classification and regression for visual object tracking. In: International Joint Conference on Artificial Intelligence (IJCAI)
Yan B, Peng H, Fu J, Wang D, Lu H (2021) Learning spatio-temporal transformer for visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Di Nardo E, Ciaramella A (2023) Tracking vision transformer with class and regression tokens. Information Sciences 619(20):276–287
Article Google Scholar
Yu B, Tang M, Zheng L, Zhu G, Wang J, Feng H, Feng X, Lu H (2021) High-performance discriminative tracking with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Chen X, Yan B, Zhu J, Wang D, Yang X, Lu H (2021) Transformer tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Cui Y, Jiang C, Wang L, Wu G (2022) Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zheng, Z. Wan, Y. Zhang, Y. Xiang, S. Peng, D. Zhang, B. (2021) Clnet: Cross-layer convolutional neural network for change detection in optical remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing (JPRS) Vol: 175(1), 247–267
Zheng J, Ma C, Peng H, Yang X (2021) Learning to track objects from unlabeled videos. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Download references

Funding

Joint Funds of National Natural Science Foundation of China, Nr.: U2033218.

Author information

Authors and Affiliations

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, China
Liu-Yi Fan, Xiao-Yan Jiang, Bo Huang, Juan Zhang & Yong-Bin Gao

Authors

Liu-Yi Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Huang
View author publications
You can also search for this author in PubMed Google Scholar
Juan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Bin Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Yan Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fan, LY., Jiang, XY., Huang, B. et al. Spatial graph attention network-based object tracking with adaptive cosine window. Appl Intell 53, 26439–26453 (2023). https://doi.org/10.1007/s10489-023-04839-3

Download citation

Accepted: 25 June 2023
Published: 23 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04839-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions