Skip to main content
Log in

Spatial graph attention network-based object tracking with adaptive cosine window

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Most popular Siamese trackers optimize the classification map from the tracking head using a fixed cosine window penalty. However, this fixed operation, which sets the weight and center of the cosine window to fixed values, can lead to tracking errors when there are similar interferences or the target is out of view. In addition, traditional graph attention networks determine attention weights only based on the cosine similarity between nodes, ignoring the relationship between the positions of nodes in the template and search region. To address these issues, this paper proposes a spatial graph attention network-based object tracking with adaptive cosine window in tracking head. The adaptive cosine window combines spatial-temporal information and adjusts the cosine window, using a positional bias Kalman filter to predict the offset of the target in the search region. The location-based attention mask module considers both the similarity between nodes and their positions in the template and search region, rather than just node similarity, which reduces the impact of similar surroundings. The attention weights between nodes are constrained using a position matrix based on Gaussian functions. Extensive experiments on four challenging public datasets (GOT-10k, UAV123, OTB-100, and LaSOT) show that our tracker outperforms other state-of-the-art trackers.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  2. Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  3. Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  4. Cheng S, Zhong B, Li G, Liu X, Tang Z, Li X, Wang J (2021) Learning to filter: Siamese relation network for robust tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  5. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on Neural Networks 20(1):61–80

    Article  Google Scholar 

  6. Zhang S, Tong H, Xu J, Maciejewski R (2019) Graph convolutional networks: a comprehensive review. Computational Social Networks 6(1):1–23

  7. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks. Stat 1050(1):4

    Google Scholar 

  8. Gao J, Zhang T, Xu C (2019) Graph convolutional tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  9. Guo D, Shao Y, Cui Y, Wang Z, Zhang L, Shen C (2021) Graph attention tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  10. Huang, L. Zhao, X. Huang, K. (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) Vol: 43(5), 1562–1577

  11. Wu Y, Lim J, Yang M.-H (2013) Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  12. Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  13. Cen M, Jung C (2018) Fully convolutional siamese fusion networks for object tracking. In: 2018 25Th IEEE International Conference on Image Processing (ICIP)

  14. Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV)

  15. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  16. Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  17. Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: Proceedings of the AAAI Conference on Artificial Intelligence (AIII)

  18. Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  19. Voigtlaender P, Luiten J, Torr PH, Leibe B (2020) Siam r-cnn: Visual tracking by re-detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  20. Tang F, Ling Q (2022) Ranking-based siamese visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  21. Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  22. Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

  23. Xiao S, Wang S, Dai Y, Guo W (2022) Graph neural networks in node classification: survey and evaluation. Machine Vision and Applications 33(8):1–19

    Google Scholar 

  24. Subasi A, Dogan S, Tuncer T (2023) novel automated tower graph ased ecg signal classification method with hexadecimal local adaptive inary pattern and deep learning. Journal of Ambient Intelligence and umanized Computing 14(2):711–725

    Article  Google Scholar 

  25. Dong Y, Liu Q, Du B, Zhang L (2022) eighted feature fusion of convolutional neural network and graph attention network for hyperspectral mage classification. IEEE Transactions on Image Processing 31(1):1559–1572

    Article  Google Scholar 

  26. Takyi-Aninakwa P, Wang S, Zhang H, Appiah E, Bobobee ED, Fernandez C (2022) A strong tracking adaptive fading-extended kalman filter for the state of charge estimation of lithium-ion batteries. Int J Energy Res 46(12):16427–16444

    Article  Google Scholar 

  27. Alaaudeen K, Aruna T, Ananthi G (2022) An improved strong tracking kalman filter algorithm for real-time vehicle tracking. Materials Today: Proceedings 64(1):931–939

    Google Scholar 

  28. Takyi-Aninakwa P, Wang S, Zhang H, Appiah E, Bobobee ED, Fernandez C (2022) A strong tracking adaptive fading-extended kalman filter for the state of charge estimation of lithium-ion batteries. Int J Energy Res 46(12):16427–16444

    Article  Google Scholar 

  29. Fang W, Zhuo W, Yan J, Song Y, Jiang D, Zhou T (2022) Attention meets long short-term memory: A deep learning network for traffic flow forecasting. Physica A: Statistical Mechanics and its Applications 587(1)

  30. Lin T.-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision (ECCV)

  31. Deng J, Dong W, Socher R, Li L.-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  32. Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtubeboundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  33. Mueller, M. Smith, N. Ghanem, B. (2016) A benchmark and simulator for uav tracking. In: European Conference on Computer Vision (ECCV)

  34. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  35. Danelljan M, Gool LV, Timofte R (2020) Probabilistic regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  36. Fu Z, Liu Q, Fu Z, Wang Y (2021) Stmtrack: Template-free visual tracking with space-time memory networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  37. Zhang Z, Peng H, Fu J, Li B, Hu W (2020) Ocean: Object-aware anchor-free tracking. In: European Conference on Computer Vision (ECCV)

  38. Du F, Liu P, Wei Z, Tang X (2020) Correlation-guided attention for corner detection based visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  39. Zhang D, Zheng Z, Jia R, Li M (2021) Visual tracking via hierarchical deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence

  40. Lukezic A, Matas J, Kristan M (2020) D3s-a discriminative single shot segmentation tracker. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  41. Peng J, Jiang Z, Gu Y, Wu Y, Wang Y, Tai Y, Wang C, Lin W (2021) Siamrcr: Reciprocal classification and regression for visual object tracking. In: International Joint Conference on Artificial Intelligence (IJCAI)

  42. Yan B, Peng H, Fu J, Wang D, Lu H (2021) Learning spatio-temporal transformer for visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

  43. Di Nardo E, Ciaramella A (2023) Tracking vision transformer with class and regression tokens. Information Sciences 619(20):276–287

    Article  Google Scholar 

  44. Yu B, Tang M, Zheng L, Zhu G, Wang J, Feng H, Feng X, Lu H (2021) High-performance discriminative tracking with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

  45. Chen X, Yan B, Zhu J, Wang D, Yang X, Lu H (2021) Transformer tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  46. Cui Y, Jiang C, Wang L, Wu G (2022) Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  47. Zheng, Z. Wan, Y. Zhang, Y. Xiang, S. Peng, D. Zhang, B. (2021) Clnet: Cross-layer convolutional neural network for change detection in optical remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing (JPRS) Vol: 175(1), 247–267

  48. Zheng J, Ma C, Peng H, Yang X (2021) Learning to track objects from unlabeled videos. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Download references

Funding

Joint Funds of National Natural Science Foundation of China, Nr.: U2033218.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-Yan Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, LY., Jiang, XY., Huang, B. et al. Spatial graph attention network-based object tracking with adaptive cosine window. Appl Intell 53, 26439–26453 (2023). https://doi.org/10.1007/s10489-023-04839-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04839-3

Keywords

Navigation