Skip to main content
Log in

Online visual tracking via background-aware Siamese networks

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

With the rapid development of Siamese network based trackers, a set of related methods have produced considerable performance improvement. However, the tracking results are often disturbed due to the background noise from the template image and background distractor objects from the search image. In this paper, we present an elegant background-aware Siamese tracker for online single object visual tracking. Specifically, a new basic tracking framework is firstly proposed to implement the target localization, bounding box regression, and IoU prediction with offline multi-task learning. During the online tracking stage, we design a novel background-aware tracker with two strategies. Firstly, a spatial mask is introduced to reduce the impacts of background noise from the template image. Secondly, we predict a background-aware salient map to discover and suppress the distractor features in the search image. To validate the effectiveness, we conduct extensive experiments and exhaustive comparisons on OTB2013, OTB2015, VOT2019, UAV123, and GOT10k tracking datasets. Experimental results demonstrate that the proposed tracker, dubbed BaSiamIoU, can achieve state-of-the-art performance while running over 50 FPS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1401–1409

  2. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: European conference on computer vision. Springer, pp 850–865

  3. Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 6182–6191

  4. Bhat G, Danelljan M, Van Gool L, Timofte R (2020) Know your surroundings: exploiting scene information for object tracking. In: European conference on computer vision. Springer, pp 205–221

  5. Bhat G, Johnander J, Danelljan M, Shahbaz KF, Felsberg M (2018) Unveiling the power of deep tracking. In: European conference on computer vision. Springer, pp 483–498

  6. Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2544–2550

  7. Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 6667–6676

  8. Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: accurate tracking by overlap maximization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4660–4669

  9. Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 6638–6646

  10. Danelljan M, Hager G, Khan FS, Felsberg M (2017) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575. https://doi.org/10.1109/TPAMI.2016.2609928

    Article  Google Scholar 

  11. Danelljan M, Hager G, Shahbaz KF, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 58–66

  12. Danelljan M, Hager G, Shahbaz KF, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 4310–4318

  13. Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: European conference on computer vision. Springer, pp 472–488

  14. Danelljan M, Shahbaz KF, Felsberg M, Van de Weijer J (2014) Adaptive color attributes for real-time visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1090–1097

  15. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  16. Dong X, Shen J (2018) Triplet loss in siamese network for object tracking. In: European conference on computer vision. Springer, pp 459–474

  17. Fan H, Bai H, Lin L, Yang F, Chu P, Deng G, Yu S, Harshit HM, Liu J, Xu Y, Liao C, Yuan L, Ling H (2021) Lasot: A high-quality large-scale single object tracking benchmark. Int J Comput Vis 129(2):439–461

    Article  Google Scholar 

  18. Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 1763–1771

  19. He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4834–4843

  20. He K, Gkioxari G, Dollár P, Girshick R (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397

    Article  Google Scholar 

  21. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 1026–1034

  22. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 770–778

  23. Held D, Thrun S, Savarese S (2016) Learning to track at 100 fps with deep regression networks. In: European conference on computer vision. Springer, pp 749–765

  24. Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  25. Huang L, Zhao X, Huang K (2019) Got-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans Pattern Anal Mach Intell 43:1562–1577

    Article  Google Scholar 

  26. Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: European conference on computer vision. Springer, pp 784–799

  27. Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 7482–7491

  28. Kiani GH, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 1135–1143

  29. Kristan M, Matas J, Leonardis A, Felsberg M, Pflugfelder R, Kamarainen J.K, Cehovin ZL, Drbohlav O, Lukezic A, Berg A et al (2019) The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 2206–2241

  30. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems-volume 1, NIPS’12. Curran Associates Inc., Red Hook, NY, USA, pp 1097–1105

  31. Lee H, Choi S, Kim Y, Kim C (2019) Bilinear siamese networks with background suppression for visual object tracking. In: BMVC, pp 8

  32. Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4282–4291

  33. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 8971–8980

  34. Li D, Porikli F, Wen G, Kuai Y (2019) When correlation filters meet siamese networks for real-time complementary tracking. IEEE Trans Circuits Syst Video Technol 30(2):509–519

    Article  Google Scholar 

  35. Li P, Chen B, Ouyang W, Wang D, Yang X, Lu H (2019) Gradnet: Gradient-guided network for visual object tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 6162–6171

  36. Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: European conference on computer vision. Springer, pp 254–265

  37. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2117–2125

  38. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

  39. Liu T, Kong J, Jiang M, Liu C, Gu X, Wang X (2019) Collaborative model with adaptive selection scheme for visual tracking. Int J Mach Learn Cybern 10(2):215–228

    Article  Google Scholar 

  40. Lukezic A, Vojir T, Cehovin ZL, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 6309–6318

  41. Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 3074–3082

  42. Ma C, Yang X, Zhang C, Yang MH (2015) Long-term correlation tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 5388–5396

  43. Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. In: European conference on computer vision. Springer, pp 445–461

  44. Muller M, Bibi A, Giancola S, Alsubaihi S, Ghanem B (2018) Trackingnet: a large-scale dataset and benchmark for object tracking in the wild. In: European conference on computer vision. Springer, pp 300–317

  45. Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4293–4302

  46. Tao R, Gavves E, Smeulders AW (2016) Siamese instance search for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1420–1429

  47. Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 9627–9636

  48. Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PH (2017) End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2805–2813

  49. Wang G, Luo C, Xiong Z, Zeng W (2019) Spm-tracker: series-parallel matching for real-time visual object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 3643–3652

  50. Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4854–4863

  51. Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1328–1338

  52. Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2411–2418

  53. Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848. https://doi.org/10.1109/TPAMI.2014.2388226

    Article  Google Scholar 

  54. Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: towards robust and accurate visual tracking with target estimation guidelines. In: AAAI, pp 12549–12556

  55. Zhang K, Zhang L, Liu Q, Zhang D, Yang MH (2014) Fast visual tracking via dense spatio-temporal context learning. In: European conference on computer vision. Springer, pp 127–141

  56. Zhang L, Gonzalez-Garcia A, Weijer J, Danelljan M, Khan FS (2019) Learning the model update for siamese trackers. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 4010–4019

  57. Zhang Z, Peng H (2019) Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4591–4600

  58. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2921–2929

  59. Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: European conference on computer vision. Springer, pp 101–117

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 52127809, 51625501.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenzhong Wei.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, K., Xu, TB. & Wei, Z. Online visual tracking via background-aware Siamese networks. Int. J. Mach. Learn. & Cyber. 13, 2825–2842 (2022). https://doi.org/10.1007/s13042-022-01564-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01564-0

Keywords