Dual Siamese Channel Attention Networks for Visual Object Tracking

Gao, Wenxing; Tian, Xiaolin; Zhang, Yifan; Jia, Nan; Yang, Ting; Jiao, Licheng

doi:10.1007/978-3-031-14903-0_28

Dual Siamese Channel Attention Networks for Visual Object Tracking

Wenxing Gao¹⁸,
Xiaolin Tian¹⁸,
Yifan Zhang¹⁸,
Nan Jia¹⁸,
Ting Yang¹⁸ &
…
Licheng Jiao¹⁸

Conference paper
First Online: 19 October 2022

926 Accesses

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 659))

Abstract

Siamese network based trackers have achieved remarkable performance on visual object tracking. The target position is determined by the similarity map produced via cross-correlation over features generated from template branch and search branch. The interaction between the template and search branches is essential for achieving high-performance object tracking task, which is neglected in previous works as features of the two branches are computed separately. In this paper, we propose Dual Siamese Channel Attentions Networks, referred as SiamDCA, which exploits the channel attentions to further improve tracking robustness. Firstly, a convolutional version of Squeeze and Excitation Networks (CSENet) is embedded in backbone to explicitly formulate interdependencies between channels to recalibrate channel-wise feature responses adaptively. Meanwhile, we propose a novel Global Channel Enhancement (GCE) module, which is capable of capturing attention weights of each channel in template branch, so as to normalize the channel characteristics in search branch. We experiment on benchmark OTB2015, VOT2016 and UAV123 where our algorithm demonstrates competitive performance versus other state-of-the-art trackers.

Supported by the National Natural Science Foundation of China (No. 61977052).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Cao, Y., Xu, J., Lin, S., Wei, F.: GCNet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, October 2019
Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ATOM: accurate tracking by overlap maximization. In: CVPR, pp. 4660–4669, June 2019
Google Scholar
Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: ECO: efficient convolution operators for tracking. In: CVPR, pp. 6638–6646, July 2017
Google Scholar
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4310–4318, December 2015
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Hadfield, S., Bowden, R., Lebeda, K.: The visual object tracking VOT2016 challenge results. In: ECCV Workshops, vol. 9914, pp. 777–823, October 2016
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778, June 2016
Google Scholar
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 99, 7132–7141 (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: CVPR, pp. 4282–4291 (2019)
Google Scholar
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: CVPR, pp. 8971–8980, June 2018
Google Scholar
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 254–265. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16181-5_18
Chapter Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)
Google Scholar
Liu, L., Xing, J., Ai, H., Ruan, X.: Hand posture recognition using finger geometric feature. In: ICPR, pp. 565–568 (2013)
Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video. In: CVPR, pp. 5296–5305, July 2017
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: CVPR, pp. 1328–1338, June 2019
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR, pp. 7794–7803, June 2018
Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. TPAMI 37(9), 1834–1848 (2015)
Article Google Scholar
Xing, J., Ai, H., Lao, S.: Multiple human tracking based on multi-view upper-body detection and discriminative learning. In: ICPR, pp. 1698–1701 (2010)
Google Scholar
Yu, Y., Xiong, Y., Huang, W., Scott, M.R.: Deformable Siamese attention networks for visual object tracking. In: CVPR, pp. 6728–6737, June 2020
Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: ECCV, pp. 103–119 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, Xidian University, Xi’an, 710071, China
Wenxing Gao, Xiaolin Tian, Yifan Zhang, Nan Jia, Ting Yang & Licheng Jiao

Authors

Wenxing Gao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Tian
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Nan Jia
View author publications
You can also search for this author in PubMed Google Scholar
Ting Yang
View author publications
You can also search for this author in PubMed Google Scholar
Licheng Jiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaolin Tian .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Zhongzhi Shi
Department of Computer Science, University of Surrey, Guildford, UK
Yaochu Jin
College of Artificial Intelligence, Xidian University, Xi’an, China
Xiangrong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, W., Tian, X., Zhang, Y., Jia, N., Yang, T., Jiao, L. (2022). Dual Siamese Channel Attention Networks for Visual Object Tracking. In: Shi, Z., Jin, Y., Zhang, X. (eds) Intelligence Science IV. ICIS 2022. IFIP Advances in Information and Communication Technology, vol 659. Springer, Cham. https://doi.org/10.1007/978-3-031-14903-0_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-14903-0_28
Published: 19 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14902-3
Online ISBN: 978-3-031-14903-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)