End-to-end multitask Siamese network with residual hierarchical attention for real-time object tracking

Huang, Wenhui; Gu, Jason; Ma, Xin; Li, Yibin

doi:10.1007/s10489-019-01605-2

End-to-end multitask Siamese network with residual hierarchical attention for real-time object tracking

Published: 20 February 2020

Volume 50, pages 1908–1921, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wenhui Huang ORCID: orcid.org/0000-0002-5435-8775¹,
Jason Gu²,
Xin Ma³ &
…
Yibin Li³

812 Accesses
20 Citations
Explore all metrics

Abstract

Object tracking with deep networks has recently achieved substantial improvement in terms of tracking performance. In this paper, we propose a multitask Siamese neural network that uses a residual hierarchical attention mechanism to achieve high-performance object tracking. This network is trained offline in an end-to-end manner, and it is capable of real-time tracking. To produce more efficient and generative attention-aware features, we propose residual hierarchical attention learning using residual skip connections in the attention module to receive hierarchical attention. Moreover, we formulate a multitask correlation filter layer to exploit the missing link between context awareness and regression target adaptation, and we insert this differentiable layer into a neural network to improve the discriminatory capability of the network. The results of experimental analyses conducted on the OTB, VOT and TColor-128 datasets, which contain various tracking scenarios, demonstrate the efficiency of our proposed real-time object-tracking network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DensSiam: End-to-End Densely-Siamese Network with Self-Attention Model for Object Tracking

MHASiam: Mixed High-Order Attention Siamese Network for Real-Time Visual Tracking

Residual Attention SiameseRPN for Visual Tracking

References

Jiang H, Jin W (2019) Effective use of convolutional neural networks and diverse deep supervision for better crowd counting. Appl Intell 49(7):2415–2433
Article MathSciNet Google Scholar
Yao G, Lei T, Zhong J, Jiang P (2019) Learning multi-temporal-scale deep information for action recognition. Appl Intell 49(6):2017–2029
Article Google Scholar
Zhao Y, Xu Z, Xiang Z, Liu Y (2017) Online learning of dynamic multi-view gallery for person re-identification. Multimed Tools Appl 76(1):217–241
Article Google Scholar
Shi D, Zhu L, Cheng Z, Li Z, Zhang H (2018) Unsupervised multi-view feature extraction with dynamic graph learning. J Vis Commun Image Represent 56:256–264
Article Google Scholar
Hou S, Zhou S, Liu W, Zheng Y (2018) Classifying advertising video by topicalizing high-level semantic concepts. Multimed Tools Appl 77(19):25475–25511
Article Google Scholar
Huang W, Gu J, Ma X, Li Y (2017) Correlation filter-based self-paced object tracking. In: Proceedings of the IEEE international conference on robotics and automation, pp 4437–4442
Huang W, Gu J, Ma X, Li Y (2018) Correlation-filter based scale-adaptive visual tracking with hybrid-scheme sample learning. IEEE Access 6:125–137
Article Google Scholar
Zhang B, Lei Z, Sun J, Zhang H (2018) Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 77(17):22247–22266
Article Google Scholar
Sui X, Zheng Y, Wei B, Bi H, Wu J, Pan X, Yin Y, Zhang S (2017) Choroid segmentation from optical coherence tomography with graph edge weights learned from deep convolutional neural networks. Neurocomputing 237:332–341
Article Google Scholar
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: ECCV Workshop, pp 850–865
Valmadre J, Bertinetto L, Henriques JF, Vedaldi A, Torr PHS (2017) End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5000–5008
Zhu Z, Wu W, Zou W, Yan J (2018) End-to-end flow correlation tracking with spatial-temporal attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 548–557
Wang Q, Gao J, Xing J, Zhang M, Hu W (2017) Dcfnet: discriminant correlation filters network for visual tracking, arXiv:1704.04057
Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 1781–1789
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Fan H, Ling H (2017) Sanet: structure-aware network for visual tracking. In: CVPR Workshop, pp 2217–2224
Huang W, Gu J, Ma X, Li Y (2017) Self-paced model learning for robust visual tracking. J Electron Imag, 26(1)
Huang W, Gu J, Ma X (2016) Compressive sensing with weighted local classifiers for robot visual tracking. Int J Robot Autom 31(5):416–427
Google Scholar
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article Google Scholar
Tang M, Yu B, Zhang F, Wang J (2018) High-speed tracking with multi-kernel correlation filters. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4874–4883
Galoogahi HK, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 1144–1152
Mueller M, Smith N, Ghanem B (2017) Context-aware correlation filter tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1387–1395
Bibi A, Mueller M, Ghanem B (2016) Target response adaptation for correlation filter tracking. In: Proceedings of the European Conference on Computer Vision, pp 419–433
Xing S, Liu F, Wang Q, Zhao X, Li T (2019) A hierarchical attention model for rating prediction by leveraging user and product reviews. Neurocomputing 332:417–427
Article Google Scholar
Sun J, Liu X, Wan W, Li J, Zhao D, Zhang H (2016) Video hashing based on appearance and attention features fusion via dbn. Neurocomputing 213:84–94
Article Google Scholar
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: Proceedings of the European conference on computer vision, pp 483–499
Choi J, Chang HJ, Yun S, Fischer T, Demiris Y, Choi JY (2017) Attentional correlation filter network for adaptive visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4828–4837
Ren M, Zemel RS (2017) End-to-end instance segmentation with recurrent attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 293–301
Lukezic A, Vojir T, Cehovin L, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4847–4856
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zhang M, Xing J, Gao J, Hu W (2015) Robust visual tracking using joint scale-spatial correlation filters
Chen B, Wang D, Li P, Lu H (2018) Real-time ’actor-critic’ tracking. In: Proceedings of the European conference on computer vision, pp 328–345
Zhang Y, Wang L, Wang D, Feng M, Lu H, Qi J (2018) Structured siamese network for real-time visual tracking. In: Proceedings of the European conference on computer vision, pp 355–370
Dong X, Shen J (2018) Triplet loss in siamese network for object tracking. In: Proceedings of the European conference on computer vision, pp 472–488
Wang N, Song Y, Ma C, Zhou W, Liu W, Li H (2019) Unsupervised deep tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: complementary leaners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1401–1409
Danelljan M, Hager G, Khan F, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In: ICCV Workshop, pp 621–629
Park E, Berg AC (2018) Meta-tracker: fast and robust online adaptation for visual object trackers. In: Proceedings of the European conference on computer vision, pp 587–604
Zhang T, Xu C, Yang M-H (2017) Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4819–4827
Danelljan M, Hager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 4310–4318
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 749–758
Choi J, Chang HJ, Yun S, Fischer T, Demiris Y, Choi JY (2017) Attentional correlation filter network for adaptive visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4828–4837
Qi Y, Zhang S, Qin L, Yao H, Huang Q, Lim J, Yang M-H (2016) Hedged deep tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4303–4311
Tao R, Gavves E, Smeulders AWM (2016) Siamese instance search for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1420–1429
Gao J, Zhang T, Xu C (2019) Graph convolutional tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Wang X, Li C, Luo B, Tang J (2018) Sint++: robust visual tracking via adversarial positive instance generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4864–4873
Kristan M, Leonardis A, Matas J, Felsberg M et al (2017) The visual object tracking vot2017 challenge results. In: ICCV Workshop, pp 1949–1972
Danelljan M, Bhat G, Khan FS, Felsberg M (2017) Eco: efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6931–6939
Danelljan M, Hager G, Khan FS, Felsberg M (2014) Accurate scale estimation for robust visual tracking. In: Proceedings of the British machine vision conference
Zhang J, Ma S, Sclaroff S (2014) Meem: robust tracking via multiple experts using entropy minimization. In: Proceedings of the European conference on computer vision, pp 188– 203
Poostchi M, Palaniappan K, Seetharaman G (2017) Spatial pyramid context-aware moving vehicle detection and tracking in urban aerial imagery. In: Proceedings of the IEEE international conference on advanced video and signal based surveillance, pp 1–6
Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of the European conference on computer vision, pp 472–488
Wang L, Ouyang W, Wang X, Lu H (2015) Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International conference on computer vision, pp 3119– 3127
Choi J, Chang HJ, Jeong J, Demiris Y, Choi JY (2016) Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4321–4330

Download references

Acknowledgments

This work is supported by the National High-tech Research and Development Program 863, China [2015AA042307]; the National Key Research and Development Program, China [2018YFB1305803]; the Joint Fund of National Natural Science Foundation and Shandong Province, China [U1706228]; the National Natural Science Foundation of China [61673245]; the Program for Outstanding PhD Candidate of Shandong University; the Natural Sciences and Engineering Research Council of Canada; the National Natural Science Foundation of China [61572300; 81871508; 61773246]; the Taishan Scholar Program of Shandong Province, China [TSHW201502038]; and the Major Program of Shandong Province Natural Science Foundation, China [ZR2018ZB0419].

Author information

Authors and Affiliations

School of Information Science and Engineering at Shandong Normal University, Jinan, China
Wenhui Huang
Department of Electrical and Computer Engineering at Dalhousie University, Halifax, Canada
Jason Gu
School of Control Science and Engineering at Shandong University, Jinan, China
Xin Ma & Yibin Li

Authors

Wenhui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jason Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yibin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jason Gu or Xin Ma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Based on the principle of convex functions and the circulant matrix, the closed-form solution of the filter parameter w in the proposed objective function can be deduced. The appendix shows the detailed solution process.

The objective function in (5) can be rewritten as follows using a circulant matrix, $\mathbf {B}= \left [\begin {array}{cccccc} \mathbf {X}_{0}& \sqrt {\theta _{3}}\mathbf {X}_{1}&\sqrt {\theta _{3}}\mathbf {X}_{2}&\ldots &\sqrt {\theta _{3}}\mathbf {X}_{k} \end {array}\right ]^{T}$:

$$ \begin{array}{@{}rcl@{}} \arg\underset{\mathbf{w},\mathbf{y}}{\min} \mathbb{O} &=& ||\begin{array}{cccccc} \left[\begin{array}{cccc} \mathbf{X}_{0}\\ \sqrt{\theta_{3}}\mathbf{X}_{1}\\\vdots\\\sqrt{\theta_{3}}\mathbf{X}_{k} \end{array}\right] \mathbf{w}- \left[\begin{array}{cccc} \mathbf{y}\\ 0\\\vdots\\0 \end{array}\right] \end{array}|| ^{2}_{2}+\theta_{1}||\mathbf{w}||^{2}_{2}\\ &&+\theta_{2} ||\begin{array}{cccccc} \left[\begin{array}{cccc} \mathbf{y}\\ 0\\\vdots\\0 \end{array}\right] -\left[\begin{array}{cccc} \mathbf{y}_{0}\\ 0\\\vdots\\0 \end{array}\right] \end{array}|| ^{2}_{2}\\ &=&||\mathbf{B}\mathbf{w}-\mathbf{y}^{\prime}||^{2}_{2}+\theta_{1}||\mathbf{w}||^{2}_{2}+\theta_{2}||\mathbf{y}^{\prime}-\mathbf{y}^{\prime}_{0}||^{2}_{2}. \end{array} $$

(10)

We define that $\mathbf {z}=\left [\begin {array}{cccc}\mathbf {w}\\\mathbf {y}^{\prime } \end {array}\right ]$; then, we attempt to search for a z to optimize (10):

$$ \begin{array}{@{}rcl@{}} \arg\underset{\mathbf{z}}{\min} \mathbb{O}(\mathbf{z})&=& ||\left[\begin{array}{cccc}\mathbf{B} &-\mathbf{I} \end{array}\right]\mathbf{z}||^{2}_{2}+\theta_{1}||\left[\begin{array}{cccc}\mathbf{I} &0 \end{array}\right]\mathbf{z}||^{2}_{2}\\ &&+\theta_{2}||\left[\begin{array}{cccc}0&\mathbf{I} \end{array}\right]\mathbf{z}-\mathbf{y}^{\prime}_{0}||^{2}_{2}.\ \end{array} $$

(11)

Because our proposed objective function is convex, (11) can be optimized using its first derivative:

$$ \begin{array}{@{}rcl@{}} \nabla_{\mathbf{z}}\mathbb{O}(\mathbf{z})&=& \left[\begin{array}{cccc}\mathbf{B}^{T}\mathbf{B} & -\mathbf{B}^{T}\\-\mathbf{B}&\mathbf{I} \end{array}\right]\mathbf{z}+\theta_{1}\left[\begin{array}{cccc}\mathbf{I}&0\\0&0 \end{array}\right]\mathbf{z}\\ &&+\theta_{2}\left[\begin{array}{cccc}0&0\\0&\mathbf{I} \end{array}\right]\mathbf{z} -\theta_{2}\left[\begin{array}{cccc}0\\\mathbf{I} \end{array}\right]\mathbf{y}^{\prime}_{0}=0,\\ &\Rightarrow&\left[\begin{array}{cccc}\mathbf{B}^{T}\mathbf{B}+\theta_{1}\mathbf{I} & -\mathbf{B}^{T}\\-\mathbf{B}&(1+\theta_{2})\mathbf{I} \end{array}\right]\mathbf{z}=\theta_{2}\left[\begin{array}{cccc}0\\\mathbf{I} \end{array}\right]\mathbf{y}^{\prime}_{0}.\ \end{array} $$

(12)

B is a circulant matrix. Thus, we can use its characteristics as follows:

$$ \begin{array}{lllll} &\mathbf{B}=\mathbf{F}\left[\begin{array}{cccc}diag(\hat{\mathbf{x}}_{0})&\sqrt{\theta_{3}}diag(\hat{\mathbf{x}}_{1})&\ldots&\sqrt{\theta_{3}}diag(\hat{\mathbf{x}}_{k}) \end{array}\right]^{T}\mathbf{F}^{H}=-\mathbf{F}\mathbf{D}\mathbf{F}^{H},\\ & \mathbf{B}^{T}=\mathbf{F}\left[\begin{array}{cccc}diag(\hat{\mathbf{x}}^{*}_{0})&\sqrt{\theta_{3}}diag(\hat{\mathbf{x}}^{*}_{1})&\ldots&\sqrt{\theta_{3}}diag(\hat{\mathbf{x}}^{*}_{k}) \end{array}\right]\mathbf{F}^{H} =-\mathbf{F}\mathbf{C}\mathbf{F}^{H}. \end{array} $$

(13)

where F is the DFT matrix.

By substituting (13) into the result of (12), the equation can be transformed into (14) and (15):

$$ \begin{array}{@{}rcl@{}} &&\left[\begin{array}{cccc} \mathbf{F}&0\\0&\mathbf{F} \end{array}\right] \left[\begin{array}{cccc} diag(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0}+\theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i}+\theta_{1}) & \mathbf{C}\\\mathbf{D}&diag(1+\theta_{2}) \end{array}\right]\\ &&\times \left[\begin{array}{cccc} \mathbf{F}^{H}&0\\0&\mathbf{F}^{H} \end{array}\right] \mathbf{z}=\theta_{2} \left[\begin{array}{cccc}0\\\mathbf{I} \end{array}\right]\mathbf{y}^{\prime}_{0},\ \end{array} $$

(14)

$$ \left[\begin{array}{cccc} \mathbf{N} & \mathbf{C}\\\mathbf{D}&\mathbf{V} \end{array}\right] \left[\begin{array}{cccc} \hat{\mathbf{w}}^{*}\\\hat{\mathbf{y}^{\prime}}^{*} \end{array}\right] =\theta_{2} \left[\begin{array}{cccc}0\\\mathbf{F}^{H} \end{array}\right]\mathbf{y}^{\prime}_{0},\ $$

(15)

where $\mathbf {N}=diag(\hat {\mathbf {x}}^{*}_{0}\odot \hat {\mathbf {x}}_{0}+\theta _{3}{\sum }^{k}_{i=1}\hat {\mathbf {x}}^{*}_{i}\odot \hat {\mathbf {x}}_{i}+\theta _{1})$ and V = diag(1 + 𝜃₂).

Thus, $\left [\begin {array}{cccc} \hat {\mathbf {w}}^{*}\\\hat {\mathbf {y}^{\prime }}^{*} \end {array}\right ]$ canbe rewritten as follows:

$$ \left[\begin{array}{cccc} \hat{\mathbf{w}}^{*}\\\hat{\mathbf{y}^{\prime}}^{*} \end{array}\right] =\theta_{2} \left[\begin{array}{cccc} \mathbf{N} & \mathbf{C}\\\mathbf{D}&\mathbf{V} \end{array}\right]^{-1} \left[\begin{array}{cccc}0\\\mathbf{F}^{H} \end{array}\right]\mathbf{y}^{\prime}_{0}.\ $$

(16)

In (16), $\left [\begin {array}{cccc} \mathbf {N} & \mathbf {C}\\\mathbf {D}&\mathbf {V} \end {array}\right ]^{-1}$ is as follows:

$$ \begin{array}{lllll} &\left[\begin{array}{cccc} \mathbf{N} & \mathbf{C}\\\mathbf{D}&\mathbf{V} \end{array}\right]^{-1}\\ &= \left[\begin{array}{cccc} (\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1}& -(\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1}\mathbf{C}\mathbf{V}^{-1} \\-\mathbf{V}^{-1}\mathbf{D}(\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1} & \mathbf{V}^{-1}\mathbf{D}(\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1}\mathbf{C}\mathbf{V}^{-1}+\mathbf{V}^{-1} \end{array}\right] .\ \end{array} $$

(17)

Then, the solution of $\hat {\mathbf {w}}^{*}$ according to (16) and (17):

$$ \hat{\mathbf{w}}^{*}=-\theta_{2}(\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1}\mathbf{C}\mathbf{V}^{-1}\mathbf{F}^{H}\mathbf{y}^{\prime}_{0}, $$

(18)

where (N −CV^− 1D)^− 1 is:

$$ \begin{array}{@{}rcl@{}} (\mathbf{N}- \mathbf{C}\mathbf{V}^{-1}\mathbf{D})^{-1}&=&[diag(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0}+\theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i}+\theta_{1})\\ &&-\frac{diag(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0}+\theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i})}{diag(1+\theta_{2})}]^{-1}\\ &=&diag(\frac{1 + \theta_{2}}{\theta_{2}(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0} + \theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i})+\theta_{1}(1 + \theta_{2})}).\ \end{array} $$

(19)

Finally, we can obtain the solution of filter parameter w:

$$ \begin{array}{llllllll} &\hat{\mathbf{w}}^{*}=\frac{\theta_{2}(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{y}}^{*}_{0})}{\theta_{2}(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0}+\theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i})+\theta_{1}(1+\theta_{2})}\\ &\Rightarrow\hat{\mathbf{w}}=\frac{\theta_{2}(\hat{\mathbf{x}}_{0}\odot\hat{\mathbf{y}}_{0})}{\theta_{2}(\hat{\mathbf{x}}^{*}_{0}\odot\hat{\mathbf{x}}_{0}+\theta_{3}{\sum}^{k}_{i=1}\hat{\mathbf{x}}^{*}_{i}\odot\hat{\mathbf{x}}_{i})+\theta_{1}(1+\theta_{2})}.\ \end{array} $$

(20)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, W., Gu, J., Ma, X. et al. End-to-end multitask Siamese network with residual hierarchical attention for real-time object tracking. Appl Intell 50, 1908–1921 (2020). https://doi.org/10.1007/s10489-019-01605-2

Download citation

Published: 20 February 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10489-019-01605-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-end multitask Siamese network with residual hierarchical attention for real-time object tracking

Abstract

Access this article

Similar content being viewed by others

DensSiam: End-to-End Densely-Siamese Network with Self-Attention Model for Object Tracking

MHASiam: Mixed High-Order Attention Siamese Network for Real-Time Visual Tracking

Residual Attention SiameseRPN for Visual Tracking

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

End-to-end multitask Siamese network with residual hierarchical attention for real-time object tracking

Abstract

Access this article

Similar content being viewed by others

DensSiam: End-to-End Densely-Siamese Network with Self-Attention Model for Object Tracking

MHASiam: Mixed High-Order Attention Siamese Network for Real-Time Visual Tracking

Residual Attention SiameseRPN for Visual Tracking

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation