Variable scale learning for visual object tracking

He, Xuedong; Zhao, Lu; Chen, Calvin Yu-Chian

doi:10.1007/s12652-021-03469-2

Variable scale learning for visual object tracking

Original Research
Published: 06 September 2021

Volume 14, pages 3315–3330, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

456 Accesses
Explore all metrics

Abstract

Recently, deep learning achieves competitive accuracy and robustness and dramatically improves the performance of target scale estimation through pre-trained special network branches. Yet, a fast and robust scale estimation method is still a challenging problem for visual object tracking. Early correlation filter tracking algorithm uses a multiscale search method to estimate the scale with the constant number of scale factors and invariant aspect ratio, which is redundant for the video frames with little or no scale change. Also, an independent network branch for target scale state is proposed, but the training network needs an abundance of datasets, and the effect is not very stable for the unseen target object. Aiming at the problems of existing scale estimation solutions, several variable scale learning methods are proposed to explore the scale change of the target. Firstly, we proposed a variable scale factor learning method, which makes us rid of the commonly used multiscale search with the flaws of fixed scale factors. Secondly, we used a multiscale aspect ratio solution to make up for invariant aspect ratio. Thirdly, the first and second scale methods were combined to propose a variable scale aspect ratio estimation method. Finally, the proposed scale estimation methods were embedded into the state-of-the-art ECO (Efficient Convolution Operators) and ATOM (Accurate Tracking by Overlap Maximization) trackers to replace the original scale methods for verifying the effectiveness of our proposed method. Extensive experiments on OTB100, UAV123, TC128 and LaSOT datasets demonstrate that the tracking performance can be improved effectively by using the proposed scale methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accurate Scale-Variable Tracking

Deep Scale Feature for Visual Tracking

Efficient scale estimation methods using lightweight deep convolutional neural networks for visual tracking

Article 02 January 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

We used four publicly available datasets in order to illustrate and test our methods. The OTB dataset can be found in http://cvlab.hanyang.ac.kr/tracker_benchmark/index.html. The UAV123 dataset can be found in https://ivul.kaust.edu.sa/Pages/pub-benchmark-simulator-uav.aspx. The TC128 dataset can be found in http://www.dabi.temple.edu/~hbling/data/TColor-128/TColor-128.html. The LaSOT dataset can be found in https://cis.temple.edu/lasot/download.html.

References

Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016a) Staple: Complementary learners for real-time tracking. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1401–1409
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016b) Fully-convolutional siamese networks for object tracking. In: Proceedings of the European Conference on Computer Vision, pp 850–865
Bhat G, Johnander J, Danelljan M, Khan FS, Felsberg M (2018) Unveiling the power of deep tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 493–509
Bhat G, Danelljan M, Van Gool L, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6182–6191
Bhat G, Danelljan M, Van Gool L, Timofte R (2020) Know your surroundings: Exploiting scene information for object tracking. In: Proceedings of the European Conference on Computer Vision, pp 205–221
Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2544–2550
Chen X, Yan B, Zhu J, Wang D, Yang X, Lu H (2021) Transformer tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8126–8135
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), pp 886–893
Danelljan M, Häger G, Khan F, Felsberg M (2014a) Accurate scale estimation for robust visual tracking. In: Proceedings of the British Machine Vision Conference, pp 1–11
Danelljan M, Khan F S, Felsberg M, Weijer JVD (2014b) Adaptive color attributes for real-time visual tracking. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 1090–1097
Danelljan M, Häger G, Khan F S, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), pp 4310–4318
Danelljan M, Robinson A, Shahbaz Khan F, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: Proceedings of the European Conference on Computer Vision, pp 472–488
Danelljan M, Bhat G, Khan FS, Felsberg M (2017a) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646
Danelljan M, Häger G, Khan FS, Felsberg M (2017b) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575
Article Google Scholar
Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4660–4669
Danelljan M, Van Gool L, Timofte R (2020) Probabilistic regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7183–7192
Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 5369–5378
Henriques J F, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of the European conference on computer vision, pp 702–715
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article Google Scholar
Hu B, Zhao H, Yang Y, Zhou B, Raj ANJ (2020) Multiple faces tracking using feature fusion and neural network in video. Intell Autom Soft Comput 26(6):1549–1560
Article Google Scholar
Huang D, Gu P, Feng H-M, Lin Y, Zheng L (2020) Robust visual tracking models designs through kernelized correlation filters. Intell Autom Soft Comput 26(2):313–322
Google Scholar
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 784–799
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, ˇCehovin Zajc L, Vojir T, Bhat G, Lukezic A, Eldesokey A (2018) The sixth visual object tracking vot2018 challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp 0–0
Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of the European conference on computer vision, pp 254–265
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8971–8980
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4282–4291
Liang P, Blasch E, Ling H (2015) Encoding color information for visual tracking: Algorithms and benchmark. IEEE Trans Image Process 24(12):5630–5644
Article MathSciNet MATH Google Scholar
Liu Z, Wang XA, Sun C, Lu K (2019) Implementation system of human eye tracking algorithm based on fpga. CMC-Comput Mat Contin 58(3):653–664
Google Scholar
Ma H, Lin Z, Acton ST (2020) Fast: Fast and accurate scale estimation for tracking. IEEE Signal Process Lett 27:161–165
Article Google Scholar
Ma C, Huang J, Yang X, Yang M (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), pp 3074–3082
Marvasti-Zadeh SM, Cheng L, Ghanei-Yakhdan H, Kasaei S (2021) Deep learning for visual tracking: A comprehensive survey. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3046478
Article Google Scholar
Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. In: Proceedings of the Proceedings of the European Conference on Computer Vision (ECCV), pp 445–461
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Santhosh P, Kaarthick B (2019) An automated player detection and tracking in basketball game. CMC-Comput Mat Contin 58(3):625–639
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations, pp 1–14
Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PH (2017) End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2805–2813
Voigtlaender P, Luiten J, Torr PHS, Leibe B (2020) Siam r-cnn: Visual tracking by re-detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6578–6588
Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848
Article Google Scholar
Xu Y, Wang Z, Li Z, Ye Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12549–12556
Zhang J, Jin X, Sun J, Wang J, Li K (2019a) Dual model learning combined with multiple feature selection for accurate visual tracking. IEEE Access 7:43956–43969
Article Google Scholar
Zhang J, Wu Y, Feng W, Wang J (2019b) Spatially attentive visual tracking using multi-model adaptive response fusion. IEEE Access 7:83873–83887
Article Google Scholar
Zhang L, Gonzalez-Garcia A, Weijer JVD, Danelljan M, Khan FS (2019c) Learning the model update for siamese trackers. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp 4009–4018
Zhang J, Jin X, Sun J, Wang J, Sangaiah AK (2020a) Spatial and semantic convolutional features for robust visual object tracking. Multimed Tools Appl 79(21):15095–15115
Article Google Scholar
Zhang J, Sun J, Wang J, Yue X-G (2020b) Visual object tracking based on residual network and cascaded correlation filters. J Ambient Intell Humaniz Comput 12:8427–8440
Article Google Scholar
Zhang J, Liu Y, Liu H, Wang J (2021) Learning local–global multiple correlation filters for robust visual tracking with kalman filter redetection. Sensors 21(4):1129
Article Google Scholar
Zhao S, Xu T, Wu X-J, Zhu X-F (2021) Adaptive feature fusion for visual object tracking. Pattern Recognit 111:107679
Article Google Scholar
Zhao H, Yang G, Wang D, Lu H (2021) Deep mutual learning for visual object tracking. Pattern Recognit 112:107796
Article Google Scholar
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 101–117

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for useful and constructive comments that help improve the quality of this paper.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62176272), Guangzhou Science and Technology Fund (Grant No. 201803010072), Science, Technology & Innovation Commission of Shenzhen Municipality (JCYL 20170818165305521), and China Medical University Hospital (DMR-107-067, DMR-108-132, DMR-110-097). We also acknowledge the start-up funding from SYSU “Hundred Talent Program”.

Author information

Authors and Affiliations

Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-Sen University, Shenzhen, 510275, Guangdong, China
Xuedong He & Calvin Yu-Chian Chen
Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510655, Guangdong, China
Lu Zhao
Department of Medical Research, China Medical University Hospital, Taichung, 40447, Taiwan
Calvin Yu-Chian Chen
Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan
Calvin Yu-Chian Chen

Authors

Xuedong He
View author publications
You can also search for this author inPubMed Google Scholar
Lu Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Calvin Yu-Chian Chen
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Xuedong He completed the experiments and analyzed the results. Xuedong He, Lu Zhao and Calvin Yu-Chian Chen wrote the manuscript together.

Corresponding author

Correspondence to Calvin Yu-Chian Chen.

Ethics declarations

Conflicts of interest

The author reports no conflicts of interest in this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, X., Zhao, L. & Chen, C.YC. Variable scale learning for visual object tracking. J Ambient Intell Human Comput 14, 3315–3330 (2023). https://doi.org/10.1007/s12652-021-03469-2

Download citation

Received: 28 February 2021
Accepted: 31 August 2021
Published: 06 September 2021
Issue Date: April 2023
DOI: https://doi.org/10.1007/s12652-021-03469-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable scale learning for visual object tracking

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accurate Scale-Variable Tracking

Deep Scale Feature for Visual Tracking

Efficient scale estimation methods using lightweight deep convolutional neural networks for visual tracking

Explore related subjects

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now