Feature selection accelerated convolutional neural networks for visual tracking

Cui, Zhiyan; Lu, Na

doi:10.1007/s10489-021-02234-4

Feature selection accelerated convolutional neural networks for visual tracking

Published: 30 March 2021

Volume 51, pages 8230–8244, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

820 Accesses
Explore all metrics

Abstract

Most of the existing tracking methods based on convolutional neural network (CNN) models are too slow for use in real-time applications despite their excellent tracking accuracy in comparison with traditional methods. Meanwhile, CNN tracking solutions are memory intensive and require considerable computational resources. In this paper, we propose a time-efficient and accurate tracking scheme, a feature selection accelerated CNN (FSNet) tracking solution based on MDNet (Multi-Domain Network). The large number of convolutional operations is a major contributor to the high computational cost of MDNet. To reduce the computational complexity, we incorporated an efficient mutual information-based feature selection over the convolutional layer that reduces the feature redundancy in feature maps. Considering that tracking is a typical binary classification problem, redundant feature maps can simply be pruned, which results in an insignificant influence on the tracking performance. To further accelerate the CNN tracking solution, a RoIAlign layer is added that can apply convolution to the entire image instead of just to each RoI (Region of Interest). The bilinear interpolation of RoIAlign could well reduce misalignment errors of the tracked target. In addition, a new fine-tuning strategy is used in the fully-connected layers to accelerate the online updating process. By combining the above strategies, the accelerated CNN achieves a speedup to 60 FPS (Frame Per Second) on the GPU compared with the original MDNet, which functioned at 1 FPS with a very low impact on tracking accuracy. We evaluated the proposed solution on four benchmarks: OTB50, OTB100 ,VOT2016 and UAV123. The extensive comparison results verify the superior performance of FSNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective fusion of deep multitasking representations for robust visual tracking

Article 19 October 2021

Scale estimation-based visual tracking with optimized convolutional activation features

Article 12 September 2019

Robust and Real-Time Visual Tracking Based on Single-Layer Convolutional Features and Accurate Scale Estimation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848
Article Google Scholar
Li P, Wang D, Wang L, Lu H (2018) Deep visual tracking: Review and experimental comparison. Pattern Recogn 76:323–338
Article Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowl.-Based Syst 194:105590
Article Google Scholar
Lu N, Wu Y, Feng L, Song J (2018) Deep learning for fall detection: Three-dimensional cnn combined with lstm on video kinematic data. IEEE J Biomed Health Inform 23(1):314–323
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69(6):066138
Article MathSciNet Google Scholar
Cui Z, Lu N, Jing X, Shi X (2018) Fast dynamic convolutional neural networks for visual tracking. In: Proceedings of Asian conference on machine learning, pp 770–785
Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2544–2550
Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article Google Scholar
Touil DE, Terki N, Medouakh S (2018) Learning spatially correlation filters based on convolutional features via pso algorithm and two combined color spaces for visual tracking. Appl Intell 48:2837–2846
Article Google Scholar
Han Z, Wang P, Ye Q (2020) Adaptive discriminative deep correlation filter for visual object tracking. IEEE Trans Circuits Syst Video Technol 30(1):155–166
Article Google Scholar
Xu T, Feng ZH, Wu XJ, Kittler J (2019) Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans Image Process 28(11):5596–5609
Article MathSciNet Google Scholar
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67
Article Google Scholar
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: Proceedings of European conference on computer vision, pp 850–865
Zhang Z, Peng H (2019) Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern Recognition, pp 4591–4600
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowl.-Based Syst 193:105448
Article Google Scholar
Abdelpakey MH, Shehata MS (2019) Dp-siam: Dynamic policy siamese network for robust object tracking. IEEE Trans Image Process 28(11):5596–5609
Article MathSciNet Google Scholar
Huang W, Gu J, Ma X, Li Y (2020) End-to-end multitask siamese network with residual hierarchical attention for real-time object tracking. Appl Intell 50:1908–1921
Article Google Scholar
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: Delving deep into convolutional nets. In: Proceedings of British machine vision conference
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Lal TN, Chapelle O, Weston J, Elisseeff A (2006) Embedded methods. Feature Extraction 207:137–165
Article Google Scholar
Sung KK, Poggio T (1998) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51
Article Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Article Google Scholar
Jung I, Son J, Baek M, Han B (2018) Real-time mdnet. In: Proceedings of European conference on computer vision, pp 83–98
Song Y, Ma C, Gong L, Zhang J, Lau RW, Yang MH (2017) Crest: Convolutional residual learning for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 2555–2564
Choi J, Jin Chang H, Fischer T, Yun S, Lee K, Jeong J, Demiris Y, Young Choi J (2018) Context-aware deep feature compression for high-speed visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 479–488
Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of European conference on computer vision, pp 702–715
Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3074–3082
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1401–1409
Kristan M, Matas J, Leonardis A, Felsberg M, Cehovin L, Fernández G, Vojir T (2016) The visual object tracking vot2016 challenge results. In: Proceedings of European conference on computer vision, pp 777–823
Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. In: Proceedings of European conference on computer vision, pp 445–461
Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 4310–4318
Zhang J, Ma S, Sclaroff S (2014) Meem: Robust tracking via multiple experts using entropy minimization. In: Proceedings of European conference on computer vision, pp 188–203
Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of European conference on computer vision, pp 254–265
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): A cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 749–758
Danelljan M, Häger G, Khan FS, Felsberg M (2016) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575
Article Google Scholar
Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks SL, Torr PH (2015) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109
Article Google Scholar
Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1822–1829
Kalal Z, Mikolajczyk K, Matas J (2011) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422
Article Google Scholar

Download references

Acknowledgements

This work is supported by National Key R&D Program of China 2018AAA0101501, Science and Technology Project of SGCC (State Grid Corporation of China): Fundamental Theory of Human-in-the-loop Hybrid-Augmented Intelligence for Power Grid Dispatch and Control.

Author information

Authors and Affiliations

Systems Engineering Institute, Xi’an Jiaotong University, Xi’an, Shaanxi, China
Zhiyan Cui & Na Lu
Beijing Advanced Innovation Center for Intelligent Robots and Systems, Beijing, China
Na Lu

Authors

Zhiyan Cui
View author publications
You can also search for this author in PubMed Google Scholar
Na Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Na Lu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, Z., Lu, N. Feature selection accelerated convolutional neural networks for visual tracking. Appl Intell 51, 8230–8244 (2021). https://doi.org/10.1007/s10489-021-02234-4

Download citation

Accepted: 24 January 2021
Published: 30 March 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s10489-021-02234-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection accelerated convolutional neural networks for visual tracking

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective fusion of deep multitasking representations for robust visual tracking

Scale estimation-based visual tracking with optimized convolutional activation features

Robust and Real-Time Visual Tracking Based on Single-Layer Convolutional Features and Accurate Scale Estimation

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now