Skip to main content

Advertisement

Log in

Feature selection accelerated convolutional neural networks for visual tracking

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Most of the existing tracking methods based on convolutional neural network (CNN) models are too slow for use in real-time applications despite their excellent tracking accuracy in comparison with traditional methods. Meanwhile, CNN tracking solutions are memory intensive and require considerable computational resources. In this paper, we propose a time-efficient and accurate tracking scheme, a feature selection accelerated CNN (FSNet) tracking solution based on MDNet (Multi-Domain Network). The large number of convolutional operations is a major contributor to the high computational cost of MDNet. To reduce the computational complexity, we incorporated an efficient mutual information-based feature selection over the convolutional layer that reduces the feature redundancy in feature maps. Considering that tracking is a typical binary classification problem, redundant feature maps can simply be pruned, which results in an insignificant influence on the tracking performance. To further accelerate the CNN tracking solution, a RoIAlign layer is added that can apply convolution to the entire image instead of just to each RoI (Region of Interest). The bilinear interpolation of RoIAlign could well reduce misalignment errors of the tracked target. In addition, a new fine-tuning strategy is used in the fully-connected layers to accelerate the online updating process. By combining the above strategies, the accelerated CNN achieves a speedup to 60 FPS (Frame Per Second) on the GPU compared with the original MDNet, which functioned at 1 FPS with a very low impact on tracking accuracy. We evaluated the proposed solution on four benchmarks: OTB50, OTB100 ,VOT2016 and UAV123. The extensive comparison results verify the superior performance of FSNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  2. Li P, Wang D, Wang L, Lu H (2018) Deep visual tracking: Review and experimental comparison. Pattern Recogn 76:323–338

    Article  Google Scholar 

  3. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  4. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  5. Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowl.-Based Syst 194:105590

    Article  Google Scholar 

  6. Lu N, Wu Y, Feng L, Song J (2018) Deep learning for fall detection: Three-dimensional cnn combined with lstm on video kinematic data. IEEE J Biomed Health Inform 23(1):314–323

    Article  Google Scholar 

  7. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  8. Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646

  9. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  10. Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302

  11. Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69(6):066138

    Article  MathSciNet  Google Scholar 

  12. Cui Z, Lu N, Jing X, Shi X (2018) Fast dynamic convolutional neural networks for visual tracking. In: Proceedings of Asian conference on machine learning, pp 770–785

  13. Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2544–2550

  14. Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  15. Touil DE, Terki N, Medouakh S (2018) Learning spatially correlation filters based on convolutional features via pso algorithm and two combined color spaces for visual tracking. Appl Intell 48:2837–2846

    Article  Google Scholar 

  16. Han Z, Wang P, Ye Q (2020) Adaptive discriminative deep correlation filter for visual object tracking. IEEE Trans Circuits Syst Video Technol 30(1):155–166

    Article  Google Scholar 

  17. Xu T, Feng ZH, Wu XJ, Kittler J (2019) Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans Image Process 28(11):5596–5609

    Article  MathSciNet  Google Scholar 

  18. Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67

    Article  Google Scholar 

  19. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: Proceedings of European conference on computer vision, pp 850–865

  20. Zhang Z, Peng H (2019) Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern Recognition, pp 4591–4600

  21. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  22. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  23. Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowl.-Based Syst 193:105448

    Article  Google Scholar 

  24. Abdelpakey MH, Shehata MS (2019) Dp-siam: Dynamic policy siamese network for robust object tracking. IEEE Trans Image Process 28(11):5596–5609

    Article  MathSciNet  Google Scholar 

  25. Huang W, Gu J, Ma X, Li Y (2020) End-to-end multitask siamese network with residual hierarchical attention for real-time object tracking. Appl Intell 50:1908–1921

    Article  Google Scholar 

  26. Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: Delving deep into convolutional nets. In: Proceedings of British machine vision conference

  27. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105

  28. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  29. Lal TN, Chapelle O, Weston J, Elisseeff A (2006) Embedded methods. Feature Extraction 207:137–165

    Article  Google Scholar 

  30. Sung KK, Poggio T (1998) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51

    Article  Google Scholar 

  31. Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158

    Article  Google Scholar 

  32. Jung I, Son J, Baek M, Han B (2018) Real-time mdnet. In: Proceedings of European conference on computer vision, pp 83–98

  33. Song Y, Ma C, Gong L, Zhang J, Lau RW, Yang MH (2017) Crest: Convolutional residual learning for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 2555–2564

  34. Choi J, Jin Chang H, Fischer T, Yun S, Lee K, Jeong J, Demiris Y, Young Choi J (2018) Context-aware deep feature compression for high-speed visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 479–488

  35. Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of European conference on computer vision, pp 702–715

  36. Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3074–3082

  37. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1401–1409

  38. Kristan M, Matas J, Leonardis A, Felsberg M, Cehovin L, Fernández G, Vojir T (2016) The visual object tracking vot2016 challenge results. In: Proceedings of European conference on computer vision, pp 777–823

  39. Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. In: Proceedings of European conference on computer vision, pp 445–461

  40. Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 4310–4318

  41. Zhang J, Ma S, Sclaroff S (2014) Meem: Robust tracking via multiple experts using entropy minimization. In: Proceedings of European conference on computer vision, pp 188–203

  42. Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of European conference on computer vision, pp 254–265

  43. Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): A cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 749–758

  44. Danelljan M, Häger G, Khan FS, Felsberg M (2016) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575

    Article  Google Scholar 

  45. Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks SL, Torr PH (2015) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  46. Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1822–1829

  47. Kalal Z, Mikolajczyk K, Matas J (2011) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by National Key R&D Program of China 2018AAA0101501, Science and Technology Project of SGCC (State Grid Corporation of China): Fundamental Theory of Human-in-the-loop Hybrid-Augmented Intelligence for Power Grid Dispatch and Control.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Na Lu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, Z., Lu, N. Feature selection accelerated convolutional neural networks for visual tracking. Appl Intell 51, 8230–8244 (2021). https://doi.org/10.1007/s10489-021-02234-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02234-4

Keywords