Skip to main content
Log in

How do deep convolutional features affect tracking performance: an experimental study

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Visual tracking is an import topic in computer vision with many practical applications. Recently, deep learning methods have been introduced into the tracking community to improve tracking performance. How deep features affect tracking performance, however, has not been studied thoroughly. In this paper, we carry out an experimental study to deeply investigate the impact of convolutional features on tracking performance. We adopt the most influential and representative Convolutional Neural Network (CNN) models that are widely used in computer vision to equip our baseline tracking framework. Firstly, we carry out experiments on each CNN to reveal the relationship between tracking performance and different CNN layers. Secondly, we have a vertical comparison of tracking performance between different CNN models. In addition, we explore the effect of ensemble strategies of CNN features on tracking performance. Our work has got several valuable findings on the relationship between tracking performance and convolutional features. Based on our findings, we have derived a few useful guidelines for designing trackers with better performance. We have also developed a simple baseline tracker with the guidelines and it outperforms several state-of-the-art trackers very easily on challenging benchmark video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Avidan, S.: Support vector tracking. IEEE Trans. Patt. Anal. Mach. Intell. 26(8), 1064–1072 (2004)

    Article  Google Scholar 

  2. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Patt. Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  3. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: Proceedings of European Conference on Computer Vision, pp. 850–865 (2016)

    Google Scholar 

  4. Bolme, D.S., Beveridge, J.R., Draper, B., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2544–2550 (2010)

  5. Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015)

  6. Danelljan, M., Hger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: Proceedings of British Machine Vision Conference, pp. 1–5 (2014)

  7. Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of European Conference on Computer Vision, pp. 472–488 (2016)

    Chapter  Google Scholar 

  8. Fan, H., Ling, H.: Sanet: Structure-aware network for visual tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Workshop, pp. 2217–2224 (2017)

  9. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2014)

  10. Han, B., Sim, J., Adam, H.: Branchout: regularization for online ensemble tracking with convolutional neural networks. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2217–2224 (2017)

  11. Hare, S., Saffari, A., Torr, P.H.: Structured output tracking with kernels. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2096–2109 (2011)

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  13. He, S., Yang, Q., Lau, R., Wang, J., Yang, M.H.: Visual tracking via locality sensitive histograms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2427–2434 (2013)

  14. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Proceedings of European Conference on Computer Vision, pp. 749–765 (2016)

    Chapter  Google Scholar 

  15. Henriques, J., Caseiro, R., , Martins, P., Batista, J.: Exploring the circulant structure of tracking-by-detection with kernels. In: Proceedings of European Conference on Computer Vision, pp. 702–715 (2012)

    Chapter  Google Scholar 

  16. Henriques, J., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Patt. Anal. Mach. Intell. 37(3), 583–596 (2015)

    Article  Google Scholar 

  17. Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: Proceedings of 32nd International Conference on Machine Learning, pp. 597–606 (2015)

  18. Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking–learning–detection. IEEE Trans. Patt. Anal. Mach. Intell. 34(7), 1409–1422 (2012)

    Article  Google Scholar 

  19. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  20. Kumar, B.V., Mahalanobis, A., Juday, R.D.: Correlation Pattern Recognition. Cambridge University Press, Cambridge (2005)

    Book  Google Scholar 

  21. LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: International Symposium on Circuits and Systems, pp. 1097–1105 (2012)

  22. Li, X., Hu, W., Shen, C., Zhang, Z., Dick, A.: Hengel: a survey of appearance models in visual object tracking. ACM Trans. Intell. Syst. Technol. 4(4), 58 (2013)

    Article  Google Scholar 

  23. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  24. Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: IEEE International Conference on Computer Vision, pp. 3074–3082 (2015)

  25. Ma, C., Yang, X., Zhang, C., Yang, M.H.: Long-term correlation tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5388–5396 (2015)

  26. Danelljan, M., Robinson, A., Khan, F. S., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of European Conference on Computer Vision, pp. 850–865 (2016)

    Chapter  Google Scholar 

  27. Matthews, L., Ishikawa, T., Baker, S.: The template update problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 810–815 (2004)

    Article  Google Scholar 

  28. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)

  29. Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q., Lim, J., Yang, M.H.: Hedged deep tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4303–4311 (2016)

  30. Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1–3), 125–141 (2008)

    Article  Google Scholar 

  31. Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: an astounding baseline for recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Workshops (2014)

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representat. (2015)

  33. Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2016)

  34. Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)

    Article  Google Scholar 

  35. Valmadre, J., Bertinetto, L., Henriques, J. F., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2217–2224 (2017)

  36. Vedaldi, A., Lenc, K.: Matconvnet: convolutional neural networks for matlab. In: Proceedings of ACM International Conference on Multimedia, pp. 689–692 (2015)

  37. Wang, Y., Chen, H., Li, S., Zhang, J., Gao, C.: Object tracking by color distribution fields with adaptive hierarchical structure. Vis. Comput. 33(2), 235–247 (2017)

    Article  Google Scholar 

  38. Wang, L., Liu, T., Wang, G., Chan, K.L., Yang, Q.: Video tracking using learned hierarchical features. IEEE Trans. Image Process. 24(4), 1424–1435 (2015)

    Article  MathSciNet  Google Scholar 

  39. Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3119–3127 (2015)

  40. Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems, pp. 809–817 (2013)

  41. Wang, Z., Yoon, S., Xie, S.J., Lu, Y., Park, D.S.: Visual tracking with semi-supervised online weighted multiple instance learning. Vis. Comput. 32(3), 307–320 (2016)

    Article  Google Scholar 

  42. Smeulders, A.W., Chu, D.M., Cucciara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)

    Article  Google Scholar 

  43. Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)

  44. Xu, J., Lu, H., Yang., M.H.: Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1822–1829 (2012)

  45. Yao, R., Xia, S., Zhang, Z., Zhang, Y.: Real-time correlation filter tracking by efficient dense belief propagation with structure preserving. IEEE Trans. Multimed 19(4), 772–784 (2017)

    Article  Google Scholar 

  46. Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 13 (2006)

    Article  Google Scholar 

  47. Zhang, J., Ma, S., Sclaroff, S.: Robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Computer Vision, pp. 188–203 (2014)

Download references

Acknowledgements

This work was supported by The Fundamental Research Funds for the Central Universities (Grant No. 2017RC54).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Guan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guan, H., Cheng, B. How do deep convolutional features affect tracking performance: an experimental study. Vis Comput 34, 1701–1711 (2018). https://doi.org/10.1007/s00371-017-1445-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-017-1445-y

Keywords

Navigation