Skip to main content

Advertisement

Log in

Visual tracking based on stacked Denoising Autoencoder network with genetic algorithm optimization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Visual object tracking in dynamic environments with severe appearance variations is a significant problem in the computer vision field. This paper proposes a novel visual tracking algorithm that exploits the multiple level features learning ability of SDAE. There are two training stages for the SDAE network: Layer-wise pre-training and fine-tuning. In the pre-training stage, a two-layer sparse-coded method is used to represent the input image, then a multi-level image feature descriptor is obtained. In the fine-tuning stage, the connection weights and bias terms for back propagation are gathered via genetic algorithm. A logistic classification layer is added at the top of the encoder network to enable tracking within the well-established particle filter network. Experimental results confirm, both qualitatively and quantitatively, that the proposed method performs well in comparison against eight other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Arulampalam M, Maskell S, Gordon N et al (2002) A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process 50(2):174–188

    Article  Google Scholar 

  2. Babenko B, Yang MH, Belongie S (2009) Visual tracking with online multiple instance learning. Proc IEEE Int Conf Computer Vision and Pattern Recognition (CVPR), Miami, June 2009, pp 983–990

  3. Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. JMLR: workshop and conference Proc 27th on unsupervised and transfer learning, pp 37–50

  4. Bao C, Wu Y, Ling H et al (2012) Real time robust L1 tracker using accelerated proximal gradient approach. Proc IEEE Int Conf Computer Vision and Pattern Recognition (CVPR), Providence, June 2012, pp 1830–1837

  5. Bengio Y (2009) Learning deep architectures for AI. Foundation and Trends in Machine Learning 2(1):1–127

    Article  MATH  Google Scholar 

  6. Bengio S, Pereira F, Singer Y et al (2009) Group sparse coding. Advances in Neural Information Processing Systems (NIPS), Vancouver, December 2009, pp 82–89

  7. Collins RT, Liu YX, Leordeanu M (2005) Online selection of discriminative tracking features. IEEE Trans Pattern Anal Mach Intell 27(10):1631–1643

    Article  Google Scholar 

  8. Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136

    Article  Google Scholar 

  9. Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Int Conf Computer Vision and Pattern Recognition (CVPR), Columbus, June 2014, pp 580–587

  10. Godsill S, Clapp T (2001) Improvement strategies for Monte Carlo particle filters[M]// sequential Monte Carlo methods in practice. Springer, New York, pp. 139–158

  11. Grabner H, Grabner M, Bischof H (2006) Real-time tracking via on-line boosting. Proc British Machine Vision Conf, Edinburgh, September 2006, pp 6

  12. Hinton GE (2012) A practical guide to training restricted boltzmann machines. Springer, Berlin

    Book  Google Scholar 

  13. Huval B, Coates A, Ng A (2013) Deep learning for class-generic object detection. arXiv preprint arXiv:1312.6885

  14. Jordan A (2001) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems (NIPS), Vancouver, December 2001, pp 169–187

  15. Kaihua Z, Lei Z, Yang MH (2013) Real-time object tracking via online discriminative feature selection. IEEE Trans Image Process 22(12):4664–4677

    Article  MathSciNet  MATH  Google Scholar 

  16. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, December 2012, pp 1097–1105

  17. von Lehman A, Paek EG, Liao PF (1988) Factors influencing learning by back-propagation. IEEE Int Conf Neural Networks, San Diego, July 1988, pp 335–341

  18. Li MQ, Kou JS, Lin D (2002) The fundamental theory and application of genetic algorithm. Science Press, Beijing

    Google Scholar 

  19. Olshausen B, Field D (1997) Sparse coding with an overcomplete basis set: a strategy employed by V1? Vis Res 37(23):3311–3326

    Article  Google Scholar 

  20. Perez P, Hue C, Vermaak J et al (2002) Color-Based Probabilistic Tracking. Proc 7th European Conf Computer Vision (ECCV), Copenhagen, pp 661–675

  21. Ross D, Lim J, Yang MH (2004) Adaptive probabilistic visual tracking with incremental subspace update. Proc 8th European Conf on Computer Vision (ECCV), Prague, May 2004, pp 470–482

  22. Ross D, Lim J, Lin RS (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1):125–141

    Article  Google Scholar 

  23. Vincent P, Larochelle H, Bengio Y et al (2008) Extracting and composing robust features with denoising autoencoders. Proc 25th Int Conf on Machine Learning (ICML), New York, June 2008, pp 1096–1103

  24. Vincent P, Larochelle H, Lajoie I et al (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  25. Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, December 2013, pp 809–817

  26. Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. Proc IEEE Conf Computer Vision and Pattern Recognition (CVPR), Portland, June 2013, pp 2411–2418

  27. Yang J, Zhang D, Frangi AF et al (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26(1):131–137

    Article  Google Scholar 

  28. Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38(4):1–45

    Article  Google Scholar 

  29. Zhang T, Ghanem B, Liu S et al (2012) Robust visual tracking via multi-task sparse learning. Proc IEEE Conf on Computer Vision and Pattern Recognition (CVPR), Providence, June 2012, pp 2042–2049

  30. Zhang K, Zhang L, Yang MH (2012) Real-time compressive tracking. Proc 12th European Conf Computer Vision (ECCV), Florence, October 2012, pp 864–877

  31. Zhou X, Xie L, Zhang P et al (2014) An ensemble of deep neural networks for object tracking. Proc IEEE Int Conf Image Process (ICIP), Paris, October 2014, pp 843–847

  32. Yu K, Li YQ, Lafferty J (2011) Learning image representations from the pixel level via hierarchical sparse coding. Proc IEEE Int Conf Computer Vision and Pattern Recognition (CVPR), Providence, June 2011, pp 1713–1720

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61672433). The authors would like to thank the valuable comments from the reviewers and editors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Guo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hua, W., Mu, D., Guo, D. et al. Visual tracking based on stacked Denoising Autoencoder network with genetic algorithm optimization. Multimed Tools Appl 77, 4253–4269 (2018). https://doi.org/10.1007/s11042-017-4702-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4702-1

Keywords

Navigation