Skip to main content

Advertisement

Log in

Tucker decomposition-based tensor learning for human action recognition

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

The spatial information is the important cue for human action recognition. Different from the vector representation, the spatial structure of human action in the still images can be preserved by the tensor representation. This paper proposes a robust human action recognition algorithm by tensor representation and Tucker decomposition. In this method, the still image containing human action is represented by a tensor descriptor (Histograms of Oriented Gradients). This representation preserves the spatial information inside the human action. Based on this representation, the unknown tensor parameter is decomposed according to the Tucker tensor decomposition at first, and then the optimization problems can be solved using the alternative optimization method, where at each iteration, the tensor descriptor is projected along one order and the parameter along the corresponding order can be estimated by solving the Ridge Regression problem. The estimated tensor parameter is more discriminative because of effectively using the spacial information along each order. Experiments are conducted using action images from three publicly available databases. Experimental results demonstrate that our method outperforms other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Wang, M., Hua, X.-S.: Active learning in multimedia annotation and retrieval: a survey. ACM Trans Intell Syst Technol 2(2), 10 (2011)

    Article  MathSciNet  Google Scholar 

  2. Wu, F., Yuan, Y., Rui, Y., Yan, S., Zhuang, Y.: Annotating web images using nova: non-convex group sparsity. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 509–518. ACM, (2012)

  3. Wang, M., Gao, Y., Lu, K., Rui, Y.: View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process 22(4), 1395–1407 (2013)

    Article  MathSciNet  Google Scholar 

  4. Wu, F., Tan, X., Yang, Y., Tao, D., Tang, S., Zhuang, Y.: Supervised nonnegative tensor factorization with maximum-margin constraint. In: 27th AAAI Conference on Artificial Intelligence, AAAI, pp. 962–968, (2013)

  5. Guo, W., Kotsia, I., Patras, I.: Tensor learning for regression. IEEE Trans Image Process 21(2), 816–827 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, CVPR, vol. 1, pp. 886–893. IEEE, (2005)

  7. Fischer, S., Šroubek, F., Perrinet, L., Redondo, R., Cristóbal, G.: Self-invertible 2d log-gabor wavelets. Int J Comput Vis 75(2), 231–246 (2007)

    Article  Google Scholar 

  8. Hatun, K., Duygulu, P.: Pose sentences: a new representation for action recognition using sequence of pose words. In: 19th International Conference on Pattern Recognition. ICPR, pp. 1–4. IEEE, (2008)

  9. Ikizler-Cinbis, N., Gokberk Cinbis, R., Sclaroff, S.: Learning actions from the web. In: IEEE 12th International Conference on Computer Vision, pp. 995–1002. IEEE, (2009)

  10. Vo, T., Tran, D., Ma, W., Nguyen, K.: Improved hog descriptors in image classification with cp decomposition. In: Neural Information Processing, pp. 384–391. Springer, (2013)

  11. Shakhnarovich, G., Indyk, P., Darrell, T.: Nearest-neighbor methods in learning and vision: theory and practice. (2006)

  12. Cortes, C., Vapnik, V.: Support vector machine. In: Machine learning, vol. 20, pp. 273–297. Springer, (1995)

  13. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  14. Vasilescu M.A.O., Terzopoulos, D.: Multilinear subspace analysis of image ensembles. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-93. IEEE, (2003)

  15. Pang, Y., Li, X., Yuan, Y.: Robust tensor analysis with l1-norm. IEEE. Trans. Circuits. Syst. Video. Technol. 20(2):172–178 (2010)

  16. Cai, D., He, X., Han, J.: Subspace learning based on tensor analysis. Department of Computer Science Technology Report No. 2572, University of Illinois at Urbana-Champaign (UIUCDCS-R-2005-2572), (2005)

  17. Yan, S., Xu, D., Yang, Q., Zhang, L., Tang, X., Zhang, H.-J.: Multilinear discriminant analysis for face recognition. IEEE Trans Image Process 16(1), 212–220 (2007)

    Article  MathSciNet  Google Scholar 

  18. Zhigang, M., Yi, Y., Feiping, N., Nicu, S.: Thinking of images as what they are: compound matrix regression for image classification. In: Proceedings of Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), (2013)

  19. Tao, D., Li, X, Hu, W., Maybank, S., Wu, X.: Supervised tensor learning. In: Fifth IEEE International Conference on Data Mining (ICDM), pp. 450–457. IEEE, (2005)

  20. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev 51(3), 455–500 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  21. Tichavsky, P., Phan, A.H., Koldovsky, Z.: Cramér-rao-induced bounds for candecomp/parafac tensor decomposition. IEEE Trans Signal Process 61(8), 1986–1997 (2013)

    Article  Google Scholar 

  22. Kotsia, I., Patras, I.: Support tucker machines. In: Computer Vision and Pattern Recognition (CVPR), pp. 633–640. IEEE, (2011)

  23. Fung, G., Mangasarian, O.L.: Multicategory proximal support vector machine classifiers. Mach Learn 59(1–2), 77–97 (2005)

    Article  MATH  Google Scholar 

  24. Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. Neural Inform Process Lett Rev 11(10), 203–224 (2007)

    Google Scholar 

  25. Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5), 530–549 (2004)

    Article  Google Scholar 

  26. Gupta, A., Kembhavi, A., Davis, L.S.: Observing human-object interactions: using spatial and functional compatibility for recognition. IEEE Trans Pattern Anal Mach Intell 31(10), 1775–1789 (2009)

    Article  Google Scholar 

  27. Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9–16. IEEE, (2010)

  28. Ikizler, N., Cinbis, R.G., Pehlivan, S., Duygulu, P.: Recognizing actions from still images. In: 19th International Conference on Pattern Recognition. ICPR 2008, pp. 1–4. IEEE, (2008)

Download references

Acknowledgments

This work was partly supported by the National Program on the Key Basic Research Project (under Grant 2013CB329301), NSFC (under Grant 61202166, 61472276), the Major Project of National Social Science Fund (under Grant 14ZDB153) and Doctoral Fund of Ministry of Education of China (under Grant 20120032120042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yahong Han.

Additional information

Communicated by B. Prabhakaran.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Han, Y. & Jiang, J. Tucker decomposition-based tensor learning for human action recognition. Multimedia Systems 22, 343–353 (2016). https://doi.org/10.1007/s00530-015-0464-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-015-0464-7

Keywords

Navigation