Abstract
The spatial information is the important cue for human action recognition. Different from the vector representation, the spatial structure of human action in the still images can be preserved by the tensor representation. This paper proposes a robust human action recognition algorithm by tensor representation and Tucker decomposition. In this method, the still image containing human action is represented by a tensor descriptor (Histograms of Oriented Gradients). This representation preserves the spatial information inside the human action. Based on this representation, the unknown tensor parameter is decomposed according to the Tucker tensor decomposition at first, and then the optimization problems can be solved using the alternative optimization method, where at each iteration, the tensor descriptor is projected along one order and the parameter along the corresponding order can be estimated by solving the Ridge Regression problem. The estimated tensor parameter is more discriminative because of effectively using the spacial information along each order. Experiments are conducted using action images from three publicly available databases. Experimental results demonstrate that our method outperforms other methods.
Similar content being viewed by others
References
Wang, M., Hua, X.-S.: Active learning in multimedia annotation and retrieval: a survey. ACM Trans Intell Syst Technol 2(2), 10 (2011)
Wu, F., Yuan, Y., Rui, Y., Yan, S., Zhuang, Y.: Annotating web images using nova: non-convex group sparsity. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 509–518. ACM, (2012)
Wang, M., Gao, Y., Lu, K., Rui, Y.: View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process 22(4), 1395–1407 (2013)
Wu, F., Tan, X., Yang, Y., Tao, D., Tang, S., Zhuang, Y.: Supervised nonnegative tensor factorization with maximum-margin constraint. In: 27th AAAI Conference on Artificial Intelligence, AAAI, pp. 962–968, (2013)
Guo, W., Kotsia, I., Patras, I.: Tensor learning for regression. IEEE Trans Image Process 21(2), 816–827 (2012)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, CVPR, vol. 1, pp. 886–893. IEEE, (2005)
Fischer, S., Šroubek, F., Perrinet, L., Redondo, R., Cristóbal, G.: Self-invertible 2d log-gabor wavelets. Int J Comput Vis 75(2), 231–246 (2007)
Hatun, K., Duygulu, P.: Pose sentences: a new representation for action recognition using sequence of pose words. In: 19th International Conference on Pattern Recognition. ICPR, pp. 1–4. IEEE, (2008)
Ikizler-Cinbis, N., Gokberk Cinbis, R., Sclaroff, S.: Learning actions from the web. In: IEEE 12th International Conference on Computer Vision, pp. 995–1002. IEEE, (2009)
Vo, T., Tran, D., Ma, W., Nguyen, K.: Improved hog descriptors in image classification with cp decomposition. In: Neural Information Processing, pp. 384–391. Springer, (2013)
Shakhnarovich, G., Indyk, P., Darrell, T.: Nearest-neighbor methods in learning and vision: theory and practice. (2006)
Cortes, C., Vapnik, V.: Support vector machine. In: Machine learning, vol. 20, pp. 273–297. Springer, (1995)
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Vasilescu M.A.O., Terzopoulos, D.: Multilinear subspace analysis of image ensembles. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-93. IEEE, (2003)
Pang, Y., Li, X., Yuan, Y.: Robust tensor analysis with l1-norm. IEEE. Trans. Circuits. Syst. Video. Technol. 20(2):172–178 (2010)
Cai, D., He, X., Han, J.: Subspace learning based on tensor analysis. Department of Computer Science Technology Report No. 2572, University of Illinois at Urbana-Champaign (UIUCDCS-R-2005-2572), (2005)
Yan, S., Xu, D., Yang, Q., Zhang, L., Tang, X., Zhang, H.-J.: Multilinear discriminant analysis for face recognition. IEEE Trans Image Process 16(1), 212–220 (2007)
Zhigang, M., Yi, Y., Feiping, N., Nicu, S.: Thinking of images as what they are: compound matrix regression for image classification. In: Proceedings of Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), (2013)
Tao, D., Li, X, Hu, W., Maybank, S., Wu, X.: Supervised tensor learning. In: Fifth IEEE International Conference on Data Mining (ICDM), pp. 450–457. IEEE, (2005)
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev 51(3), 455–500 (2009)
Tichavsky, P., Phan, A.H., Koldovsky, Z.: Cramér-rao-induced bounds for candecomp/parafac tensor decomposition. IEEE Trans Signal Process 61(8), 1986–1997 (2013)
Kotsia, I., Patras, I.: Support tucker machines. In: Computer Vision and Pattern Recognition (CVPR), pp. 633–640. IEEE, (2011)
Fung, G., Mangasarian, O.L.: Multicategory proximal support vector machine classifiers. Mach Learn 59(1–2), 77–97 (2005)
Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. Neural Inform Process Lett Rev 11(10), 203–224 (2007)
Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5), 530–549 (2004)
Gupta, A., Kembhavi, A., Davis, L.S.: Observing human-object interactions: using spatial and functional compatibility for recognition. IEEE Trans Pattern Anal Mach Intell 31(10), 1775–1789 (2009)
Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9–16. IEEE, (2010)
Ikizler, N., Cinbis, R.G., Pehlivan, S., Duygulu, P.: Recognizing actions from still images. In: 19th International Conference on Pattern Recognition. ICPR 2008, pp. 1–4. IEEE, (2008)
Acknowledgments
This work was partly supported by the National Program on the Key Basic Research Project (under Grant 2013CB329301), NSFC (under Grant 61202166, 61472276), the Major Project of National Social Science Fund (under Grant 14ZDB153) and Doctoral Fund of Ministry of Education of China (under Grant 20120032120042).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by B. Prabhakaran.
Rights and permissions
About this article
Cite this article
Zhang, J., Han, Y. & Jiang, J. Tucker decomposition-based tensor learning for human action recognition. Multimedia Systems 22, 343–353 (2016). https://doi.org/10.1007/s00530-015-0464-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0464-7