Skip to main content
Log in

Linearized kernel dictionary learning with group sparse priors for action recognition

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Classification-driven dictionary has been successfully used in pattern recognition and computer vision in recent years. In this paper, a discriminative dictionary is constructed by concatenating all class specific sub-dictionaries and one sub-dictionary containing the common patterns. To further enhance the discriminative power, we also propose to use group sparse priors in the coding stage of the dictionary learning process. A kernel dictionary is learned to solve the same direction distribution problem existing in the traditional dictionary learning framework. Actually, the kernel dictionary is learned in a linearized manner by using virtual features. We evaluate our method on three public action datasets including facial expression, Hand Gesture and UCF Sports. Experimental results demonstrate that our method can achieve the better or at least competitive performance when compared with other action recognition methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Fernandez-Caballero, A., Castillo, J.C., Rodriguez-Sanchez, J.M.: Human activity monitoring by local and global finite state machines. Expert Syst. Appl. 39(8), 6982–6993 (2012)

    Article  Google Scholar 

  2. Bian, Z.P., Hou, J.H., Chau, L.P., Magnenat-Thalmann, N.: Fall detection based on body part tracking using a depth camera. IEEE J. Biomed. Health Inform. 19(2), 430–439 (2015)

    Article  Google Scholar 

  3. Barnachon, M., Bouakaz, S., Boufama, B., Guillou, E.: Ongoing human action recognition with motion capture. Pattern Recognit. 47(1), 238–247 (2014)

    Article  Google Scholar 

  4. Kong, Y., Jia, Y.D., Fu, Y.: Interactive phrases: Semantic descriptions for human interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1775–1788 (2014)

    Article  Google Scholar 

  5. Yan, Y., Yang, Y., Meng, D.Y., Liu, G.W., Tong, W., Hauptmann, A.G., Sebe, N.: Event oriented dictionary learning for complex event detection. IEEE Trans. Image Process. 24(6), 1867–1878 (2015)

    Article  MathSciNet  Google Scholar 

  6. Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115(2), 224–241 (2011)

    Article  Google Scholar 

  7. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 1–43 (2011)

    Article  Google Scholar 

  8. Dawn, D.D., Shaikh, S.H.: A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis. Comput. 32(3), 289–306 (2016)

    Article  Google Scholar 

  9. Niebles, J.C., Wang, H., Li, F.F.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008)

    Article  Google Scholar 

  10. Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-independent action recognition from temporal self-similarities. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 172–185 (2011)

    Article  Google Scholar 

  11. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)

    Article  Google Scholar 

  12. Zhu, Y., Zhao, X., Fu, Y., Liu, Y.: Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian Conference on Computer Vision (ACCV) (2010)

  13. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  14. Ramirez, I., Sprechmann, P., Sapiro,G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: IEEE conference on computer vision and pattern recognition (CVPR) (2010)

  15. Zhang, Q., Li, B.: Discriminative K-SVD for dictionary learning in face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)

  16. Jiang, Z., Lin, Z., Davis, L.S.: Label consistent K-SVD: learning a discriminative dictionary for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2651–2664 (2013)

    Article  Google Scholar 

  17. Schölkopf, B., Smola, A., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  18. Mika, S., Ratsch, G., Weston, J., Schölkopf, B., Müller, K.R.: Fisher discriminant analysis with kernels. In: IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, pp. 41–48, (1999)

  19. Gao, S., Tsang, I.W.-H., Chia, L.-T.: Kernel sparse representation for image classification and face recognition. In: European Conference on Computer Vision (ECCV) (2010)

  20. Yin, J., Liu, Z., Jin, Z., Yang, W.: Kernel sparse representation based classification. Neurocomputing 77(1), 120–128 (2012)

    Article  Google Scholar 

  21. Nguyen, H.V., Patel, V.M., Nasrabadi, N.M., Chellappa, R.: Kernel dictionary learning. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2012)

  22. Zhang, L., Zhou, W.D., Chang, P.C., Liu, J., Yan, Z., Wang, T., Li, F.Z.: Kernel sparse representation-based classifier. IEEE Trans. Signal Process. 60(4), 1684–1695 (2012)

    Article  MathSciNet  Google Scholar 

  23. Liu, B.D., Shen, B., Gui, L., Wang, Y.X., Li, X., Yan, F., Wang, Y.J.: Face recognition using class specific dictionary learning for sparse representation and collaborative representation. Neurocomputing 204, 198–210 (2016)

    Article  Google Scholar 

  24. Golts, A., Elad, M.: Linearized kernel dictionary learning. IEEE J. Sel. Top. Signal Process. 10(4), 726–739 (2016)

    Article  Google Scholar 

  25. Suo, Y., Dao, M., Tran, T., Mousavi, H., Srinivas, U., Monga,V.: Group structured dirty dictionary learning for classification. In: 2014 IEEE International Conference on Image Processing (ICIP) (2014)

  26. Agahian, S., Negin, F., Köse, C.: Improving bag-of-poses with semi-temporal pose descriptors for skeleton-based action recognition. Vis. Comput. (2018). https://doi.org/10.1007/s00371-018-1489-7

    Article  Google Scholar 

  27. Li, Y., Ye, J.Y., Wang, T.Q., Huang, S.J.: Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition. Vis. Comput. 31, 1383–1394 (2015)

    Article  Google Scholar 

  28. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Neural Information Processing Systems (NIPS) (2014)

  29. Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2016)

  30. Guha, T., Ward, R.K.: Learning sparse representations for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1576–1588 (2012)

    Article  Google Scholar 

  31. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning. In: Neural Information Processing Systems (NIPS) (2008)

  32. Yang,M., Zhang,L., Feng, X., Zhang, D.: Fisher discrimination dictionary learning for sparse representation. In: International Conference on Computer Vision (ICCV) (2011)

  33. Chi,Y.T., Ali,M. Rajwade,A. Ho,J.: Block and group regularized sparse modeling for dictionary learning. n: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

  34. Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)

    Article  MathSciNet  Google Scholar 

  35. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)

    Article  MathSciNet  Google Scholar 

  36. Dollar, P., Rabaud, V. Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)

  37. Kim, T.K., Cipolla, R.: Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans. Pattern Anal. Mach. Intell. 31(8), 1415–1428 (2009)

    Article  Google Scholar 

  38. Rodriguez,M.D., Ahmed,J., Shah,M.: Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunhai Hu.

Ethics declarations

Funding

This study was funded by the Hebei Province Science and Technology Support Program (No. 15220324).

Conflict of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, C., Hu, C. & Liu, B. Linearized kernel dictionary learning with group sparse priors for action recognition. Vis Comput 35, 1797–1807 (2019). https://doi.org/10.1007/s00371-018-1603-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1603-x

Keywords

Navigation