Skip to main content
Log in

Human action recognition based on the Grassmann multi-graph embedding

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In this paper, a human action recognition method based on the kernelized Grassmann manifold learning is introduced. The goal is to find a map which transfers the high-dimensional data to a discriminative low-dimensional space by considering the geometry of the manifold. To this end, a multi-graph embedding method using three graphs named as center-class, within-class and between-class similarity graphs is proposed. These graphs capture the local and semi-global information of data which is the benefit of the proposed method. Graphs play an important role in subspace learning methods. Most of the graph-based methods ignore the geometry of the manifold-valued data because of using Euclidean distance in graph construction. So, these methods are sensitive to noise and outliers. To handle these problems, the geodesic distance is used to build the neighborhood graphs. We analyze the performance of the proposed method on both noisy (complex) and less noisy (simple) datasets. So, two geodesic distance metrics are used to calculate the geodesic distance of these datasets. Experimental results show the performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wu, D., et al.: Recent advances in video-based human action recognition using deep learning: a review. In: International Joint Conference on Neural Networks (IJCNN), pp. 2865–2872, May 2017

  2. Hou, R., et al.: Tube convolutional neural network (T-CNN) for action detection in videos. In: IEEE International Conference on Computer Vision, pp. 5822–5831 (2017)

  3. Li, C., et al.: Deep spatio-temporal manifold network for action recognition. arXiv preprint arXiv:1705.03148, pp. 1–12, May 2017

  4. Weinzaepfel, P., et al.: Learning to track for spatio-temporal action localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3164–3172 (2015)

  5. Wang, Y., et al.: Two-stream SR-CNNs for action recognition in videos. In: Proceedings of British Machine Vision Conference (BMVC), pp. 108.1–108.12, Sept 2016

  6. Wang, P., et al.: Action recognition based on joint trajectory maps with convolutional neural networks. arXiv preprint arXiv:1612.09401, pp. 1–11, Dec 2016

  7. Hou, Y., et al.: Skeleton optical spectra-based action recognition using convolutional neural networks. IEEE Trans. Circuits Syst. Video Technol. 28(3), 807–811 (2016)

    Article  Google Scholar 

  8. Wang, G. et al.: DeepIGeoS: a deep interactive geodesic framework for medical image segmentation. arXiv preprint arXiv:1707.00652, pp. 1–14 (2017)

  9. Weinland, D., et al.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115(2), 224–241 (2011)

    Article  Google Scholar 

  10. Yi, Y., et al.: Realistic action recognition with salient foreground trajectories. Expert Syst. Appl. 75, 44–55 (2017)

    Article  Google Scholar 

  11. Xu, H., et al.: A joint evaluation of different dimensionality reduction techniques, fusion and learning methods for action recognition. Neurocomputing 214, 329–339 (2016)

    Article  Google Scholar 

  12. Megrhi, S., et al.: Spatio-temporal action localization and detection for human action recognition in big dataset. J. Vis. Commun. Image Represent. 41, 375–390 (2016)

    Article  Google Scholar 

  13. Qiao, R., et al.: Learning discriminative trajectorylet detector sets for accurate skeleton-based action recognition. Pattern Recogn. 66, 202–212 (2017)

    Article  Google Scholar 

  14. Bagheri, M.A., et al.: Locality regularized group sparse coding for action recognition. Comput. Vis. Image Underst. 158, 106–114 (2017)

    Article  Google Scholar 

  15. Devanne, M., et al.: 3-d human action recognition by shape analysis of motion trajectories on riemannian manifold. IEEE Trans. Cybern. 45(7), 1340–1352 (2015)

    Article  Google Scholar 

  16. Zhang, B., et al.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26(10), 4648–4660 (2017)

    Article  MathSciNet  Google Scholar 

  17. Chen, C., et al.: Multi-temporal depth motion maps-based local binary patterns for 3-D human action recognition. IEEE Access 5, 22590–22604 (2017)

    Article  Google Scholar 

  18. Liu, M., et al.: Robust 3D action recognition through sampling local appearances and global distributions. IEEE Trans. Multimed. 20(8), 1932–1947 (2017)

    Article  Google Scholar 

  19. Shao, L., et al.: Spatio-temporal Laplacian pyramid coding for action recognition. IEEE Trans. Cybern. 44(6), 817–827 (2014)

    Article  Google Scholar 

  20. Baumann, F., et al.: Recognizing human actions using novel space-time volume binary patterns. Neurocomputing 173, 54–63 (2016)

    Article  Google Scholar 

  21. Van Der Maaten, L., et al.: Dimensionality reduction: a comparative review. Tilburg University Technical Report, TiCC-TR 2009-005 (2009)

  22. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004)

  23. Wang, J.: Locally linear embedding. In: Wang, J. (ed.) Geometric Structure of High-Dimensional Data and Dimensionality Reduction, pp. 203–220. Springer, Berlin, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)

    Article  MATH  Google Scholar 

  25. Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) Human Motion–Understanding, Modeling, Capture and Animation. Lecture Notes in Computer Science, vol. 4814, pp. 285–298. Springer, Berlin (2007)

    Chapter  Google Scholar 

  26. Christopher, B.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  27. Sakarya, U.: Dimension reduction using global and local pattern information-based maximum margin criterion. Signal Image Video Process. 10(5), 903–909 (2016)

    Article  Google Scholar 

  28. Harandi, M.T., et al.: Kernel analysis on Grassmann manifolds for action recognition. Pattern Recogn. Lett. 34(15), 1906–1915 (2013)

    Article  Google Scholar 

  29. Slama, R., et al.: Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recogn. 48(2), 556–567 (2015)

    Article  MathSciNet  Google Scholar 

  30. Harandi, M., et al.: Extrinsic methods for coding and dictionary learning on Grassmann manifolds. Int. J. Comput. Vis. 114(2–3), 113–136 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  31. Nokleby, M., et al.: Discrimination on the Grassmann manifold: fundamental limits of subspace classifiers. IEEE Trans. Inf. Theory 61(4), 2133–2147 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  32. Zhang, L., et al.: Grassmann multimodal implicit feature selection. Multimed. Syst. 20(6), 659–674 (2014)

    Article  Google Scholar 

  33. Liu, A.A., et al.: Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 102–114 (2017)

    Article  Google Scholar 

  34. Shen, W., et al.: Exemplar-based human action pose correction and tagging. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1784–1791 (2012)

  35. Shen, W., et al.: Exemplar-based human action pose correction. IEEE Trans. Cybern. 44(7), 1053–1066 (2014)

    Article  Google Scholar 

  36. Escorcia, V., et al.: Guess Where? Actor-supervision for spatiotemporal action localization. arXiv preprint arXiv:1804.01824, pp. 1–10, April 2018

  37. Chen, K., Forbus, K.D.: Action recognition from skeleton data via analogical generalization. In: Proceedings of 30th International Workshop on Qualitative Reasoning (2017)

  38. Rahimi, S., et al.: Human action recognition by Grassmann manifold learning. In: 2015 9th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 61–64, Nov 2015

  39. Turaga, P., et al.: Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2273–2286 (2011)

    Article  MathSciNet  Google Scholar 

  40. Wei, Z., et al.: An effective two-dimensional linear discriminant analysis with locality preserving approach for image recognition. SIViP 11(8), 1577–1584 (2017)

    Article  Google Scholar 

  41. Aeini, F., et al.: Supervised hierarchical neighborhood graph construction for manifold learning. Signal Image Video Process. 12(4), 799–807 (2018)

    Article  Google Scholar 

  42. Huang, X., et al.: Local discriminant canonical correlation analysis for supervised PolSAR image classification. IEEE Geosci. Remote Sens. Lett. 14(11), 2102–2106 (2017)

    Article  Google Scholar 

  43. Azary, S.: Grassmann learning for recognition and classification. Dissertation, Rochester Institute of Technology (2014)

  44. Edelman, A., et al.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  45. Ly, N.H., et al.: Sparse graph-based discriminant analysis for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 52(7), 3872–3884 (2014)

    Article  Google Scholar 

  46. Hamm, J., Lee, D.D.: Grassmann discriminant analysis: a unifying view on subspace-based learning. Presented at the Proceedings of the 25th International Conference on Machine Learning, Helsinki, pp. 376–383, July 2008

  47. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  48. KTH dataset: http://www.nada.kth.se/cvap/actions/. Accessed 6 June 2018

  49. UCF sport dataset: http://crcv.ucf.edu/data/UCF_Sports_Action.php. Accessed 6 June 2018

  50. MSR action 3D dataset: https://www.uow.edu.au/~wanqing/#Datasets. Accessed 6 June 2018

  51. UTD-MHAD dataset: https://www.utdallas.edu/~kehtar/UTD-MHAD.html. Accessed 6 June 2018

  52. UCF101 dataset: http://crcv.ucf.edu/data/UCF101.php. Accessed 6 June 2018

  53. Chen, C., et al.: Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: IEEE International Conference on Image Processing (ICIP), pp. 168–172 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Aghagolzadeh.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 216 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahimi, S., Aghagolzadeh, A. & Ezoji, M. Human action recognition based on the Grassmann multi-graph embedding. SIViP 13, 271–279 (2019). https://doi.org/10.1007/s11760-018-1354-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-018-1354-1

Keywords

Navigation