Human action recognition based on the Grassmann multi-graph embedding

Rahimi, Sahere; Aghagolzadeh, Ali; Ezoji, Mehdi

doi:10.1007/s11760-018-1354-1

Human action recognition based on the Grassmann multi-graph embedding

Original Paper
Published: 06 September 2018

Volume 13, pages 271–279, (2019)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

443 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, a human action recognition method based on the kernelized Grassmann manifold learning is introduced. The goal is to find a map which transfers the high-dimensional data to a discriminative low-dimensional space by considering the geometry of the manifold. To this end, a multi-graph embedding method using three graphs named as center-class, within-class and between-class similarity graphs is proposed. These graphs capture the local and semi-global information of data which is the benefit of the proposed method. Graphs play an important role in subspace learning methods. Most of the graph-based methods ignore the geometry of the manifold-valued data because of using Euclidean distance in graph construction. So, these methods are sensitive to noise and outliers. To handle these problems, the geodesic distance is used to build the neighborhood graphs. We analyze the performance of the proposed method on both noisy (complex) and less noisy (simple) datasets. So, two geodesic distance metrics are used to calculate the geodesic distance of these datasets. Experimental results show the performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Wu, D., et al.: Recent advances in video-based human action recognition using deep learning: a review. In: International Joint Conference on Neural Networks (IJCNN), pp. 2865–2872, May 2017
Hou, R., et al.: Tube convolutional neural network (T-CNN) for action detection in videos. In: IEEE International Conference on Computer Vision, pp. 5822–5831 (2017)
Li, C., et al.: Deep spatio-temporal manifold network for action recognition. arXiv preprint arXiv:1705.03148, pp. 1–12, May 2017
Weinzaepfel, P., et al.: Learning to track for spatio-temporal action localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3164–3172 (2015)
Wang, Y., et al.: Two-stream SR-CNNs for action recognition in videos. In: Proceedings of British Machine Vision Conference (BMVC), pp. 108.1–108.12, Sept 2016
Wang, P., et al.: Action recognition based on joint trajectory maps with convolutional neural networks. arXiv preprint arXiv:1612.09401, pp. 1–11, Dec 2016
Hou, Y., et al.: Skeleton optical spectra-based action recognition using convolutional neural networks. IEEE Trans. Circuits Syst. Video Technol. 28(3), 807–811 (2016)
Article Google Scholar
Wang, G. et al.: DeepIGeoS: a deep interactive geodesic framework for medical image segmentation. arXiv preprint arXiv:1707.00652, pp. 1–14 (2017)
Weinland, D., et al.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115(2), 224–241 (2011)
Article Google Scholar
Yi, Y., et al.: Realistic action recognition with salient foreground trajectories. Expert Syst. Appl. 75, 44–55 (2017)
Article Google Scholar
Xu, H., et al.: A joint evaluation of different dimensionality reduction techniques, fusion and learning methods for action recognition. Neurocomputing 214, 329–339 (2016)
Article Google Scholar
Megrhi, S., et al.: Spatio-temporal action localization and detection for human action recognition in big dataset. J. Vis. Commun. Image Represent. 41, 375–390 (2016)
Article Google Scholar
Qiao, R., et al.: Learning discriminative trajectorylet detector sets for accurate skeleton-based action recognition. Pattern Recogn. 66, 202–212 (2017)
Article Google Scholar
Bagheri, M.A., et al.: Locality regularized group sparse coding for action recognition. Comput. Vis. Image Underst. 158, 106–114 (2017)
Article Google Scholar
Devanne, M., et al.: 3-d human action recognition by shape analysis of motion trajectories on riemannian manifold. IEEE Trans. Cybern. 45(7), 1340–1352 (2015)
Article Google Scholar
Zhang, B., et al.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26(10), 4648–4660 (2017)
Article MathSciNet Google Scholar
Chen, C., et al.: Multi-temporal depth motion maps-based local binary patterns for 3-D human action recognition. IEEE Access 5, 22590–22604 (2017)
Article Google Scholar
Liu, M., et al.: Robust 3D action recognition through sampling local appearances and global distributions. IEEE Trans. Multimed. 20(8), 1932–1947 (2017)
Article Google Scholar
Shao, L., et al.: Spatio-temporal Laplacian pyramid coding for action recognition. IEEE Trans. Cybern. 44(6), 817–827 (2014)
Article Google Scholar
Baumann, F., et al.: Recognizing human actions using novel space-time volume binary patterns. Neurocomputing 173, 54–63 (2016)
Article Google Scholar
Van Der Maaten, L., et al.: Dimensionality reduction: a comparative review. Tilburg University Technical Report, TiCC-TR 2009-005 (2009)
He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004)
Wang, J.: Locally linear embedding. In: Wang, J. (ed.) Geometric Structure of High-Dimensional Data and Dimensionality Reduction, pp. 203–220. Springer, Berlin, Heidelberg (2011)
Chapter Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) Human Motion–Understanding, Modeling, Capture and Animation. Lecture Notes in Computer Science, vol. 4814, pp. 285–298. Springer, Berlin (2007)
Chapter Google Scholar
Christopher, B.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
Sakarya, U.: Dimension reduction using global and local pattern information-based maximum margin criterion. Signal Image Video Process. 10(5), 903–909 (2016)
Article Google Scholar
Harandi, M.T., et al.: Kernel analysis on Grassmann manifolds for action recognition. Pattern Recogn. Lett. 34(15), 1906–1915 (2013)
Article Google Scholar
Slama, R., et al.: Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recogn. 48(2), 556–567 (2015)
Article MathSciNet Google Scholar
Harandi, M., et al.: Extrinsic methods for coding and dictionary learning on Grassmann manifolds. Int. J. Comput. Vis. 114(2–3), 113–136 (2015)
Article MathSciNet MATH Google Scholar
Nokleby, M., et al.: Discrimination on the Grassmann manifold: fundamental limits of subspace classifiers. IEEE Trans. Inf. Theory 61(4), 2133–2147 (2015)
Article MathSciNet MATH Google Scholar
Zhang, L., et al.: Grassmann multimodal implicit feature selection. Multimed. Syst. 20(6), 659–674 (2014)
Article Google Scholar
Liu, A.A., et al.: Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 102–114 (2017)
Article Google Scholar
Shen, W., et al.: Exemplar-based human action pose correction and tagging. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1784–1791 (2012)
Shen, W., et al.: Exemplar-based human action pose correction. IEEE Trans. Cybern. 44(7), 1053–1066 (2014)
Article Google Scholar
Escorcia, V., et al.: Guess Where? Actor-supervision for spatiotemporal action localization. arXiv preprint arXiv:1804.01824, pp. 1–10, April 2018
Chen, K., Forbus, K.D.: Action recognition from skeleton data via analogical generalization. In: Proceedings of 30th International Workshop on Qualitative Reasoning (2017)
Rahimi, S., et al.: Human action recognition by Grassmann manifold learning. In: 2015 9th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 61–64, Nov 2015
Turaga, P., et al.: Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2273–2286 (2011)
Article MathSciNet Google Scholar
Wei, Z., et al.: An effective two-dimensional linear discriminant analysis with locality preserving approach for image recognition. SIViP 11(8), 1577–1584 (2017)
Article Google Scholar
Aeini, F., et al.: Supervised hierarchical neighborhood graph construction for manifold learning. Signal Image Video Process. 12(4), 799–807 (2018)
Article Google Scholar
Huang, X., et al.: Local discriminant canonical correlation analysis for supervised PolSAR image classification. IEEE Geosci. Remote Sens. Lett. 14(11), 2102–2106 (2017)
Article Google Scholar
Azary, S.: Grassmann learning for recognition and classification. Dissertation, Rochester Institute of Technology (2014)
Edelman, A., et al.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998)
Article MathSciNet MATH Google Scholar
Ly, N.H., et al.: Sparse graph-based discriminant analysis for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 52(7), 3872–3884 (2014)
Article Google Scholar
Hamm, J., Lee, D.D.: Grassmann discriminant analysis: a unifying view on subspace-based learning. Presented at the Proceedings of the 25th International Conference on Machine Learning, Helsinki, pp. 376–383, July 2008
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
KTH dataset: http://www.nada.kth.se/cvap/actions/. Accessed 6 June 2018
UCF sport dataset: http://crcv.ucf.edu/data/UCF_Sports_Action.php. Accessed 6 June 2018
MSR action 3D dataset: https://www.uow.edu.au/~wanqing/#Datasets. Accessed 6 June 2018
UTD-MHAD dataset: https://www.utdallas.edu/~kehtar/UTD-MHAD.html. Accessed 6 June 2018
UCF101 dataset: http://crcv.ucf.edu/data/UCF101.php. Accessed 6 June 2018
Chen, C., et al.: Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: IEEE International Conference on Image Processing (ICIP), pp. 168–172 (2015)

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
Sahere Rahimi, Ali Aghagolzadeh & Mehdi Ezoji

Authors

Sahere Rahimi
View author publications
You can also search for this author in PubMed Google Scholar
Ali Aghagolzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Ezoji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Aghagolzadeh.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 216 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahimi, S., Aghagolzadeh, A. & Ezoji, M. Human action recognition based on the Grassmann multi-graph embedding. SIViP 13, 271–279 (2019). https://doi.org/10.1007/s11760-018-1354-1

Download citation

Received: 06 February 2018
Revised: 13 August 2018
Accepted: 17 August 2018
Published: 06 September 2018
Issue Date: 12 March 2019
DOI: https://doi.org/10.1007/s11760-018-1354-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human action recognition based on the Grassmann multi-graph embedding

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human action recognition using fusion of multiview and deep features: an application to video surveillance

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 216 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human action recognition based on the Grassmann multi-graph embedding

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human action recognition using fusion of multiview and deep features: an application to video surveillance

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 216 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation