Hybrid two-stream dynamic CNN for view adaptive human action recognition using ensemble learning

Javed, Muhammad Hafeez; Yu, Zeng; Li, Tianrui; Rajeh, Taha M.; Rafique, Fahad; Waqar, Syed

doi:10.1007/s13042-021-01441-2

Hybrid two-stream dynamic CNN for view adaptive human action recognition using ensemble learning

Original Article
Published: 02 November 2021

Volume 13, pages 1157–1166, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Muhammad Hafeez Javed^1,2,
Zeng Yu¹,
Tianrui Li ORCID: orcid.org/0000-0001-7780-104X¹,
Taha M. Rajeh¹,
Fahad Rafique³ &
…
Syed Waqar¹

850 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

Human actions are sequential, and structured patterns of the body parts and their movements. In this paper, we present a hybrid two-stream convolutional neural network (H2SCNN) for the recognition of actions from sequences by exploring the statistical information like skeletons. This aims to exploit the skeletons completely and identify the actions properly by merging the different motion related features. These features include motion and joint features. The framework calculates the distance between consecutive sequences to form the temporal information required for the recognition process. The proposed H2SCNN is based on two stages. The neighbourhood feature model will be used to process both inputs individually in the first step. In the second stage, it performs ensemble learning and takes advantage of the diversity of multiple features by fusing them together. The multi-task ensemble learning model helps the system to improve the prediction ability of H2SCNN. Experiments on the benchmark dataset have shown the superiority of the proposed model with other recent approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Human activity recognition in artificial intelligence framework: a narrative review

Article 18 January 2022

References

Atwood J, Towsley D (2016) Diffusion-convolutional neural networks. In: Advances in neural information processing systems, pp 1993–2001
Caetano C, Sena J, Brémond F, Dos Santos JA, Schwartz WR (2019) Skelemotion: a new representation of skeleton joint sequences based on motion information for 3d action recognition. In: 2019 16th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–8
De Jong M, Joss S, Schraven D, Zhan C, Weijnen M (2015) Sustainable-smart-resilient-low carbon-eco-knowledge cities; making sense of a multitude of concepts promoting sustainable urbanization. J Clean Prod 109:25–38
Article Google Scholar
Ding W, Hu B, Liu H, Wang X, Huang X (2020) Human posture recognition based on multiple features and rule learning. Int J Mach Learn Cybern 11:2529–2540
Article Google Scholar
Ding Z, Wang P, Ogunbona PO, Li W (2017) Investigation of different skeleton features for CNN-based 3D action recognition. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 617–622
Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1110–1118
Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, Gómez-Bombarelli R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. CoRR. arXiv:1509.09292
Engel JI, Martin J, Barco R (2016) A low-complexity vision-based system for real-time traffic monitoring. IEEE Trans Intell Transp Syst 18(5):1279–1288
Article Google Scholar
Gedamu K, Ji Y, Yang Y, Gao L, Shen HT (2021) Arbitrary-view human action recognition via novel-view action generation. Pattern Recognit 118:108043
Article Google Scholar
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st international conference on neural information processing systems, pp 1025–1035
Jiang Y, Xu J, Zhang T (2020) View-independent representation with frame interpolation method for skeleton-based human action recognition. Int J Mach Learn Cybern 11(12):2625–2636
Article Google Scholar
Jin D, Liu Z, Li W, He D, Zhang W (2019) Graph convolutional networks meet Markov random fields: semi-supervised community detection in attribute networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 152–159
Ke Q, Bennamoun M, An S, Sohel F, Boussaid F (2017) A new representation of skeleton sequences for 3D action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3288–3297
Kipf T, Fetaya E, Wang K-C, Welling M, Zemel R (2018) Neural relational inference for interacting systems. In: International conference on machine learning. PMLR, pp 2688–2697
Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In: Proceedings of the IEEE international conference on computer vision, pp 1012–1020
Li B, Dai Y, Cheng X, Chen H, Lin Y, He M (2017) Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In: 2017 IEEE international conference on multimedia and expo workshops. IEEE, pp 601–604
Li C, Hou Y, Wang P, Li W (2017) Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Process Lett 24(5):624–628
Article Google Scholar
Li C, Zhong Q, Xie D, Pu S (2017) Skeleton-based action recognition with convolutional neural networks. In: 2017 IEEE international conference on multimedia and expo workshops. IEEE, pp 597–600
Liang D, Fan G, Lin G, Chen W, Pan X, Zhu H (2019) Three-stream convolutional neural network with multi-task and ensemble learning for 3D action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (2019)
Liu X, Li Y, Xia R (2020) Rotation-based spatial-temporal feature learning from skeleton sequences for action recognition. Signal Image Video Process 14(6):1227–1234
Article Google Scholar
Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5115–5124
Niepert M, Ahmed M, Kutzkov K (2016) Learning convolutional neural networks for graphs. In: International conference on machine learning. PMLR, pp 2014–2023
Nievas EB, Suarez OD, García GB, Sukthankar R (2011) Violence detection in video using computer vision techniques. In: International conference on computer analysis of images and patterns. Springer, Berlin, pp 332–339
Ren Z, Zhang Q, Gao X, Hao P, Cheng J (2021) Multi-modality learning for human action recognition. Multimedia Tools Appl 80(11):16185–16203
Article Google Scholar
Shahroudy A, Liu J, Ng T-T, Wang G (2016) NTU RGB+D: a large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019
Google Scholar
Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12026–12035
Si C, Jing Y, Wang W, Wang L, Tan T (2020) Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network. Pattern Recognit 107:107511
Article Google Scholar
Wan Y, Yu Z, Wang Y, Li X (2020) Action recognition based on two-stream convolutional networks with long-short-term spatiotemporal features. IEEE Access 8:85284–85293
Article Google Scholar
Wang H, Wang L (2017) Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 499–508
Wang P, Li W, Li C, Hou Y (2018) Action recognition based on joint trajectory maps with convolutional neural networks. Knowl Based Syst 158:43–53
Article Google Scholar
Xu Y, Cheng J, Wang L, Xia H, Liu F, Tao D (2018) Ensemble one-dimensional convolution neural networks for skeleton-based action recognition. IEEE Signal Process Lett 25(7):1044–1048
Article Google Scholar
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI conference on artificial intelligence
Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 30, pp 3697–3703

Download references

Acknowledgements

This research was supported by the National Key R&D Program of China (no. 2019YFB2101802) and the National Natural Science Foundation of China (no. 61773324).

Author information

Authors and Affiliations

School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China
Muhammad Hafeez Javed, Zeng Yu, Tianrui Li, Taha M. Rajeh & Syed Waqar
Department of Software Engineering, Foundation University Islamabad, Rawalpindi, Pakistan
Muhammad Hafeez Javed
School of Computer Science and Technology, Harbin Engineering University, Harbin, China
Fahad Rafique

Authors

Muhammad Hafeez Javed
View author publications
You can also search for this author in PubMed Google Scholar
Zeng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Tianrui Li
View author publications
You can also search for this author in PubMed Google Scholar
Taha M. Rajeh
View author publications
You can also search for this author in PubMed Google Scholar
Fahad Rafique
View author publications
You can also search for this author in PubMed Google Scholar
Syed Waqar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zeng Yu or Tianrui Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Javed, M.H., Yu, Z., Li, T. et al. Hybrid two-stream dynamic CNN for view adaptive human action recognition using ensemble learning. Int. J. Mach. Learn. & Cyber. 13, 1157–1166 (2022). https://doi.org/10.1007/s13042-021-01441-2

Download citation

Received: 16 August 2020
Accepted: 30 September 2021
Published: 02 November 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s13042-021-01441-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid two-stream dynamic CNN for view adaptive human action recognition using ensemble learning

Abstract

Access this article

Similar content being viewed by others

Convolutional neural network: a review of models, methodologies and applications to object detection

Transfer learning for image classification using VGG19: Caltech-101 image data set

Human activity recognition in artificial intelligence framework: a narrative review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hybrid two-stream dynamic CNN for view adaptive human action recognition using ensemble learning

Abstract

Access this article

Similar content being viewed by others

Convolutional neural network: a review of models, methodologies and applications to object detection

Transfer learning for image classification using VGG19: Caltech-101 image data set

Human activity recognition in artificial intelligence framework: a narrative review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation