Skip to main content
Log in

Skeleton joint trajectories based human activity recognition using deep RNN

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Human Activity Recognition is the act of recognizing activities performed by humans in real-time. This can be done using video data or more advanced forms of data like- inertial, depth maps, or human skeletal joint trajectories. In this work, we perform human action recognition through skeletal joint tracking of the human body using a deep recurrent neural network. Our proposed method was then tested on two standard databases, namely UTD-MHAD and MSR- Daily Activity 3D-Datasets. The judgement on the efficiency of our proposed model was made by comparing it to various, recently published, State-Of-The-Art (SOTA) methods.The evaluations of our model show that our method performs well on both the datasets and achieves an accuracy of 99.07%, and 91%, on UTD-MHAD and MSR Daily Activity databases respectively, and can recognize human activities from a variety of domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 1
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. A V, Roy-Chowdhury AK, Chellappa R (2005) Matching shape sequences in video with applications in human movement analysis. IEEE Trans Pattern Anal Mach Intell 27(12):1896–1909. https://doi.org/10.1109/tpami.2005.246

    Article  Google Scholar 

  2. Ahmad T, Jin L, Lin L, Tang G (2021) Skeleton-based action recognition using sparse spatio-temporal gcn with edge effective resistance. Neurocomputing 423:389–398. https://doi.org/10.1016/j.neucom.2020.10.096

    Article  Google Scholar 

  3. Ahmed N, Rafiq JI, Islam MR (2020) Enhanced human activity recognition based on smartphone sensor data using hybrid feature selection model. Sensors 20(1):317. https://doi.org/10.3390/s20010317

    Article  Google Scholar 

  4. Al-Faris M, Chiverton JP, Yang Y, Ndzi D (2020) Multi-view region-adaptive multi-temporal dmm and rgb action recognition. Pattern Anal Appl 23 (4):1587–1602. https://doi.org/10.1007/s10044-020-00886-5

    Article  Google Scholar 

  5. Andrade-Ambriz YA, Ledesma S, Ibarra-Manzano M-A, Oros-Flores MI, Almanza-Ojeda D-L (2022) Human activity recognition using temporal convolutional neural network architecture. Expert Syst Appl 191:116287. https://doi.org/10.1016/j.eswa.2021.116287

    Article  Google Scholar 

  6. Anjum ML, Ahmad O, Rosa S, Yin J, Bona B (2014) Skeleton tracking based complex human activity recognition using kinect camera. Social Robot Lect Notes Comput Sci:23–33. https://doi.org/10.1007/978-3-319-11973-13

  7. Bulbul MF, Islam S, Ali H (2019) 3d Human action analysis and recognition through glac descriptor on 2d motion and static posture images. Multimed Tools Appl 78(15):21085–21111. https://doi.org/10.1007/s11042-019-7365-2

    Article  Google Scholar 

  8. Cekova K, Koceska N, Koceski S (2016) Gesture control of a mobile robot using kinect sensor. Proccedings of the ICAIIT, https://doi.org/10.20544/aiit2016.31

  9. Chen C, Jafari R, Kehtarnavaz N (2015) Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE International Conference on Image Processing (ICIP), pp 168–172

  10. Cho J, Jung Y, Kim D-S, Lee S, Jung Y (2019) Moving object detection based on optical flow estimation and a gaussian mixture model for advanced driver assistance systems. Sensors 19(14):3217. https://doi.org/10.3390/s19143217

    Article  Google Scholar 

  11. Cho S-S, Lee A-R, Suk H-I, Park J-S, Lee S-W (2015) Volumetric spatial feature representation for view-invariant human action recognition using a depth camera. Optical Eng 54(3):033102. https://doi.org/10.1117/1.oe.54.3.033102

    Article  Google Scholar 

  12. Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. Comput Vis – ECCV 2006 Lect Notes Comput Sci:428–441. https://doi.org/10.1007/1174404733

  13. Du Y, Chen F, Xu W (2007) Human interaction representation and recognition through motion decomposition. IEEE Signal Process Lett 14(12):952–955

    Article  Google Scholar 

  14. Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1110–1118

  15. Duong TV, Bui HH, Phung DQ, Venkatesh S (2005) Activity recognition and abnormality detection with the switching hidden semi-markov model. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol 1, pp 838–8451

  16. Eskaf K, Aly WM, Aly A (2016) Aggregated activity recognition using smart devices. In: 2016 3rd International Conference on Soft Computing Machine Intelligence (ISCMI), pp 214–218

  17. Foroughi H, Naseri A, Saberi A, Sadoghi Yazdi H (2008) An eigenspace-based approach for human fall detection using integrated time motion image and neural network. In: 2008 9th International Conference on Signal Processing, pp 1499–1503

  18. Geravesh S, Rupapara V (2022) Artificial neural networks for human activity recognition using sensor based dataset. Multimed Tools Appl, https://doi.org/10.1007/s11042-022-13716-z

  19. Huan R, Zhan Z, Ge L, Chi K, Chen P, Liang R (2021) A hybrid cnn and blstm network for human complex activity recognition with multi-feature fusion. Multimed Tools Appl 80(30):36159–36182. https://doi.org/10.1007/s11042-021-11363-4

    Article  Google Scholar 

  20. Ke S-R, Thuc H, Lee Y-J, Hwang J-N, Yoo J-H, Choi K-H (2013) A review on video-based human activity recognition. Computers 2(2):88–131. https://doi.org/10.3390/computers2020088

    Article  Google Scholar 

  21. Kinect Camera (2022) http://www.xbox.com/en-US/kinect/default.htm

  22. Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8

  23. Leung MK, Yang Y-H (1995) First sight: a human body outline labeling system. IEEE Trans Pattern Anal Mach Intell 17(4):359–377. https://doi.org/10.1109/34.385981

    Article  Google Scholar 

  24. Lin C-H, Hsu F-S, Lin W-Y (2010) Recognizing human actions using nwfe-based histogram vectors. EURASIP J Adv Signal Process, vol 2010(1). https://doi.org/10.1155/2010/453064

  25. Lu C, Jia J, Tang C-K (2014) Range-sample depth feature for action recognition 2014. IEEE Conference on Computer Vision and Pattern Recognition., https://doi.org/10.1109/cvpr.2014.104

  26. Lu X, Liu Q, Oe S (2004) Recognizing non-rigid human actions using joints tracking in space-time. In: International conference on information technology: coding and computing, 2004. Proceedings. ITCC 2004. https://doi.org/10.1109/itcc.2004.1286534

  27. Luo J, Wang W, Qi H (2013) Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In: 2013 IEEE International Conference on Computer Vision, pp 1809–1816

  28. Luo J, Wang W, Qi H (2014) Spatio-temporal feature extraction and representation for rgb-d human action recognition. Pattern Recogn Lett 50:139–148. https://doi.org/10.1016/j.patrec.2014.03.024

    Article  Google Scholar 

  29. Luo Y, Wu T-D, Hwang J-N (2003) Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks. Comput Vis Image Underst 92(2-3):196–216. https://doi.org/10.1016/j.cviu.2003.08.001

    Article  Google Scholar 

  30. Park SU, Park JH, Al-Masni MA, Al-Antari MA, Uddin MZ, Kim T-S (2016) A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services. Proc Comput Sci 100:78–84. https://doi.org/10.1016/j.procs.2016.09.126

    Article  Google Scholar 

  31. Pham C, Nguyen L, Nguyen A, Nguyen N, Nguyen V-T (2021) Combining skeleton and accelerometer data for human fine-grained activity recognition and abnormal behaviour detection with deep temporal convolutional networks. Multimed Tools Appl 80(19):28919–28940. https://doi.org/10.1007/s11042-021-11058-w

    Article  Google Scholar 

  32. Rajak S, Bose D, Saha A, Chowdhury C (2022) A human activity recognition framework for grossly labeled smartphone sensing data through combining genetic algorithm with multiple instance multiple label learning. Multimed Tools Appl 81(17):24887–24911. https://doi.org/10.1007/s11042-022-12261-z

    Article  Google Scholar 

  33. Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia - MULTIMEDIA ’07. https://doi.org/10.1145/1291233.1291311

  34. Singh R, Khurana R, Kushwaha AK, Srivastava R (2020) Combining cnn streams of dynamic image and depth data for action recognition. Multimed Syst 26(3):313–322. https://doi.org/10.1007/s00530-019-00645-5

    Article  Google Scholar 

  35. Tasnim N, Islam MK, Baek J-H (2021) Deep learning based human activity recognition using spatio-temporal image formation of skeleton joints. Appl Sci, vol 11(6)

  36. Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras 2012. IEEE Conference on Computer Vision and Pattern Recognition., https://doi.org/10.1109/cvpr.2012.6247813

  37. Yamato J, Ohya J, Ishii K (1992) Recognizing human action in time-sequential images using hidden markov model. In: Proceedings 1992 IEEE computer society conference on computer vision and pattern recognition, pp 379–385

  38. Yazdansepas D, Niazi AH, Gay JL, Maier FW, Ramaswamy L, Rasheed K, Buman MP (2016) A multi-featured approach for wearable sensor-based human activity recognition. In: 2016 IEEE international conference on healthcare informatics (ICHI), pp 423–431

  39. Zhang J, Li W, Ogunbona PO, Wang P, Tang C (2016) Rgb-d-based action recognition datasets: a survey. Pattern Recogn 60:86–105. https://doi.org/10.1016/j.patcog.2016.05.019

    Article  Google Scholar 

  40. Zhang C, Liang J, Li X, Xia Y, Di L, Hou Z, Huan Z (2022) Human action recognition based on enhanced data guidance and key node spatial temporal graph convolution. Multimed Tools Appl 81(6):8349–8366. https://doi.org/10.1007/s11042-022-11947-8

    Article  Google Scholar 

  41. Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-occurrence feature learning for skeleton based action recognition using regularized deep lstm networks. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI’16, pp 3697–3703

  42. Zhu S, Xu J, Guo H, Liu Q, Wu S, Wang H (2018) Indoor human activity recognition based on ambient radar with signal processing and machine learning. In: 2018 IEEE International Conference on Communications (ICC), pp 1–6. https://doi.org/10.1109/ICC.2018.8422107

Download references

Acknowledgements

None. No funding to proclaim.

Funding

There is no financial or non-financial interests that are directly or indirectly related to the work submitted for publication. All authors have no conflict of interest to report.

Author information

Authors and Affiliations

Authors

Contributions

Atiya Usmani: provided the concept behind the design of the proposed architecture for this problem. All the data collected and discussed with all authors. Analyzed and verified computation results. Also did some literature review, also responsible for drafting the article. Nadia Siddiqui: Conducted extensive literature review , helped in and responsible for drafting the article. Atiya Usmani, Nadia Siddiqui: responsible for the article draft critically for important intellectual content. Saiful Islam: provided the revised the article critically for important intellectual content and gave final approval of the version to be submitted.

Corresponding author

Correspondence to Nadia Siddiqui.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Atiya Usmani, Nadia Siddiqui and Saiful Islam contributed equally in this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Usmani, A., Siddiqui, N. & Islam, S. Skeleton joint trajectories based human activity recognition using deep RNN. Multimed Tools Appl 82, 46845–46869 (2023). https://doi.org/10.1007/s11042-023-15024-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15024-6

Keywords

Navigation