Skip to main content

Advertisement

Log in

SMDF: Spatial Mass Distribution Features and Deep Learning-Based Technique for Human Activity Recognition

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Automatic detection of human activity is one of the growing research areas due to the wide range of applications like elderly and patient monitoring for ambient assistive living, visual surveillance, etc. This paper presents a novel bi-channel deep learning model to recognize several human daily living activities. One channel includes activity classification using key poses, based on the spatial mass distribution feature (SMDF). The key poses of all the training classes are estimated using the K-means algorithm on the SMDF of the corresponding silhouette images. Similarly, the key poses of a testing video are extracted, and the k-nearest neighbor (KNN) technique is used thereafter to determine the activity classes. The other channel utilizes VGG-16 to classify human activity on preprocessed frame sequence of a video. Ultimately, the results of the two channels are combined to determine the final activity classification. The bi-channel procedure provides better accuracy over any of the individual channels. The proposed technique is verified using the single-view Weizmann and multi-view MuHAVi datasets. For the Weizmann dataset, the accuracy is 98.8%, and for MuHAVi-14 and MuHAVi-8 both, the accuracy is 98.5%. The accuracy rate for the proposed technique is better than the other state-of-the-art methods concerning different parameters that establish the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Here both kinds of data means single-view Weizmann data and multi-view MuHAVi dataset.

References

  1. Ajagbe S, Amuda K, Oladipupo M, Okesola K. Multi-classification of Alzheimer Disease on magnetic resonance images (MRI) using deep convolution neural network approaches. Int J Adv Comput Res (IJACR). 2021;11(53):51–60.

    Article  Google Scholar 

  2. Velosa F, Florez H. Edge solution with machine learning and open data to interpret signs for people with visual disability. In: 23rd international conference on applied informatics. (ICAIW). Ota; 2020. p. 15–26.

  3. Ajagbe S, Oki O, Oladipupo M, Nwanakwaugwu A. Investing the Efficiency of Deep Learning Models in Bioinspired Object Detection. In: International conference on electrical, computer and energy technologies (ICECET). Prague; 2022.

  4. Raychaudhuri A, Maity S, Chakrabarti A, Bhattacharjee D, et al. A Novel approach for human silhouette extraction from video data. In: Chaki R, et al., editors. Advanced computing and systems for security, advances in intelligent systems and computing, vol. 568. Berlin: Springer; 2017. p. 125–35.

    Chapter  Google Scholar 

  5. Coniglio C, Meurie C, Lezoray O, Berbineau M. People silhouette extraction from people detection bounding boxes in images. Pattern Recognit Lett. 2017;93:182–91.

    Article  Google Scholar 

  6. Baysal S, Kurt M, Duygulu P. Recognizing human actions using key poses. In: 20th international conference on pattern recognition (ICPR). Istanbul; 2010. p. 1727–30.

  7. Dedeoglu Y, Toereyin BU, Gueduekbay U, Cetin AE. Silhouette-based method for object classification and human action recognition. In: Proceedings of computer vision in human computer interaction (ECCV 2006). Berlin; 2006.

  8. Thurau C, Hlavac V. Pose primitive based human action recognition in videos or still images. In: Proceedings of IEEE conference of computer vision and pattern recognition. (CVPR 2008). Anchorage; 2008.

  9. Weinland D, Boyer E. Action recognition using exemplar-based embedding. In: Proceedings of IEEE conference on computer vision and pattern recognition. (CVPR 2008). Anchorage; 2008.

  10. Blank M, Gorelick L, Shechtman E, Irani M, Basri R. Actions as space-time shapes. In: 10th IEEE international conference on computer vision, ICCV 2005, vol 1. Beijing; 2005. p. 1395–402.

  11. Singh S, Velastin SA, Ragheb H. Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 7th IEEE international conference on advanced video and signal based surveillance. AVSS 2010. Boston; 2010.

  12. Aggarwal JK, Ryoo MS. Human activity analysis: a review. ACM Comput Surv. 2011;43:16.

    Article  Google Scholar 

  13. Guo K, Ishwar P, Konrad J. Action recognition from video using feature covariance matrices. IEEE Trans Image Process. 2013;22(6):2479–94.

    Article  MathSciNet  Google Scholar 

  14. Ahmad M, Parvin I, Lee SW. Silhouette history and energy image information for human movement recognition. J Multimed. 2010;5(1):12–21.

    Article  Google Scholar 

  15. Bobick A, Davis J. The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell. 2001;23:257–67.

    Article  Google Scholar 

  16. Chen Y, Wu Q, He X. Human action recognition by Radon transform. In: Proceedings of IEEE international conference data mining workshops. (ICDMW). Pisa; 2008. p. 862–8.

  17. Gorelick L, Blank M, Shechtman E, Irani M, Basri R. Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell. 2007;29:2247–53.

    Article  Google Scholar 

  18. Guo K, Ishwar P, Konrad J. Action recognition in video by covariance matching of silhouette tunnels. In: Proceedings of XXII Brazilian symposium on computer Graphics and Image Processing. Rio de Janeiro; 2009. p. 299–306.

  19. Ikizler N, Duygulu P. Human action recognition using distribution of oriented rectangular patches. In: Elgammal A, Rosenhahn B, Klette R, editors. Human motion understanding, modeling, capture and animation, lecture notes in computer science, vol. 4814. Berlin: Springer; 2007. p. 271–84.

    Chapter  Google Scholar 

  20. Maity S, Bhattacharjee D, Chakrabarti A. A novel approach for human action recognition from silhouette images. IETE J Res. 2016;63:160–71.

    Article  Google Scholar 

  21. Wang Y, Huang K, Tan T. Human activity recognition based on R transform. In: Proceedings of IEEE conference computer vision and pattern recognition. CVPR 2007. Minneapolis; 2007. p. 1–8.

  22. Yilmaz A, Shah M. Action sketch: a novel action representation. In: Proceedings of IEEE conference on computer vision and pattern recognition. CVPR 2005. San Diego; 2005. p. 984–89.

  23. Ali S, Shah M. Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell. 2010;32:288–303.

    Article  Google Scholar 

  24. Danafar S, Gheissari N. Action recognition for surveillance applications using optic flow and SVM. In: Proceedings of Asian conference on computer vision. ACCV. Berlin; 2007. p. 457–66.

  25. Fathi A, Mori G. Action recognition by learning mid-level motion features. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.

  26. Liu J, Ali S, Shah M. Recognizing human actions using multiple features. In: Proceedings of IEEE conference on computer vision and pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.

  27. Matikainen P, Sukthankar R, Hebert M, Ke Y. Fast motion consistency through matrix quantization. In: Proceedings of British machine vision conference. Leeds; 2008. p. 1055–64.

  28. Yudistira N, Kurita T. Deep packet flow: action recognition via multiresolution deep wavelet packet of local dense optical flows. J Signal Process Sys. 2019;91:609–25.

    Article  Google Scholar 

  29. Aly S, Sayed A. Human action recognition using bag of global and local Zernike moment features. Multimed Tools Appl. 2019;78:24923–53.

    Article  Google Scholar 

  30. Cunado D, Nixon MS, Carter JN. Automatic extraction and description of human gait models for recognition purposes. Comput Vis Image Underst. 2003;90:1–41.

    Article  Google Scholar 

  31. Goncalves L, Bernardo ED, Ursella E, Perona P (1995) Monocular tracking of the human arm in 3-D. In: Proceedings of IEEE international conference computer vision. Cambridge; 1995. p. 764–70.

  32. Rohr K. Toward model-based recognition of human movements in image sequences. CVGIP Image Underst. 1994;59:94–115.

    Article  Google Scholar 

  33. Wang L, Ning H, Tan T, Hu W. Fusion of static and dynamic body biometrics for gait recognition. IEEE Trans Circuits Syst Video Technol. 2004;14:149–58.

    Article  Google Scholar 

  34. Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proceedings of IEEE international workshop visual surveillance performance evaluation tracking surveillance. Beijing; 2005. p. 65–72.

  35. Laptev I, Marszalek M, Schmid C, Rozenfeld B. Learning realistic human actions from movies. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.

  36. Niebles J, Wang H, Fei-Fei L. Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis. 2008;79:299–318.

    Article  Google Scholar 

  37. Schuldt C, Laptev I, Caputo B. Recognizing human actions: a local SVM approach. In: Proceedings of international conference pattern recognition. Cambridge; 2004. p. 32–36.

  38. Wong SF, Cipolla R. Extracting spatio-temporal interest points use global information. In: Proceedings of IEEE international conference computer vision. ICCV 2007. Rio de Janerio; 2007. p. 1–8.

  39. Starner T. Visual recognition of American sign language using hidden Markov model. In: Master's Thesis, MIT Media Laboratory. 1995. p. 1–52.

  40. Yamato J, Ohya J, Ishii K. Recognizing human action in time sequential image using hidden markov model. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 1992. Champaign 1992. p. 379–85.

  41. Arshad H, Khan MA, Sharif M, Yasmin M, Javed MY. Multi-level features fusion and selection for human gait recognition: an optimized framework of Bayesian model and binomial distribution. Int J Mach Learn Cybern. 2019;10:3601–18.

    Article  Google Scholar 

  42. Sharif M, Khan MA, Zahid F, Shah JH, Akram T. Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Pattern Anal Appl. 2019;23(1):281–94.

    Article  Google Scholar 

  43. Ullah A, Muhammad K, Haq IU, Baik SW. Action recognition using optimized deep autoencoder and CNN for surveillance data streams of non-stationary environments. Futur Gener Comput Syst. 2019;96:386–97.

    Article  Google Scholar 

  44. Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. In: 27th international conference on neural information processing systems. NIPS'14. Cambridge; 2014. p. 568–76.

  45. Wang L, Qiao Y, Tang X. Action recognition with trajectory-pooled deep-convolutional descriptors. In: IEEE conference on computer vision and pattern recognition (CVPR). Boston; 2015. p. 4305–14.

  46. Wang L, Zang J, Zhang Q, Niu Z, Hua G, Zheng N. Action recognition by an attention-aware temporal weighted convolutional neural network. Sensors. 2018;18(7):1979.

    Article  Google Scholar 

  47. Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F. Silhouette-based human action recognition using sequences of key poses. Pattern Recogn Lett. 2013;34:1799–807.

    Article  Google Scholar 

  48. Cheema S, Eweiwi A, Thurau C, Bauckhage C. Action recognition by learning discriminative key poses. In: IEEE international conference on computer vision workshops (ICCV workshops). Barcelona; 2011. p. 1302–9.

  49. Chou KP, Prasad M, Wu D, Sharma N, Li DL, Lin YF, Blumenstein M, Lin WC, Lin CT. Robust feature-based automated multi-view human action recognition system. IEEE Access. 2018;6:15283–96.

    Article  Google Scholar 

  50. Thurau C, Hlavac V. n-Grams of action primitives for recognizing human behavior. In: Kropatsch W, Kampel M, Hanbury A, editors. Computer analysis of images and patterns, lecture notes in computer science, vol. 4673. Berlin: Springer; 2007. p. 93–100.

    Chapter  Google Scholar 

  51. Liu C, Ying J, Yang H, Hu X, Liu J. Improved human action recognition approach based on two-stream convolutional neural network model. Vis Comput. 2021;37:1327–41.

    Article  Google Scholar 

  52. Qian H, Zhou J, Mao Y. Recognizing human actions from silhouettes described with weighted distance metric and kinematics. Multimed Tools Appl. 2017;76:21889–910.

    Article  Google Scholar 

  53. Vishwakarma DK, Dhiman C. A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel. Vis Comput. 2019;35:1595–613.

    Article  Google Scholar 

  54. Vishwakarma DK. A twofold transformation model for human action recognition using decisive pose. Cognit Syst Res. 2020;61:1–13.

    Article  Google Scholar 

  55. Nida N, Yousaf MA, Irtaza A, Velastin SA. Video augmentation technique for human action recognition using genetic algorithm. ETRI J. 2022;44(2):327–38.

    Article  Google Scholar 

  56. Naeem HB, Murtaza YMH, Velastin SA. T-VLAD: temporal vector of locally aggregated descriptor for multiview human action recognition. Pattern Recognit Lett. 2021;148:22–8.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amlan Raychaudhuri.

Ethics declarations

Conflict of interest

We declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Quantum Computing and Emerging Technologies” guest edited by Himanshu Thapliyal and Saraju Mohanty.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raychaudhuri, A., Maity, S., Chakrabarti, A. et al. SMDF: Spatial Mass Distribution Features and Deep Learning-Based Technique for Human Activity Recognition. SN COMPUT. SCI. 5, 129 (2024). https://doi.org/10.1007/s42979-023-02452-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02452-2

Keywords

Navigation