Abstract
Automatic detection of human activity is one of the growing research areas due to the wide range of applications like elderly and patient monitoring for ambient assistive living, visual surveillance, etc. This paper presents a novel bi-channel deep learning model to recognize several human daily living activities. One channel includes activity classification using key poses, based on the spatial mass distribution feature (SMDF). The key poses of all the training classes are estimated using the K-means algorithm on the SMDF of the corresponding silhouette images. Similarly, the key poses of a testing video are extracted, and the k-nearest neighbor (KNN) technique is used thereafter to determine the activity classes. The other channel utilizes VGG-16 to classify human activity on preprocessed frame sequence of a video. Ultimately, the results of the two channels are combined to determine the final activity classification. The bi-channel procedure provides better accuracy over any of the individual channels. The proposed technique is verified using the single-view Weizmann and multi-view MuHAVi datasets. For the Weizmann dataset, the accuracy is 98.8%, and for MuHAVi-14 and MuHAVi-8 both, the accuracy is 98.5%. The accuracy rate for the proposed technique is better than the other state-of-the-art methods concerning different parameters that establish the effectiveness of our proposed method.
Similar content being viewed by others
Data availability
Here both kinds of data means single-view Weizmann data and multi-view MuHAVi dataset.
References
Ajagbe S, Amuda K, Oladipupo M, Okesola K. Multi-classification of Alzheimer Disease on magnetic resonance images (MRI) using deep convolution neural network approaches. Int J Adv Comput Res (IJACR). 2021;11(53):51–60.
Velosa F, Florez H. Edge solution with machine learning and open data to interpret signs for people with visual disability. In: 23rd international conference on applied informatics. (ICAIW). Ota; 2020. p. 15–26.
Ajagbe S, Oki O, Oladipupo M, Nwanakwaugwu A. Investing the Efficiency of Deep Learning Models in Bioinspired Object Detection. In: International conference on electrical, computer and energy technologies (ICECET). Prague; 2022.
Raychaudhuri A, Maity S, Chakrabarti A, Bhattacharjee D, et al. A Novel approach for human silhouette extraction from video data. In: Chaki R, et al., editors. Advanced computing and systems for security, advances in intelligent systems and computing, vol. 568. Berlin: Springer; 2017. p. 125–35.
Coniglio C, Meurie C, Lezoray O, Berbineau M. People silhouette extraction from people detection bounding boxes in images. Pattern Recognit Lett. 2017;93:182–91.
Baysal S, Kurt M, Duygulu P. Recognizing human actions using key poses. In: 20th international conference on pattern recognition (ICPR). Istanbul; 2010. p. 1727–30.
Dedeoglu Y, Toereyin BU, Gueduekbay U, Cetin AE. Silhouette-based method for object classification and human action recognition. In: Proceedings of computer vision in human computer interaction (ECCV 2006). Berlin; 2006.
Thurau C, Hlavac V. Pose primitive based human action recognition in videos or still images. In: Proceedings of IEEE conference of computer vision and pattern recognition. (CVPR 2008). Anchorage; 2008.
Weinland D, Boyer E. Action recognition using exemplar-based embedding. In: Proceedings of IEEE conference on computer vision and pattern recognition. (CVPR 2008). Anchorage; 2008.
Blank M, Gorelick L, Shechtman E, Irani M, Basri R. Actions as space-time shapes. In: 10th IEEE international conference on computer vision, ICCV 2005, vol 1. Beijing; 2005. p. 1395–402.
Singh S, Velastin SA, Ragheb H. Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 7th IEEE international conference on advanced video and signal based surveillance. AVSS 2010. Boston; 2010.
Aggarwal JK, Ryoo MS. Human activity analysis: a review. ACM Comput Surv. 2011;43:16.
Guo K, Ishwar P, Konrad J. Action recognition from video using feature covariance matrices. IEEE Trans Image Process. 2013;22(6):2479–94.
Ahmad M, Parvin I, Lee SW. Silhouette history and energy image information for human movement recognition. J Multimed. 2010;5(1):12–21.
Bobick A, Davis J. The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell. 2001;23:257–67.
Chen Y, Wu Q, He X. Human action recognition by Radon transform. In: Proceedings of IEEE international conference data mining workshops. (ICDMW). Pisa; 2008. p. 862–8.
Gorelick L, Blank M, Shechtman E, Irani M, Basri R. Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell. 2007;29:2247–53.
Guo K, Ishwar P, Konrad J. Action recognition in video by covariance matching of silhouette tunnels. In: Proceedings of XXII Brazilian symposium on computer Graphics and Image Processing. Rio de Janeiro; 2009. p. 299–306.
Ikizler N, Duygulu P. Human action recognition using distribution of oriented rectangular patches. In: Elgammal A, Rosenhahn B, Klette R, editors. Human motion understanding, modeling, capture and animation, lecture notes in computer science, vol. 4814. Berlin: Springer; 2007. p. 271–84.
Maity S, Bhattacharjee D, Chakrabarti A. A novel approach for human action recognition from silhouette images. IETE J Res. 2016;63:160–71.
Wang Y, Huang K, Tan T. Human activity recognition based on R transform. In: Proceedings of IEEE conference computer vision and pattern recognition. CVPR 2007. Minneapolis; 2007. p. 1–8.
Yilmaz A, Shah M. Action sketch: a novel action representation. In: Proceedings of IEEE conference on computer vision and pattern recognition. CVPR 2005. San Diego; 2005. p. 984–89.
Ali S, Shah M. Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell. 2010;32:288–303.
Danafar S, Gheissari N. Action recognition for surveillance applications using optic flow and SVM. In: Proceedings of Asian conference on computer vision. ACCV. Berlin; 2007. p. 457–66.
Fathi A, Mori G. Action recognition by learning mid-level motion features. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.
Liu J, Ali S, Shah M. Recognizing human actions using multiple features. In: Proceedings of IEEE conference on computer vision and pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.
Matikainen P, Sukthankar R, Hebert M, Ke Y. Fast motion consistency through matrix quantization. In: Proceedings of British machine vision conference. Leeds; 2008. p. 1055–64.
Yudistira N, Kurita T. Deep packet flow: action recognition via multiresolution deep wavelet packet of local dense optical flows. J Signal Process Sys. 2019;91:609–25.
Aly S, Sayed A. Human action recognition using bag of global and local Zernike moment features. Multimed Tools Appl. 2019;78:24923–53.
Cunado D, Nixon MS, Carter JN. Automatic extraction and description of human gait models for recognition purposes. Comput Vis Image Underst. 2003;90:1–41.
Goncalves L, Bernardo ED, Ursella E, Perona P (1995) Monocular tracking of the human arm in 3-D. In: Proceedings of IEEE international conference computer vision. Cambridge; 1995. p. 764–70.
Rohr K. Toward model-based recognition of human movements in image sequences. CVGIP Image Underst. 1994;59:94–115.
Wang L, Ning H, Tan T, Hu W. Fusion of static and dynamic body biometrics for gait recognition. IEEE Trans Circuits Syst Video Technol. 2004;14:149–58.
Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proceedings of IEEE international workshop visual surveillance performance evaluation tracking surveillance. Beijing; 2005. p. 65–72.
Laptev I, Marszalek M, Schmid C, Rozenfeld B. Learning realistic human actions from movies. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 2008. Anchorage; 2008. p. 1–8.
Niebles J, Wang H, Fei-Fei L. Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis. 2008;79:299–318.
Schuldt C, Laptev I, Caputo B. Recognizing human actions: a local SVM approach. In: Proceedings of international conference pattern recognition. Cambridge; 2004. p. 32–36.
Wong SF, Cipolla R. Extracting spatio-temporal interest points use global information. In: Proceedings of IEEE international conference computer vision. ICCV 2007. Rio de Janerio; 2007. p. 1–8.
Starner T. Visual recognition of American sign language using hidden Markov model. In: Master's Thesis, MIT Media Laboratory. 1995. p. 1–52.
Yamato J, Ohya J, Ishii K. Recognizing human action in time sequential image using hidden markov model. In: Proceedings of IEEE conference computer vision pattern recognition. CVPR 1992. Champaign 1992. p. 379–85.
Arshad H, Khan MA, Sharif M, Yasmin M, Javed MY. Multi-level features fusion and selection for human gait recognition: an optimized framework of Bayesian model and binomial distribution. Int J Mach Learn Cybern. 2019;10:3601–18.
Sharif M, Khan MA, Zahid F, Shah JH, Akram T. Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Pattern Anal Appl. 2019;23(1):281–94.
Ullah A, Muhammad K, Haq IU, Baik SW. Action recognition using optimized deep autoencoder and CNN for surveillance data streams of non-stationary environments. Futur Gener Comput Syst. 2019;96:386–97.
Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. In: 27th international conference on neural information processing systems. NIPS'14. Cambridge; 2014. p. 568–76.
Wang L, Qiao Y, Tang X. Action recognition with trajectory-pooled deep-convolutional descriptors. In: IEEE conference on computer vision and pattern recognition (CVPR). Boston; 2015. p. 4305–14.
Wang L, Zang J, Zhang Q, Niu Z, Hua G, Zheng N. Action recognition by an attention-aware temporal weighted convolutional neural network. Sensors. 2018;18(7):1979.
Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F. Silhouette-based human action recognition using sequences of key poses. Pattern Recogn Lett. 2013;34:1799–807.
Cheema S, Eweiwi A, Thurau C, Bauckhage C. Action recognition by learning discriminative key poses. In: IEEE international conference on computer vision workshops (ICCV workshops). Barcelona; 2011. p. 1302–9.
Chou KP, Prasad M, Wu D, Sharma N, Li DL, Lin YF, Blumenstein M, Lin WC, Lin CT. Robust feature-based automated multi-view human action recognition system. IEEE Access. 2018;6:15283–96.
Thurau C, Hlavac V. n-Grams of action primitives for recognizing human behavior. In: Kropatsch W, Kampel M, Hanbury A, editors. Computer analysis of images and patterns, lecture notes in computer science, vol. 4673. Berlin: Springer; 2007. p. 93–100.
Liu C, Ying J, Yang H, Hu X, Liu J. Improved human action recognition approach based on two-stream convolutional neural network model. Vis Comput. 2021;37:1327–41.
Qian H, Zhou J, Mao Y. Recognizing human actions from silhouettes described with weighted distance metric and kinematics. Multimed Tools Appl. 2017;76:21889–910.
Vishwakarma DK, Dhiman C. A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel. Vis Comput. 2019;35:1595–613.
Vishwakarma DK. A twofold transformation model for human action recognition using decisive pose. Cognit Syst Res. 2020;61:1–13.
Nida N, Yousaf MA, Irtaza A, Velastin SA. Video augmentation technique for human action recognition using genetic algorithm. ETRI J. 2022;44(2):327–38.
Naeem HB, Murtaza YMH, Velastin SA. T-VLAD: temporal vector of locally aggregated descriptor for multiview human action recognition. Pattern Recognit Lett. 2021;148:22–8.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Quantum Computing and Emerging Technologies” guest edited by Himanshu Thapliyal and Saraju Mohanty.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Raychaudhuri, A., Maity, S., Chakrabarti, A. et al. SMDF: Spatial Mass Distribution Features and Deep Learning-Based Technique for Human Activity Recognition. SN COMPUT. SCI. 5, 129 (2024). https://doi.org/10.1007/s42979-023-02452-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02452-2