Skip to main content
Log in

On integration of multiple features for human activity recognition in video sequences

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Human activity recognition has become one of the most active areas of research in computer vision, due to its increasing demand in many automated monitoring applications such as visual surveillance, human-computer interaction, health care, security systems, and many more. This work aims to introduce an integrated feature descriptor which combines texture feature and shape feature, at multiple orientations, to construct the efficient and robust feature vector for activity recognition in realistic scenarios. This feature descriptor is an integration of Discrete Wavelet Transform (DWT), multiscale Local Binary Pattern, and Histogram of Oriented Gradients (HOG). HOG descriptor extracts local-oriented histograms of the frame sequences, multiscale LBP gives the complex structural information of the frames and DWT gives the directional information at multiple scales. By exploiting these properties, we have constructed an integrated feature descriptor to construct the feature vector and achieves promising results of activity recognition in realistic videos. Multiclass Support Vector Machine (SVM) classifier with one-vs-one architecture has been used for activity recognition. The experiments are performed on five benchmark publicly available video datasets, namely Weizmann, IXMAS, UT Interaction, HMDB51, and UCF101. The experimental results are compared with the results of other state-of-art methods based on conventional machine learning and deep learning-based methods to show the effectiveness and usefulness of the proposed work. The experimental results have demonstrated that the proposed method performs better than the other state-of-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ahad MA, Islam MN, Jahan I (2016) Action recognition based on binary patterns of action-history and histogram of oriented gradient. Journal on Multimodal User Interfaces 10(4):335–344

    Article  Google Scholar 

  2. Ahmad M, Lee SW (2008) Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recogn 41(7):2237–2252

    Article  MATH  Google Scholar 

  3. Ahonen T, Matas J, He C, Pietikäinen M (2009) Rotation invariant image description with local binary pattern histogram fourier features. InScandinavian conference on image analysis 2009 Jun 15, pp 61-70. Springer, Berlin, Heidelberg

  4. Akula A, Shah AK, Ghosh R (2018) Deep learning approach for human action recognition in infrared images. Cogn Syst Res 50:146–154

    Article  Google Scholar 

  5. Almaadeed N, Elharrouss O, Al-Maadeed S, Bouridane A, Beghdadi A (2019) A novel approach for robust multi human action detection and recognition based on 3-dimentional convolutional neural networks. arXiv preprint arXiv:1907.11272. 2019 Jul 25

  6. Althloothi S, Mahoor MH, Zhang X, Voyles RM (2014) Human activity recognition using multi-features and multiple kernel learning. Pattern Recogn 47(5):1800–1812

    Article  Google Scholar 

  7. Aly S, Sayed A (2019 Sep 15) Human action recognition using bag of global and local Zernike moment features. Multimed Tools Appl 78(17):24923–24953

    Article  Google Scholar 

  8. Avola D, Bernardi M, Foresti GL (2019) Fusing depth and colour information for human action recognition. Multimed Tools Appl 78(5):5919–5939

    Article  Google Scholar 

  9. Ballan L, Bertini M, Del Bimbo A, Serra G (2010) Video event classification using string kernels. Multimed Tools Appl 48(1):69–87

    Article  Google Scholar 

  10. Ben-Arie J, Wang Z, Pandit P, Rajaram S (2002) Human activity recognition using multidimensional indexing. IEEE Trans Pattern Anal Mach Intell Arch 24(8):1091–1104

    Article  Google Scholar 

  11. Bhatti N, Hanbury A, Stottinger J (2018) Contextual local primitives for binary patent image retrieval. Multimed Tools Appl 77(7):9111–9151

    Article  Google Scholar 

  12. Bi F, Fu X, Chen W, Fang W, Miao X, Assefa B (2020) Fire detection method based on improved fruit fly optimization-based SVM. CMC-Computers Materials & Continua 62(1):199–216

    Article  Google Scholar 

  13. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. InTenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 2005 Oct 17, vol 2, pp 1395–1402

  14. Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267

    Article  Google Scholar 

  15. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. InProceedings of the fifth annual workshop on Computational learning theory 1992 Jul 1, pp 144–152

  16. Carlsson S, Sullivan J (2001) Action recognition by shape matching to key frames. InWorkshop on models versus exemplars in computer vision 2001 Dec, vol 1, no 18

  17. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3):27

    Google Scholar 

  18. Cohen I, Li H (2003) Inference of human postures by classification of 3D human body shape. In: 2003 IEEE international SOI conference. Proceedings (cat. No. 03CH37443) 2003 Oct 17 pp 74-81. IEEE

  19. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp 886–893

  20. Fernández A, Ghita O, González E, Bianconi F, Whelan PF (2011) Evaluation of robustness against rotation of LBP, CCR and ILBP features in granite texture classification. Mach Vis Appl 22(6):913–926

    Article  Google Scholar 

  21. Gadekallu TR, Khare N, Bhattacharya S, Singh S, Reddy Maddikunta PK, Ra IH, Alazab M (2020) Early detection of diabetic retinopathy using PCA-firefly based deep learning model. Electronics. 9(2):274

    Article  Google Scholar 

  22. Gonzalez RC, Woods RE (2002) Digital Image Processing. 2nd edn Prentice Hall. New Jersey, 793

  23. Gumaei A, Al-Rakhami M, AlSalman H, Rahman SM, Alamri A (2020) DL-HAR: deep learning-based human activity recognition framework for edge computing. CMC-Computers Materials & Continua. 65(2):1033–1057

    Article  Google Scholar 

  24. Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425

    Article  Google Scholar 

  25. Jamel AA, Akay B (2020) Human activity recognition based on parallel approximation kernel k-means algorithm. Comput Syst Sci Eng 35(6):441–456

    Article  Google Scholar 

  26. Jhuang H, Garrote H, Poggio E, Serre T, Hmdb T (2011) A large video database for human motion recognition. In: Proc of IEEE International Conference on Computer Vision 2011 Vol 4, No 5, p. 6

  27. Kabir MH, Thapa K, Yang JY, Yang SH (2019) State-space based linear modeling for human activity recognition in smart space. Intell Autom Soft Comput 25(4):673–681

    Google Scholar 

  28. Ke S-R, Thuc H, Lee Y-J, Hwang J-N, Yoo J-H, Choi K-H (2013) A review on video-based human activity recognition. Computers 2(2):88–131

    Article  Google Scholar 

  29. Kellokumpu V, Zhao G, Pietikäinen M (2010) Dynamic textures for human movement recognition. In: Proceedings of the ACM International Conference on Image and Video Retrieval 2010 Jul 5, pp 470–476

  30. Kellokumpu V, Zhao G, Pietikäinen M (2011) Recognition of human actions using texture descriptors. Mach Vis Appl 22(5):767–780

    Article  Google Scholar 

  31. Khan MA, Zhang YD, Khan SA, Attique M, Rehman A, Seo S (2020) A resource conscious human action recognition framework using 26-layered deep convolutional neural network. Multimed Tools Appl 1:1–23

    Google Scholar 

  32. Khan MA, Javed K, Khan SA, Saba T, Habib U, Khan JA, Abbasi AA (2020) Human action recognition using fusion of multiview and deep features: an application to video surveillance. Multimed Tools Appl 14:1–27

    Google Scholar 

  33. Khare M, Srivastava RK, Khare A (2017) Object tracking using combination of daubechies complex wavelet transform and Zernike moment. Multimed Tools Appl 76(1):1247–1290

    Article  Google Scholar 

  34. Kim SJ, Kim SW, Sandhan T, Choi JY (2014) View invariant action recognition using generalized 4D features. Pattern Recogn Lett 49:40–47

    Article  Google Scholar 

  35. Kumaran N, Vadivel A, Kumar SS (2018) Recognition of human actions using CNN-GWO: a novel modeling of CNN for enhancement of classification performance. Multimed Tools Appl 77(18):23115–23147

    Article  Google Scholar 

  36. Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2–3):107–123

    Article  Google Scholar 

  37. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: recognizing complex activities from sensor data. In: Twenty-fourth international joint conference on artificial intelligence 2015 Jun 24

  38. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing. 181:108–115

    Article  Google Scholar 

  39. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. InAAAI 2016 Feb 12 (Vol. 30, pp. 1266–1272)

  40. Liu C, Li Z, Shi X, Du C (2018) Learning a mid-level representation for Multiview action recognition. Adv Multimed 1:2018

    Google Scholar 

  41. Moussa MM, Hamayed E, Fayek MB, El Nemr HA (2015) An enhanced method for human action recognition. J Adv Res 6(2):163–169

    Article  Google Scholar 

  42. Murtaza F, Yousaf MH, Velastin SA (2016) Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description. IET Comput Vis 10(7):758–767

    Article  Google Scholar 

  43. Nguyen DT, Ogunbona PO, Li W (2013) A novel shape-based non-redundant local binary pattern descriptor for object detection. Pattern Recogn 46(5):1485–1500

    Article  Google Scholar 

  44. Nigam S, Khare A (2015) Multiresolution approach for multiple human detection using moments and local binary patterns. Multimed Tools Appl 74(17):7037–7062

    Article  Google Scholar 

  45. Nigam S, Khare A (2016) Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences. Multimed Tools Appl 75(24):17303–17332

    Article  Google Scholar 

  46. Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 29(1):51–59

    Article  Google Scholar 

  47. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  MATH  Google Scholar 

  48. Pietikäinen M, Hadid A, Zhao G, Ahonen T (2011) Local binary patterns for still images. InComputer vision using local binary patterns 2011, pp 13-47. Springer, London

  49. Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In2009 IEEE 12th international conference on computer vision 2009 Sep 29, pp 1593-1600. IEEE.

  50. Seemanthini K, Manjunath SS (2018) Human detection and tracking using HOG for action recognition. Proc Comput Sci 132:1317–1326

    Article  Google Scholar 

  51. Sharif A, Khan MA, Javed K, Gulfam H, Iqbal T, Saba T, Ali H, Nisar W (2019) Intelligent human action recognition: a framework of optimal features selection based on Euclidean distance and strong correlation. J Control Eng Appl Inform 21(3):3–11

    Google Scholar 

  52. Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2020) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Pattern Anal Applic 23(1):281–294

    Article  Google Scholar 

  53. Sharma CM, Kushwaha AKS, Nigam S, Khare A (2011) Automatic human activity recognition in video using background modeling and spatio-temporal template matching based technique. In: Proc. of international conference on advances in computing and artificial intelligence (ACAI –2011), pp 97–101

  54. Shen J, Yang W, Sun C (2013) Real-time human detection based on gentle MILBoost with variable granularity HOG-CSLBP. Neural Comput & Applic 23(7–8):1937–1948

    Article  Google Scholar 

  55. Shen J, Deng RH, Cheng Z, Nie L, Yan S (2015) On robust image spam filtering via comprehensive visual modeling. Pattern Recogn 48(10):3227–3238

    Article  Google Scholar 

  56. Singh D, Singh B (2019) Investigating the impact of data normalization on classification performance. Appl Soft Comput 23:105524

    Google Scholar 

  57. Singh R, Kushwaha AK, Srivastava R (2019) Multi-view recognition system for human activity based on multiple features for video surveillance system. Multimed Tools Appl 78(12):17165–17196

    Article  Google Scholar 

  58. Singh R, Dhillon JK, Kushwaha AK, Srivastava R (2019) Depth based enlarged temporal dimension of 3D deep convolutional network for activity recognition. Multimed Tools Appl 78(21):30599–30614

    Article  Google Scholar 

  59. Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402. 2012 Dec 3

  60. Srivastava P, Khare A (2017) Integration of wavelet transform, local binary patterns and moments for content-based image retrieval. J Vis Commun Image Represent 42:78–103

    Article  Google Scholar 

  61. Srivastava P, Khare A (2018) Utilizing multiscale LLocal binary pattern for content-based image retrieval. Multimed Tools Appl 77(10):12377–12403

    Article  Google Scholar 

  62. Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans Image Process 19(6):1635–1650

    Article  MathSciNet  MATH  Google Scholar 

  63. Uddin MA, Lee YK (2019) Feature fusion of deep spatial features and handcrafted spatiotemporal features for human action recognition. Sensors. 19(7):1599

    Article  Google Scholar 

  64. Veeraraghavan A, Chellappa R, Roy-Chowdhury AK (2006) The function space of an activity. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) 2006 Jun 17, vol 1, pp 959-968. IEEE

  65. Vili K, Guoying Z, Matti P (2008) Texture based description of movements for activity analysis. InInt. Conf. on Computer Vision Theory and Applications (VISAPP 2008) 2008 Jan 22, vol 1, pp 206–213

  66. Vishwakarma DK (2020) A two-fold transformation model for human action recognition using decisive pose. Cogn Syst Res 61:1–3

    Article  Google Scholar 

  67. Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans Cybern 15

  68. Yeffet L, Wolf L (2009) Local trinary patterns for human action recognition. In2009 IEEE 12th international conference on computer vision 2009 Sep 27, pp 492–497

  69. Zare A, Moghaddam HA, Sharifi A (2020) Video spatiotemporal mapping for human action recognition by convolutional neural network. Pattern Anal Applic 23(1):265–279

    Article  Google Scholar 

  70. Zhang HB, Zhang YX, Zhong B, Lei Q, Yang L, Du JX, Chen DS (2019) A comprehensive survey of vision-based human action recognition methods. Sensors. 19(5):1005

    Article  Google Scholar 

  71. Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans Cybern 45(12):2756–2769

    Article  Google Scholar 

  72. Zhu L, Shen J, Xie L, Cheng Z (2016) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arati Kushwaha.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kushwaha, A., Khare, A. & Srivastava, P. On integration of multiple features for human activity recognition in video sequences. Multimed Tools Appl 80, 32511–32538 (2021). https://doi.org/10.1007/s11042-021-11207-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11207-1

Keywords

Navigation