Convolutional Autoencoder for Vision-Based Human Activity Recognition

Jain, Surbhi; Garg, Aishvarya; Nigam, Swati; Singh, Rajiv; Shastri, Anshuman; Singh, Irish

doi:10.1007/978-3-031-53830-8_10

Surbhi Jain^11,13,
Aishvarya Garg^12,13,
Swati Nigam^11,13,
Rajiv Singh^11,13,
Anshuman Shastri¹³ &
…
Irish Singh¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14532))

Included in the following conference series:

International Conference on Intelligent Human Computer Interaction

81 Accesses

Abstract

Human activity recognition (HAR) is a crucial component for many current applications, including those in the healthcare, security, and entertainment sectors. At the current state of the art, deep learning outperforms machine learning with its ability to automatically extract features. Autoencoders (AE) and convolutional neural networks (CNN) are the types of neural networks that are known for their good performance in dimensionality reduction and image classification, respectively. As most of the methods introduced for classification purposes are limited to sensor based methods. This paper mainly focuses on vision based HAR where we present a combination of AE and CNN for the classification of labeled data, in which convolutional AE (conv-AE) is utilized for two functions: dimensionality reduction and feature extraction and CNN is employed for classifying the activities. For the proposed model’s implementation, public benchmark datasets KTH and Weizmann are considered, on which we have attained a recognition rate of 96.3%, 94.89% for both, respectively. Comparative analysis is provided for the proposed model for the above-mentioned datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Basly, H., Ouarda, W., Sayadi, F.E., Ouni, B., Alimi, A.M.: CNN-SVM Learning Approach based Human Activity Recognition, pp. 271–281. ICISP, Springer (2020)
Google Scholar
Bouchabou, D., Nguyen, S.M., Lohr, C., LeDuc, B., Kanellos, I.: A survey of human activity recognition in smart homes based on IoT sensors algorithms: taxonomies, challenges, and opportunities with deep learning. Sensors, MDPI 21, 6037 (2021)
Article Google Scholar
Zhang, S., et al.: Deep learning in human activity recognition with wearable sensors: a review on advances. Sensors, MDPI 4, 1476 (2022)
Article Google Scholar
Alo, U.R., Nweke, H.F., The, Y.W., Murtaza, G.: Smartphone motion sensor-based complex human activity identification using deep stacked autoencoder algorithm for enhanced smart healthcare system. Sensors, MDPI 20, 6300 (2020)
Article Google Scholar
Gu, F., Khoshelham, K., Valaee, S., Shang, J., Zhang, R.: Locomotion activity recognition using stacked denoising autoencoders. IEEE Internet of Things Journal, IEEE 5, 2085–2093 (2018)
Article Google Scholar
Sunny, J.T., et al.: Applications and challenges of human activity recognition using sensors in a smart environment. IJIRST Int. J. Innov. Res. Sci. Technol 2, 50–57 (2015)
Google Scholar
Kiruba, K., Shiloah, E.D., Sunil, R.R.C.: Hexagonal Volume Local Binary Pattern (H-VLBP) with Deep Stacked Autoencoder for Human Action Recognition. Cognitive Systems Research, Elsevier 58, 71–93 (2019)
Article Google Scholar
Gnouma, M., Ladjailia, A., Ejbali, R., Zaied, M.: Stacked sparse autoencoder and history of binary motion image for human activity recognition. Multimedia Tools and Applications, Springer 78, 2157–2179 (2019)
Article Google Scholar
Nigam, S., Singh, R., Singh, M.K., Singh, V.K.: Multiview human activity recognition using uniform rotation invariant local binary patterns. J. Ambient Intell. Humani. Comp. Springer, 1–19 (2022)
Google Scholar
Song, X., Zhou, H., Liu, G.: Human behavior recognition based on multi-feature fusion of image. Cluster Computing, Springer 22, 9113–9121 (2019)
Article Google Scholar
Ramya, P., Rajeswari, R.: Human action recognition using distance transform and entropy based features. Multimedia Tools and Applications, Springer 80, 8147–8173 (2021)
Article Google Scholar
Mahmoud, R., Belgacem, S., Omri, M.N.: Towards an end-to-end Isolated and continuous deep gesture recognition process. Neural Computing and Applications, Springer 34, 13713–13732 (2022)
Article Google Scholar
Karuppannan, K., Darmanayagam, S.E., Cyril, S.R.R.: Human action recognition using fusion-based discriminative features and long short term memory classification. Concurrency and Computation: Practice and Experience, Wiley Online Library 34, e7250 (2022)
Article Google Scholar
Garg, A., Nigam, S., Singh, R.: Vision based Human Activity Recognition using Hybrid Deep Learning. CSI, IEEE, 1–6 (2022)
Google Scholar
Singh, R., Nigam, S., Singh, A.K., Elhoseny, M.: Wavelets for Activity Recognition. Intelligent Wavelet Based Techniques for Advanced Multimedia Applications, Springer 10, 109–121 (2020)
Article Google Scholar
Dwivedi, N., Singh, D.K., Kushwaha, D.S.: A Novel Approach for Suspicious Activity Detection with Deep Learning. Multimedia Tools and Applications, pp. 1–24. Springer (2023)
Google Scholar
Badhagouni, S.K., ViswanadhaRaju, S.: HBA optimized Efficient CNN in Human Activity Recognition. The Imaging Science Journal, Taylor & Francis 71, 66–81 (2023)
Google Scholar
Saif, A.S., Wollega, E.D., Kalevela, S.A.: Spatio-temporal features based human action recognition using convolutional long short-term deep neural network. Int. J. Adv. Comp. Sci. Appl. Sci. Info. (SAI) Organization Limited 14, 66–81 (2023)
Google Scholar
https://towardsdatascience.com/acomprehensive-guide-to-convolutional-neural-networks-the-eli5-way3bd2b1164a53/
Schuldt, C., Laptev, I., Caputo, B.: Recognizing Human Actions: A Local SVM Approach. ICPR, IEEE 3, 32–36 (2004)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as Space-time Shapes. ICCV, IEEE 2, 1395–1402 (2005)
Google Scholar
Nigam, S., Khare, A.: Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences. Multimedia Tools and Applications, Springer 75, 17303–17332 (2016)
Article Google Scholar
Naveed, H., Khan, G.A.U., Siddiqi, A., Khan, M.U.G.: Human activity recognition using mixture of heterogeneous features and sequential minimal optimization. International Journal of Machine Learning and Cybernetics, Springer 10, 2329–2340 (2019)
Google Scholar
Nadeem, A., Jalal, A., Kim, K.: Human Actions Tracking and Recognition based on Body Parts Detection via Artificial Neural Network. ICACS, IEEE, pp. 1–6 (2020)
Google Scholar
Song, B.: Application of Fuzzy Clustering Model in the Classification of Sports Training Movements. Computational Intelligence and Neuroscience, Hindawi, 2022 (2022)
Google Scholar
Haq, I.U., Iwata, T., Kawahara, Y.: Dynamic mode decomposition via convolutional autoencoders for dynamics modeling in videos. Comput. Vis. Image Underst. 216, 103355 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Banasthali Vidyapith, Rajasthan, 304022, India
Surbhi Jain, Swati Nigam & Rajiv Singh
Department of Physical Science, Banasthali Vidyapith, Rajasthan, 304022, India
Aishvarya Garg
Centre for Artificial Intelligence, Banasthali Vidyapith, Rajasthan, 304022, India
Surbhi Jain, Aishvarya Garg, Swati Nigam, Rajiv Singh & Anshuman Shastri
Departmenet of Computer Science & Engineering, Oregon Institute of Technology, Oregon, USA
Irish Singh

Authors

Surbhi Jain
View author publications
You can also search for this author in PubMed Google Scholar
Aishvarya Garg
View author publications
You can also search for this author in PubMed Google Scholar
Swati Nigam
View author publications
You can also search for this author in PubMed Google Scholar
Rajiv Singh
View author publications
You can also search for this author in PubMed Google Scholar
Anshuman Shastri
View author publications
You can also search for this author in PubMed Google Scholar
Irish Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajiv Singh .

Editor information

Editors and Affiliations

Soongsil University, Seoul, Korea (Republic of)
Bong Jun Choi
Saint Louis University, St. Louis, MO, USA
Dhananjay Singh
Indian Institute of Information Technology, Allahabad, India
Uma Shanker Tiwary
Pukyong National University, Busan, Korea (Republic of)
Wan-Young Chung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jain, S., Garg, A., Nigam, S., Singh, R., Shastri, A., Singh, I. (2024). Convolutional Autoencoder for Vision-Based Human Activity Recognition. In: Choi, B.J., Singh, D., Tiwary, U.S., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2023. Lecture Notes in Computer Science, vol 14532. Springer, Cham. https://doi.org/10.1007/978-3-031-53830-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-53830-8_10
Published: 29 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53829-2
Online ISBN: 978-3-031-53830-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics