Skip to main content
Log in

Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets used in this research are available at the UCI Machine Learning Repository: https://archive.ics.uci.edu.

References

  1. Shavit, Y., Klein, I.: Boosting inertial-based human activity recognition with transformers. IEEE Access 9, 53540–53547 (2021). https://doi.org/10.1109/ACCESS.2021.3070646

    Article  Google Scholar 

  2. Kumar, R., Kumar, S.: Effectiveness of vision transformers in human activity recognition from videos. In: 2023 International Conference on Advancement in Computation and Computer Technologies (InCACCT), pp. 593–597 (2023). https://doi.org/10.1109/InCACCT57535.2023.10141761

  3. Pereira, R.M., Costa, Y.M., Silla, C.N., Jr.: Handling imbalance in hierarchical classification problems using local classifiers approaches. Data Min. Knowl. Discov. 35(4), 1564–1621 (2021). https://doi.org/10.1007/s10618-021-00762-8

    Article  MathSciNet  Google Scholar 

  4. Liu, Z., Li, S., Hao, J., Hu, J., Pan, M.: An efficient and fast model reduced kernel KNN for human activity recognition. J. Adv. Transp. (2021). https://doi.org/10.1155/2021/2026895

    Article  Google Scholar 

  5. Thakur, D., Biswas, S.: Guided regularized random forest feature selection for smartphone based human activity recognition. J. Ambient. Intell. Humaniz. Comput. (2022). https://doi.org/10.1007/s12652-022-03862-5

    Article  Google Scholar 

  6. Halim, N.: Stochastic recognition of human daily activities via hybrid descriptors and random forest using wearable sensors. Array 15, 100190 (2022)

    Article  Google Scholar 

  7. Nawal, Y., Oussalah, M., Fergani, B., Fleury, A.: New incremental SVM algorithms for human activity recognition in smart homes. J. Ambient Intell. Hum. Comput. 14, 13433–13450 (2022)

    Article  Google Scholar 

  8. Ordóñez, F.J., Roggen, D.: Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 16, 115 (2016)

    Article  Google Scholar 

  9. Chen, L., Liu, X., Peng, L., Wu, M.: Deep learning based multimodal complex human activity recognition using wearable devices. Appl. Intell. 51(6), 1–14 (2021)

    Article  Google Scholar 

  10. Huan, R., Zhan, Z., Luoqi, G., Chi, K., Chen, P., Liang, R.: A hybrid CNN and BLSTM network for human complex activity recognition with multi-feature fusion. Multimedia Tools Appl. 80(30), 36159–36182 (2021)

    Article  Google Scholar 

  11. Mekruksavanich, Sakorn, Jitpattanakul, A.: LSTM networks using smartphone data for sensor-based human activity recognition in smart homes. Sensors 21(5), 1636 (2021)

  12. Wang, H., Zhao, J., Li, J., Tian, L., Tu, P., Cao, T., An, Y., Wang, K., Li, S.: Wearable sensor-based human activity recognition using hybrid deep learning techniques. Secur. Commun. Netw. 10, 1–12 (2020). https://doi.org/10.1155/2020/2132138

    Article  Google Scholar 

  13. Dirgová Luptáková, I., Kubovčík, M., Pospíchal, J.: Wearable sensor-based human activity recognition with transformer model. Sensors 22(5), 1911 (2022). https://doi.org/10.3390/s22051911

    Article  Google Scholar 

  14. Zhang, Z., Wang, W., An, A., et al.: A human activity recognition method using wearable sensors based on convtransformer model. Evol. Syst. 14, 939–955 (2023)

    Article  Google Scholar 

  15. Zheng, W., Zhao, H.: Cost-sensitive hierarchical classification via multi-scale information entropy for data with an imbalanced distribution. Appl. Intell. 51, 1–13 (2021)

    Article  Google Scholar 

  16. Leutheuser, H., Schuldhaus, D., Eskofier, B.M.: Hierarchical, multi-sensor based classification of daily life activities: comparison with state-of-the-art algorithms using a benchmark dataset. PLoS ONE 8(10), 1–11 (2013)

    Article  Google Scholar 

  17. Fazli, M., Kowsari, K., Gharavi, E., Barnes, L., Doryab, A.: HHAR-Net: Hierarchical human activity recognition using neural networks (2020)

  18. Manouchehri, N., Bouguila, N.: Human activity recognition with an HMM-based generative model. Sensors 23, 1390 (2023). https://doi.org/10.3390/s23031390

    Article  Google Scholar 

  19. Wei, X., Wang, Z.: TCN-attention-HAR: human activity recognition based on attention mechanism time convolutional network. Sci. Rep. (2024). https://doi.org/10.1038/s41598-024-57912-3

    Article  Google Scholar 

  20. Hoai Thu, N.T., Han, D.S.: HIHAR: a hierarchical hybrid deep learning architecture for wearable sensor-based human activity recognition. IEEE Access 9, 145271–145281 (2021)

    Article  Google Scholar 

  21. Luwe, Y.J., Lee, C.P., Lim, K.M.: Wearable sensor-based human activity recognition with hybrid deep learning model. Informatics 9(3), 56 (2022)

    Article  Google Scholar 

  22. Zhang, C., Cao, K., Lu, L., Deng, T.: A multi-scale feature extraction fusion model for human activity recognition. Sci. Rep. 12, 20620 (2022)

    Article  Google Scholar 

  23. Verma, U., Tyagi, P., Aneja, M.K.: Multi-branch CNN GRU with attention mechanism for human action recognition. Eng. Res. Express 5, 025055 (2023)

    Article  Google Scholar 

  24. Cook, D.: Learning setting-generalized activity models for smart spaces. IEEE Intell. Syst. (2010). https://doi.org/10.1109/MIS.2010.112

    Article  Google Scholar 

  25. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)

    Google Scholar 

  26. Slim, S.O., Atia, A., Marwa, M.A., Mostafa, M.-S.: Survey on human activity recognition based on acceleration data. Int. J. Adv. Comput. Sci. Appl. 1, 12–56 (2019). https://doi.org/10.14569/IJACSA.2019.0100311

  27. Li, C., Tong, C.L., Niu, D., Jiang, B., Zuo, X., Cheng, L., Xiong, J., Yang, J.: Similarity embedding networks for robust human activity recognition. ACM Trans. Knowl. Discov. Data (TKDD) 15(6), 1–17 (2021)

    Google Scholar 

  28. Zhang, Y., Zhang, Y., Zhang, Z., Bao, J., Song, Y.: Human activity recognition based on time series analysis using U-Net. arXiv (2018)

  29. Daghero, F., Burrello, A., Xie, C., Castellano, M., Gandolfi, L., Calimera, A., Macii, E., Poncino, M., Jahier Pagliari, D.: Human activity recognition on microcontrollers with quantized and adaptive deep neural networks. ACM Trans. Embed. Comput. Syst. 21, 1–28 (2022)

    Article  Google Scholar 

  30. Xu, C., Chai, D., He, J., Zhang, X., Duan, S.: InnoHAR: a deep neural network for complex human activity recognition. IEEE Access (2018)

  31. Ronald, M., Poulose, A., Han, D.S.: iSPLInception: an inception-resnet deep learning architecture for human activity recognition. IEEE Access 9, 1 (2021). https://doi.org/10.1109/ACCESS.2021.3078184

    Article  Google Scholar 

  32. Mim, T.R., Amatullah, M., Afreen, S., Yousuf, M.A., Uddin, S., Alyami, S.A., Hasan, K.F., Moni, M.A.: GRU-INC: an inception-attention based approach using GRU for human activity recognition. Expert Syst. Appl. 216, 119419 (2023)

    Article  Google Scholar 

  33. Karim, F., Majumdar, S., Darabi, H., Harford, S.: Multivariate LSTM-FCNs for time series classification. Neural Netw. 116, 237–245 (2019)

    Article  Google Scholar 

  34. Zhao, Y., Yang, R., Chevalier, G., Xu, X., Zhang, Z.: Deep residual BIDIR-LSTM for human activity recognition using wearable sensors. Math. Probl. Eng. 2018, 1–13 (2018)

    Article  Google Scholar 

  35. Han, C., Zhang, L., Tang, Y., Huang, W., Min, F., He, J.: Human activity recognition using wearable sensors by heterogeneous convolutional neural networks. Expert Syst. Appl. 198, 116764 (2022)

    Article  Google Scholar 

  36. Bhattacharya, D., Sharma, D., Kim, W., Ijaz, M.F., Singh, P.K.: ENSEM-HAR: an ensemble deep learning model for smartphone sensor-based human activity recognition for measurement of elderly health monitoring. Biosensors 12(6), 393 (2022)

    Article  Google Scholar 

  37. Sarkar, A., Hossain, R., Sabbir, S.K.: Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm. Neural Comput. Appl. 35(7), 5165–5191 (2023)

    Article  Google Scholar 

  38. Han, C., Zhang, L., Tang, Y., Huang, W., Min, F., He, J.: Human activity recognition using wearable sensors by heterogeneous convolutional neural networks. Expert Syst. Appl. 198, 116764 (2022)

    Article  Google Scholar 

  39. Li, Y., Wang, L., Liu, F.: Multi-branch attention-based grouped convolution network for human activity recognition using inertial sensors. Electronics 11(16), 2526 (2022)

    Article  Google Scholar 

  40. Zhou, B., Wang, C., Huan, Z., Li, Z., Chen, Y., Gao, G., Li, H., Dong, C., Liang, J.: A novel segmentation scheme with multi-probability threshold for human activity recognition using wearable sensors. Sensors 22(19), 7446 (2022)

    Article  Google Scholar 

  41. Bouchabou, D., Nguyen, S.M., Lohr, C., LeDuc, B., Kanellos, I.: Fully convolutional network bootstrapped by word encoding and embedding for activity recognition in smart homes. In: Li, X., Wu, M., Chen, Z., Zhang, L. (eds.) Deep Learning for Human Activity Recognition, pp. 111–125. Springer, Singapore (2021)

    Chapter  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

ME: writing. AG: supervision and writing. LA: supervision and writing. AA: supervision and methodology.

Corresponding author

Correspondence to Mustafa Ezzeldin.

Ethics declarations

Conflict of interest

Not applicable.

Ethical/Informed consent for data

The datasets used in this research are publicly available.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ezzeldin, M., S. Ghoneim, A., Abdelhamid, L. et al. Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition. SIViP 18, 9375–9385 (2024). https://doi.org/10.1007/s11760-024-03552-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-024-03552-z

Keywords