A spatiotemporal multi-feature extraction framework with space and channel based squeeze-and-excitation blocks for human activity recognition

Zhang, Beibei; Xu, Hongji; Xiong, Hailiang; Sun, Xiaojie; Shi, Leixin; Fan, Shidi; Li, Juan

doi:10.1007/s12652-020-02526-6

A spatiotemporal multi-feature extraction framework with space and channel based squeeze-and-excitation blocks for human activity recognition

Original Research
Published: 14 October 2020

Volume 12, pages 7983–7995, (2021)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Beibei Zhang¹,
Hongji Xu ORCID: orcid.org/0000-0001-6916-348X¹,
Hailiang Xiong¹,
Xiaojie Sun¹,
Leixin Shi¹,
Shidi Fan¹ &
…
Juan Li¹

671 Accesses
15 Citations
Explore all metrics

Abstract

Human activity recognition (HAR) is an active field in ubiquitous computing and body area network (BAN), which has been widely applied in medical care, sport and smart home. In recent years, a lot of methods based on deep learning show great performance on HAR. In consideration of the temporal and spatial dependencies of time series, the extracted features of traditional methods are not comprehensive. In this paper, we propose a new activity recognition framework based on spatiotemporal multi-feature extraction with space and channel based squeeze-and-excitation blocks (SCbSE-SMFE). The framework includes a temporal feature extraction layer composed of gated recurrent unit (GRU) blocks, a spatial feature extraction layer composed of convolutional neural networks (CNN) blocks with SCbSE blocks, a statistical feature extraction layer and an output layer. Meanwhile, regarding the actual needs for recognizing aggressive activities, we simulate the prison environment and collect an aggressive activity dataset (AAD). What’s more, aiming at the characteristics of aggressive activities, a threshold-based aggressive activity detection method is proposed to reduce the computational complexity. The proposed framework is evaluated on the public dataset WISDM and the collected dataset AAD, and the results prove that the proposed SCbSE-SMFE framework can effectively improve the accuracy and distinguish similar activities better. The proposed aggressive activity detection method based on threshold can simplify the model and improve the recognition speed while ensuring the recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling transformer architecture with attention layer for human activity recognition

Article 10 January 2024

Violent activity classification with transferred deep features and 3d-Cnn

Article 18 April 2022

An Efficient Human Activity Recognition Technique Based on Deep Learning

Article 01 October 2019

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Abidine MHB, Fergani B, Fleury A (2017) Integrating prior knowledge in weighted SVM for human activity recognition in smart home. In: Proceedings of International Conference on Smart Homes and Health Telematics, pp 233–239
Chen L, Zhang HW, Xiao J et al (2017) SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 5659–5667
Chen MJ, Li Y, Luo X et al (2018) A novel human activity recognition scheme for smart health using multilayer extreme learning machine. IEEE Internet Things J 6(2):1410–1418
Article Google Scholar
Cho H, Yoon SM (2018) Divide and conquer-based 1D CNN human activity recognition using test data sharpening. Sensors 18(4):1055–1079
Article Google Scholar
Cho K, Merrienboer BV, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Empirical Methods in Natural Language Process, pp 1724–1734
Deniz O, Serrano I, Bueno G, et al (2015) Fast violence detection in video. In: Proceedings of the 2014 9th International Conference on Computer Vision Theory and Applications, pp 478–485
Feng ZT, Mo LF, Li M (2015) A random forest-based ensemble method for activity recognition. In: Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 5074–5077
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation networks. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Huang Q et al (2020) Development of artificial intelligence (AI) algorithms to avoid potential baby sleep hazards in smart buildings, ASCE Construction Research Congress (CRC)
Ignatov A (2018) Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput 62:915–922
Article Google Scholar
Jiang WC, Yin ZZ (2015) Human activity recognition using wearable sensors by deep convolutional neural networks. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp 1307–1310
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. Comput Sci. arXiv:1412.6980
Lara OD, Labrador MA (2013) A survey on human activity recognition using wearable sensors. IEEE Commun Surv Tutor 15(3):1192–1209
Article Google Scholar
Lee SM, Yoon SM, Cho H (2017) Human activity recognition from accelerometer data using convolutional neural network. In: Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing, pp 131–134
Lockhart JW, Pulickal T, Weiss GM (2012) Applications of mobile activity recognition. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp 1054–1058
Nematallah H, Rajan S, Cretu AM (2019) Logistic model tree for human activity recognition using smartphone-based inertial sensors. In: Proceedings of 2019 IEEE Sensors, pp 1–4
Nievas EB, Suarez OD, Garcia GB, et al (2011) Violence detection in video using computer vision techniques. In: Proceedings of the 14th International Conference on Computer Analysis of Images and Patterns, pp 332–339
Okeyo G, Chen LM, Wang H (2014) Combining ontological and temporal formalisms for composite activity modelling and recognition in smart homes. Fut Generat Comput Syst 39:29–43
Article Google Scholar
Panwar M, Biswas D, Bajaj H et al (2019) Rehab-Net: deep learning framework for arm movement classification using wearable sensors for stroke rehabilitation. IEEE Trans Biomed Eng 66(11):3026–3037
Article Google Scholar
Paul P, George T (2015) An effective approach for human activity recognition on smartphone. In: Proceedings of the 2015 IEEE International Conference on Engineering and Technology, pp 45–47
Qiao HH, Wang TY, Wang P et al (2018) A time-distributed spatiotemporal feature learning method for machine health monitoring with multi-sensor time series. Sensors 18(9):2932–2951
Article Google Scholar
Qin Z, Hu LZ, Zhang N et al (2019) Learning aided user identification using smartphone sensors for smart homes. IEEE Internet Things J 6(5):7760–7772
Article Google Scholar
Ravi D, Wong C, Lo B et al (2017) A deep learning approach to on-node sensor data analytics for mobile or wearable devices. IEEE J Biomed Health Inform 21(1):56–64
Article Google Scholar
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Subasi A, Dammas DH, Alghamdi RD et al (2018) Sensor based human activity recognition using Adaboost ensemble classifier. Proc Comput Sci 140:104–111
Article Google Scholar
Subasi A, Khateeb K, Brahimi T et al (2020) Human activity recognition using machine learning methods in a smart healthcare environment. In: Innovation in Health Informatics. Elsevier, Amsterdam, pp 123–144
Sun ZJ, Xue L, Xu YM et al (2012) Overview of deep learning. Appl Res Comput 29(8):2806–2810
Google Scholar
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4. Inception-ResNet and the impact of residual connections on learning. arXiv: 1602.07261
TensorFlow (2020). https://www.tensorflow.org/
Vaughn A, Biocco P, Liu Y et al (2018) Activity detection and analysis using smartphone sensors. In: Proceedings of the 2018 IEEE International Conference on Information Reuse and Integration for Data Science, pp 102–107
Wang JD, Chen YQ, Hao SJ et al (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recognit Lett 119:3–11
Article Google Scholar
Wang JD, Chen YQ, Hu LS et al (2017) Stratified transfer learning for cross-domain activity recognition. In: Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications, pp 1–10
Wang LK, Liu RY (2019) Human activity recognition based on wearable sensor using hierarchical deep LSTM networks. Circ Syst Signal Process 39(1):837–856
Google Scholar
Xi R, Li M, Hou MS et al (2018) Deep dilation on multimodality time series for human activity recognition. IEEE Access 6:53381–53396
Article Google Scholar
Xia K, Huang JG, Wang HY (2020) LSTM-CNN architecture for human activity recognition. IEEE Access 8:56855–56866
Article Google Scholar
Xu C, Chai D, He J et al (2019) InnoHAR: a deep neural network for complex human activity recognition. IEEE Access 7:9893–9902
Article Google Scholar
Yin BC, Wang WT, Wang LC (2015) Review of deep learning. J Beijing Univ Technol 41(1):48–59
MATH Google Scholar
Zhang HX, Xiao ZW, Wang J et al (2019) A novel IoT-perceptive human activity recognition (HAR) approach using multihead convolutional attention. IEEE Internet Things J 7(2):1072–1080
Article Google Scholar
Zhao Y, Yang RN, Chevalier G et al (2018) Deep residual Bidir-LSTM for human activity recognition using wearable sensors. Math Prob Eng 9:1–13
Google Scholar
Zheng JW, Lu C, Hao C et al (2020) Improving the generalization ability of deep neural networks for cross-domain visual recognition. IEEE Trans Cognit Dev Syst 2020:1–15
Google Scholar

Download references

Acknowledgements

This work was financially supported by the National Key Research and Development Program of China (2017YFC0803403, 2018YFC0831001), the National Natural Science Foundation of China (61771292, 61401253), and the Natural Science Foundation of Shandong Province of China (ZR2016FM29), the Key Research and Development Program of Shandong Province of China (2017GGX201003).

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong University, Qingdao, 266237, China
Beibei Zhang, Hongji Xu, Hailiang Xiong, Xiaojie Sun, Leixin Shi, Shidi Fan & Juan Li

Authors

Beibei Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Hongji Xu
View author publications
You can also search for this author inPubMed Google Scholar
Hailiang Xiong
View author publications
You can also search for this author inPubMed Google Scholar
Xiaojie Sun
View author publications
You can also search for this author inPubMed Google Scholar
Leixin Shi
View author publications
You can also search for this author inPubMed Google Scholar
Shidi Fan
View author publications
You can also search for this author inPubMed Google Scholar
Juan Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Hongji Xu or Hailiang Xiong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, B., Xu, H., Xiong, H. et al. A spatiotemporal multi-feature extraction framework with space and channel based squeeze-and-excitation blocks for human activity recognition. J Ambient Intell Human Comput 12, 7983–7995 (2021). https://doi.org/10.1007/s12652-020-02526-6

Download citation

Received: 24 May 2020
Accepted: 04 September 2020
Published: 14 October 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s12652-020-02526-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A spatiotemporal multi-feature extraction framework with space and channel based squeeze-and-excitation blocks for human activity recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Modeling transformer architecture with attention layer for human activity recognition

Violent activity classification with transferred deep features and 3d-Cnn

An Efficient Human Activity Recognition Technique Based on Deep Learning

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now