Research on human behavior recognition in factory environment based on 3-2DCNN-BIGRU fusion network

Wang, Zhenyu; Zheng, Jianming; Yang, Mingshun; Shi, Weichao; Su, Yulong; Chen, Ting; Peng, Chao

doi:10.1007/s11760-024-03613-3

Research on human behavior recognition in factory environment based on 3-2DCNN-BIGRU fusion network

Original Paper
Published: 09 December 2024

Volume 19, article number 102, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Zhenyu Wang¹,
Jianming Zheng¹,
Mingshun Yang¹,
Weichao Shi¹,
Yulong Su¹,
Ting Chen¹ &
…
Chao Peng¹

85 Accesses
Explore all metrics

Abstract

In the safety monitoring of workers in manufacturing enterprises, there is a problem of large amount of calculation in the identification model caused by multi-objective high concurrent behavior. In this paper, we propose a human behavior recognition model that combines multi-dimensional convolution and gated recurrent neural network from the perspective of model structure design. The single target human behavior data set of the factory was constructed by YOLOv7 target detection and BOT-SORT multi-target tracking. Human behavior recognition model 3-2DCNN-BIGRU, in the mixed spatio-temporal feature extraction layer, uses the advantages of 3DCNN in spatio-temporal feature extraction to extract spatio-temporal features; The 3-2DCNN is used to extract spatial features after dimension reduction to improve the computational complexity and reduce the complexity of the model; Using the idea of expansion convolution in the time convolution network, the receptive fields of 3DCNN and 3-2DCNN are increased, and the ability of spatio-temporal feature extraction of the model is enhanced. In the time feature enhancement layer, a bidirectional gated recurrent neural network is fused to enhance the model’s ability to extract time features, thereby improving the overall performance of the model. With fewer parameters, the accuracy on the Fall Dataset reaches 98.65%, which can effectively identify human behaviors such as walking, sitting, and falling in factories, ensuring the safety of workers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Intelligent recognition of rough handling of express parcels based on CNN-GRU with the channel attention mechanism

Article 30 June 2021

An algorithm for abnormal behavior recognition based on sharing human target tracking features

Article 19 March 2024

Recognition of abnormal human behavior in dual-channel convolutional 3D construction site based on deep learning

Article 13 October 2022

Data availability

No datasets were generated or analysed during the current study.

References

Yue, R., Tian, Z., Du, S.: Action recognition based on RGB and skeleton data sets: A survey. Neurocomputing. 512, 287–306 (2022). https://doi.org/10.1016/j.neucom.2022.09.071
Article MATH Google Scholar
Nunez-Marcos, A., Azkune, G., Arganda-Carreras, I.: Egocentric Vision-based Action Recognition: A survey. NEUROCOMPUTING. 472, 175–197 (2022). https://doi.org/10.1016/j.neucom.2021.11.081
Article Google Scholar
Zhang, H., Liu, X., Yu, D., Guan, L., Wang, D., Ma, C., Hu, Z.: Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network. Appl. Intell. 53, 17629–17643 (2023). https://doi.org/10.1007/s10489-022-04365-8
Article Google Scholar
Wu, N., Kera, H., Kawamoto, K.: Improving zero-shot action recognition using human instruction with text description. Appl. Intell. 53, 24142–24156 (2023). https://doi.org/10.1007/s10489-023-04808-w
Article MATH Google Scholar
Qi, Y., Hu, J., Zhuang, L., Pei, X.: Semantic-guided multi-scale human skeleton action recognition. Appl. Intell. 53, 9763–9778 (2023). https://doi.org/10.1007/s10489-022-03968-5
Article Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning Spatiotemporal Features with 3D Convolutional Networks. In: 2015 IEEE International Conference on Computer Vision (ICCV). pp. 4489–4497 (2015)
Hara, K., Kataoka, H., Satoh, Y.: Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6546–6555 (2018)
Muhammad, K., Mustaqeem, Ullah, A., Imran, A.S., Sajjad, M., Kiran, M.S., Sannino, G., de Albuquerque, V.H.C.: Human action recognition using attention based LSTM network with dilated CNN features. Future Gener Comput. Syst. 125, 820–830 (2021). https://doi.org/10.1016/j.future.2021.06.045
Article Google Scholar
Tan, K.S., Lim, K.M., Lee, C.P., Kwek, L.C.: Bidirectional long short-term memory with temporal dense sampling for human action recognition. Expert Syst. Appl. 210, 118484 (2022). https://doi.org/10.1016/j.eswa.2022.118484
Article Google Scholar
Afza, F., Khan, M.A., Sharif, M., Kadry, S., Manogaran, G., Saba, T., Ashraf, I., Damaševičius, R.: A framework of human action recognition using length control features fusion and weighted entropy-variances based feature selection. Image Vis. Comput. 106, 104090 (2021). https://doi.org/10.1016/j.imavis.2020.1040
Article Google Scholar
Carreira, J., Zisserman, A.: Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4724–4733. IEEE, Honolulu, HI (2017)
Dai, C., Liu, X., Lai, J.: Human action recognition using two-stream attention based LSTM networks. Appl. Soft Comput. 86, 105820 (2020). https://doi.org/10.1016/j.asoc.2019.105820
Article Google Scholar
Hu, W., Fu, C., Cao, R., Zang, Y., Wu, X.-J., Shen, S., Gao, X.-Z.: Joint dual-stream interaction and multi-scale feature extraction network for multi-spectral pedestrian detection. Appl. Soft Comput. 147, 110768 (2023). https://doi.org/10.1016/j. asoc.2023.110768
Article MATH Google Scholar
Senthilkumar, N., Manimegalai, M., Karpakam, S., Ashokkumar, S.R., Premkumar, M.: Human action recognition based on spatial–temporal relational model and LSTM-CNN framework. Mater. Today Proc. 57, 2087–2091 (2022). https://doi.org/10.1016/j.matpr.2021.12.004
Article Google Scholar
Jaouedi, N., Boujnah, N., Bouhlel, M.S.: A new hybrid deep learning model for human action recognition. J. King Saud Univ. - Comput. Inf. Sci. 32, 447–453 (2020). https://doi.org/10.1016/j.jksuci.2019.09.004
Article Google Scholar
Zhang, Z., Lv, Z., Gan, C., Zhu, Q.: Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions. Neurocomputing. 410, 304–316 (2020). https://doi.org/10.1016/j.neucom.2020.06.032
Article MATH Google Scholar
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A Closer Look at Spatiotemporal Convolutions for Action Recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6450–6459 (2018)
Liu, X., Xiong, S., Wang, X., Liang, T., Wang, H., Liu, X.: A compact multi-branch 1D convolutional neural network for EEG-based motor imagery classification. Biomed. Signal. Process. Control. 81, 104456 (2023). https://doi.org/10.1016/j.bspc.2022
Article MATH Google Scholar
Cui, J., Lan, Z., Liu, Y., Li, R., Li, F., Sourina, O., Müller-Wittig, W.: A compact and interpretable convolutional neural network for cross-subject driver drowsiness detection from single-channel EEG. Methods. 202, 173–184 (2022). https://doi.org/10.1016/j. ymeth.2021.04.017
Article Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild, (2012). http://arxiv.org/abs/1212.0402
Adhikari, K., Bouchachia, H., Nait-Charif, H.: Activity recognition for indoor fall detection using convolutional neural network. In: 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA). pp. 81–84 (2017)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision. pp. 2556–2563 (2011)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 7464–7475 (2023)
Aharon, N., Orfaig, R., Bobrovsky, B.-Z.: BoT-SORT: Robust Associations Multi-Pedestrian Tracking, (2022). http://arxiv.org/abs/2206.14651

Download references

Funding

The research was supported by the Natural Science Foundation of Shaanxi Province (No.2021SF-422) and the Natural Science Foundation of Shaanxi Province (No.2024GX-YBXM-190).

Author information

Authors and Affiliations

School of Mechanical and Precision Instrument Engineering, Xi’an University of Technology, Xi’an Shaanxi, 710048, China
Zhenyu Wang, Jianming Zheng, Mingshun Yang, Weichao Shi, Yulong Su, Ting Chen & Chao Peng

Authors

Zhenyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianming Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Mingshun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weichao Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yulong Su
View author publications
You can also search for this author in PubMed Google Scholar
Ting Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chao Peng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This paper was written by Wang Zhenyu, who independently completed the research design, data analysis and result interpretation, and provided in-depth insights into the research results. Zheng Jianming gave four instructions on the overall structure of the paper. Yang Mingshun gave two instructions on the experimental design and data analysis of the paper. Shi Weichao was responsible for reviewing the literature and provided an important knowledge framework in the background part of the paper. Su Yulong is responsible for revising and proofreading the paper. Chen Ting provided resources related to the experiment and provided key support in the process of experiment design and implementation. Peng Chao combed the language of the paper and optimized the pictures of the paper.

Corresponding author

Correspondence to Jianming Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Z., Zheng, J., Yang, M. et al. Research on human behavior recognition in factory environment based on 3-2DCNN-BIGRU fusion network. SIViP 19, 102 (2025). https://doi.org/10.1007/s11760-024-03613-3

Download citation

Received: 15 April 2024
Revised: 20 October 2024
Accepted: 27 October 2024
Published: 09 December 2024
DOI: https://doi.org/10.1007/s11760-024-03613-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on human behavior recognition in factory environment based on 3-2DCNN-BIGRU fusion network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intelligent recognition of rough handling of express parcels based on CNN-GRU with the channel attention mechanism

An algorithm for abnormal behavior recognition based on sharing human target tracking features

Recognition of abnormal human behavior in dual-channel convolutional 3D construction site based on deep learning

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Research on human behavior recognition in factory environment based on 3-2DCNN-BIGRU fusion network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intelligent recognition of rough handling of express parcels based on CNN-GRU with the channel attention mechanism

An algorithm for abnormal behavior recognition based on sharing human target tracking features

Recognition of abnormal human behavior in dual-channel convolutional 3D construction site based on deep learning

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation