Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Zhu, Wenrui; Shi, Donghui; Cheng, Rui; Huang, Ruifeng; Hu, Tao; Wang, Junyi

doi:10.1007/s11760-023-02923-2

Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Original Paper
Published: 18 January 2024

Volume 18, pages 2473–2483, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Wenrui Zhu^1,2,
Donghui Shi¹,
Rui Cheng¹,
Ruifeng Huang^1,3,
Tao Hu¹ &
…
Junyi Wang¹

181 Accesses
Explore all metrics

Abstract

Human falls during ladder climbing are typically instantaneous, making the timely and accurate determination of security risks during ladder climbing a challenging engineering issue. A skeleton-based behaviour recognition method is proposed for the real-time detection of human risky behaviour during ladder climbing in construction scenes. This method used a multi-modal feature fusion strategy to enhance the semantic information of the skeletal data and used an improved adaptive graph convolutional network by partial dense connection for behaviour recognition. This method was evaluated through ablation and comparative experiments on the public behaviour dataset. The results demonstrated its advantages in balancing accuracy and model complexity. Moreover, the experiment results on the ladder climbing behaviour dataset also validated its effectiveness in practical applications. The method in this paper will hopefully help to guarantee the personal safety of construction workers and provide information-based advance warning of security risks during ladder climbing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 7

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Article 12 August 2023

Human activity recognition in artificial intelligence framework: a narrative review

Article 18 January 2022

A review of computer vision-based approaches for physical rehabilitation and assessment

Article Open access 19 June 2021

Data availability

All data are available from the corresponding author on reasonable request.

References

Han, S., Lee, S., Peña-Mora, F.: Vision-based detection of unsafe actions of a construction worker: case study of ladder climbing. J. Comput. Civ. Eng. 27(6), 635–644 (2013)
Article Google Scholar
Fang, W., Ding, L., Luo, H., Love, P.E.: Falls from heights: a computer vision-based approach for safety harness detection. Autom. Constr. 91, 53–61 (2018)
Article Google Scholar
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Li, C.: Computer vision aided inspection on falling prevention measures for steeplejacks in an aerial environment. Autom. Constr. 93, 148–164 (2018)
Article Google Scholar
Shen, J., Xiong, X., Li, Y., He, W., Li, P., Zheng, X.: Detecting safety helmet wearing on construction sites with bounding-box regression and deep transfer learning. Comput. Aided Civ. Infrastruct. Eng. 36(2), 180–196 (2021)
Article Google Scholar
Wu, X., Li, Y., Long, J., Zhang, S., Wan, S., Mei, S.: A remote-vision-based safety helmet and harness monitoring system based on attribute knowledge modeling. Remote Sens. 15(2), 347 (2023)
Article ADS Google Scholar
Kim, D., Liu, M., Lee, S., Kamat, V.R.: Remote proximity monitoring between mobile construction resources using camera-mounted UAVs. Autom. Constr. 99, 168–182 (2019)
Article Google Scholar
Fang, W., Zhong, B., Zhao, N., Love, P.E., Luo, H., Xue, J., Xu, S.: A deep learning-based approach for mitigating falls from height with computer vision: convolutional neural network. Adv. Eng. Inform. 39, 170–177 (2019)
Article Google Scholar
Mei, X., Zhou, X., Xu, F., Zhang, Z.: Human intrusion detection in static hazardous areas at construction sites: deep learning-based method. J. Constr. Eng. Manag. 149(1), 04022142 (2023)
Article Google Scholar
Zhang, S., Yang, Y., Xiao, J., Liu, X., Yang, Y., Xie, D., Zhuang, Y.: Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks. IEEE Trans. Multimedia 20(9), 2330–2343 (2018)
Article Google Scholar
Si, C., Chen, W., Wang, W., Wang, L., Tan, T.: An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019, pp. 1227–1236 (2019)
Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1963–1978 (2019)
Article PubMed Google Scholar
Banerjee, A., Singh, P.K., Sarkar, R.: Fuzzy integral-based CNN classifier fusion for 3d skeleton action recognition. IEEE Trans. Circuits Syst. Video Technol. 31(6), 2206–2216 (2020)
Article Google Scholar
Ding, W., Ding, C., Li, G., Liu, K.: Skeleton-based square grid for human action recognition with 3D convolutional neural network. IEEE Access 9, 54078–54089 (2021)
Article Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018, pp. 7444–7452 (2018)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019, pp. 12026–12035 (2019)
Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., Tian, Q.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019, pp. 3595–3603 (2019)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans. Image Process. 29, 9532–9545 (2020)
Article ADS Google Scholar
Feng, L., Zhao, Y., Zhao, W., Tang, J.: A comparative review of graph convolutional networks for human skeleton-based action recognition. Artif. Intell. Rev. 1–31 (2022)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017, pp. 4700–4708 (2017)
Zhu, Q., Deng, H., Wang, K.: Skeleton action recognition based on temporal gated unit and adaptive graph convolution. Electronics 11(18), 2973 (2022)
Article Google Scholar
Alsarhan, T., Ali, U., Lu, H.: Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition. Comput. Vis. Image Underst. 216, 103348 (2022)
Article Google Scholar
Zhou, S.-B., Chen, R.-R., Jiang, X.-Q., Pan, F.: 2s-GATCN: two-stream graph attentional convolutional networks for skeleton-based action recognition. Electronics 12(7), 1711 (2023)
Article Google Scholar
Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: NTU RGB+D: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, pp. 1010–1019 (2016)
Sun, Z., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G., Liu, J.: Human action recognition from various data modalities: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3200–3225 (2022)
Google Scholar
Weiyao, X., Muqing, W., Min, Z., Ting, X.: Fusion of skeleton and RGB features for RGB-D human action recognition. IEEE Sens. J. 21(17), 19157–19164 (2021)
Article ADS Google Scholar
Li, Z., Zhang, Q., Lv, S., Han, M., Jiang, M., Song, H.: Fusion of RGB, optical flow and skeleton features for the detection of lameness in dairy cows. Biosyst. Eng. 218, 62–77 (2022)
Article CAS Google Scholar
Abavisani, M., Joze, H.R.V., Patel, V.M.: Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019, pp. 1165–1174 (2019)
Song, Y.-F., Zhang, Z., Shan, C., Wang, L.: Richly activated graph convolutional network for robust skeleton-based action recognition. IEEE Trans. Circuits Syst. Video Technol. 31(5), 1915–1925 (2020)
Article Google Scholar
Pérez-Rúa, J.-M., Vielzeuf, V., Pateux, S., Baccouche, M., Jurie, F.: MFAS: multimodal fusion architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019, pp. 6966–6975 (2019)
Das, S., Sharma, S., Dai, R., Bremond, F., Thonnat, M.: VPN: learning video-pose embedding for activities of daily living. In: Proceedings of the Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, pp. 72–90 (2020)
Duan, H., Zhao, Y., Chen, K., Lin, D., Dai, B.: Revisiting skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022, pp. 2969–2978 (2022)
Liang, X., Qian, Y., Guo, Q., Cheng, H., Liang, J.: AF: an association-based fusion method for multi-modal classification. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9236–9254 (2021)
Article Google Scholar
Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020, pp. 839–848 (2020)
Guo, H., Yu, Y., Ding, Q., Skitmore, M.: Image-and-skeleton-based parameterized approach to real-time identification of construction workers’ unsafe behaviors. J. Constr. Eng. Manag. 144(6), 04018042 (2018)
Article Google Scholar
Yu, Y., Guo, H., Ding, Q., Li, H., Skitmore, M.: An experimental study of real-time identification of construction workers’ unsafe behaviors. Autom. Constr. 82, 193–206 (2017)
Article Google Scholar
Anjum, S., Khan, N., Khalid, R., Khan, M., Lee, D., Park, C.: Fall prevention from ladders utilizing a deep learning-based height assessment method. IEEE Access 10, 36725–36742 (2022)
Article Google Scholar
Ding, L., Fang, W., Luo, H., Love, P.E., Zhong, B., Ouyang, X.: A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Autom. Constr. 86, 118–124 (2018)
Article Google Scholar
Yao, L., Shuangjian, J.: Application of ST-GCN in unsafe action identification of construction workers. China Saf. Sci. J. 32(4), 30 (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2016)
Zhang, X., Xu, C., Tao, D.: Context aware graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2020)
Zhou, S.-B., Chen, R.-R., Jiang, X.-Q., Pan, F.: 2s-GATCN: two-stream graph attentional convolutional networks for skeleton-based action recognition. Electronics 12(7), 1711 (2023)
Article Google Scholar
Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2021)
Article PubMed Google Scholar
Bian, C., Feng, W., Wan, L., Wang, S.: Structural knowledge distillation for efficient skeleton-based action recognition. IEEE Trans. Image Process. 30, 2963–2976 (2021)
Article ADS PubMed Google Scholar
Wu, H., Ma, X., Li, Y.: Spatiotemporal multimodal learning with 3D CNNs for video action recognition. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1250–1261 (2021)
Article Google Scholar

Download references

Funding

This research was supported by the Humanities and Social Science Project of Anhui Provincial Department of Education under Grant: 2022AH050224.

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Anhui Jianzhu University, Hefei, 230601, Anhui, China
Wenrui Zhu, Donghui Shi, Rui Cheng, Ruifeng Huang, Tao Hu & Junyi Wang
School of Big Data and Artificial Intelligence, Anhui Xinhua University, Hefei, 230088, Anhui, China
Wenrui Zhu
Emergetech and Intelligent Media Computing Joint Lab, Institude of Advanced Technology, University of Science and Technology of China, Hefei, 230031, Anhui, China
Ruifeng Huang

Authors

Wenrui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Donghui Shi
View author publications
You can also search for this author in PubMed Google Scholar
Rui Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ruifeng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Junyi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

WZ contributed to conceptualization and writing; WZ and RC were involved in methodology; DS and TH contributed to formal analysis; WZ and RH were involved in investigation; WZ, TH, RC, JW, and RH contributed to data curation; and DS was involved in supervision. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Donghui Shi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhu, W., Shi, D., Cheng, R. et al. Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network. SIViP 18, 2473–2483 (2024). https://doi.org/10.1007/s11760-023-02923-2

Download citation

Received: 10 October 2023
Revised: 16 November 2023
Accepted: 26 November 2023
Published: 18 January 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11760-023-02923-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Abstract

Access this article

Similar content being viewed by others

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Human activity recognition in artificial intelligence framework: a narrative review

A review of computer vision-based approaches for physical rehabilitation and assessment

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Abstract

Access this article

Similar content being viewed by others

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Human activity recognition in artificial intelligence framework: a narrative review

A review of computer vision-based approaches for physical rehabilitation and assessment

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation