Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures

Shi, Peicheng; Zhang, Qing; Yang, Aixi

doi:10.1007/s11760-024-03729-6

Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures

Original Paper
Published: 08 December 2024

Volume 19, article number 92, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Peicheng Shi¹,
Qing Zhang¹ &
Aixi Yang²

137 Accesses
Explore all metrics

Abstract

The rapid and accurate recognition of traffic police hand gestures holds significant importance for intelligent vehicles and smart transportation. However, existing algorithms face challenges in finely distinguishing traffic police gestures in dense crowds, and their recognition speed often fails to meet practical application demands. To address this, our research proposes a method for traffic police gesture recognition based on a dual-module spatial temporal information enhancement graph convolutional network (STIE-GCN). The proposed method introduces the Traffic Police Target Detection and Pose Skeleton Extraction (TD-PSE) to eliminate interference from complex environments on gesture recognition. Subsequently, we incorporate the Synergy Attention Module (SAM) and Keyframe Extraction Module (KEM) into the spatial temporal graph convolutional network to enhance the network’s capability to extract synergistic action features and key action frames. The effectiveness of this method is evaluated on three different datasets, and the experimental results demonstrate that the proposed approach achieves an impressive accuracy of 98.63% in traffic police gesture recognition, with an average model response time of 1.036 s. These results highlight the method’s precision and efficiency in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition

Global and Local Spatial-Attention Network for Isolated Gesture Recognition

Data availability

No datasets were generated or analysed during the current study.

References

Guihuai, W., Jian, W.: Overview of information sensing technology for automobile safety assisted driving support systems. Transp. Comput. 03, 50–54 (2008)
MATH Google Scholar
Cai, Z., Guo, F.: Max-covering scheme for gesture recognition of Chinese traffic police. Pattern Anal. Appl. 18, 403–418 (2015)
Article MathSciNet MATH Google Scholar
Xiaojie, X.: Research on traffic police gesture recognition technology based on computer vision. Harbin Engineering University. https://doi.org/10.27060/d.cnki.ghbcu.2019.000886 (2019)
Lizhi, L., Research on traffic police gesture recognition algorithm based on deep learning. Shanghai University of Engineering and Technology. https://doi.org/10.27715/d.cnki.gshgj.2019.000269 (2019)
Guanghua, Qi., Mingxiang, He.: Convolutional neural network image classification method combined with Inception module. Softw. Guide 19(03), 79–82 (2020)
MATH Google Scholar
Xiong, X., Wu, H., Min, W., et al.: Traffic police gesture recognition based on gesture skeleton extractor and multichannel dilated graph convolution network. Electronics 10(5), 551 (2021)
Article MATH Google Scholar
He, J., Zhang, C., He, X., et al.: Visual recognition of traffic police gestures with convolutional pose machine and handcrafted features. Neurocomputing 390, 248–259 (2020)
Article MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25 (2012).
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1–9 (2015)
Wei, S.E., Ramakrishna, V., Kanade, T., et al.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4724–4732 (2016).
Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision. pp 4489–4497 (2015).
Shi, X., Chen, Z., Wang, H., et al.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Advances in neural information processing systems. 28 (2015).
Baek, T., Lee, Y.G.: Traffic control hand signal recognition using convolution and recurrent neural networks. J. Computat. Design Eng. 9(2), 296–309 (2022)
Article MATH Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. Proc. AAAI Conf. Artif. Intell. (2018). https://doi.org/10.1609/aaai.v32i1.12328
Article MATH Google Scholar
Shi, L., Zhang, Y., Cheng, J., et al.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 12026–12035 (2019)
Chen, Y., Zhang, Z., Yuan, C., et al.: Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 13359–13368 (2021)
Liu, K., Zheng, Y., Yang, J., et al.: Chinese traffic police gesture recognition based on graph convolutional network in natural scene. Appl. Sci. 11(24), 11951 (2021)
Article Google Scholar
He, J., Jiang, S., Wei, X., et al.: A high-resolution approach for dynamic traffic police gestures recognition based on spatial context and temporal features fusion. In: 2023 8th International Conference on Image, Vision and Computing (ICIVC). IEEE. pp 114–119 (2023)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, (2020)
Fang, H.S., Xie, S., Tai, Y.W., et al.: Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2334–2343 (2017).
Song, Y.F., Zhang, Z., Shan, C., et al.: Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 1625–1633 (2020).
Jiangyi, S., Xiaoning, S., Xiaojun, W., et al.: Multimodal lightweight graph convolution human skeleton behavior recognition method. Comput. Sci. Explor. 15(04), 733–742 (2021)
MATH Google Scholar
Babinski, J. Sur le réflexe cutané plantaire dans certains affections organiques du système nerveux central (1896)
Liu, S., Liu, X., Huang, G., et al.: FSD-10: a fine-grained classification dataset for figure skating. Neurocomputing 413, 360–367 (2020)
Article MATH Google Scholar
Shahroudy, A., Liu, J., Ng, T.T., et al.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1010–1019 (2016)
Chen, Z., Li, S., Yang, B., et al.: Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. 35(2): 1113–1122 (2021)
Ye, F., Pu, S., Zhong, Q., et al.: Dynamic gcn: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. 55–63 (2020).

Download references

Funding

This work was supported by the Natural Science Foundation of Anhui Province (2208085MF173) and Anhui Provincial Key Research and Development Plan (202104a05020003).

Author information

Authors and Affiliations

School of Mechanical and Automotive Engineering, Anhui Polytechnic University, Wuhu, 241000, China
Peicheng Shi & Qing Zhang
Department Polytechnic Institute of Zhejiang University, Hangzhou, 310000, China
Aixi Yang

Authors

Peicheng Shi
View author publications
You can also search for this author in PubMed Google Scholar
Qing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Aixi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S. and Z. wrote the main manuscript text, Z. is responsible for obtaining data, Z. and Y. are responsible for the experiment. All authors reviewed the manuscript.

Corresponding author

Correspondence to Peicheng Shi.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 54884 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shi, P., Zhang, Q. & Yang, A. Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures. SIViP 19, 92 (2025). https://doi.org/10.1007/s11760-024-03729-6

Download citation

Received: 25 January 2024
Revised: 14 October 2024
Accepted: 20 November 2024
Published: 08 December 2024
DOI: https://doi.org/10.1007/s11760-024-03729-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition

Global and Local Spatial-Attention Network for Isolated Gesture Recognition

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition

Global and Local Spatial-Attention Network for Isolated Gesture Recognition

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation