Skip to main content
Log in

Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

The rapid and accurate recognition of traffic police hand gestures holds significant importance for intelligent vehicles and smart transportation. However, existing algorithms face challenges in finely distinguishing traffic police gestures in dense crowds, and their recognition speed often fails to meet practical application demands. To address this, our research proposes a method for traffic police gesture recognition based on a dual-module spatial temporal information enhancement graph convolutional network (STIE-GCN). The proposed method introduces the Traffic Police Target Detection and Pose Skeleton Extraction (TD-PSE) to eliminate interference from complex environments on gesture recognition. Subsequently, we incorporate the Synergy Attention Module (SAM) and Keyframe Extraction Module (KEM) into the spatial temporal graph convolutional network to enhance the network’s capability to extract synergistic action features and key action frames. The effectiveness of this method is evaluated on three different datasets, and the experimental results demonstrate that the proposed approach achieves an impressive accuracy of 98.63% in traffic police gesture recognition, with an average model response time of 1.036 s. These results highlight the method’s precision and efficiency in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Guihuai, W., Jian, W.: Overview of information sensing technology for automobile safety assisted driving support systems. Transp. Comput. 03, 50–54 (2008)

    MATH  Google Scholar 

  2. Cai, Z., Guo, F.: Max-covering scheme for gesture recognition of Chinese traffic police. Pattern Anal. Appl. 18, 403–418 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  3. Xiaojie, X.: Research on traffic police gesture recognition technology based on computer vision. Harbin Engineering University. https://doi.org/10.27060/d.cnki.ghbcu.2019.000886 (2019)

  4. Lizhi, L., Research on traffic police gesture recognition algorithm based on deep learning. Shanghai University of Engineering and Technology. https://doi.org/10.27715/d.cnki.gshgj.2019.000269 (2019)

  5. Guanghua, Qi., Mingxiang, He.: Convolutional neural network image classification method combined with Inception module. Softw. Guide 19(03), 79–82 (2020)

    MATH  Google Scholar 

  6. Xiong, X., Wu, H., Min, W., et al.: Traffic police gesture recognition based on gesture skeleton extractor and multichannel dilated graph convolution network. Electronics 10(5), 551 (2021)

    Article  MATH  Google Scholar 

  7. He, J., Zhang, C., He, X., et al.: Visual recognition of traffic police gestures with convolutional pose machine and handcrafted features. Neurocomputing 390, 248–259 (2020)

    Article  MATH  Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25 (2012).

  9. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1–9 (2015)

  10. Wei, S.E., Ramakrishna, V., Kanade, T., et al.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4724–4732 (2016).

  11. Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision. pp 4489–4497 (2015).

  12. Shi, X., Chen, Z., Wang, H., et al.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Advances in neural information processing systems. 28 (2015).

  13. Baek, T., Lee, Y.G.: Traffic control hand signal recognition using convolution and recurrent neural networks. J. Computat. Design Eng. 9(2), 296–309 (2022)

    Article  MATH  Google Scholar 

  14. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. Proc. AAAI Conf. Artif. Intell. (2018). https://doi.org/10.1609/aaai.v32i1.12328

    Article  MATH  Google Scholar 

  15. Shi, L., Zhang, Y., Cheng, J., et al.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 12026–12035 (2019)

  16. Chen, Y., Zhang, Z., Yuan, C., et al.: Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 13359–13368 (2021)

  17. Liu, K., Zheng, Y., Yang, J., et al.: Chinese traffic police gesture recognition based on graph convolutional network in natural scene. Appl. Sci. 11(24), 11951 (2021)

    Article  Google Scholar 

  18. He, J., Jiang, S., Wei, X., et al.: A high-resolution approach for dynamic traffic police gestures recognition based on spatial context and temporal features fusion. In: 2023 8th International Conference on Image, Vision and Computing (ICIVC). IEEE. pp 114–119 (2023)

  19. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, (2020)

  20. Fang, H.S., Xie, S., Tai, Y.W., et al.: Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2334–2343 (2017).

  21. Song, Y.F., Zhang, Z., Shan, C., et al.: Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 1625–1633 (2020).

  22. Jiangyi, S., Xiaoning, S., Xiaojun, W., et al.: Multimodal lightweight graph convolution human skeleton behavior recognition method. Comput. Sci. Explor. 15(04), 733–742 (2021)

    MATH  Google Scholar 

  23. Babinski, J. Sur le réflexe cutané plantaire dans certains affections organiques du système nerveux central (1896)

  24. Liu, S., Liu, X., Huang, G., et al.: FSD-10: a fine-grained classification dataset for figure skating. Neurocomputing 413, 360–367 (2020)

    Article  MATH  Google Scholar 

  25. Shahroudy, A., Liu, J., Ng, T.T., et al.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1010–1019 (2016)

  26. Chen, Z., Li, S., Yang, B., et al.: Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. 35(2): 1113–1122 (2021)

  27. Ye, F., Pu, S., Zhong, Q., et al.: Dynamic gcn: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. 55–63 (2020).

Download references

Funding

This work was supported by the Natural Science Foundation of Anhui Province (2208085MF173) and Anhui Provincial Key Research and Development Plan (202104a05020003).

Author information

Authors and Affiliations

Authors

Contributions

S. and Z. wrote the main manuscript text, Z. is responsible for obtaining data, Z. and Y. are responsible for the experiment. All authors reviewed the manuscript.

Corresponding author

Correspondence to Peicheng Shi.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 54884 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, P., Zhang, Q. & Yang, A. Dual-module spatial temporal information enhancement graph convolutional network for recognizing traffic police command gestures. SIViP 19, 92 (2025). https://doi.org/10.1007/s11760-024-03729-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03729-6

Keywords

Navigation