Lightweight multimodal feature graph convolutional network for dangerous driving behavior detection

Wei, Xing; Yao, Shang; Zhao, Chong; Hu, Di; Luo, Hui; Lu, Yang

doi:10.1007/s11554-023-01277-9

Lightweight multimodal feature graph convolutional network for dangerous driving behavior detection

Original Research Paper
Published: 10 February 2023

Volume 20, article number 15, (2023)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Xing Wei^1,2,3,
Shang Yao¹,
Chong Zhao^2,4,
Di Hu²,
Hui Luo² &
…
Yang Lu^1,3

335 Accesses
3 Citations
Explore all metrics

Abstract

Real-time detection and identification of dangerous driving behaviors is an effective measure to reduce traffic accidents. Due to the high network delay, limited communication bandwidth, and weak computing power, lightweight detection models that can run on edge devices have been widely investigated and attracted considerable attention. In recent years, the Graph Convolutional Network (GCN), which models the human skeleton as a spatiotemporal graph, has achieved remarkable performance, due to its powerful capability of modeling non-Euclidean structured data. However, there are disadvantages such as the unitary way of extracting information, high model complexity, and inability to integrate environmental information. Therefore, we design a Lightweight Multimodal Feature Graph Convolutional Network (L-MFGCN) model for dangerous driving behavior detection video in an end-to-end manner. First, we propose a Multimodal Feature Graph Convolutional Neural Network (MF-GCN), which captures richer features by extracting critical local spatial and temporal information of joint points, and a multi-information fusion behavior recognition model of “people + objects” by capturing the motion information of related object. Then, the method based on Singular Value Decomposition (SVD) rank reduction is used to compress the model to improve the speed of recognizing an action sample while ensuring sufficient detection accuracy. The proposed model, respectively, achieves 96% and 86.3% accuracy on the x-view benchmark of NTU-RGBD dataset and the homemade Locomotive Driver Dataset, which attains the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Xia Zhao, Limin Wang, … Milan Parmar

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Article 12 August 2023

Pranjal Kumar, Siddhartha Chauhan & Lalit Kumar Awasthi

References

Authors, P.: Paddledetection, object detection and instance segmentation toolkit based on paddlepaddle. https://github.com/PaddlePaddle/PaddleDetection (2019)
Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013)
Caetano, C., Sena, J., Brémond, F., Dos Santos, J.A., Schwartz, W.R.: Skelemotion: a new representation of skeleton joint sequences based on motion information for 3d action recognition. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). pp. 1–8. IEEE (2019)
Cho, S., Maqbool, M., Liu, F., Foroosh, H.: Self-attention network for skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 635–644 (2020)
Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural Inf. Process. Syst. 29 (2016)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. Adv. Neural Inf. Process. Syst. 27 (2014)
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1110–1118 (2015)
Duvenaud, D.K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., Adams, R.P.: Convolutional networks on graphs for learning molecular fingerprints. Adv. Neural Inf. Process. Syst. 28 (2015)
Giles, M.: An extended collection of matrix derivative results for forward and reverse mode automatic differentiation (2008)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 30 (2017)
Henaff, M., Bruna, J., LeCun, Y.: Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015)
Idelbayev, Y., Carreira-Perpinán, M.A.: Low-rank compression of neural nets: learning the rank of each layer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8049–8059 (2020)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
Joze, H.R.V., Shaban, A., Iuzzolino, M.L., Koishida, K.: Mmtm: multimodal transfer module for CNN fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 13289–13299 (2020)
Ke, Q., Bennamoun, M., An, S., Sohel, F., Boussaid, F.: A new representation of skeleton sequences for 3d action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3288–3297 (2017)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Krishnan, D., Tay, T., Fergus, R.: Blind deconvolution using a normalized sparsity measure. In: CVPR 2011. pp. 233–240. IEEE (2011)
Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553 (2014)
Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., Tian, Q.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3595–3603 (2019)
Li, C., Xie, C., Zhang, B., Han, J., Zhen, X., Chen, J.: Memory attention networks for skeleton-based action recognition. IEEE Trans Neural Netw Learn Syst (2021)
Li, C., Zhong, Q., Xie, D., Pu, S.: Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. arXiv preprint arXiv:1804.06055 (2018)
Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal lstm with trust gates for 3d human action recognition. In: European Conference on Computer Vision. pp. 816–833. Springer (2016)
Liu, H., Tu, J., Liu, M.: Two-stream 3d convolutional neural network for skeleton-based action recognition. arXiv preprint arXiv:1705.08106 (2017)
Moczulski, M., Denil, M., Appleyard, J., de Freitas, N.: Acdc: a structured efficient linear layer. arXiv preprint arXiv:1511.05946 (2015)
Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 5115–5124 (2017)
Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: International Conference on Machine Learning. pp. 2014–2023. PMLR (2016)
Peng, W., Hong, X., Chen, H., Zhao, G.: Learning graph convolutional network for skeleton-based human action recognition by neural searching. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 2669–2676 (2020)
RangiLyu: Nanodet-plus: super fast and high accuracy lightweight anchor-free object detection model. https://github.com/RangiLyu/nanodet (2021)
Shahroudy, A., Liu, J., Ng, T.T., Wang, G.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1010–1019 (2016)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7912–7921 (2019)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 12026–12035 (2019)
Si, C., Chen, W., Wang, W., Wang, L., Tan, T.: An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1227–1236 (2019)
Si, C., Jing, Y., Wang, W., Wang, L., Tan, T.: Skeleton-based action recognition with spatial reasoning and temporal stack learning. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 103–118 (2018)
Song, S., Lan, C., Xing, J., Zeng, W., Liu, J.: An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 31 (2017)
Song, Y.F., Zhang, Z., Shan, C., Wang, L.: Richly activated graph convolutional network for robust skeleton-based action recognition. IEEE Trans. Circ. Syst. Video Technol. 31(5), 1915–1925 (2020)
Article Google Scholar
Tai, C., Xiao, T., Zhang, Y., Wang, X., et al.: Convolutional neural networks with low-rank regularization. arXiv preprint arXiv:1511.06067 (2015)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI conference on artificial intelligence (2018)
Ye, F., Pu, S., Zhong, Q., Li, C., Xie, D., Tang, H.: Dynamic gcn: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp. 55–63 (2020)
Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive recurrent neural networks for high performance human action recognition from skeleton data. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2117–2126 (2017)
Zhang, P., Lan, C., Zeng, W., Xing, J., Xue, J., Zheng, N.: Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1112–1121 (2020)
Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2015)
Article Google Scholar

Download references

Acknowledgements

This work was supported by Open fund of Intelligent Interconnected Systems Laboratory of Anhui Province (PA2021AKSK0107), Joint Fund of Natural Science Foundation of Anhui Province in 2020 (2008085UD08), Anhui Provincial Key R &D Program (202004a05020004), Intelligent Networking and New Energy Vehicle Special Project of Intelligent Manufacturing Institute of HFUT (IMIWL2019003, IMIDC2019002).

Author information

Authors and Affiliations

School of Computer and Information, Hefei University of Technology, Hefei, China
Xing Wei, Shang Yao & Yang Lu
Intelligent Manufacturing Institute of Hefei University of Technology, Hefei, China
Xing Wei, Chong Zhao, Di Hu & Hui Luo
Engineering Research Center of Safety Critical Industrial Measurement and Control Technology, Ministry of Education, Hefei, China
Xing Wei & Yang Lu
Engineering Quality Education Center of Undergraduate School, Hefei University of Technology, Hefei, China
Chong Zhao

Authors

Xing Wei
View author publications
You can also search for this author in PubMed Google Scholar
Shang Yao
View author publications
You can also search for this author in PubMed Google Scholar
Chong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Di Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Luo
View author publications
You can also search for this author in PubMed Google Scholar
Yang Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xing Wei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wei, X., Yao, S., Zhao, C. et al. Lightweight multimodal feature graph convolutional network for dangerous driving behavior detection. J Real-Time Image Proc 20, 15 (2023). https://doi.org/10.1007/s11554-023-01277-9

Download citation

Received: 04 July 2022
Accepted: 07 November 2022
Published: 10 February 2023
DOI: https://doi.org/10.1007/s11554-023-01277-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight multimodal feature graph convolutional network for dangerous driving behavior detection

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of convolutional neural networks in computer vision

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A review of convolutional neural networks in computer vision

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation