Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

Fırat, Hüseyin; Üzen, Hüseyin; Atila, Orhan; Şengür, Abdulkadir

doi:10.1007/s11760-024-03664-6

Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

Original Paper
Published: 02 December 2024

Volume 19, article number 35, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Radar-based artificial intelligence (AI) applications have gained significant attention recently, spanning from fall detection to gesture recognition. The growing interest in this field has led to a shift towards deep convolutional networks, and transformers have emerged to address limitations in convolutional neural network methods, becoming increasingly popular in the AI community. In this paper, we present a novel hybrid approach for radar-based traffic hand gesture classification using transformers. Traffic hand gesture recognition (HGR) holds importance in AI applications, and our proposed three-phase approach addresses the efficiency and effectiveness of traffic HGR. In the initial phase, feature vectors are extracted from input radar images using the pre-trained DenseNet-121 model. These features are then consolidated by concatenating them to gather information from diverse radar sensors, followed by a patch extraction operation. The concatenated features from all inputs are processed in the Swin transformer block to facilitate further HGR. The classification stage involves sequential application of global average pooling, Dense, and Softmax layers. To assess the effectiveness of our method on ULM university radar dataset, we employ various performance metrics, including accuracy, precision, recall, and F1-score, achieving an average accuracy score of 90.54%. We compare this score with existing approaches to demonstrate the competitiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Millimeter wave gesture recognition using multi-feature fusion models in complex scenes

Article Open access 14 June 2024

UWB-gestures, a public dataset of dynamic hand gestures acquired using impulse radar sensors

Article Open access 12 April 2021

A novel F-RCNN based hand gesture detection approach for FMCW systems

Article 16 July 2019

Data availability

No datasets were generated or analysed during the current study.

References

Rasouli, A., Tsotsos, J.K.: Autonomous vehicles that interact with pedestrians: a survey of theory and practice. IEEE Trans. Intell. Transp. Syst.Intell. Transp. Syst. 21, 900–918 (2020). https://doi.org/10.1109/TITS.2019.2901817
Article MATH Google Scholar
Ohn-Bar, E., Trivedi, M.M.: Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans. Intell. Veh. 1, 90–104 (2016). https://doi.org/10.1109/TIV.2016.2571067
Article MATH Google Scholar
Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst.Intell. Transp. Syst. 15, 2368–2377 (2014). https://doi.org/10.1109/TITS.2014.2337331
Article MATH Google Scholar
Gupta, S., Vasardani, M., Winter, S.: Negotiation between vehicles and pedestrians for the right of way at intersections. IEEE Trans. Intell. Transp. Syst.Intell. Transp. Syst. 20, 888–899 (2019). https://doi.org/10.1109/TITS.2018.2836957
Article MATH Google Scholar
Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3D convolutional neural networks. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work. (2015). https://doi.org/10.1109/CVPRW.2015.7301342
Article Google Scholar
Chamorro, S., Collier, J., Grondin, F.: Neural network based lidar gesture recognition for realtime robot teleoperation. In: 2021 IEEE Int. Symp. Safety, Secur. Rescue Robot. SSRR 2021. 98–103 (2021). https://doi.org/10.1109/SSRR53300.2021.9597855
Sang, Y., Shi, L., Liu, Y.: Micro hand gesture recognition system using ultrasonic active sensing. IEEE Access. 6, 49339–49347 (2018). https://doi.org/10.1109/ACCESS.2018.2868268
Article Google Scholar
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev.. Intell. Rev. 43, 1–54 (2015). https://doi.org/10.1007/s10462-012-9356-9
Article MATH Google Scholar
Korti, D.S., Slimane, Z., Lakhdari, K.: Enhancing dynamic hand gesture recognition using feature concatenation via multi-input hybrid model. Int. J. Electr. Comput. Eng. Syst. 14, 535–546 (2023). https://doi.org/10.32985/ijeces.14.5.5
Article Google Scholar
Guo, L., Lu, Z., Yao, L.: Human-machine interaction sensing technology based on hand gesture recognition: a review. IEEE Trans. Human-Machine Syst. 51, 300–309 (2021). https://doi.org/10.1109/THMS.2021.3086003
Article MATH Google Scholar
Van Amsterdam, B., Clarkson, M.J., Stoyanov, D.: Gesture recognition in robotic surgery: a review. IEEE Trans. Biomed. Eng. 68, 2021–2035 (2021). https://doi.org/10.1109/TBME.2021.3054828
Article MATH Google Scholar
Jin, B., Ma, X., Zhang, Z., Lian, Z., Wang, B.: Interference-robust millimeter-wave radar-based dynamic hand gesture recognition using 2D CNN-transformer networks. IEEE Internet Things J. (2023). https://doi.org/10.1109/JIOT.2023.3293092
Article MATH Google Scholar
Wang, C., Zhao, X., Li, Z.: DCS-CTN: subtle gesture recognition based on TD-CNN-transformer via millimeter-wave radar. IEEE Internet Things J. 10, 17680–17693 (2023). https://doi.org/10.1109/JIOT.2023.3280227
Article Google Scholar
Liu, H., Liu, Z.: A multimodal dynamic hand gesture recognition based on radar-vision fusion. IEEE Trans. Instrum. Meas.Instrum. Meas. 72, 1–15 (2023). https://doi.org/10.1109/TIM.2023.3253906
Article MATH Google Scholar
Zhao, P., Lu, C.X., Wang, B., Trigoni, N., Markham, A.: CubeLearn: end-to-end learning for human motion recognition from raw mmWave radar signals. IEEE Internet Things J. 10, 10236–10249 (2023). https://doi.org/10.1109/JIOT.2023.3237494
Article MATH Google Scholar
Mao, Y., Zhao, L., Liu, C., Ling, M.: A low-complexity hand gesture recognition framework via dual mmWave FMCW radar system. Sensors (Basel). (2023). https://doi.org/10.3390/s23208551
Article MATH Google Scholar
Kern, N., Grebner, T., Waldschmidt, C.: PointNet+LSTM for target list-based gesture recognition with incoherent radar networks. IEEE Trans. Aerosp. Electron. Syst.Aerosp. Electron. Syst. 58, 5675–5686 (2022). https://doi.org/10.1109/TAES.2022.3179248
Article MATH Google Scholar
Sharma, R.R., Kumar, K.A., Cho, S.H.: Novel time-distance parameters based hand gesture recognition system using multi-UWB radars. IEEE Sensors Lett. 7, 1–4 (2023). https://doi.org/10.1109/LSENS.2023.3268065
Article MATH Google Scholar
Guo, Z., Guendel, R.G., Yarovoy, A., Fioranelli, F.: Point transformer-based human activity recognition using high-dimensional radar point clouds. In: Proc. IEEE Radar Conf. 2023 (2023). https://doi.org/10.1109/RadarConf2351548.2023.10149679
Gao, H., Li, C.: Automated violin bowing gesture recognition using FMCW-radar and machine learning. IEEE Sens. J. 23, 9262–9270 (2023). https://doi.org/10.1109/JSEN.2023.3263513
Article MATH Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017. (2017). https://doi.org/10.1109/CVPR.2017.243
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proc. IEEE Int. Conf. Comput. Vis. 9992–10002 (2021). https://doi.org/10.1109/ICCV48922.2021.00986
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An ımage is worth 16 × 16 words: transformers for ımage recognition at scale. In: ICLR 2021 (2021)
Traffic Gesture Dataset, https://www.uni-ulm.de/in/mwt/forschung/online-datenbank/traffic-gesture-dataset/
Flach, P.A., Kull, M.: Precision-recall-gain curves: PR analysis done right. Adv. Neural Inf. Process. Syst. 2015, 838–846 (2015)
Kim, Y., Toomajian, B.: Hand gesture recognition using micro-Doppler signatures with convolutional neural network. IEEE Access. 4, 7125–7130 (2016). https://doi.org/10.1109/ACCESS.2016.2617282
Article MATH Google Scholar

Download references

Funding

None.

Author information

Authors and Affiliations

Department of Computer Engineering, Faculty of Engineering, Dicle University, Diyarbakır, Turkey
Hüseyin Fırat
Department of Computer Engineering, Faculty of Engineering and Architecture, Bingol University, Bingöl, Turkey
Hüseyin Üzen
Technology Faculty, Electrical and Electronics Engineering Department, Firat University, Elazig, Turkey
Orhan Atila
Department of Electrical-Electronics Engineering, Faculty of Technology, Firat University, Elazig, Turkey
Abdulkadir Şengür

Authors

Hüseyin Fırat
View author publications
You can also search for this author in PubMed Google Scholar
Hüseyin Üzen
View author publications
You can also search for this author in PubMed Google Scholar
Orhan Atila
View author publications
You can also search for this author in PubMed Google Scholar
Abdulkadir Şengür
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Hüseyin FIRAT: Conceptualization, Discussed the results, Writing—Original Draft Preparation, Validation, Formal analysis. Hüseyin ÜZEN: Conceptualization, Methodology, Software, Writing- Original Draft Preparation, Visualization. Orhan ATİLA: Conceptualization, Discussed the results, Writing—Original Draft Preparation, Validation, Formal analysis. Abdulkadir ŞENGÜR: Reviewing and Editing, Discussed the results, Validation, Supervision.

Corresponding author

Correspondence to Hüseyin Fırat.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fırat, H., Üzen, H., Atila, O. et al. Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images. SIViP 19, 35 (2025). https://doi.org/10.1007/s11760-024-03664-6

Download citation

Received: 04 June 2024
Revised: 30 August 2024
Accepted: 28 September 2024
Published: 02 December 2024
DOI: https://doi.org/10.1007/s11760-024-03664-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Millimeter wave gesture recognition using multi-feature fusion models in complex scenes

UWB-gestures, a public dataset of dynamic hand gestures acquired using impulse radar sensors

A novel F-RCNN based hand gesture detection approach for FMCW systems

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Millimeter wave gesture recognition using multi-feature fusion models in complex scenes

UWB-gestures, a public dataset of dynamic hand gestures acquired using impulse radar sensors

A novel F-RCNN based hand gesture detection approach for FMCW systems

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation