PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms

Zhou, Weina; Li, Xile

doi:10.1007/s11760-023-02755-0

PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms

Original Paper
Published: 05 October 2023

Volume 18, pages 597–605, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Weina Zhou¹ &
Xile Li¹

620 Accesses
Explore all metrics

Abstract

Gesture recognition has been widely used in many human–computer interaction applications, which is one of the most intuitive and natural ways for humans to communicate with computers. However, it remains a challenging problem due to the interference such as variety of backgrounds, hand similar object, and lighting changes. In this article, a lightweight static gesture recognition network, named as PEA-YOLO, was put forward. The network adopts the idea of adaptive spatial feature pyramid and combines the attention mechanism and multi-path feature fusion method to improve the localization and recognition performance of gesture features. First, Efficient Channel Attention module was added after the backbone network to focus the model’s attention on the gesture. Second, Feature Pyramid Network was replaced by Path Aggregation Network to localize the gesture better. Finally, Adaptive Spatial Feature Fusion module was added before the Yolo head to further reduce false detections rate in gesture recognition. The experiments conducted on the OUHANDS and NUSII datasets show that PEA-YOLO could achieve favorable performance with only 8.57 M parameters in static gesture recognition. Compared with other state of the arts, the proposed lightweight network has obtained a highest accuracy with a much high speed and few parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Att-PyNet: An Attention Pyramidal Feature Network for Hand Gesture Recognition

SpAtNet: a spatial feature attention network for hand gesture recognition

Article 13 October 2023

Gesture Recognition Method Based on Sim-ConvNeXt Model

Data availability

The data and materials that support the findings of this study are available on request from the authors.

References

Redrovan, D.V., Kim, D.: Hand gestures recognition using machine learning for control of multiple quadrotors. In: 2018 IEEE Sensors Applications Symposium (SAS), pp. 1–6 (2018). https://doi.org/10.1109/SAS.2018.8336782
Zhou, W., Chen, K.: A lightweight hand gesture recognition in complex backgrounds. Displays (2022)
Padam Priyal, S., Bora, P.K.: A robust static hand gesture recognition system using geometry based normalizations and krawtchouk moments. Pattern Recognit. 46(8), 2202–2219 (2013)
Article Google Scholar
Avraam, M.: Static gesture recognition combining graph and appearance features. Int. J. Adv. Res. Artif. Intell. 3(2) (2014)
Wu, C.H., Chen, W.L., Lin, C.H.: Depth-based hand gesture recognition. Multimedia Tools Appl. 75(12), 7065–7086 (2016)
Article Google Scholar
Wu, X.Y.: A hand gesture recognition algorithm based on DC-CNN. Multimedia Tools Appl. 79(13–14), 9193–9205 (2020)
Article Google Scholar
Yadav, K.S., Anish Monsley, K., Laskar, R.H.: Gesture objects detection and tracking for virtual text entry keyboard interface. Multimedia Tools Appl. 82(4), 5317–5342 (2023)
Article Google Scholar
Wang, W., He, M., Wang, X., Ma, J., Song, H.: Medical gesture recognition method based on improved lightweight network. Appl. Sci. 12(13), 6414 (2022)
Article Google Scholar
Diwan, T., Anirudh, G., Tembhurne, J.V.: Object detection using yolo: challenges, architectural successors, datasets and applications. Multimedia Tools Appl. 1–33 (2022)
Yadav, K.S., Laskar, R.H., Ahmad, N., et al.: Exploration of deep learning models for localizing bare-hand in the practical environment. Eng. Appl. Artif. Intell. 123, 106253 (2023)
Sun, S., Han, L., Wei, J., Hao, H., Huang, J., Xin, W., Zhou, X., Kang, P.: Shufflenetv2-yolov3: a real-time recognition method of static sign language based on a lightweight network. Signal Image Video Process. 1–9 (2023)
Lim, J.-S., Astrid, M., Yoon, H.-J., Lee, S.-I.: Small object detection using context and attention. In: 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 181–186. IEEE (2021)
Zhang, Y., Yi, P., Zhou, D., Yang, X., Yang, D., Zhang, Q., Wei, X.: Csanet: channel and spatial mixed attention cnn for pedestrian detection. IEEE Access 8, 76243–76252 (2020). https://doi.org/10.1109/ACCESS.2020.2986476
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Liu, S., Huang, D., Wang, Y.: Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516 (2019)
Li, X., Pan, J., Xie, F., Zeng, J., Li, Q., Huang, X., Liu, D., Wang, X.: Fast and accurate green pepper detection in complex backgrounds via an improved Yolov4-tiny model. Comput. Electron. Agric. 191, 106503 (2021)
Article Google Scholar
Matilainen, M., Sangi, P., Holappa, J., Silvén, O.: Ouhands database for hand detection and pose recognition. In: 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–5. IEEE (2016)
Pisharady, P.K., Vadakkepat, P., Loh, A.P.: Attention based detection and recognition of hand postures against complex backgrounds. Int. J. Comput. Vis. 101, 403–419 (2013)
Article Google Scholar

Download references

Funding

This research was funded in part by the State Key Laboratory of ASIC & System (2021KF010) and National Natural Science Foundation of China (Grant No. 61404083).

Author information

Authors and Affiliations

Information Engineering College, Shanghai Maritime University, Shanghai, 201306, China
Weina Zhou & Xile Li

Authors

Weina Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xile Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

WZ contributed to conceptualization, methodology, resources, supervision, writing—review and editing, project administration. XL contributed to methodology, software, validation, formal analysis, investigation, data curation, writing—original draft, and visualization.

Corresponding author

Correspondence to Weina Zhou.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, W., Li, X. PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms. SIViP 18, 597–605 (2024). https://doi.org/10.1007/s11760-023-02755-0

Download citation

Received: 02 July 2023
Revised: 10 August 2023
Accepted: 20 August 2023
Published: 05 October 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11760-023-02755-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms

Abstract

Access this article

Similar content being viewed by others

Att-PyNet: An Attention Pyramidal Feature Network for Hand Gesture Recognition

SpAtNet: a spatial feature attention network for hand gesture recognition

Gesture Recognition Method Based on Sim-ConvNeXt Model

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms

Abstract

Access this article

Similar content being viewed by others

Att-PyNet: An Attention Pyramidal Feature Network for Hand Gesture Recognition

SpAtNet: a spatial feature attention network for hand gesture recognition

Gesture Recognition Method Based on Sim-ConvNeXt Model

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation