Conferences >2023 18th International Confe...

YOLOv5 with Mixed Backbone for Efficient Spatio-Temporal Hand Gesture Localization and Recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Spatio-temporal Hand Gesture Localization and Recognition (SHGLR) refers to analyzing the spatial and temporal aspects of hand movements for detecting and identifying han...Show More

Metadata

Abstract:

Spatio-temporal Hand Gesture Localization and Recognition (SHGLR) refers to analyzing the spatial and temporal aspects of hand movements for detecting and identifying hand gestures in a video. Current state-of-the-art approaches for SHGLR utilize large and complex architectures that result in a high computational cost. To address this issue, we present a new efficient method based on a mixed backbone for YOLOv5. We decided to use it since it is a lightweight and one-stage framework. We designed a mixed backbone that combines 2D and 3D convolutions to obtain temporal information from previous frames. The proposed method offers an efficient way to perform SHGLR on videos by inflating specific convolutions of the backbone while keeping a similar computational cost to the conventional YOLOv5. Due to its challenging and continuous hand gestures, we conduct experiments using the IPN Hand dataset. Our proposed method achieves a frame mAP@0.5 of 66.52% with a 6-frame clip input, outperforming conventional YOLOv5 by 7.89%, demonstrating the effectiveness of our approach.

Published in: 2023 18th International Conference on Machine Vision and Applications (MVA)

Date of Conference: 23-25 July 2023

Date Added to IEEE Xplore: 22 August 2023

ISBN Information:

DOI: 10.23919/MVA57639.2023.10215605

Conference Location: Hamamatsu, Japan

Contents

References is not available for this document.

YOLOv5 with Mixed Backbone for Efficient Spatio-Temporal Hand Gesture Localization and Recognition

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

YOLOv5 with Mixed Backbone for Efficient Spatio-Temporal Hand Gesture Localization and Recognition

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?