Skip to main content

Advertisement

Log in

LGCANet: lightweight hand pose estimation network based on HRNet

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Hand pose estimation is a fundamental task in computer vision with applications in virtual reality, gesture recognition, autonomous driving, and virtual surgery. Keypoint detection often relies on deep learning methods and high-resolution feature map representations to achieve accurate detection. The HRNet framework serves as the basis, but it presents challenges in terms of extensive parameter count and demanding computational complexity due to high-resolution representations. To mitigate these challenges, we propose a lightweight keypoint detection network called LGCANet (Lightweight Ghost-Coordinate Attention Network). This network primarily consists of a lightweight feature extraction head for initial feature extraction and multiple lightweight foundational network modules called GCAblocks. GCAblocks introduce linear transformations to generate redundant feature maps while concurrently considering inter-channel relationships and long-range positional information using a coordinate attention mechanism. Validation on the RHD dataset and the COCO-WholeBody-Hand dataset shows that LGCANet reduces the number of parameters by 65.9% and GFLOPs by 72.6% while preserving the accuracy and improves the detection speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Quy VK, Hau NV, Anh DV, Ngoc LA (2022) Smart healthcare IoT applications based on fog computing: architecture, applications and challenges. Complex Intell Syst 8(5):3805–3815

    Article  Google Scholar 

  2. Moin A, Aadil F, Ali Z, Kang D (2023) Emotion recognition framework using multiple modalities for an effective human-computer interaction. J Supercomput 79(8):9320–9349

    Article  Google Scholar 

  3. Toshpulatov M, Lee W, Lee S, Haghighian Roudsari A (2022) Human pose, hand and mesh estimation using deep learning: a survey. J Supercomput 78(6):7616–7654

    Article  Google Scholar 

  4. Halbig A, Babu SK, Gatter S, Latoschik ME, Brukamp K, Mammen S (2022) Opportunities and challenges of virtual reality in healthcare-a domain experts inquiry. Front Virtual Real 3:14

    Article  Google Scholar 

  5. Tompson J, Stein M, Lecun Y, Perlin K (2014) Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans Gr (ToG) 33(5):1–10

    Article  Google Scholar 

  6. Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 466–481

  7. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5693–5703

  8. Zhang F, Zhu X, Dai H, Ye M, Zhu C (2020) Distribution-aware coordinate representation for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7093–7102

  9. Yu C, Xiao B, Gao C, Yuan L, Zhang L, Sang N, Wang J (2021) Lite-HRNet: a lightweight high-resolution network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10440–10450

  10. Tan M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, PMLR. pp 6105–6114

  11. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 483–499

  12. Li Q, Zhang Z, Xiao F, Zhang F, Bhanu B (2022) Dite-HRNet: dynamic lightweight high-resolution network for human pose estimation. arXiv:2204.10762

  13. Zhang H, Dun Y, Pei Y, Lai S, Liu C, Zhang K, Qian X (2024) HF-HRNet: a simple hardware friendly high-resolution network. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2024.3377365

    Article  Google Scholar 

  14. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

  15. Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6848–6856

  16. Zhong F, Li M, Zhang K, Hu J, Liu L (2021) DSPNet: a low computational-cost network for human pose estimation. Neurocomputing 423:327–335

    Article  Google Scholar 

  17. Noor N, Park IK (2023) A lightweight skeleton-based 3D-CNN for real-time fall detection and action recognition, pp 2179–2188

  18. Li W, Wang J, Ren T, Li F, Zhang J, Wu Z (2022) Learning accurate, speedy, lightweight CNNs via instance-specific multi-teacher knowledge distillation for distracted driver posture identification. IEEE Trans Intell Transp Syst 23(10):17922–17935

    Article  Google Scholar 

  19. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

  20. Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19

  21. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542

  22. Tang X, Wang T, Fu C-W (2021) Towards accurate alignment in real-time 3D hand-mesh reconstruction, pp 11698–11707

  23. Chen P, Chen Y, Yang D, Wu F, Li Q, Xia Q, Tan Y (2021) I2UV-HandNet: image-to-UV prediction network for accurate and high-fidelity 3D hand mesh modeling, pp 12929–12938

  24. Li M, An L, Zhang H, Wu L, Chen F, Yu T, Liu Y (2022) Interacting attention graph for single image two-hand reconstruction, pp 2761–2770

  25. Chen X, Liu Y, Dong Y, Zhang X, Ma C, Xiong Y, Zhang Y, Guo X (2022) MobRecon: mobile-friendly hand mesh reconstruction from monocular image, pp 20544–20554

  26. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  27. Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1580–1589

  28. Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13713–13722

  29. Ma N, Zhang X, Zheng H.-T, Sun J (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 116–131

  30. Zimmermann C, Brox T (2017) Learning to estimate 3D hand pose from single RGB images. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4903–4911

  31. Jin S, Xu L, Xu J, Wang C, Liu W, Qian C, Ouyang W, Luo P (2020) Whole-body human pose estimation in the wild. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 196–214

  32. Contributors M (2020) OpenMMLab pose estimation toolbox and benchmark. https://github.com/open-mmlab/mmpose

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoying Pan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, X., Li, S., Wang, H. et al. LGCANet: lightweight hand pose estimation network based on HRNet. J Supercomput 80, 19351–19373 (2024). https://doi.org/10.1007/s11227-024-06226-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06226-2

Keywords