LGCANet: lightweight hand pose estimation network based on HRNet

Pan, Xiaoying; Li, Shoukun; Wang, Hao; Wang, Beibei; Wang, Haoyi

doi:10.1007/s11227-024-06226-2

LGCANet: lightweight hand pose estimation network based on HRNet

Published: 26 May 2024

Volume 80, pages 19351–19373, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Xiaoying Pan^1,2,
Shoukun Li^1,2^na1,
Hao Wang^3,4^na1,
Beibei Wang^1,2^na1 &
…
Haoyi Wang⁵^na1

350 Accesses
1 Citation
Explore all metrics

Abstract

Hand pose estimation is a fundamental task in computer vision with applications in virtual reality, gesture recognition, autonomous driving, and virtual surgery. Keypoint detection often relies on deep learning methods and high-resolution feature map representations to achieve accurate detection. The HRNet framework serves as the basis, but it presents challenges in terms of extensive parameter count and demanding computational complexity due to high-resolution representations. To mitigate these challenges, we propose a lightweight keypoint detection network called LGCANet (Lightweight Ghost-Coordinate Attention Network). This network primarily consists of a lightweight feature extraction head for initial feature extraction and multiple lightweight foundational network modules called GCAblocks. GCAblocks introduce linear transformations to generate redundant feature maps while concurrently considering inter-channel relationships and long-range positional information using a coordinate attention mechanism. Validation on the RHD dataset and the COCO-WholeBody-Hand dataset shows that LGCANet reduces the number of parameters by 65.9% and GFLOPs by 72.6% while preserving the accuracy and improves the detection speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

LSDNet: lightweight stochastic depth network for human pose estimation

Article 16 March 2024

SAHF-LightPoseResNet: Spatially-Aware Attention-Based Hierarchical Features Enabled Lightweight PoseResNet for 2D Human Pose Estimation

LiteHandNet: A Lightweight Hand Pose Estimation Network via Structural Feature Enhancement

References

Quy VK, Hau NV, Anh DV, Ngoc LA (2022) Smart healthcare IoT applications based on fog computing: architecture, applications and challenges. Complex Intell Syst 8(5):3805–3815
Article Google Scholar
Moin A, Aadil F, Ali Z, Kang D (2023) Emotion recognition framework using multiple modalities for an effective human-computer interaction. J Supercomput 79(8):9320–9349
Article Google Scholar
Toshpulatov M, Lee W, Lee S, Haghighian Roudsari A (2022) Human pose, hand and mesh estimation using deep learning: a survey. J Supercomput 78(6):7616–7654
Article Google Scholar
Halbig A, Babu SK, Gatter S, Latoschik ME, Brukamp K, Mammen S (2022) Opportunities and challenges of virtual reality in healthcare-a domain experts inquiry. Front Virtual Real 3:14
Article Google Scholar
Tompson J, Stein M, Lecun Y, Perlin K (2014) Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans Gr (ToG) 33(5):1–10
Article Google Scholar
Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 466–481
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5693–5703
Zhang F, Zhu X, Dai H, Ye M, Zhu C (2020) Distribution-aware coordinate representation for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7093–7102
Yu C, Xiao B, Gao C, Yuan L, Zhang L, Sang N, Wang J (2021) Lite-HRNet: a lightweight high-resolution network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10440–10450
Tan M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, PMLR. pp 6105–6114
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 483–499
Li Q, Zhang Z, Xiao F, Zhang F, Bhanu B (2022) Dite-HRNet: dynamic lightweight high-resolution network for human pose estimation. arXiv:2204.10762
Zhang H, Dun Y, Pei Y, Lai S, Liu C, Zhang K, Qian X (2024) HF-HRNet: a simple hardware friendly high-resolution network. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2024.3377365
Article Google Scholar
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6848–6856
Zhong F, Li M, Zhang K, Hu J, Liu L (2021) DSPNet: a low computational-cost network for human pose estimation. Neurocomputing 423:327–335
Article Google Scholar
Noor N, Park IK (2023) A lightweight skeleton-based 3D-CNN for real-time fall detection and action recognition, pp 2179–2188
Li W, Wang J, Ren T, Li F, Zhang J, Wu Z (2022) Learning accurate, speedy, lightweight CNNs via instance-specific multi-teacher knowledge distillation for distracted driver posture identification. IEEE Trans Intell Transp Syst 23(10):17922–17935
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542
Tang X, Wang T, Fu C-W (2021) Towards accurate alignment in real-time 3D hand-mesh reconstruction, pp 11698–11707
Chen P, Chen Y, Yang D, Wu F, Li Q, Xia Q, Tan Y (2021) I2UV-HandNet: image-to-UV prediction network for accurate and high-fidelity 3D hand mesh modeling, pp 12929–12938
Li M, An L, Zhang H, Wu L, Chen F, Yu T, Liu Y (2022) Interacting attention graph for single image two-hand reconstruction, pp 2761–2770
Chen X, Liu Y, Dong Y, Zhang X, Ma C, Xiong Y, Zhang Y, Guo X (2022) MobRecon: mobile-friendly hand mesh reconstruction from monocular image, pp 20544–20554
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1580–1589
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13713–13722
Ma N, Zhang X, Zheng H.-T, Sun J (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 116–131
Zimmermann C, Brox T (2017) Learning to estimate 3D hand pose from single RGB images. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4903–4911
Jin S, Xu L, Xu J, Wang C, Liu W, Qian C, Ouyang W, Luo P (2020) Whole-body human pose estimation in the wild. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 196–214
Contributors M (2020) OpenMMLab pose estimation toolbox and benchmark. https://github.com/open-mmlab/mmpose

Download references

Author information

Shoukun Li, Hao Wang, Beibei Wang and Haoyi Wang have contributed equally.

Authors and Affiliations

School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an, 710121, Shaanxi, China
Xiaoying Pan, Shoukun Li & Beibei Wang
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an University of Posts and Telecommunications, Xi’an, 710121, Shaanxi, China
Xiaoying Pan, Shoukun Li & Beibei Wang
School of Software, Northwestern Polytechnical University, Xi’an, 710121, Shaanxi, China
Hao Wang
National Engineering Laboratory for Air Earth-Sea Integration Big Data Application Technology, Northwestern Polytechnical University, Xi’an, 710121, Shaanxi, China
Hao Wang
Westa College, Southwest University, Chongqing, China
Haoyi Wang

Authors

Xiaoying Pan
View author publications
You can also search for this author inPubMed Google Scholar
Shoukun Li
View author publications
You can also search for this author inPubMed Google Scholar
Hao Wang
View author publications
You can also search for this author inPubMed Google Scholar
Beibei Wang
View author publications
You can also search for this author inPubMed Google Scholar
Haoyi Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xiaoying Pan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pan, X., Li, S., Wang, H. et al. LGCANet: lightweight hand pose estimation network based on HRNet. J Supercomput 80, 19351–19373 (2024). https://doi.org/10.1007/s11227-024-06226-2

Download citation

Accepted: 12 May 2024
Published: 26 May 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s11227-024-06226-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LGCANet: lightweight hand pose estimation network based on HRNet

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LSDNet: lightweight stochastic depth network for human pose estimation

SAHF-LightPoseResNet: Spatially-Aware Attention-Based Hierarchical Features Enabled Lightweight PoseResNet for 2D Human Pose Estimation

LiteHandNet: A Lightweight Hand Pose Estimation Network via Structural Feature Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now