TBSA-Net: A Temperature-Based Structure-Aware Hand Pose Estimation Model in Infrared Images

Xia, Hongfu; Li, Yang; Liu, Chunyan; Zhao, Yunlong

doi:10.1007/978-981-99-9896-8_16

Hongfu Xia¹³,
Yang Li^14,15,
Chunyan Liu¹³ &
…
Yunlong Zhao¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14504))

Included in the following conference series:

International Conference on Green, Pervasive, and Cloud Computing

103 Accesses

Abstract

In recent years, numerous researchers have conducted in-depth studies and made significant progress in 2D Hand Pose Estimation (HPE) tasks on RGB images. However, the field of HPE in the context of infrared images has received limited attention. Due to the limited channel information and high correlation with temperature, models designed for RGB images may suffer from insufficient accuracy in infrared images. Our experiments reveal that the temperature distributions of the human hand in infrared images exhibit significant regularity. In this paper, we propose the Temperature-Based Hand Judgement Model (TB-HJM) that leverages this characteristic. During the training phase, a higher penalty is given when the predicted pose’s temperature distribution does not align with the actual temperature distribution, and vice versa. In the testing phase, TB-HJM is utilized to select a hand proposal that closely matches the temperature distribution as the final output. Additionally, to address the lack of visual information in infrared images, we use PBNHead and GCN Refine Module to merge structural information into the network to ensure model accuracy. Experimental results demonstrate that our model outperforms the benchmark model (HRNet) by 1.72% in terms of AUC and achieves an improvement of 0.6448 by reducing the EPE from 3.02 to 2.38, achieving state-of-the-art performance on our infrared hand dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ci, H., Wang, C., Ma, X., Wang, Y.: Optimizing network structure for 3D human pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2262–2271 (2019)
Google Scholar
MMPose Contributors: Openmmlab pose estimation toolbox and benchmark (2020). https://github.com/open-mmlab/mmpose
Geng, Z., Sun, K., Xiao, B., Zhang, Z., Wang, J.: Bottom-up human pose estimation via disentangled keypoint regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14676–14686 (2021)
Google Scholar
Iqbal, U., Molchanov, P., Gall, T.B.J., Kautz, J.: Hand pose estimation via latent 2.5 D heatmap regression. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 118–134 (2018)
Google Scholar
Jiang, T., et al.: RTMPose: real-time multi-person pose estimation based on MMPose. arXiv preprint arXiv:2303.07399 (2023)
Jin, S., et al.: Whole-body human pose estimation in the wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 196–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_12
Chapter Google Scholar
Li, G., Muller, M., Thabet, A., Ghanem, B.: DeepGCNs: can GCNs go as deep as CNNs? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9267–9276 (2019)
Google Scholar
Li, J., Bian, S., Zeng, A., Wang, C., Pang, B., Liu, W., Lu, C.: Human pose regression with residual log-likelihood estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11025–11034 (2021)
Google Scholar
Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., Tian, Q.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3595–3603 (2019)
Google Scholar
Li, W., et al.: Rethinking on multi-stage networks for human pose estimation. arXiv preprint arXiv:1901.00148 (2019)
Li, Y., et al.: SimCC: a simple coordinate classification perspective for human pose estimation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part VI, pp. 89–106. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20068-7_6
Chapter Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Qiu, L., et al.: Peeking into occluded joints: a novel framework for crowd pose estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 488–504. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_29
Chapter Google Scholar
Rogalski, A., Chrzanowski, K.: Infrared devices and techniques. Optoelectron. Rev. 10(2), 111–136 (2002)
Google Scholar
Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)
Santavas, N., Kansizoglou, I., Bampis, L., Karakasis, E., Gasteratos, A.: Attention! a lightweight 2D hand pose estimation approach. IEEE Sens. J. 21(10), 11488–11496 (2020)
Article Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Google Scholar
Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 529–545 (2018)
Google Scholar
Tang, W., Wu, Y.: Does learning specific features for related parts help human pose estimation? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1107–1116 (2019)
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Google Scholar
Wang, D., Zhang, S.: Contextual instance decoupling for robust multi-person pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11060–11068 (2022)
Google Scholar
Wang, Y., Peng, C., Liu, Y.: Mask-pose cascaded CNN for 2D hand pose estimation from single color image. IEEE Trans. Circuits Syst. Video Technol. 29(11), 3258–3268 (2018)
Article Google Scholar
Wang, Y., Zhang, B., Peng, C.: SRHandNet: real-time 2D hand pose estimation with simultaneous region localization. IEEE Trans. Image Process. 29, 2977–2986 (2019)
Article Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
Google Scholar
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466–481 (2018)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Zhao, L., Peng, X., Tian, Y., Kapadia, M., Metaxas, D.N.: Semantic graph convolutional networks for 3D human pose regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3425–3435 (2019)
Google Scholar
Zheng, C., et al.: Deep learning-based human pose estimation: a survey. arXiv preprint arXiv:2012.13392 (2020)
Zhou, Z.H.: Machine Learning. Springer, Cham (2021)
Google Scholar

Download references

Acknowledgements

This research was supported by the National Key Research and Development Program of China under Grant No. 2022ZD0115403, National Natural Science Foundation of China under Grant No. 62072236, and the Fundamental Research Funds for the Central Universities under Grant No. 56XCA2205404.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Hongfu Xia, Chunyan Liu & Yunlong Zhao
Unmanned Aerial Vehicles Research Institute, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Yang Li
Key Laboratory of Advanced Technology for Small and Medium-Sized UAV, Ministry of Industry and Information Technology, Nanjing, China
Yang Li

Authors

Hongfu Xia
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Chunyan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunlong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunlong Zhao .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, China
Hai Jin
Harbin Engineering University, Harbin, China
Zhiwen Yu
Huazhong University of Science and Technology, Wuhan, China
Chen Yu
Shiga University, Shiga, Japan
Xiaokang Zhou
National Academy of Guo Ding Institute of Data Science, Beijing, China
Zeguang Lu
Harbin University of Science and Technology, Harbin, China
Xianhua Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, H., Li, Y., Liu, C., Zhao, Y. (2024). TBSA-Net: A Temperature-Based Structure-Aware Hand Pose Estimation Model in Infrared Images. In: Jin, H., Yu, Z., Yu, C., Zhou, X., Lu, Z., Song, X. (eds) Green, Pervasive, and Cloud Computing. GPC 2023. Lecture Notes in Computer Science, vol 14504. Springer, Singapore. https://doi.org/10.1007/978-981-99-9896-8_16

Download citation

DOI: https://doi.org/10.1007/978-981-99-9896-8_16
Published: 23 January 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9895-1
Online ISBN: 978-981-99-9896-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TBSA-Net: A Temperature-Based Structure-Aware Hand Pose Estimation Model in Infrared Images