Skip to main content
Log in

Human pose estimation for low-resolution image using 1-D heatmaps and offset regression

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Running a reliable network on resource-limited platforms for a low-resolution image is a great challenge for heatmap-based human pose estimation (HPE). Scale mismatch between the input image and heatmaps and the intrinsic quantization effect induced by the ‘argmax’ function hinder the performance of heatmap-based human pose estimation for low-resolution image. In this paper, we propose a coordinate-decoupled and offset-revised module (CDORM) to tackle these challenges. The proposed CDORM uses two coordinate-decoupled 1-D heatmaps to supervise the regression process of determining the horizontal and vertical locations of human joints, and employs offset regressing to alleviate the effect of quantization. The CDORM can be integrated with any current heatmap-based HPE network without increasing the size of network significantly. Experimental results on the COCO and MPII datasets show that CDORM helps heatmap-based regression approaches obtain high estimation accuracy from the low-resolution image and only slightly increases the size and runtime of the network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Code Availability

The code will be pulished online soon.

References

  1. Andriluka M, Pishchulin L, Gehler P et al (2014) 2d human pose estimation: New benchmark and state of the art analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2014.471, pp 3686–3693

  2. Bhatti U, Huang M, Wang H et al (2018) Recommendation system for immunization coverage and monitoring. Hum Vaccines Immunother 14(1):165–171. https://doi.org/10.1080/21645515.2017.1379639https://doi.org/10.1080/21645515.2017.1379639

    Article  Google Scholar 

  3. Bhatti U, Huang M, Wu D, et al. (2019) Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterp Inf Syst 13(3):329–351. https://doi.org/10.1080/17517575.2018.1557256

    Article  Google Scholar 

  4. Bhatti U, Yu Z, Chanussot J et al (2021) Local similarity-based spatial-spectral fusion hyperspectral image classification with deep CNN and Gabor filtering. IEEE Trans Geosci Remote Sens 60:1–15. https://doi.org/10.1109/TGRS.2021.3090410

    Article  Google Scholar 

  5. Bhatti U, Yu Z, Hasnain A et al (2022) Evaluating the impact of roads on the diversity pattern and density of trees to improve the conservation of species. Environ Sci Pollut Res 29(10):14780–14790. https://doi.org/10.1007/s11356-021-16627-y

    Article  Google Scholar 

  6. Bhatti U, Zeeshan Z, Nizamani M et al (2022) Assessing the change of ambient air quality patterns in Jiangsu Province of China pre-to post-COVID-19. Chemosphere 288:132569. https://doi.org/10.1016/j.chemosphere.2021.132569

    Article  Google Scholar 

  7. Carreira J, Agrawal P, Fragkiadaki K et al (2016) Human pose estimation with iterative error feedback. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2016.512, pp 4733–4742

  8. Chen Y, Wang Z, Peng Y, et al. (2018) Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2018.00742, pp 7103–7112

  9. Cheng B, Xiao B, Wang J et al (2020) Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR42600.2020.00543, pp 5386–5395

  10. Dai X, Chen Y, Xiao B et al (2021) Dynamic head: unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR46437.2021.00729, pp 7373–7382

  11. Fan X, Zheng K, Lin Y, et al. (2015) Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2015.7298740, pp 1347–1355

  12. Fang H, Xie S, Tai Y et al (2017) Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision. https://doi.org/10.1109/ICCV.2017.256, pp 2334–2343

  13. Feng Z, Lai J, Xie X (2021) Resolution-aware knowledge distillation for efficient inference. IEEE Trans Image Process 30:6985–6996. https://doi.org/10.1109/TIP.2021.3101158

    Article  Google Scholar 

  14. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169

  15. Guesdon R, Crispim-Junior C, Tougne L (2021) Dripe: a dataset for human pose estimation in real-world driving settings. In: Proceedings of the IEEE/CVF international conference on computer vision. https://doi.org/10.1109/ICCVW54120.2021.00321, pp 2865–2874

  16. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2016.90https://doi.org/10.1109/CVPR.2016.90, pp 770–778

  17. Li W, Wang Z, Yin B et al (2019) Rethinking on multi-stage networks for human pose estimation. arXiv:1901.00148

  18. Li K, Wang S, Zhang X, et al. (2021) Pose recognition with cascade transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR46437.2021.00198, pp 1944–1953

  19. Li Y, Yang S, Zhang S et al (2021) Is 2D Heatmap representation even necessary for human pose estimation? arXiv:2107.03332

  20. Lin T, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. https://doi.org/10.1109/TPAMI.2018.2858826https://doi.org/10.1109/TPAMI.2018.2858826 , pp 2980–2988

  21. Lin T, Maire M, Belongie S, et al. (2014) Microsoft coco: common objects in context. In: European conference on computer vision. https://doi.org/10.1109/CVPR.2014.471, pp 740–755

  22. Martinez J, Black M, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2017.497, pp 4674–4683

  23. Meng Q, Zhao S, Huang Z et al (2021) Magface: a universal representation for face recognition and quality assessment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR46437.2021.01400, pp 14225–14234

  24. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision. https://doi.org/10.1007/978-3-319-46484-8_29, pp 483–499

  25. Nibali A, He Z, Morgan S et al (2018) Numerical coordinate regression with convolutional neural networks. arXiv:1801.07372

  26. Nie X, Feng J, Zhang J, et al. (2019) Single-stage multi-person pose machines. In: Proceedings of the IEEE/CVF international conference on computer vision. https://doi.org/10.1109/ICCV.2019.00705https://doi.org/10.1109/ICCV.2019.00705, pp 6951–6960

  27. Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 28. https://doi.org/10.48550/arXiv.1506.01497https://doi.org/10.48550/arXiv.1506.01497

  28. Sun X, Shang J, Liang S, et al. (2017) Compositional human pose regression. In: Proceedings of the IEEE international conference on computer vision. https://doi.org/10.1109/ICCV.2017.284, pp 2602–2611

  29. Sun K, Xiao B, Liu D et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2019.00584, pp 5693–5703

  30. Sun X, Xiao B, Wei F et al (2018) Integral human pose regression. In: Proceedings of the European conference on computer vision. https://doi.org/10.1007/978-3-030-01231-1_33, pp 529–545

  31. Tian Z, Chen H, Shen C (2019) Directpose: direct end-to-end multi-person pose estimation. arXiv:1911.07451

  32. Tian Z, Shen C, Chen H et al (2019) Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. https://doi.org/10.1109/ICCV.2019.00972, pp 9627–9636

  33. Tian L, Wang P, Liang G et al (2021) An adversarial human pose estimation network injected with graph structure. Pattern Recogn 115:107863. https://doi.org/10.1016/j.patcog.2021.107863

    Article  Google Scholar 

  34. Tompson J, Jain A, LeCun Y et al (2014) Joint training of a convolutional network and a graphical model for human pose estimation. Advances in Neural Information Processing Systems, 27. https://doi.org/10.48550/arXiv.1406.2984

  35. Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2014.214, pp 1653–1660

  36. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Advances in Neural Information Processing Systems, 30. https://doi.org/10.48550/arXiv.1706.03762

  37. Wang C, Zhang F, Ge S (2021) A comprehensive survey on 2D multi-person pose estimation methods. Eng Appl Artif Intel 102:104260. https://doi.org/10.1016/j.engappai.2021.104260

    Article  Google Scholar 

  38. Wei F, Sun X, Li H et al (2020) Point-set anchors for object detection, instance segmentation and pose estimation. In: European conference on computer vision. https://doi.org/10.1007/978-3-030-58607-2_31, pp 527–544

  39. Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the European conference on computer vision. https://doi.org/10.1007/978-3-030-01231-1_29, pp 466–481

  40. Yu C, Xiao B, Gao C et al (2021) Lite-hrnet: a lightweight high-resolution network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR46437.2021.01030, pp 10440–10450

  41. Zhang F, Zhu X, Dai H et al (2020) Distribution-aware coordinate representation for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR42600.2020.00712, pp 7093–7102

  42. Zhang R, Zhu Z, Li P et al (2019) Exploiting offset-guided network for pose estimation and tracking. In: CVPR Workshops. https://doi.org/10.48550/arXiv.1906.01344

  43. Zheng L, Huang Y, Lu H et al (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509. https://doi.org/10.1109/TIP.2019.2910414

    Article  MATH  Google Scholar 

  44. Zhou L, Chen Y, Gao Y et al (2020) Occlusion-aware siamese network for human pose estimation. In: European conference on computer vision. https://doi.org/10.1007/978-3-030-58565-5_24, pp 396–412

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (62173353), Guangzhou Municipal People’s Livelihood Science and Technology Plan (201903010040), Science and Technology Program of Guangzhou, China (202007030011).

Funding

This work was supported by National Natural Science Foundation of China (62173353), Guangzhou Municipal People’s Livelihood Science and Technology Plan (201903010040), Science and Technology Program of Guangzhou, China (202007030011).

Author information

Authors and Affiliations

Authors

Contributions

Cailong Chi: Investigation, Methodology, Data Acquisition & Analysis, Visulization, Writing Original Draft. Dong Zhang: Funding Acquisition, Conceptualization, Data Analysis, Critically Revised, Data Curation. Zhesi Zhu: Methodology, Validation. Xingzhi Wang: Methodology. Dah-Jye Lee: Conceptualization, Critically Revised.

Corresponding author

Correspondence to Dong Zhang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for Publication

All authors agreed with the content and gave explicit consent to submit.

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and materials

The datasets analysed during the current study are available in the Common Objects in Context (COCO) repository, https://cocodataset.org/, and MPII Human Pose Dataset, http://human-pose.mpi-inf.mpg.de/. The data generated during the current study are available from the corresponding author on reasonable request.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chi, C., Zhang, D., Zhu, Z. et al. Human pose estimation for low-resolution image using 1-D heatmaps and offset regression. Multimed Tools Appl 82, 6289–6307 (2023). https://doi.org/10.1007/s11042-022-13468-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13468-w

Keywords

Navigation