skip to main content
10.1145/3614008.3614040acmotherconferencesArticle/Chapter ViewAbstractPublication PagesspmlConference Proceedingsconference-collections
research-article

Improved YOLO-Pose Crowd Pose Estimation

Published: 17 October 2023 Publication History

Abstract

Aiming at the problem of low detection and recognition rate caused by crowded occlusion or indistinct difference between front and back backgrounds when performing human pose estimation in crowded scenes, We proposed an improved YOLO-Pose algorithm that combines convolutional block attention modules. By Integrate CBAM to Bottleneck module to dig deep into the feature information of the two levels of channel and space, and enhance the feature extraction of the target object in the image. In order to improve the convergence ability of the model, improve the CIOU loss function, use EIOU instead of the original regression function of YOLOv8, optimize the regression accuracy of anchor points, reduce the difficulty of network training, and improve the detection rate of occlusion. Compared with the general YOLOv8 algorithm paper, the improved algorithm proposed in this paper has a higher accuracy rate when performing pose estimation in crowded scenes.

References

[1]
B. Xiao, H. P. Wu, and Y. C. Wei, "Simple Baselines for Human Pose Estimation and Tracking," Computer Vision - Eccv 2018, Pt Vi, vol. 11210, pp. 472-487, 2018.
[2]
Y. L. Chen, Z. C. Wang, Y. X. Peng, Z. Q. Zhang, G. Yu, and J. Sun, "Cascaded Pyramid Network for Multi-Person Pose Estimation," 2018 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr), pp. 7103-7112, 2018.
[3]
Z. G. Geng, K. Sun, B. Xiao, Z. X. Zhang, and J. D. Wang, "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression," 2021 Ieee/Cvf Conference on Computer Vision and Pattern Recognition, Cvpr 2021, pp. 14671-14681, 2021.
[4]
A. Newell, K. U. Yang, and J. Deng, "Stacked Hourglass Networks for Human Pose Estimation," Computer Vision - Eccv 2016, Pt Viii, vol. 9912, pp. 483-499, 2016.
[5]
K. Sun, B. Xiao, D. Liu, and J. D. Wang, "Deep High-Resolution Representation Learning for Human Pose Estimation," 2019 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr 2019), pp. 5686-5696, 2019.
[6]
W. McNally, K. Vats, A. Wong, and J. McPhee, "Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-person Human Pose Estimation," Computer Vision - Eccv 2022, Pt Vi, vol. 13666, pp. 37-54, 2022.
[7]
D. Maji, S. Nagori, M. Mathew, and D. Poddar, "YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss," 2022 Ieee/Cvf Conference on Computer Vision and Pattern Recognition Workshops (Cvprw 2022), pp. 2636-2645, 2022.
[8]
Xu, Wang, and Yang, "Attention-YOLO: YOLO detection algorithm introducing attention mechanism," Computer Engineering and Applications(in Chinese), vol. 55, no. 06, pp. 13-23+125, 2019.
[9]
Zhou, Song, and Yang, " Occlusion-aware pedestrian detection combined with dual attention mechanism" Journal of Harbin Institute of Technology.(in Chinese) vol. 53, no. 09, pp. 156-163, 2021. [10]W. Chen, B. Zhang, X. Yang, W. Fang, W. Zhang, and X. Jiang, "C-EEUC: a Cluster Routing Protocol for Coal Mine Wireless Sensor Network Based on Fog Computing and 5G," Mobile Networks & Applications, Article vol. 27, no. 5, pp. 1853-1866, Oct 2022.
[10]
S. H. Woo, J. Park, J. Y. Lee, and I. S. Kweon, "CBAM: Convolutional Block Attention Module," Computer Vision - Eccv 2018, Pt Vii, vol. 11211, pp. 3-19, 2018.
[11]
Y.-F. Zhang, W. Ren, Z. Zhang, Z. Jia, L. Wang, and T. Tan, "Focal and efficient IOU loss for accurate bounding box regression," Neurocomputing, Article vol. 506, pp. 146-157, Sep 28 2022.
[12]
S. Liu, L. Qi, H. F. Qin, J. P. Shi, and J. Y. Jia, "Path Aggregation Network for Instance Segmentation," 2018 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr), pp. 8759-8768, 2018.
[13]
C. Tang, W. Chen, C. Zhu, Q. Li, and H. H. Chen, "When Cache Meets Vehicular Edge Computing: Architecture, Key Issues, and Challenges," Ieee Wireless Communications, Article vol. 29, no. 4, pp. 56-62, Aug 2022.
[14]
N. O' Mahony, "One-Shot Learning for Custom Identification Tasks; A Review," 29th International Conference on Flexible Automation and Intelligent Manufacturing (Faim 2019): Beyond Industry 4.0: Industrial Advances, Engineering Education and Intelligent Manufacturing, Proceedings Paper vol. 38, pp. 186-193, 2019 2019.
[15]
H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression," Arxiv, preprint Apr 15 2019.
[16]
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression," Arxiv, preprint Nov 19 2019.

Cited By

View all
  • (2024)Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D ImageACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369587720:12(1-20)Online publication date: 11-Sep-2024
  • (2024)MDA-YOLO Person: a 2D human pose estimation model based on YOLO detection frameworkCluster Computing10.1007/s10586-024-04608-y27:9(12323-12340)Online publication date: 11-Jun-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SPML '23: Proceedings of the 2023 6th International Conference on Signal Processing and Machine Learning
July 2023
383 pages
ISBN:9798400707575
DOI:10.1145/3614008
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. Keyword: Pose estimation Crowd occlusion YOLO-Pose Attention mechanism

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SPML 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)6
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D ImageACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369587720:12(1-20)Online publication date: 11-Sep-2024
  • (2024)MDA-YOLO Person: a 2D human pose estimation model based on YOLO detection frameworkCluster Computing10.1007/s10586-024-04608-y27:9(12323-12340)Online publication date: 11-Jun-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media