research-article

Improved YOLO-Pose Crowd Pose Estimation

Authors:

Minda YaoAuthors Info & Claims

SPML '23: Proceedings of the 2023 6th International Conference on Signal Processing and Machine Learning

Pages 201 - 206

https://doi.org/10.1145/3614008.3614040

Published: 17 October 2023 Publication History

Abstract

Aiming at the problem of low detection and recognition rate caused by crowded occlusion or indistinct difference between front and back backgrounds when performing human pose estimation in crowded scenes, We proposed an improved YOLO-Pose algorithm that combines convolutional block attention modules. By Integrate CBAM to Bottleneck module to dig deep into the feature information of the two levels of channel and space, and enhance the feature extraction of the target object in the image. In order to improve the convergence ability of the model, improve the CIOU loss function, use EIOU instead of the original regression function of YOLOv8, optimize the regression accuracy of anchor points, reduce the difficulty of network training, and improve the detection rate of occlusion. Compared with the general YOLOv8 algorithm paper, the improved algorithm proposed in this paper has a higher accuracy rate when performing pose estimation in crowded scenes.

References

[1]

B. Xiao, H. P. Wu, and Y. C. Wei, "Simple Baselines for Human Pose Estimation and Tracking," Computer Vision - Eccv 2018, Pt Vi, vol. 11210, pp. 472-487, 2018.

[2]

Y. L. Chen, Z. C. Wang, Y. X. Peng, Z. Q. Zhang, G. Yu, and J. Sun, "Cascaded Pyramid Network for Multi-Person Pose Estimation," 2018 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr), pp. 7103-7112, 2018.

[3]

Z. G. Geng, K. Sun, B. Xiao, Z. X. Zhang, and J. D. Wang, "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression," 2021 Ieee/Cvf Conference on Computer Vision and Pattern Recognition, Cvpr 2021, pp. 14671-14681, 2021.

[4]

A. Newell, K. U. Yang, and J. Deng, "Stacked Hourglass Networks for Human Pose Estimation," Computer Vision - Eccv 2016, Pt Viii, vol. 9912, pp. 483-499, 2016.

[5]

K. Sun, B. Xiao, D. Liu, and J. D. Wang, "Deep High-Resolution Representation Learning for Human Pose Estimation," 2019 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr 2019), pp. 5686-5696, 2019.

[6]

W. McNally, K. Vats, A. Wong, and J. McPhee, "Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-person Human Pose Estimation," Computer Vision - Eccv 2022, Pt Vi, vol. 13666, pp. 37-54, 2022.

[7]

D. Maji, S. Nagori, M. Mathew, and D. Poddar, "YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss," 2022 Ieee/Cvf Conference on Computer Vision and Pattern Recognition Workshops (Cvprw 2022), pp. 2636-2645, 2022.

[8]

Xu, Wang, and Yang, "Attention-YOLO: YOLO detection algorithm introducing attention mechanism," Computer Engineering and Applications(in Chinese), vol. 55, no. 06, pp. 13-23+125, 2019.

[9]

Zhou, Song, and Yang, " Occlusion-aware pedestrian detection combined with dual attention mechanism" Journal of Harbin Institute of Technology.(in Chinese) vol. 53, no. 09, pp. 156-163, 2021. [10]W. Chen, B. Zhang, X. Yang, W. Fang, W. Zhang, and X. Jiang, "C-EEUC: a Cluster Routing Protocol for Coal Mine Wireless Sensor Network Based on Fog Computing and 5G," Mobile Networks & Applications, Article vol. 27, no. 5, pp. 1853-1866, Oct 2022.

[10]

S. H. Woo, J. Park, J. Y. Lee, and I. S. Kweon, "CBAM: Convolutional Block Attention Module," Computer Vision - Eccv 2018, Pt Vii, vol. 11211, pp. 3-19, 2018.

[11]

Y.-F. Zhang, W. Ren, Z. Zhang, Z. Jia, L. Wang, and T. Tan, "Focal and efficient IOU loss for accurate bounding box regression," Neurocomputing, Article vol. 506, pp. 146-157, Sep 28 2022.

Digital Library

[12]

S. Liu, L. Qi, H. F. Qin, J. P. Shi, and J. Y. Jia, "Path Aggregation Network for Instance Segmentation," 2018 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (Cvpr), pp. 8759-8768, 2018.

[13]

C. Tang, W. Chen, C. Zhu, Q. Li, and H. H. Chen, "When Cache Meets Vehicular Edge Computing: Architecture, Key Issues, and Challenges," Ieee Wireless Communications, Article vol. 29, no. 4, pp. 56-62, Aug 2022.

Digital Library

[14]

N. O' Mahony, "One-Shot Learning for Custom Identification Tasks; A Review," 29th International Conference on Flexible Automation and Intelligent Manufacturing (Faim 2019): Beyond Industry 4.0: Industrial Advances, Engineering Education and Intelligent Manufacturing, Proceedings Paper vol. 38, pp. 186-193, 2019 2019.

[15]

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression," Arxiv, preprint Apr 15 2019.

[16]

Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression," Arxiv, preprint Nov 19 2019.

Cited By

Bao YSu CQi YGeng YLi H(2024)Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D ImageACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369587720:12(1-20)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3695877
Dong CTang YZhang L(2024)MDA-YOLO Person: a 2D human pose estimation model based on YOLO detection frameworkCluster Computing10.1007/s10586-024-04608-y27:9(12323-12340)Online publication date: 11-Jun-2024
https://doi.org/10.1007/s10586-024-04608-y

Index Terms

Improved YOLO-Pose Crowd Pose Estimation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

Gravitational pose estimation

Problem of relative pose estimation between a camera and rigid object, given an object model with feature points and image(s) with respective image points (hence known correspondence) has been extensively studied in the literature. We propose a ''...
Multiple people tracking and pose estimation with occlusion estimation

Simultaneously tracking poses of multiple people is a difficult problem because of inter-person occlusions and self occlusions. This paper presents an approach that circumvents this problem by performing tracking based on observations from multiple wide-...
Orientation and pose estimation for panoramic imagery
Special issue on Image Databases

In a database of geo-referenced images, determining the exact position of each panorama is an important step in order to ensure the consistency of visual information. This paper addresses the problem of camera pose recovery from spherical (360°) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SPML '23: Proceedings of the 2023 6th International Conference on Signal Processing and Machine Learning

July 2023

383 pages

ISBN:9798400707575

DOI:10.1145/3614008

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

Keyword: Pose estimation Crowd occlusion YOLO-Pose Attention mechanism

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China (NSFC)
Fundamental Research Funds for the Central Universities
Graduate Innovation Program of China University of Mining and Technology

Conference

SPML 2023

SPML 2023: 2023 6th International Conference on Signal Processing and Machine Learning

July 14 - 16, 2023

Tianjin, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
197
Total Downloads

Downloads (Last 12 months)126
Downloads (Last 6 weeks)6

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bao YSu CQi YGeng YLi H(2024)Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D ImageACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369587720:12(1-20)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3695877
Dong CTang YZhang L(2024)MDA-YOLO Person: a 2D human pose estimation model based on YOLO detection frameworkCluster Computing10.1007/s10586-024-04608-y27:9(12323-12340)Online publication date: 11-Jun-2024
https://doi.org/10.1007/s10586-024-04608-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten