skip to main content
10.1145/3569966.3569989acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsseConference Proceedingsconference-collections
research-article

ConvPose: An efficient human pose estimation method based on ConvNeXt

Published: 20 December 2022 Publication History

Abstract

Human pose estimation methods have developed rapidly in recent years and many high precision models have emerged. However, the computational costs of these methods are often very huge, especially for transformer-based models. In this work, we propose ConvPose, an efficient human pose estimation model based on convolutional neural network architecture. ConvPose uses an efficient single branch structure, using the ConvNeXt Block as a baseline and incorporating the Coordinate Attention module. This composition not only provides better feature extraction capabilities, but also can efficiently obtain the global dependency relationships between human keypoints and scenes. The effective combination of the large convolution kernel and the attention module gives our model the ability to focus more on detailed features when oriented to complex scenes. In addition, the number of parameters and GFLOPs of our model are at a lighter level compared to current high- performance models, which offers more possibilities for deployment of the model in low-end devices. Experiments show that our model achieves 73.6AP on the MS-COCO dataset with only 6.3M parameters, which is a very competitive result.

References

[1]
Sun K, Xiao B, Liu D, Deep High-Resolution Representation Learning for Human Pose Estimation[J]. arXiv e-prints, 2019.
[2]
Vaswani A, Shazeer N, Parmar N, Attention Is All You Need[C]// arXiv. arXiv, 2017.
[3]
Yang S, Quan Z, Nie M, TransPose: Keypoint Localization via Transformer[C]// 2020.
[4]
Liu Z, Mao H, Wu C Y, A ConvNet for the 2020s[J]. arXiv e-prints, 2022.
[5]
Hou Q, Zhou D, Feng J . Coordinate Attention for Efficient Mobile Network Design[C]// 2021.
[6]
Wang Y, Li M, Cai H, Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation[J]. 2022.
[7]
Bao W, Yang Y, Liang D, Multi-Residual Module Stacked Hourglass Networks for Human Pose Estimation[J]. Journal of Beijing Institute of Technology, 2020, 29(1):10.
[8]
Yang W, Li S, Ouyang W, Learning Feature Pyramids for Human Pose Estimation[C]// IEEE Computer Society. IEEE Computer Society, 2017.
[9]
Bin Xiao, Haiping Wu, and Yichen Wei. Simple base-lines for human pose estimation and tracking. In ECCV, pages 466–481, 2018.
[10]
He K, Zhang X, Ren S, Deep Residual Learning for Image Recognition[J]. IEEE, 2016.
[11]
Jie H, Li S, Gang S, Squeeze-and-Excitation Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, PP(99).
[12]
WOO S, PARK J, LEE J Y, et al.Cbam:convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision)ECCV(, 2018:3-19.
[13]
Sandler M, Howard A, Zhu M, MobileNetV2: Inverted Residuals and Linear Bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018.
[14]
Lin T Y, Maire M, Belongie S, Microsoft COCO: Common Objects in Context[J]. Springer International Publishing, 2014.
[15]
Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2d human pose estimation: New benchmark and state of the art analysis. In CVPR, pages 3686–3693, 2014.

Cited By

View all
  • (2024)V-LTCS: Backbone exploration for Multimodal Misogynous Meme detectionNatural Language Processing Journal10.1016/j.nlp.2024.100109(100109)Online publication date: Oct-2024
  • (2023)ConvCCpose: Learning Coordinate Classification Tokens for Human Pose Estimation Based on ConvNeXt2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT)10.1109/ACAIT60137.2023.10528558(391-396)Online publication date: 10-Nov-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSSE '22: Proceedings of the 5th International Conference on Computer Science and Software Engineering
October 2022
753 pages
ISBN:9781450397780
DOI:10.1145/3569966
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ConvNeXt
  2. Coordinate Attention
  3. Human pose estimation
  4. convolutional neural network

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Science and Technology Major Project of Guangxi Zhuang Autonomous Region Government

Conference

CSSE 2022

Acceptance Rates

Overall Acceptance Rate 33 of 74 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)V-LTCS: Backbone exploration for Multimodal Misogynous Meme detectionNatural Language Processing Journal10.1016/j.nlp.2024.100109(100109)Online publication date: Oct-2024
  • (2023)ConvCCpose: Learning Coordinate Classification Tokens for Human Pose Estimation Based on ConvNeXt2023 7th Asian Conference on Artificial Intelligence Technology (ACAIT)10.1109/ACAIT60137.2023.10528558(391-396)Online publication date: 10-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media