skip to main content
10.1145/3460426.3463668acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Human Pose Estimation based on Attention Multi-resolution Network

Authors Info & Claims
Published:01 September 2021Publication History

ABSTRACT

Recently, multi-resolution neural networks, which combine features of different resolutions, have achieved good results in human pose estimation tasks. In this paper, we propose an attention-mechanism-based multi-resolution network, which adds an attention mechanism to the High-Resolution Network (HRNet) to enhance the feature representation of the network. It improves the ability of networks with different resolutions to extract key features from images, and causes the output to contain more effective multi-resolution representation information, so that the corresponding point positions of human joints can be estimated more accurately. Experiments on the MPII and COCO datasets, and verification on the MPII datasets, obtained an average accuracy of 90.3% under the [email protected] evaluation standard, and good results were also achieved on the COCO dataset (with an AP of 76.5). The experimental results show that our network model is effective in improving the accuracy of key point estimation in the human pose estimation task.

References

  1. Pishchulin L, Andriluka M, Gehler P, et al. 2013.Strong appearance and expressive spatial models for human pose estimation[C]//The IEEE International Conference on Computer Vision (ICCV). 3487--3494.Google ScholarGoogle Scholar
  2. Yang Y, Ramanan D. 2011.Articulated pose estimation with flexible mixtures-of-parts[C]// Computer Vision & Pattern Recognition. IEEE, 1385--1392.Google ScholarGoogle Scholar
  3. Pishchulin L, Andriluka M, Gehler P, et al. 2013.Poselet Conditioned Pictorial Structures[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE, 588--59.Google ScholarGoogle Scholar
  4. Newell A, Yang K, Deng J. 2016.Stacked hourglass networks for human pose estimation[C]// The European Conference on Computer Vision (ECCV). 483--499.Google ScholarGoogle Scholar
  5. Ke Sun, Bin Xiao, Dong Liu, et al. 2019.Deep High-Resolution Representation Learning for Human Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5693--5703.Google ScholarGoogle Scholar
  6. Kun Zhang, Peng He, Ping Yao, Ge Chen, Chuanguang Yang, Huimin Li, Li Fu, and Tianyao Zheng. 2019.DNANet: De-Normalized Attention Based Multi-Resolution Network for Human Pose Estimation. CoRR abs/1909.05090 (2019)Google ScholarGoogle Scholar
  7. Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian Sun. 2018.Cascaded Pyramid Network for Multi-Person Pose Estimation[J]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7103--7112.Google ScholarGoogle Scholar
  8. Xiao B, Wu H, Wei Y. 2018.Simple Baselines for Human Pose Estimation and Tracking[J]// ECCV, 472--487.Google ScholarGoogle Scholar
  9. Yang C, An Z, Zhu H, et al. 2020.Gated Convolutional Networks with Hybrid Connectivity for Image Classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34(7):12581--12588.Google ScholarGoogle ScholarCross RefCross Ref
  10. Fu J, Liu J, Tian H, et al. 2020.Dual Attention Network for Scene Segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3146-- 3154.Google ScholarGoogle Scholar
  11. Yousong Z, Chaoyang Z, Haiyun G, et al. 2018.Attention CoupleNet: Fully Convolutional Attention Coupling Network for Object Detection[J]. IEEE Transactions on Image Processing, 1--1.Google ScholarGoogle Scholar
  12. Chaudhari S, Polatkan G, Ramanath R, et al. 2019.An Attentive Survey of Attention Models[J].Google ScholarGoogle Scholar
  13. Chu X, Yang W, Ouyang W, et al. 2017.Multi-Context Attention for Human Pose Estimation[C]// The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1831--1840.Google ScholarGoogle Scholar
  14. Hu J, Shen L, Albanie S, Sun G, Wu E. Squeeze-and-Excitation Networks[J]. IEEE Trans Pattern Anal Mach Intell. Epub 2019 Apr 29. PMID: 31034408. 2020 Aug;42(8):2011--2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Woo S, Park J, Lee Jy, ET AL. CBAM: Convolutional Block Attention Module[J]. Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, et al. 2016. Deep Residual Learning for Image Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770--778.Google ScholarGoogle Scholar
  17. Andriluka M, Pishchulin L, Gehler P, et al. 2014.Human Pose Estimation: New Benchmark and State of the Art Analysis[C]//Computer Vision and Pattern Recognition (CVPR). IEEE, 3686--3693.Google ScholarGoogle Scholar
  18. Tsungyi Lin, Michael Maire, Serge J Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. 2014.Microsoft coco: Common objects in context. European Conference on Computer Vision, 740--755.Google ScholarGoogle Scholar
  19. Kingma D, Ba J. 2014.Adam: A Method for Stochastic Optimization[J]. Computer Science, arXiv preprint arXiv:1412.6980.Google ScholarGoogle Scholar
  20. Tang, Wei, Pei Yu, and Ying Wu. 2018.Deeply Learned Compositional Models for Human Pose Estimation[J]. Proceedings of the European Conference on Computer Vision (ECCV).Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval
    August 2021
    715 pages
    ISBN:9781450384636
    DOI:10.1145/3460426

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 September 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate254of830submissions,31%

    Upcoming Conference

    ICMR '24
    International Conference on Multimedia Retrieval
    June 10 - 14, 2024
    Phuket , Thailand

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader