ABSTRACT
Abstract
Human pose estimation is a pivotal task in computer vision, aiming to predict the spatial locations of key body joints within an image accurately. The challenge arises from the need to understand complex human poses, occlusions, and variations in body configurations, which often perplex traditional pose estimation models. To bolster the accuracy and robustness of human pose estimation models, we introduce an Attention-Augmented HRNet Architecture. This proposed model augments the original HRNet by integrating self-attention mechanisms. These mechanisms capture long-range dependencies among keypoints and concentrate on pivotal body regions more effectively. Experimental results demonstrate that the Attention-Augmented HRNet surpasses the baseline HRNet that lacks attention, attaining state-of-the-art performance on the COCO dataset. Specifically, our model achieves an Average Precision (AP) of 74.5%.
- Andriluka, M., Pishchulin, L., Gehler, P., & Schiele, B. (2014). “2d human pose estimation: New benchmark and state of the art analysis”. In Proceedings of the IEEE Conference on computer Vision and Pattern Recognition.Google ScholarDigital Library
- Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). “Deep high-resolution representation learning for human pose estimation”. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.Google ScholarCross Ref
- MacKenzie, I. Scott. (2012). Human-computer interaction: An empirical research perspective.Google Scholar
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). “Attention is all you need”. Advances in neural information processing systems.Google Scholar
- Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). “Understanding of a convolutional neural network”. International conference on engineering and technology (ICET).Google ScholarCross Ref
- Medsker, L. R., & Jain, L. C. (2001). “Recurrent neural networks”. Design and Applications, 5(2): 64-67.Google Scholar
- Newell A, Yang K, Deng J. “Stacked hourglass networks for human pose estimation”. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands.Google Scholar
- Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018), “Cbam: Convolutional block attention module”, Proceedings of the European conference on computer vision (ECCV), 3-19.Google ScholarDigital Library
- Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., ... & Fu, W. (2020). “Deep High-Resolution Representation Learning for Visual Recognition”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(2): 665-678.Google Scholar
- Lin T Y, Maire M, Belongie S, “Microsoft coco: Common objects in context”. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland.Google Scholar
Recommendations
A survey of human pose estimation
Summarization of methods on human pose estimation in recent years.Conclusion of the traditional human pose estimation methods.Illustrated based on a two-stage framework.Comprehensive comparisons are given based on the open source methods. Estimating ...
Human pose estimation via multi-layer composite models
We introduce a hierarchical part-based approach for human pose estimation in static images. Our model is a multi-layer composite of tree-structured pictorial-structure models, each modeling human pose at a different scale and with a different graphical ...
Lightweight human pose estimation algorithm based on polarized self-attention
AbstractIn recent years, human pose estimation has been widely used in human-computer interaction, augmented reality, video surveillance, and many other fields, but the task of pose estimation still faces many challenges. To address the large number of ...
Comments