Simple Fine-Tuning Attention Modules for Human Pose Estimation

Tran, Tien-Dat; Vo, Xuan-Thuy; Russo, Moahamammad-Ashraf; Jo, Kang-Hyun

doi:10.1007/978-3-030-63119-2_15

Simple Fine-Tuning Attention Modules for Human Pose Estimation

Conference paper
First Online: 19 November 2020

1200 Accesses
3 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1287))

Abstract

The convolution neural networks (CNNs) have achieved the best performance not only for human pose estimation but also for other computer vision tasks (e.g., object detection, semantic segmentation, image classification). Then this paper focuses on a useful attention module (AM) for feed-forward CNNs. Firstly, feed the feature map after a block in the backbone network into the attention module, split into two separate dimensions, channel and spatial. After that, the AM combines these two feature maps by multiplication and gives it to the next block in the backbone. The network can capture the information in the long-range dependencies (channel) and the spatial data, which can gain better performance in accuracy. Therefore, our experimental results will illustrate how different between when using the attention module and the existing methods. As a result, the predicted joint heatmap maintains the accuracy and spatially better with the simple baseline. Besides, the proposed architecture gains 1.0 points in AP higher than the baseline. Moreover, the proposed network trained on COCO 2017 benchmarks, which is an accessible dataset nowadays.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields (2016)
Google Scholar
Chen, C., Ramanan, D.: 3D human pose estimation = 2D pose estimation + matching. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5759–5767, July 2017. https://doi.org/10.1109/CVPR.2017.610
Chou, C.J., Chien, J.T., Chen, H.T.: Self adversarial training for human pose estimation (2017)
Google Scholar
Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation (2017)
Google Scholar
Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning (2016)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Google Scholar
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2017)
Google Scholar
Hussain, Z., Sheng, M., Zhang, W.E.: Different approaches for human activity recognition: a survey (2019)
Google Scholar
Indolia, S., Goswami, A., Mishra, S., Asopa, P.: Conceptual understanding of convolutional neural network- a deep learning approach. Proc. Comput. Sci. 132, 679–688 (2018). https://doi.org/10.1016/j.procs.2018.05.069
Article Google Scholar
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: a deeper, stronger, and faster multi-person pose estimation model (2016)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015)
Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks (2015)
Google Scholar
Kim, E., Helal, S., Cook, D.: Human activity recognition and pattern discovery. IEEE Pervasive Comput. 9(1), 48–53 (2010). https://doi.org/10.1109/MPRV.2010.7
Article Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, December 2014
Google Scholar
Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Asian Conference on Computer Vision (ACCV), pp. 31–44, November 2012
Google Scholar
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks (2019)
Google Scholar
Lin, T., et al.: Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014). http://arxiv.org/abs/1405.0312
Mastyło, M.: Bilinear interpolation theorems and applications. J. Funct. Anal. 265, 185–207 (2013). https://doi.org/10.1016/j.jfa.2013.05.001
Article MathSciNet MATH Google Scholar
Moon, G., Chang, J.Y., Lee, K.M.: Posefix: model-agnostic general human pose refinement network (2018)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. CoRR abs/1603.06937 (2016). http://arxiv.org/abs/1603.06937
Ning, G., Zhang, Z., He, Z.: Knowledge-guided deep fractal neural networks for human pose estimation (2017)
Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation (2019)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning (2016)
Google Scholar
Tang, Z., Peng, X., Geng, S., Wu, L., Zhang, S., Metaxas, D.: Quantized densely connected u-nets for efficient landmark localization (2018)
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: Human pose estimation via deep neural networks. CoRR abs/1312.4659 (2013). http://arxiv.org/abs/1312.4659
Wang, X., Girshick, R.B., Gupta, A., He, K.: Non-local neural networks. CoRR abs/1711.07971 (2017). http://arxiv.org/abs/1711.07971
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines (2016)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module (2018)
Google Scholar
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. CoRR abs/1804.06208 (2018). http://arxiv.org/abs/1804.06208
Yang, X., Wang, M., Tao, D.: Person re-identification with metric learning using privileged information. CoRR abs/1904.05005 (2019). http://arxiv.org/abs/1904.05005

Download references

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government. (MSIT)(2020R1A2C2008972)

Author information

Authors and Affiliations

School of Electrical Engineering, University of Ulsan, Ulsan, 44610, South Korea
Tien-Dat Tran, Xuan-Thuy Vo, Moahamammad-Ashraf Russo & Kang-Hyun Jo

Authors

Tien-Dat Tran
View author publications
You can also search for this author in PubMed Google Scholar
Xuan-Thuy Vo
View author publications
You can also search for this author in PubMed Google Scholar
Moahamammad-Ashraf Russo
View author publications
You can also search for this author in PubMed Google Scholar
Kang-Hyun Jo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kang-Hyun Jo .

Editor information

Editors and Affiliations

Wroclaw University of Economics and Business, Wrocław, Poland
Marcin Hernes
Wrocław University of Science and Technology, Wrocław, Poland
Krystian Wojtkiewicz
University of Newcastle, Newcastle, Australia
Edward Szczerbicki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tran, TD., Vo, XT., Russo, MA., Jo, KH. (2020). Simple Fine-Tuning Attention Modules for Human Pose Estimation. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-63119-2_15
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63118-5
Online ISBN: 978-3-030-63119-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics