Channel sifted model for pose estimation

Zhou, Shuren; Peng, Liang

doi:10.1007/s10489-022-04091-1

Channel sifted model for pose estimation

Published: 03 September 2022

Volume 53, pages 11373–11388, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Shuren Zhou¹ &
Liang Peng¹

311 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In recent years, human pose estimation, also called keypoint estimation, has been a research spot in the field of computer vision. However, most methods focus on how to achieve high accuracy and ignore the computational cost. These methods usually use deeper layers and more complex structures to construct their networks result in a surge in computational cost, so it is difficult to apply them in real environments with real-time requirements. Some methods try to reduce the computational cost by decreasing precision of parameter or using simple structures. However, they can only achieve low performance, so it is difficult to apply in the real environments. Inspired by lightweight network design, this paper proposes a channel sifted method for human pose estimation, which is called the channel sifted network (CSN). A lightweight ResNet is used as the backbone, and a channel scoring module (CSM) is applied. Both parts aim to balance the computational cost and prediction accuracy of the network. The lower the computational cost, the faster the inference speed. In the experimental part, we first train and test the network on the COCO2017 dataset and MPII dataset respectively, and then demonstrate the effectiveness of lightweight backbone and CSM in the network through ablation study. Compared with other lightweight models, CSN achieves higher performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-channel spatial information feature based human pose estimation algorithm

Article Open access 11 August 2024

Lightweight human pose estimation: CVC-net

Article 07 March 2022

Lightweight Human Pose Estimation Based on Multi-Attention Mechanism

Article 02 January 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Presti LL, La Cascia M (2016) 3d skeleton-based human action classification: a survey. Pattern Recogn 53:130–147
Article Google Scholar
Guo Y, Li Y, Shao Z (2018) Dsrf: a flexible trajectory descriptor for articulated human action recognition. Pattern Recogn 76:137–148
Article Google Scholar
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR 2011, IEEE, pp 1297–1304
Zhou S, Wu J, Zhang F, Sehdev P (2020) Depth occlusion perception feature analysis for person re-identification. Pattern Recogn Lett 138:617–623
Article Google Scholar
Zhou S, Wang Y, Zhang F, Wu J (2021) Cross-view similarity exploration for unsupervised cross-domain person re-identification. Neural Comput Appl 33(9):4001–4011
Article Google Scholar
Zhu Y, Ma C, Du J (2019) Rotated cascade r-cnn: a shape robust detector with coordinate regression. Pattern Recog 96:106964
Article Google Scholar
Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Pishchulin L, Insafutdinov E, Tang S, Andres B, Andriluka M, Gehler PV, Schiele B (2016) Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4929–4937
Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: European conference on computer vision, Springer, pp 34–50
Iqbal U, Gall J (2016) Multi-person pose estimation with local joint-to-person associations. In: European conference on computer vision, Springer, pp 627–642
Zhang F, Zhu X, Ye M (2019) Fast human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3517–3526
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: NIPS deep learning and representation learning workshop
Zhang Z, Tang J, Wu G (2021) Simple and lightweight human pose estimation under resource-limited scenes. In: IEEE international conference on acoustics, speech and signal processing, pp 2170–2174
Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: Proceedings of the european conference on computer vision (ECCV), pp 466–481
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Wei S-E, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4724–4732
Papandreou G, Zhu T, Kanazawa N, Toshev A, Tompson J, Bregler C, Murphy K (2017) Towards accurate multi-person pose estimation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4903–4911
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, Springer, pp 483–499
Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2d human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3686–3693
Wang J, Sun K, Cheng T, Jiang B, Deng C, Zhao Y, Liu D, Mu Y, Tan M, Wang X, et al. (2020) Deep high-resolution representation learning for visual recognition. IEEE Trans Pattern Anal Mach Intell
Chen Y, Wang Z, Peng Y, Zhang Z, Yu G, Sun J (2018) Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7103–7112
Bin Y, Chen Z-M, Wei X-S, Chen X, Gao C, Sang N (2020) Structure-aware human pose estimation with graph convolutional networks. Pattern Recogn 107410:106
Google Scholar
Artacho B, Savakis A (2020) Unipose: unified human pose estimation in single images and videos. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7035–7044
Fang H-S, Xie S, Tai Y-W, Lu C (2017) Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp 2334–2343
Tian L, Wang P, Liang G, Shen C (2021) An adversarial human pose estimation network injected with graph structure. Pattern Recogn 115:107863
Article Google Scholar
Wang X, Tong J, Wang R (2021) Attention refined network for human pose estimation. Neural Process Lett :1–20
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D. Dollár, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Springer, pp 740–755
Andriluka M, Iqbal U, Insafutdinov E, Pishchulin L, Milan A, Gall J, Schiele B (2018) Posetrack: a benchmark for human pose estimation and tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5167–5176
Ou Z, Luo Y, Chen J, Chen G (2021) Srfnet: selective receptive field network for human pose estimation. J Supercomput :1–21
Zhou L, Chen Y, Cao C, Chu Y, Wang J, Lu H (2021) Macro-micro mutual learning inside compositional model for human pose estimation. Neurocomputing 449:176–188
Article Google Scholar
Rafi U, Leibe B, Gall J, Kostrikov I (2016) An efficient convolutional network for human pose estimation. In: BMVC, vol 1. pp 1–11
Bulat A, Tzimiropoulos G (2017) Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In: Proceedings of the IEEE international conference on computer vision, pp 3706–3714
Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291–7299
Nie X, Feng J, Zhang J, Yan S (2019) Single-stage multi-person pose machines. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6951–6960
Cheng B, Xiao B, Wang J, Shi H, Huang TS, Zhang L (2020) Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5386– 5395
Newell A, Huang Z, Deng J (2017) Associative embedding: end-to-end learning for joint detection and grouping. In: Proceedings of the 31st international conference on neural information processing systems, pp 2274–2284
Li J, Wang C, Zhu H, Mao Y, Fang H-S, Lu C (2019) Crowdpose: efficient crowded scenes pose estimation and a new benchmark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10863–10872
Papandreou G, Zhu T, Chen L-C, Gidaris S, Tompson J, Murphy K (2018) Personlab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proceedings of the european conference on computer vision (ECCV), pp 269–286
Kreiss S, Bertoni L, Alahi A (2019) Pifpaf: composite fields for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11977–11986
Zhao Y, Luo Z, Quan C, Liu D, Wang G (2020) Cluster-wise learning network for multi-person pose estimation. Pattern Recogn 107074:98
Google Scholar
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0
Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Ren Z, Wang Z (2019) Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8351–8361
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Sun X, Xiao B, Wei F, Liang S, Wei Y (2018) Integral human pose regression. In: Proceedings of the european conference on computer vision (ECCV), pp 529–545
Yang W, Li S, Ouyang W, Li H, Wang X (2017) Learning feature pyramids for human pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp 1281–1290

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61972056, in part by the Hunan Provincial Natural Science Foundation of China under Grant 2021JJ30743 and 2021JJ30741, in part by the Degree & Post-graduate Education Reform Project of Hunan Province of China under Grant 2020JGZD043, in part by the Youth Innovation Project of Guangdong Provincial Department of Education under Grant 2020kqncx205.

Author information

Authors and Affiliations

School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, Hunan, China
Shuren Zhou & Liang Peng

Authors

Shuren Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Liang Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuren Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, S., Peng, L. Channel sifted model for pose estimation. Appl Intell 53, 11373–11388 (2023). https://doi.org/10.1007/s10489-022-04091-1

Download citation

Accepted: 16 August 2022
Published: 03 September 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10489-022-04091-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Channel sifted model for pose estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multi-channel spatial information feature based human pose estimation algorithm

Lightweight human pose estimation: CVC-net

Lightweight Human Pose Estimation Based on Multi-Attention Mechanism

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now