Cascade Error-Correction Mechanism for Human Pose Estimation in Videos

Dai, Huibing; He, Lihuo; Gao, Xinbo; Guo, Zhaoqi; Lu, Wen

doi:10.1007/978-3-319-67777-4_25

Huibing Dai¹⁸,
Lihuo He¹⁸,
Xinbo Gao¹⁸,
Zhaoqi Guo¹⁸ &
…
Wen Lu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10559))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2240 Accesses

Abstract

This paper aims to estimate constantly changing human poses in videos. Traditional methods fail to locate wrists accurately, which is a tremendously challenging task. We propose a three-stage framework for human pose estimation, emphasizing on the improvement of wrist location accuracy. The first stage applies the pictorial structure model to localize the positions of all joints in each frame and calculate the posterior edge distribution probability of wrists. In the second stage, a visual tracking based method is fused into the posterior edge distribution probability of wrists to obtain the wrist location. Instead of directly predicting the wrist location, the third stage designs a novel cascade error-correction mechanism (CECM) to correct the predicted results. In addition, a skin-based proposal and multifarious reinitializing modes are also involved in CECM. Experiments are conducted on the two public datasets, and results demonstrate the superiority of the proposed algorithm compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yang, Y., Ramanan, D.: Articulated Human Detection with Flexible Mixtures of Parts. IEEE Trans. Software Eng. 35(12), 2878–2890 (2013)
Google Scholar
Pishchulin, L., Andriluka, M., Gehler, P.: Poselet Conditioned Pictorial Structures. IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–595, Portland (2013)
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660, Portland (2013)
Google Scholar
Fan, X., Zheng, K., Lin, Y., Wang, S.: Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1347–1355, Boston (2015)
Google Scholar
Wei, S. E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732, Las Vegas (2016)
Google Scholar
Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A. L., Wang, X.: Multi-context attention for human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition, Honolulu (2017)
Google Scholar
Park, D., & Ramanan, D..: N-best maximal decoders for part models. IEEE International Conference on Computer Vision, pp. 2627–2634, Spain (2011)
Google Scholar
Ramakrishna, V., Kanade, T., Sheikh, Y.: Tracking Human Pose by Tracking Symmetric Parts. IEEE Conference on Computer Vision and Pattern Recognition, Portland, pp. 3728–3735 (2013)
Google Scholar
Tokola, R., Choi, W., Savarese, S.: Breaking the chain: liberation from the temporal Markov assumption for tracking human poses. In: IEEE Conference on Computer Vision and Pattern Recognition, Portland, pp. 2424–2431 (2013)
Google Scholar
Cherian, A., Mairal, J., Alahari, K., Schmid, C.: Mixing body-part sequences for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp. 2353–2360 (2014)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55–79 (2005)
Article Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)
MATH Google Scholar
Li, X., Hu, W., Zhang, Z., Zhang, X., Luo, G.: Robust visual tracking based on incremental tensor subspace learning. In: IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp. 1–8 (2007)
Google Scholar
Shao, L., Jones, S., Li, X.: Efficient search and localization of human actions in video databases. IEEE Trans. Circ. Syst. Video Technol. 24(3), 504–512 (2014)
Article Google Scholar
Saegusa, R., Metta, G., Sandini, G., Natale, L.: Developmental perception of the self and action. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 183 (2014)
Article Google Scholar
Bousmalis, K., Zafeiriou, S., Morency, L.P., Pantic, M.: Infinite hidden conditional random fields for human behavior analysis. IEEE Trans. Neural Netw. Learn. Syst. 24(1), 170 (2013)
Article Google Scholar
Tao, D., Jin, L., Wang, Y., Li, X.: Rank preserving discriminant analysis for human behavior recognition on wireless sensor networks. IEEE Trans. Ind. Inform. 10(1), 813–823 (2014)
Article Google Scholar
Ding, C., Xu, C., Tao, D.: Multi-task pose-invariant face recognition. IEEE Trans. Image Process. 24(3), 980 (2015)
Article MathSciNet Google Scholar
Zhen, X., Shao, L., Li, X.: Action recognition by spatio-temporal oriented energies. Inf. Sci. 281, 295–309 (2014)
Article Google Scholar
Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1), 12–141 (2008)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp. 1–8 (2008)
Google Scholar
Sapp, B., Weiss, D., Taskar, B.: Parsing human motion with stretchable models. In: IEEE International Conference on Computer Vision, Spain, pp. 1281–1288 (2011)
Google Scholar
Zhao, L., Gao, X., Tao, D., Li, X.: Learning a tracking and estimation integrated graphical model for human pose tracking. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3176–3186 (2015)
Article MathSciNet Google Scholar
Ferrari, V., Marinjimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: IEEE International Conference on Computer Vision, Anchorage, pp. 1–8 (2008)
Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61432014, 61501349 and U1605252, in part by the National Key Research and Development Program of China under Grant 2016QY01W0204, in part by Key Industrial Innovation Chain in Industrial Domain under Grant 2016KTZDGY-02, in part by the Fundamental Research Funds for the Central Universities under Grant XJS17074 and JBX170218, in part by National High-Level Talents Special Support Program of China under Grant CS31117200001, in part by the Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2017JM6050.

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Huibing Dai, Lihuo He, Xinbo Gao, Zhaoqi Guo & Wen Lu

Authors

Huibing Dai
View author publications
You can also search for this author in PubMed Google Scholar
Lihuo He
View author publications
You can also search for this author in PubMed Google Scholar
Xinbo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoqi Guo
View author publications
You can also search for this author in PubMed Google Scholar
Wen Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lihuo He .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Yi Sun
Dalian University of Technology, Dalian, China
Huchuan Lu
Dalian University of Technology, Dalian, China
Lihe Zhang
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
Beijing Institute of Technology, Beijing, China
Hua Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dai, H., He, L., Gao, X., Guo, Z., Lu, W. (2017). Cascade Error-Correction Mechanism for Human Pose Estimation in Videos. In: Sun, Y., Lu, H., Zhang, L., Yang, J., Huang, H. (eds) Intelligence Science and Big Data Engineering. IScIDE 2017. Lecture Notes in Computer Science(), vol 10559. Springer, Cham. https://doi.org/10.1007/978-3-319-67777-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-67777-4_25
Published: 14 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67776-7
Online ISBN: 978-3-319-67777-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics