Estimating Human Pose from Occluded Images

Huang, Jia-Bin; Yang, Ming-Hsuan

doi:10.1007/978-3-642-12307-8_5

Jia-Bin Huang¹⁹ &
Ming-Hsuan Yang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5994))

Included in the following conference series:

Asian Conference on Computer Vision

Abstract

We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l ₁-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Article 16 November 2020

Bayesian Image Based 3D Pose Estimation

Learning to Estimate Multi-view Pose from Object Silhouettes

References

Sigal, L., Isard, M., Sigelman, B., Black, M.: Attractive people: Assembling loose-limbed models using non-parametric belief propagation. In: NIPS, pp. 1539–1546 (2004)
Google Scholar
Grauman, K., Shakhnarovich, G., Darrell, T.: Inferring 3d structure with a statistical image-based shape model. In: ICCV, pp. 641–647 (2003)
Google Scholar
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR, pp. 390–397 (2005)
Google Scholar
Sigal, L., Black, M.: Predicting 3d people from 2d pictures. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2006. LNCS, vol. 4069, pp. 185–195. Springer, Heidelberg (2006)
Chapter Google Scholar
Bo, L., Sminchisescu, C., Kanaujia, A., Metaxas, D.: Fast algorithms for large scale conditional 3d prediction. In: CVPR (2008)
Google Scholar
Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3851, pp. 50–59. Springer, Heidelberg (2006)
Chapter Google Scholar
Elgammal, A., Lee, C.: Inferring 3d body pose from silhouettes using activity manifold learning. In: CVPR, vol. 2, pp. 681–688 (2004)
Google Scholar
Jaeggli, T., Koller-Meier, E., Gool, L.V.: Learning generative models for multi-activity body pose estimation. IJCV 83(2), 121–134 (2009)
Article Google Scholar
Sminchisescu, C., Kanaujia, A., Metaxas, D.: Bm ³ e: Discriminative density propagation for visual tracking. PAMI 29(11), 2030–2044 (2007)
Google Scholar
Bissacco, A., Yang, M.H., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: CVPR, pp. 1–8 (2007)
Google Scholar
Poppe, R.: Evaluating example-based pose estimation: experiments on the HumanEva sets. In: IEEE Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2007)
Google Scholar
Okada, R., Soatto, S.: Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 434–445. Springer, Heidelberg (2008)
Chapter Google Scholar
Ning, H., Xu, W., Gong, Y., Huang, T.: Discriminative learning of visual words for 3d human pose estimation. In: CVPR (2008)
Google Scholar
Moeslund, T., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81(3), 231–268 (2001)
Article MATH Google Scholar
Gavrila, D.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)
Article MATH Google Scholar
Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Transactions on Computers 22(1), 67–92 (1973)
Article Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Efficient matching of pictorial structures. In: CVPR, vol. 2, pp. 2066–2073 (2000)
Google Scholar
Ronfard, R., Schmid, C., Triggs, B.: Learning to parse pictures of people. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 700–714. Springer, Heidelberg (2002)
Chapter Google Scholar
Ioffe, S., Forsyth, D.: Probabilistic methods for finding people. IJCV 43(1), 45–68 (2001)
Article MATH Google Scholar
Ramanan, D., Forsyth, D.: Finding and tracking people from the bottom up. In: CVPR, vol. 2, pp. 467–474 (2003)
Google Scholar
Mori, G., Ren, X., Efros, A., Malik, J.: Recovering human body configurations: Combining segmentation and recognition. In: CVPR, vol. 2, pp. 326–333 (2004)
Google Scholar
Taylor, C.J.: Reconstruction of articulated objects from point correspondence using a single uncalibrated image. In: CVPR, vol. 1, pp. 667–684 (2000)
Google Scholar
Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part III. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)
Chapter Google Scholar
Brand, M.: Shadow puppetry. In: ICCV, pp. 1237–1244 (1999)
Google Scholar
Tipping, M.: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2004)
Article MathSciNet Google Scholar
Agarwal, A., Triggs, B.: Recovering 3d human pose from monocular images. PAMI 28(1), 44–58 (2006)
Google Scholar
Rosales, R., Sclaroff, S.: Learning body pose via specialized maps. In: NIPS, pp. 1263–1270 (2001)
Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV, pp. 750–757 (2003)
Google Scholar
Candes, E., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52(2), 489–509 (2006)
Article MathSciNet Google Scholar
Candes, E., Tao, T.: Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory 52(12), 5406–5425 (2006)
Article MathSciNet Google Scholar
Donoho, D.: Compressed sensing. IEEE Transactions on Information Theory 52(4), 1289–1306 (2006)
Article MathSciNet Google Scholar
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. PAMI 31(2), 210–227 (2009)
Google Scholar
Boyd, S.P., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
MATH Google Scholar
Chen, S., Donoho, D., Saunders, M.: Automatic decomposition by basis pursuit. SIAM Journal of Scientific Computation 20(1), 33–61 (1998)
Article MathSciNet Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian processes for machine learning. MIT Press, Cambridge (2006)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering and Computer Science, University of California at Merced,
Jia-Bin Huang & Ming-Hsuan Yang

Authors

Jia-Bin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Hsuan Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Machine Intelligence, Peking University, 100871, Beijing, China
Hongbin Zha
Department of Advanced Information Technology, Kyushu University, 819-0395, Fukuoka, Japan
Rin-ichiro Taniguchi
Birkbeck College, Department of Computer Science, University of London, WC1E 7HX, London, UK
Stephen Maybank

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, JB., Yang, MH. (2010). Estimating Human Pose from Occluded Images. In: Zha, H., Taniguchi, Ri., Maybank, S. (eds) Computer Vision – ACCV 2009. ACCV 2009. Lecture Notes in Computer Science, vol 5994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12307-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-12307-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12306-1
Online ISBN: 978-3-642-12307-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics