A Local Basis Representation for Estimating Human Pose from Cluttered Images

Agarwal, Ankur; Triggs, Bill

doi:10.1007/11612032_6

Ankur Agarwal¹⁹ &
Bill Triggs¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3851))

Included in the following conference series:

Asian Conference on Computer Vision

1759 Accesses
25 Citations

Abstract

Recovering the pose of a person from single images is a challenging problem. This paper discusses a bottom-up approach that uses local image features to estimate human upper body pose from single images in cluttered backgrounds. The method takes the image window with a dense grid of local gradient orientation histograms, followed by non negative matrix factorization to learn a set of bases that correspond to local features on the human body, enabling selective encoding of human-like features in the presence of background clutter. Pose is then recovered by direct regression. This approach allows us to key on gradient patterns such as shoulder contours and bent elbows that are characteristic of humans and carry important pose information, unlike current regressive methods that either use weak limb detectors or require prior segmentation to work. The system is trained on a database of images with labelled poses. We show that it estimates pose with similar performance levels to current example-based methods, but unlike them it works in the presence of natural backgrounds, without any prior segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Density Estimation Using Multiscale Local Polynomial Transforms

In Defense of Gradient-Based Alignment on Densely Sampled Sparse Features

DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model

References

Agarwal, A., Triggs, B.: 3D Human Pose from Silhouettes by Relevance Vector Regression. In: Int. Conf. Computer Vision & Pattern Recognition (2004)
Google Scholar
Agarwal, A., Triggs, B.: Monocular Human Motion Capture with a Mixture of Regressors. In: IEEE Workshop on Vision for Human-Computer Interaction (2005)
Google Scholar
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Int. Conf. Computer Vision (2005)
Google Scholar
Lowe, D.: Distinctive Image Features from Scale-invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Pictorial Structures for Object Recognition. International Journal of Computer Vision 61(1) (2005)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object Class Recognition by Unsupervised Scale-Invariant Learning. In: Int. Conf. Computer Vision & Pattern Recognition (2003)
Google Scholar
Hoyer, P.: Non-negative Matrix Factorization with Sparseness Constraints. J. Machine Learning Research 5, 1457–1469 (2004)
MathSciNet Google Scholar
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human Detection based on a Probabilistic Assembly of Robust Part Detectors. In: European Conference on Computer Vision, vol. I, pp. 69–81 (2004)
Google Scholar
Lee, D.D., Seung, H.S.: Learning the Parts of Objects by Non–negative Matrix Factorization. Nature 401, 788–791 (1999)
Article Google Scholar
Lee, M., Cohen, I.: Human Upper Body Pose Estimation in Static Images. In: European Conference on Computer Vision (2004)
Google Scholar
Malik, J., Belongie, S., Leung, T., Shi, J.: Contour and texture analysis for image segmentation. International Journal of Computer Vision 43(1), 7–27 (2001)
Article MATH Google Scholar
Mori, G., Ren, X., Efros, A., Malik, J.: Recovering Human Body Configurations: Combining Segmentation and Recognition. In: Int. Conf. Computer Vision & Pattern Recognition (2004)
Google Scholar
Olshausen, B., Field, D.: Natural image statistics and efficient coding. Network: Computation in Neural Systems 7(2), 333–339 (1996)
Article Google Scholar
Ramanan, D., Forsyth, D.: Finding and Tracking People from the Bottom Up. In: Int. Conf. Computer Vision & Pattern Recognition (2003)
Google Scholar
Ronfard, R., Schmid, C., Triggs, B.: Learning to Parse Pictures of People. In: European Conference on Computer Vision, Copenhagen, pp. IV 700–714 (2002)
Google Scholar
Kumar, S., Hebert, M.: Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification. In: Int. Conf. Computer Vision (2003)
Google Scholar
Sali, E., Ullman, S.: Combining Class-specific Fragments for Object Classification. In: British Machine Vision Conference (1999)
Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast Pose Estimation with Parameter Sensitive Hashing. In: Int. Conf. Computer Vision (2003)
Google Scholar
Sigal, L., Isard, M., Sigelman, B., Black, M.: Assembling Loose-limbed Models using Non-parametric Belief Propagation. In: NIPS (2003)
Google Scholar
Sminchisescu, C., Triggs, B.: Estimating articulated human motion with covariance scaled sampling. International Journal of Robotics Research (Special issue on Visual Analysis of Human Movement) 22(6), 371–391 (2003)
Google Scholar
Sullivan, J., Blake, A., Isaard, M., MacCormick, J.: Object Localization by Bayesian Correlation. In: Int. Conf. Computer Vision (1999)
Google Scholar
van Haateran, J., vander Schaaf, A.: Independent component filters of natural images compared with simlpe cells in preimary visual cortex. Proc. R. Soc. Lond., B 265, 359–366 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

GRAVIR-INRIA-CNRS, 655 avenue de l’Europe, Montbonnot, 38330, France
Ankur Agarwal & Bill Triggs

Authors

Ankur Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Bill Triggs
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

International Institute of Information Technology, Center for Visual Information Technology, Hyderabad, India
P. J. Narayanan
Department of Computer Science, Columbia University, 500 West 120th Street, 10027, New York, NY, USA
Shree K. Nayar
Microsoft Research Asia, Beijing, P.R. China
Heung-Yeung Shum

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agarwal, A., Triggs, B. (2006). A Local Basis Representation for Estimating Human Pose from Cluttered Images. In: Narayanan, P.J., Nayar, S.K., Shum, HY. (eds) Computer Vision – ACCV 2006. ACCV 2006. Lecture Notes in Computer Science, vol 3851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11612032_6

Download citation

DOI: https://doi.org/10.1007/11612032_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31219-2
Online ISBN: 978-3-540-32433-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics