Learning Generative Models for Monocular Body Pose Estimation

Jaeggli, Tobias; Koller-Meier, Esther; Van Gool, Luc

doi:10.1007/978-3-540-76386-4_57

Tobias Jaeggli¹,
Esther Koller-Meier¹ &
Luc Van Gool^1,2

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4843))

Included in the following conference series:

Asian Conference on Computer Vision

3263 Accesses
4 Citations

Abstract

We consider the problem of monocular 3d body pose tracking from video sequences. This task is inherently ambiguous. We propose to learn a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Within a particle filtering framework, the potentially multimodal posterior probability distributions can then be inferred. The 2d bounding box location of the person in the image is estimated along with its body pose. Body poses are modelled on a low-dimensional manifold, obtained by LLE dimensionality reduction. In addition to the appearance model, we learn a prior model of likely body poses and a nonlinear dynamical model, making both pose and bounding box estimation more robust. The approach is evaluated on a number of challenging video sequences, showing the ability of the approach to deal with low-resolution images and noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rosales, R., Sclaroff, S.: Learning body pose via specialized maps. In: NIPS (2001)
Google Scholar
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P., Cipolla, R.: Multivariate relevance vector machines for tracking. In: Ninth European Conference on Computer Vision (2006)
Google Scholar
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR (2005)
Google Scholar
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: CVPR. IEEE Workshop on Vision for Human-Computer Interaction, IEEE Computer Society Press, Los Alamitos (2005)
Google Scholar
Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)
Chapter Google Scholar
Forsyth, D.A., Arikan, O., Ikemoto, L., O’Brien, J.D.R.: Computational studies of human motion: Part 1. Computer Graphics and Vision 1(2/3) (2006)
Google Scholar
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Article Google Scholar
Tipping, M.: The relevance vector machine. In: NIPS (2000)
Google Scholar
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3852, Springer, Heidelberg (2006)
Google Scholar
Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: CVPR (2004)
Google Scholar
Lim, H., Camps, O.I., Sznaier, M., Morariu, V.I.: Dynamic appearance modeling for human tracking. In: Conference on Computer Vision and Pattern Recognition, pp. 751–757 (2006)
Google Scholar
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. Advances in Neural Information Processing Systems 18, 1441–1448 (2006)
Google Scholar
Sminchisescu, C., Jepson, A.: Generative modeling for continuous non-linearly embedded visual inference. In: ICML. International Conference on Machine Learning (2004)
Google Scholar
Li, R., Yang, M.H., Sclaroff, S., Tian, T.P.: Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 137–150. Springer, Heidelberg (2006)
Chapter Google Scholar
Zivkovic, Z., Verbeek, J.: Transformation invariant component analysis for binary images. In: CVPR, vol. 1, pp. 254–259 (2006)
Google Scholar
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math Soc. (1943)
Google Scholar
Isard, M., Blake, A.: Condensation - conditional density propagation for visual tracking. Int. J. Computer Vision (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

ETH Zurich, D-ITET/BIWI, CH-8092 Zurich,
Tobias Jaeggli, Esther Koller-Meier & Luc Van Gool
Katholieke Universiteit Leuven, ESAT/VISICS, B-3001 Leuven,
Luc Van Gool

Authors

Tobias Jaeggli
View author publications
You can also search for this author in PubMed Google Scholar
Esther Koller-Meier
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Yasushi Yagi Sing Bing Kang In So Kweon Hongbin Zha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaeggli, T., Koller-Meier, E., Van Gool, L. (2007). Learning Generative Models for Monocular Body Pose Estimation. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds) Computer Vision – ACCV 2007. ACCV 2007. Lecture Notes in Computer Science, vol 4843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76386-4_57

Download citation

DOI: https://doi.org/10.1007/978-3-540-76386-4_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76385-7
Online ISBN: 978-3-540-76386-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics