Abstract
We consider the problem of monocular 3d body pose tracking from video sequences. This task is inherently ambiguous. We propose to learn a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Within a particle filtering framework, the potentially multimodal posterior probability distributions can then be inferred. The 2d bounding box location of the person in the image is estimated along with its body pose. Body poses are modelled on a low-dimensional manifold, obtained by LLE dimensionality reduction. In addition to the appearance model, we learn a prior model of likely body poses and a nonlinear dynamical model, making both pose and bounding box estimation more robust. The approach is evaluated on a number of challenging video sequences, showing the ability of the approach to deal with low-resolution images and noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rosales, R., Sclaroff, S.: Learning body pose via specialized maps. In: NIPS (2001)
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P., Cipolla, R.: Multivariate relevance vector machines for tracking. In: Ninth European Conference on Computer Vision (2006)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR (2005)
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: CVPR. IEEE Workshop on Vision for Human-Computer Interaction, IEEE Computer Society Press, Los Alamitos (2005)
Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)
Forsyth, D.A., Arikan, O., Ikemoto, L., O’Brien, J.D.R.: Computational studies of human motion: Part 1. Computer Graphics and Vision 1(2/3) (2006)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Tipping, M.: The relevance vector machine. In: NIPS (2000)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3852, Springer, Heidelberg (2006)
Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: CVPR (2004)
Lim, H., Camps, O.I., Sznaier, M., Morariu, V.I.: Dynamic appearance modeling for human tracking. In: Conference on Computer Vision and Pattern Recognition, pp. 751–757 (2006)
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. Advances in Neural Information Processing Systems 18, 1441–1448 (2006)
Sminchisescu, C., Jepson, A.: Generative modeling for continuous non-linearly embedded visual inference. In: ICML. International Conference on Machine Learning (2004)
Li, R., Yang, M.H., Sclaroff, S., Tian, T.P.: Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 137–150. Springer, Heidelberg (2006)
Zivkovic, Z., Verbeek, J.: Transformation invariant component analysis for binary images. In: CVPR, vol. 1, pp. 254–259 (2006)
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math Soc. (1943)
Isard, M., Blake, A.: Condensation - conditional density propagation for visual tracking. Int. J. Computer Vision (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaeggli, T., Koller-Meier, E., Van Gool, L. (2007). Learning Generative Models for Monocular Body Pose Estimation. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds) Computer Vision – ACCV 2007. ACCV 2007. Lecture Notes in Computer Science, vol 4843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76386-4_57
Download citation
DOI: https://doi.org/10.1007/978-3-540-76386-4_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76385-7
Online ISBN: 978-3-540-76386-4
eBook Packages: Computer ScienceComputer Science (R0)