Abstract
We present a method to simultaneously estimate 3d body pose and action categories from monocular video sequences. Our approach learns a low-dimensional embedding of the pose manifolds using Locally Linear Embedding (LLE), as well as the statistical relationship between body poses and their image appearance. In addition, the dynamics in these pose manifolds are modelled. Sparse kernel regressors capture the nonlinearities of these mappings efficiently. Body poses are inferred by a recursive Bayesian sampling algorithm with an activity-switching mechanism based on learned transfer functions. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.
Keywords
- Image Descriptor
- Locally Linear Embedding
- Relevance Vector Machine
- Nonlinear Dimensionality Reduction
- Appearance Descriptor
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rosales, R., Sclaroff, S.: Learning body pose via specialized maps. In: NIPS (2001)
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P., Cipolla, R.: Multivariate relevance vector machines for tracking. In: Ninth European Conference on Computer Vision (2006)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR (2005)
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: IEEE Workshop on Vision for Human-Computer Interaction at CVPR, IEEE Computer Society Press, Los Alamitos (2005)
Isard, M., Blake, A.: Condensation - conditional density propagation for visual tracking. Int. J. Computer Vision (1998)
Doucet, A., Godsill, S., Andrieu, C.: On sequentional monte carlo sampling methods for bayesian filtering. Statistics and Computing (2000)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Forsyth, D.A., Arikan, O., Ikemoto, L., O’ Brien, D.R.J.: Computational studies of human motion: Part 1. Computer Graphics and Vision 1(2/3) (2006)
Agarwal, A., Triggs, B.: 3d human pose from silhouettes by relevance vector regression. In: CVPR (2004)
Grauman, K., Shakhnarovich, G., Darrel, T.: Inferring 3d structure with a statistical image-based shape model. In: ICCV (2003)
Sun, Y., Bray, M., Thayananthan, A., Yuanand, B., Torr, P.: Regression-based human motion capture from voxel data. In: Proceedings British Machine Vision Conference (2006)
Lim, H., Camps, O.I., Sznaier, M., Morariu, V.I.: Dynamic appearance modeling for human tracking. In: Conference on Computer Vision and Pattern Recognition, pp. 751–757 (2006)
Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: CVPR (2004)
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. Advances in Neural Information Processing Systems 18, 1441–1448 (2006)
Sminchisescu, C., Jepson, A.: Generative modeling for continuous non-linearly embedded visual inference. In: ICML (2004)
Li, R., Yang, M.H., Sclaroff, S., Tian, T.P.: Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 137–150. Springer, Heidelberg (2006)
Urtasun, R., Fleet, D.J., Fua, P.: 3d people tracking with gaussian process dynamical models. In: CVPR 2006. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 238–245. IEEE Computer Society Press, Los Alamitos (2006)
Jaeggli, T., Koller-Meier, E., Van Gool, L.: Monocular Tracking with a Mixture of View-Dependent Learned Models. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2006. LNCS, vol. 4069, Springer, Heidelberg (2006)
Isard, M., Blake, A.: A mixed-state CONDENSATION tracker with automatic model-switching. In: ICCV, pp. 107–112 (1998)
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)
Bailey, D.G.: An efficient euclidean distance transform. In: Klette, R., Žunić, J. (eds.) IWCIA 2004. LNCS, vol. 3322, Springer, Heidelberg (2004)
Zivkovic, Z., Verbeek, J.: Transformation invariant component analysis for binary images. In: CVPR (1), pp. 254–259 (2006)
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math Soc. (1943)
Kschischang, F., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans. Info. Theory 47, 498–519 (2001)
Yedidia, J., Freeman, W., Weiss, Y.: Understanding belief propagation and its generalizations. Technical Report TR-2001-22, MERL (2002)
Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, pp. 702–718. Springer, Heidelberg (2000)
Tipping, M.: The relevance vector machine. In: NIPS (2000)
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: CVPR (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaeggli, T., Koller-Meier, E., Van Gool, L. (2007). Multi-activity Tracking in LLE Body Pose Space. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds) Human Motion – Understanding, Modeling, Capture and Animation. HuMo 2007. Lecture Notes in Computer Science, vol 4814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75703-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-75703-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75702-3
Online ISBN: 978-3-540-75703-0
eBook Packages: Computer ScienceComputer Science (R0)