Abstract
In exemplar-based approaches for human pose estimation, it is common to extract multiple features to better describe the visual input data. However, simply concatenating multiview features into a long vector has two shortcomings: (1) it suffers from “curse of dimensionality”; (2) it is not physically meaningful and may be incapable of fully exploiting the complementary properties of multi-view features. To address such problems, in this paper we present a dimension reduction method based on supervised spectral embedding, followed by an ensemble of nearest neighbor regressions in multi-view feature space, to infer 3D human poses from monocular videos. The experiments on HumanEva dataset show the effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We do not use Poisson features because it is very time-consuming to compute.
References
Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006)
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 1014–1021. IEEE (2009)
Bao, Y., Ishii, N., Du, X.-Y.: Combining multiple k-nearest neighbor classifiers using different distance functions. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 634–641. Springer, Heidelberg (2004)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14, 585–591 (2001)
BenAbdelkader, C.: Robust head pose estimation using supervised manifold learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 518–531. Springer, Heidelberg (2010)
Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. Adv. Neural Inf. Process. Syst. 16, 177–184 (2004)
Cai, D., He, X., Han, J.: Isometric projection. In: Proceedings of the National Conference on Artificial Intelligence, vol. 22, p. 528. AAAI Press, MIT Press, Menlo Park, Cambridge (1999, 2007)
Chen, C., Yang, Y., Nie, F., Odobez, J.-M.: 3D human pose recovery from image by efficient visual feature selection. Comput. Vis. Image Underst. 115(3), 290–299 (2011)
Elgammal, A., Lee, C.-S.: Inferring 3D body pose from silhouettes using activity manifold learning. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. 681–688. IEEE (2004)
García-Pedrajas, N., Ortiz-Boyer, D.: Boosting k-nearest neighbor classifier by means of input space projection. Expert Syst. Appl. 36, 10570–10582 (2009)
He, X., Cai, D., Yan, S., Zhang, H.-J.: Neighborhood preserving embedding. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1208–1213. IEEE (2005)
He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004)
Kumar, A., Rai, P., Daume, H.: Co-regularized multi-view spectral clustering. In: Advances in Neural Information Processing Systems, pp. 1413–1421 (2011)
Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)
Poppe, R.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108(1), 4–18 (2007)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Sigal, L., Black, M.J.: Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion. Brown Univertsity TR, 120 (2006)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Xia, T., Tao, D., Mei, T., Zhang, Y.: Multiview spectral embedding. IEEE Trans. Syst. Man Cybern. Part B Cybern. 40(6), 1438–1446 (2010)
Jun, Y., Wang, M., Tao, D.: Semisupervised multiview distance metric learning for cartoon synthesis. IEEE Trans. Image Process. 21(11), 4636–4648 (2012)
Yu, K., Wang, Z., Hagenbuchner, M., Feng, D.D.: Spectral embedding based facial expression recognition with multiple features. Neurocomputing 129, 136–145 (2014)
Acknowledgments
Zhonggui Chen was partially supported by the Fundamental Research Funds for the Central Universities (No. 20720140520).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Guo, Y., Chen, Z., Yu, J. (2015). Supervised Spectral Embedding for Human Pose Estimation. In: He, X., et al. Intelligence Science and Big Data Engineering. Image and Video Data Engineering. IScIDE 2015. Lecture Notes in Computer Science(), vol 9242. Springer, Cham. https://doi.org/10.1007/978-3-319-23989-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-23989-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23987-3
Online ISBN: 978-3-319-23989-7
eBook Packages: Computer ScienceComputer Science (R0)