3D human pose estimation from image using couple sparse coding

Zolfaghari, Mohammadreza; Jourabloo, Amin; Gozlou, Samira Ghareh; Pedrood, Bahman; Manzuri-Shalmani, Mohammad T.

doi:10.1007/s00138-014-0613-6

3D human pose estimation from image using couple sparse coding

Original Paper
Published: 30 April 2014

Volume 25, pages 1489–1499, (2014)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Mohammadreza Zolfaghari¹,
Amin Jourabloo¹,
Samira Ghareh Gozlou²,
Bahman Pedrood¹ &
…
Mohammad T. Manzuri-Shalmani¹

704 Accesses
13 Citations
Explore all metrics

Abstract

Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pose Locality Constrained Representation for 3D Human Pose Reconstruction

Incremental Principal Component Analysis-Based Sparse Representation for Face Pose Classification

Keep It Simple and Sparse: Real-Time Action Recognition

Notes

BVH format created by Biovision Company to describing 3D pose in animation production. http://www.cs.wisc.edu/graphics/Courses/cs-838-1999/Jeff/BVH.html
http://www.poser.com

References

Cmu graphics lab motion capture database (2013). http://mocap.cs.cmu.edu
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition—Workshops, vol. 03, CVPR ’05, pp. 72. IEEE Computer Society, Washington, DC (2005) doi:10.1109/CVPR.2005.496
Agarwal, A., Triggs, B.: Recovering 3d human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006). doi:10.1109/TPAMI.2006.21
Article Google Scholar
Aharon, M., Elad, M., Bruckstein, A.: K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. Trans. Sig. Proc. 54(11), 4311–4322 (2006). doi:10.1109/TSP.2006.881199
Article Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 623–630 (2010). doi:10.1109/CVPR.2010.5540156
Bo, L., Sminchisescu, C.: Twin gaussian processes for structured prediction. Int. J. Comput. Vision 87(1–2), 28–52 (2010). doi:10.1007/s11263-008-0204-y
Article Google Scholar
Cai, T.T., Wang, L.: Orthogonal matching pursuit for sparse signal recovery with noise. IEEE Trans. Inf. Theor. 57(7), 4680–4688 (2011). doi:10.1109/TIT.2011.2146090
Article MathSciNet Google Scholar
Candes, E.J., Tao, T.: Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theor. 52(12), 5406–5425 (2006). doi:10.1109/TIT.2006.885507
Article MathSciNet Google Scholar
Chen, C., Yang, Y., Nie, F., Odobez, J.M.: 3d human pose recovery from image by efficient visual feature selection. Comput. Vis. Image Underst. 115(3), 290–299 (2011). doi:10.1016/j.cviu.2010.11.007
Article Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001). doi:10.1137/S003614450037906X
Article MATH MathSciNet Google Scholar
Christoudias, C.M., Darrell, T.: On modelling nonlinear shape-and-texture appearance manifolds. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, vol. 02, CVPR ’05, pp. 1067–1074. IEEE Computer Society, Washington, DC (2005). doi:10.1109/CVPR.2005.255
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52(4), 1289–1306 (2006). doi:10.1109/TIT.2006.871582
Article MATH MathSciNet Google Scholar
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59(6), 797–829 (2006)
Article MATH MathSciNet Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Article MATH MathSciNet Google Scholar
Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: Proceedings of the IEEE Computer Society Conference on Computer vision and Pattern Recognition. CVPR’04, pp. 681–688. IEEE Computer Society, Washington, DC (2004)
Hara, K., Kurokawa, T.: Human pose estimation using patch-based candidate generation and model-based verification. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG), pp. 687–693 (2011). doi:10.1109/FG.2011.5771331
Huang, J.B., Yang, M.H.: Estimating human pose from occluded images. In: ACCV (1), Lecture Notes in Computer Science, vol. 5994, pp. 48–60. Springer, Berlin (2009)
Huang, J.B., Yang, M.H.: Fast sparse representation with prototypes. In: CVPR, pp. 3618–3625. IEEE, New York (2010)
Jiang, H.: 20th International Conference on 3d human pose reconstruction using millions of exemplars. In: Pattern Recognition (ICPR), pp. 1674–1677 (2010). doi:10.1109/ICPR.2010.414
Lee, C.S., Elgammal, A.M.: Modeling view and posture manifolds for tracking. In: ICCV, pp. 1–8. IEEE, New York (2007)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808. NIPS, Kolkata (2007)
Lee, M.W., Nevatia, R.: Human pose tracking in monocular sequence using multilevel structured models. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 27–38 (2009). doi:10.1109/TPAMI.2008.35.
Mairal, J., Bach, F., Ponce, J.: Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 791–804 (2012). doi:10.1109/TPAMI.2011.156
Article Google Scholar
Mori, G., Malik, J.: Recovering 3d human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1052–1062 (2006)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Res. 37, 3311–3325 (1997)
Article Google Scholar
Pourdamghani, N., Rabiee, H.R., Faghri, F., Rohban, M.H.: Graph based semi-supervised human pose estimation: When the output space comes to help. Pattern Recogn. Lett. 33(12), 1529–1535 (2012). doi:10.1016/j.patrec.2012.04.012
Article Google Scholar
Rao, R.P.N., Olshausen, B.A., Lewicki, M.S.: Probabilistic models of the brain: perception and neural function. MIT Press, Cambridge (2002)
Google Scholar
Rubinstein, R., Bruckstein, A., Elad, M.: Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010). doi:10.1109/JPROC.2010.2040551
Article Google Scholar
Serre, T.: Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines. Mass. Inst. Technol. (2006)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, ICCV ’03, pp. 750. IEEE Computer Society, Washington, DC (2003)
Shang, L., Zhou, Y., Tao, L., Sun, Z.l.: Super-resolution restoration of mmw image using sparse representation based on couple dictionaries. In: Emerging Intelligent Computing Technology and Applications, pp. 286–291. Springer, Berlin (2012)
Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: Sparse representations of image gradient orientations for visual recognition and tracking. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition (CVPR-W11), Workshop on CVPR for Human Behaviour Analysis, pp. 26–33. Colorado Springs, USA (2011)
Urtasun, R., Fleet, D.J., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV) vol. 1, vol. 01, ICCV ’05, pp. 403–410. IEEE Computer Society, Washington, DC (2005) doi:10.1109/ICCV.2005.193
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T., Yan, S.: Sparse representation for computer vision and pattern recognition (2009)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009). doi:10.1109/TPAMI.2008.79
Yang, J., Wang, Z., Lin, Z., Cohen, S., Huang, T.: Coupled dictionary training for image super-resolution. IEEE Trans. Image Process. 21(8), 3467–3478 (2012)
Article MathSciNet Google Scholar
Yang, S., Liu, Z., Wang, M., Sun, F., Jiao, L.: Multitask dictionary learning and sparse representation based single-image super-resolution reconstruction. Neurocomputing 74(17), 3193–3203 (2011). doi:10.1016/j.neucom.2011.04.014
Article Google Scholar
Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., Cai, D.: Graph regularized sparse coding for image representation. Trans. Image Proc. 20(5), 1327–1336 (2011). doi:10.1109/TIP.2010.2090535
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Sharif University of Technology, Tehran, Iran
Mohammadreza Zolfaghari, Amin Jourabloo, Bahman Pedrood & Mohammad T. Manzuri-Shalmani
Amirkabir University of Technology, Tehran, Iran
Samira Ghareh Gozlou

Authors

Mohammadreza Zolfaghari
View author publications
You can also search for this author in PubMed Google Scholar
Amin Jourabloo
View author publications
You can also search for this author in PubMed Google Scholar
Samira Ghareh Gozlou
View author publications
You can also search for this author in PubMed Google Scholar
Bahman Pedrood
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad T. Manzuri-Shalmani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammadreza Zolfaghari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zolfaghari, M., Jourabloo, A., Gozlou, S.G. et al. 3D human pose estimation from image using couple sparse coding. Machine Vision and Applications 25, 1489–1499 (2014). https://doi.org/10.1007/s00138-014-0613-6

Download citation

Received: 26 August 2013
Revised: 26 January 2014
Accepted: 23 March 2014
Published: 30 April 2014
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00138-014-0613-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D human pose estimation from image using couple sparse coding

Abstract

Access this article

Similar content being viewed by others

Pose Locality Constrained Representation for 3D Human Pose Reconstruction

Incremental Principal Component Analysis-Based Sparse Representation for Face Pose Classification

Keep It Simple and Sparse: Real-Time Action Recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D human pose estimation from image using couple sparse coding

Abstract

Access this article

Similar content being viewed by others

Pose Locality Constrained Representation for 3D Human Pose Reconstruction

Incremental Principal Component Analysis-Based Sparse Representation for Face Pose Classification

Keep It Simple and Sparse: Real-Time Action Recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation