Skip to main content
Log in

Arm Poses Modeling for Pedestrians with Motion Prior

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Arm pose of human, plays an import role for understanding human behaviours. It can directly carry information of people identity, action style, interaction manner, personal habit etc. However the high dynamics of arm parts, especially the movement of forearms and hands, makes that modeling arm parts with high accuracy is challenging. In order to overcome this problem in a specific application, such as modeling arm pose of pedestrians, this paper adopts multiple priors to decrease the uncertainty of arm parts. Firstly, the human structure information, i.e. the prior of human arm parts size, is adopted to remove the impossible arm configuration. Secondly, the prior of arm parts configuration of a specific action is used to constrain the co-occurrence relations of all arm components. Therefore, a Bayesian approach is presented for modeling arm pose to incorporate the multiple priors and the likelihoods from visual observation. This paper proposes an arm model in which its priors can be represented easily. It also describes the priors estimation from the CMU motion dataset by a kernel density estimation, and maximum a posteriori for modeling the parameters of arm parts. Since there are priors for walking style, this method can be directly used for arm pose modeling of pedestrians without pre-training. It is found perform effectively on a HKU campus testing dataset, and also been evaluated on different human sizes and lighting conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Similar content being viewed by others

References

  1. Carnegie Mellon University motion capture database. http://mocap.cs.cmu.edu/.

  2. 2D articulated human pose estimation software v1.21. http://www.vision.ee.ethz.ch/calvin/, 2011.

  3. Full code for training and testing, including buffy, parse, and inria image benchmarks. http://phoenix.ics.uci.edu/software/pose/, 2012.

  4. Balan, A.O., Sigal, L., Black, M.J., Davis, J.E., & Haussecker, H.W. (2007). Detailed human shape and pose from images. In Proc. Comput. Vis. Pattern Recog., pages 1–8. IEEE.

  5. Buehler, P., Everingham, M., Huttenlocher, D. P., & Zisserman, A. (2008). Long term arm and hand tracking for continuous sign language tv broadcasts. In Proc. British Mach Vis. Conf. volume 1281.

  6. Cherian, A., Mairal, J., Alahari, K., & Schmid, C. (2014). Mixing body-part sequences for human pose estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.

  7. Conaire, C.O., O’Connor, N.E., & Smeaton, A.F. (2007). Detector adaptation by maximising agreement between independent data sources. In Proc. Comput. Vis. Pattern Recog., pages 1–6. IEEE.

  8. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proc. Comput. Vis. Pattern Recog., volume 1, pages 886–893. IEEE.

  9. del Rincon, J.M., Makris, D., Uruuela, C.O., & Nebel, J.-C. (2011). Tracking human position and lower body parts using kalman and particle filters constrained by human biomechanics. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 41(1), 26–37.

    Article  Google Scholar 

  10. Deutscher, J., Blake, A., & Reid, I. (2000). Articulated body motion capture by annealed particle filtering. In Proc. Comput. Vis. Pattern Recog., volume 2, pages 126–133. IEEE.

  11. Doucet, A., Godsill, S., & Andrieu, C. (2000). On sequential monte carlo sampling methods for bayesian filtering. Statistics and Computing, 10(3), 197–208.

    Article  Google Scholar 

  12. Bradley Efron (2013). Bayes’ theorem in the 21st century. Science, 340(6137), 1177–1178.

    Article  MathSciNet  Google Scholar 

  13. Eichner, M., & Ferrari, V. (2009). Better appearance models for pictorial structures. In Proc. British Mach. Vis. Conf.

  14. Eichner, M., Marin-Jimenez, M., Zisserman, A., & Ferrari, V. (2010). Articulated human pose estimation and search in (almost) unconstrained still images. Technical report, ETH Zurich.

  15. Felzenszwalb, P.F., & Huttenlocher, D.P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.

    Article  Google Scholar 

  16. Fischler, M.A., & Elschlager, R.A. (1973). The representation and matching of pictorial structures. IEEE Transactions on Computers, 100(1), 67–92.

    Article  Google Scholar 

  17. Forsyth, D.A., & Ponce, J. (2012). Detecting objects in images. In Computer Vision, A Modern Approach, 2nd edition, pages 519–539. Prentice Hall.

  18. Fragkiadaki, K., Hu, H., & Shi, J. (2013). Pose from flow and flow from pose. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2059–2066. IEEE.

  19. Gall, J., Rosenhahn, B., Brox, T., & Seidel, H.P. (2010). Optimization and filtering for human motion capture. International Journal of Computer Vision, 87(1), 75–92.

    Article  Google Scholar 

  20. Gu, J., Ding, X., Wang, S., & Wu, Y. (2010). Action and gait recognition from recovered 3-d human joints. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 40(4), 1021–1033.

    Article  Google Scholar 

  21. Hjelmås, E., & Low, B.K. (2001). Face detection: A survey. Computer Vision and Image Understanding, 83(3), 236–274.

    Article  MATH  Google Scholar 

  22. Hsu, R., Kageyama, M., Fukui, H., Nakaya, Y., & Harashima, H. (1993). Human arm modeling for analysis/synthesis image coding. In Proc. Robt. Human Commun., pages 352–355. IEEE.

  23. Jiang, F., Zhang, S., Shen, W., Gao, Y., & Zhao, D. (2015). Multi-layered gesture recognition with kinect. Journal of Machine Learning Research, 16, 227–254.

    MathSciNet  MATH  Google Scholar 

  24. Ju, S.X., Black, M.J., & Yacoob, Y. (1996). Cardboard people: A parameterized model of articulated image motion. In Proc. Int. Conf. Autom. Face Gesture Recog., pages 38–44. IEEE.

  25. Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as bayesian inference. Annual Review of Psychology, 55, 271–304.

    Article  Google Scholar 

  26. Kreutz-Delgado, K., Long, M., & Seraji, H. (1992). Kinematic analysis of 7-dof manipulators. International Journal of Robotics Research, 11(5), 469–481.

    Article  Google Scholar 

  27. Lawrence, N. (2005). Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of Machine Learning Research, 6, 1783–1816.

    MathSciNet  MATH  Google Scholar 

  28. Lee, M.W., & Cohen, I. (2004). Proposal maps driven mcmc for estimating human body pose in static images. In Proc. Comput. Vis. Pattern Recog., volume 2, pages II–334. IEEE.

  29. Lewis, J.P., Cordner, M., & Fong, N. (2000). Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In Proc. SIGGRAPH, pages 165–172. ACM /Addison-Wesley.

  30. Martin, D.R., Fowlkes, C.C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.

    Article  Google Scholar 

  31. Mehrabian, A. (1968). Communication without words. Psychology Today, 2, 53–55.

    Google Scholar 

  32. Mitra, S., & Acharya, T. (2007). Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 37(3), 311–324.

    Article  Google Scholar 

  33. Moeslund, T.B., & Granum, E. (2003). Modelling and estimating the pose of a human arm. Machine Vision and Applications, 14(4), 237–247.

    Article  Google Scholar 

  34. Moeslund, T.B., Hilton, A., & Kruger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2-3), 90–126.

    Article  Google Scholar 

  35. Wanli, O., Xiao, C., & Xiaogang, W. (2014). Multi-source deep learning for human pose estimation.

  36. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann.

  37. Plankers, R., & Fua, P. (2001). Articulated soft objects for video-based body modeling. In Proc. Int. Conf. Comput. Vis., volume 1, pages 394–401. IEEE.

  38. Ramanan, D. (2007). Learning to parse images of articulated bodies. 19, 1129.

  39. Ramanan, D., & Sminchisescu, C. Training deformable models for localization. In Proc. Comput. Vis. Pattern Recog., volume 1, pages 206–213. IEEE.

  40. Salti, S., Schreer, O., & Di Stefano, L. (2008). Real-time 3d arm pose estimation from monocular video for enhanced hci. In Proc. Vis. Networks Behav. Anal., pages 1–8. ACM.

  41. Sapp, B., & Taskar, B. (2013). Multimodal decomposable models for human pose estimation. In In Proc. CVPR.

  42. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proc. Comput. Vis. Pattern Recog., 1297–1304.

  43. Sidenbladh, H., Black, M.J., & Fleet, D.J. (2000). Stochastic tracking of 3d human figures using 2d image motion. In Proc. Eur. Conf. Comput. Vis.

  44. Sminchisescu, C., & Triggs, B. (2001). Covariance scaled sampling for monocular 3d body tracking. In Proc. Comput. Vis. Pattern Recog., volume 1, pages I–447. IEEE.

  45. Tian, T.P., Li, R., & Sclaroff, S. (2005). Tracking human body pose on a learned smooth space. Technical report, Boston University Computer Science Department.

  46. Tilley, A.R., & Associates, H.D. (2002). The measure of man and woman: human factors in design. Wiley.

  47. Vaswani, N., Roy-Chowdhury, A.K., & Chellappa, R. (2005). Shape activity: A continuous-state hmm for moving/deforming shapes with application to abnormal activity detection. IEEE Transactions on Image Processing, 14(10), 1603–1616.

    Article  Google Scholar 

  48. Wang, L., & Yung, N.H.C. (2010). Extraction of moving objects from their background based on multiple adaptive thresholds and boundary evaluation. IEEE Transactions on Intelligent Transportation Systems, 11(1), 40–51.

    Article  Google Scholar 

  49. Wang, L., & Yung, N.H.C. (2011). Bayesian 3d model based human detection in crowded scenes using efficient optimization. In Proc. Appl. Comput. Vis., pages 557–563. IEEE.

  50. Yang, Y., & Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Proc. Comput. Vis. Pattern Recog., pages 1385–1392. IEEE.

  51. Zuffi, S., Romero, J., Schmid, C., & Black, M.J. (2013). Estimating human pose with flowing puppets. In Computer Vision (ICCV), 2013 IEEE International Conference on, pages 3312–3319. IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chongguo Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Yung, N.H.C. Arm Poses Modeling for Pedestrians with Motion Prior. J Sign Process Syst 84, 237–249 (2016). https://doi.org/10.1007/s11265-015-1049-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-015-1049-6

Keywords

Navigation