Skip to main content
Log in

View-independent action recognition: a hybrid approach

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a new framework for view independent action recognition, which uses a combination of a view-dependent representation and a view-independent representation. The view-dependent representation reduces the number of possible action’s labels prior to the view-independent representation. We used the entropy of silhouette’s distance transformation as view-dependent representation and the self-similarity matrix of the trajectory of uniformly distributed feature points over the human body as view-independent representation. The experiment results show that the proposed method outperforms recent action recognition approaches despite its low computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. “mocapdata.com.” [Online]. Available: http://www.mocapdata.com/. Accessed 03 Aug 2014

  2. Ahmad M, Lee SW (2006) HMM-based human action recognition using multiview image sequences. In Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. 1: 263–266

  3. Benmokhtar R, Huet B (2006) Neural network combining classifier based on Dempster-Shafer theory for semantic indexing in video content. In Advances in Multimedia Modeling, Springer, pp 196–205

  4. Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267

    Article  Google Scholar 

  5. Bodor R, Jackson B, Masoud O, Papanikolopoulos N (2003) Image-based reconstruction for view-independent human motion recognition. In Intelligent Robots and Systems, 2003.(IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference on, vol 2, pp 1548–1553

  6. Candamo J, Shreve M, Goldgof DB, Sapper DB, Kasturi R (2010) Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans Intell Transp Syst 11(1):206–224

    Article  Google Scholar 

  7. Chang S-F (2002) The holy grail of content-based media analysis. IEEE Multi Media 9(2):6–10

    Article  Google Scholar 

  8. Chen H-S, Chen H-T, Chen Y-W, Lee S-Y (2006) Human action recognition using star skeleton. In Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks, pp 171–178

  9. Cuzzolin F, Sarti A, Tubaro S (2004) Action modeling with volumetric data. In Image Processing, 2004. ICIP’04. 2004 International Conference on. 2: 881–884

  10. Dee HM, Velastin SA (2008) How close are we to solving the problem of automated visual surveillance? Mach Vis Appl 19(5–6):329–343

    Article  Google Scholar 

  11. DeMenthon D, Kobla V, Doermann D (1998) Video summarization by curve simplification, In Proceedings of the sixth ACM international conference on Multimedia, pp 211–218

  12. Doulamis N, Doulamis A (2006) Evaluation of relevance feedback schemes in content-based in retrieval systems. Signal Process Image Commun 21(4):334–357

    Article  MATH  Google Scholar 

  13. Haering N, Venetianer PL, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5–6):279–290

    Article  Google Scholar 

  14. Howarth RJ, Buxton H (2000) Conceptual descriptions from monitoring and watching image sequences. Image Vis Comput 18(2):105–135

    Article  Google Scholar 

  15. Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C Appl Rev 34(3):334–352

    Article  Google Scholar 

  16. Hu M, Wang Y, Zhang Z, Zhang D, Little JJ (2013) Incremental learning for video-based gait recognition with LBP flow. IEEE Trans Cybern 43(1):77–89

    Article  Google Scholar 

  17. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circuits Syst Video Technol 14(1):4–20

    Article  Google Scholar 

  18. Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231

    Article  Google Scholar 

  19. Junejo IN, Dexter E, Laptev I, Pérez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern Anal Mach Intell 33(1):172–185

    Article  Google Scholar 

  20. Laptev I, Caputo B, Schüldt C, Lindeberg T (2007) Local velocity-adapted motion events for spatio-temporal recognition. Comput Vis Image Underst 108(3):207–229

    Article  Google Scholar 

  21. Lavee G, Rivlin E, Rudzsky M (2009) Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video. IEEE Trans Syst Man Cybern Part C Appl Rev 39(5):489–504

    Article  Google Scholar 

  22. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  23. Liu J, Ali S, Shah M (2008) Recognizing human actions using multiple features. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp 1–8

  24. Liu J, Shah M (2008) Learning human actions via information maximization. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp 1–8

  25. Liu Q, Yang Y, Gao Y, Ji R, Yu L (2013) A Bayesian framework for dense depth estimation based on spatial–temporal correlation. Neurocomputing 104:1–9

    Article  Google Scholar 

  26. Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching, In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pp 1–8

  27. Lv F, Nevatia R, Lee MW (2005) 3D human action recognition using spatio-temporal motion templates, In Computer Vision in Human-Computer Interaction, Springer, pp 120–130

  28. Masoud O (2000) Tracking and analysis of articulated motion with an application to human motion. University of Minnesota

  29. Nam J, Tewfik AH (2002) Event-driven video abstraction and visualization. Multimed Tools Appl 16(1–2):55–77

    Article  MATH  Google Scholar 

  30. Natarajan P, Nevatia R (2008) View and scale invariant action recognition using multiview shape-flow models. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp 1–8

  31. Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318

    Article  Google Scholar 

  32. Ogale A, Karapurkar A, Guerra-Filho G, Aloimonos Y (2004) View-invariant identification of pose sequences for action recognition, in In VACE

  33. Pantic M, Pentland A, Nijholt A, Huang TS (2007) Human computing and machine understanding of human behavior: a survey. In Artifical Intelligence for Human Computing, Springer, pp 47–71.

  34. Poser 3D Animation & Character Creation Software - Official Website. [Online]. Available: http://poser.smithmicro.com/. Accessed 03 Aug 2014

  35. Prest A, Schmid C, Ferrari V (2012) Weakly supervised learning of interactions between humans and objects. IEEE Trans Pattern Anal Mach Intell 34(3):601–614

    Article  Google Scholar 

  36. Ramagiri S, Kavi R, Kulathumani V (2011) Real-time multi-view human action recognition using a wireless camera network. In Distributed Smart Cameras (ICDSC), 2011 Fifth ACM/IEEE International Conference on, pp 1–6

  37. Ran Y, Zheng Q, Chellappa R, Strat TM (2010) Applications of a simple characterization of human gait in surveillance. IEEE Trans Syst Man Cybern Part B: Cybern 40(4):1009–1020

    Article  Google Scholar 

  38. Rogez G, Guerrero JJ, Martínez J, Orrite-Urunuela C (2006) Viewpoint Independent Human Motion Analysis in Man-made Environments. In BMVC, pp 659–668

  39. Roh MC, Shin HK, Lee SW, Lee SW (2006) Volume motion template for view-invariant gesture recognition. 2: 1229–1232

  40. Sarkar S, Phillips PJ, Liu Z, Vega IR, Grother P, Bowyer KW (2005) The humanid gait challenge problem: data sets, performance, and analysis. IEEE Trans Pattern Anal Mach Intell 27(2):162–177

    Article  Google Scholar 

  41. Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55

    Article  MathSciNet  Google Scholar 

  42. Shen Y, Foroosh H (2008) View-invariant action recognition using fundamental ratios, pp 1–6

  43. Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):116–124

    Article  Google Scholar 

  44. Simon C, Meessen J, De Vleeschouwer C (2010) Visual event recognition using decision trees. Multimed Tools Appl 50(1):95–121

    Article  Google Scholar 

  45. Sulman N, Sanocki T, Goldgof D, Kasturi R (2008) How effective is human video surveillance performance?, In Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, pp 1–3.

  46. Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488

    Article  Google Scholar 

  47. Wang H, Klaser A, Schmid C, Liu C-L (2011) Action recognition by dense trajectories. In Computer Vision and Pattern Recognition (CVPR), 2011 I.E. Conference on, pp 3169–3176

  48. Wang H, Kläser A, Schmid C, Liu C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79

    Article  MathSciNet  Google Scholar 

  49. Wang H, Schmid C (2013) Action recognition with improved trajectories. in Computer Vision (ICCV), 2013 I.E. International Conference on, pp 3551–3558

  50. Weinland D, Boyer E, Ronfard R (2007) Action recognition from arbitrary views using 3d exemplars, in Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp 1–7

  51. Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104(2):249–257

    Article  Google Scholar 

  52. Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Comput Vis Image Underst 115(2):224–241

    Article  Google Scholar 

  53. Xia L, Aggarwal J (2013) Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera, in Computer Vision and Pattern Recognition (CVPR), 2013 I.E. Conference on, pp 2834–2841

  54. Xiang T, Gong S (2006) Beyond tracking: modelling activity and understanding behaviour. Int J Comput Vis 67(1):21–51

    Article  Google Scholar 

  55. Yan P, Khan SM, Shah M (2008) Learning 4d action feature models for arbitrary view action recognition. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp 1–7

  56. Yang X, Tian Y (2014) Effective 3D action recognition using eigenjoints. J Vis Commun Image Represent 25(1):2–11

    Article  MathSciNet  Google Scholar 

  57. Yilmaz A, Shah M (2005) Actions sketch: A novel action representation. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. 1: 984–989

  58. Yu H, Sun G, Song W, Li X (2005) Human motion recognition based on neural network. In Communications, circuits and systems, 2005. Proceedings. 2005 international conference on, vol. 2

  59. Zhang K, Lu J, Yang Q, Lafruit G, Lauwereins R, Van Gool L (2011) Real-time and accurate stereo: a scalable approach with bitwise fast voting on CUDA. IEEE Trans Circuits Syst Video Technol 21(7):867–878

    Article  Google Scholar 

  60. Zhao S, Chen L, Yao H, Zhang Y, Sun X (2014) Strategy for dynamic 3D depth data matching towards robust action retrieval. Neurocomputing 151:533–543

    Article  Google Scholar 

  61. Zhao T, Nevatia R (2002) 3D tracking of human locomotion: a tracking as recognition approach. In Pattern Recognition, 2002. Proceedings. 16th International Conference on. 1: 546–551

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Rahmati.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hashemi, S.M., Rahmati, M. View-independent action recognition: a hybrid approach. Multimed Tools Appl 75, 6755–6775 (2016). https://doi.org/10.1007/s11042-015-2606-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2606-5

Keywords

Navigation