skip to main content
10.1145/2072572.2072588acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Human action recognition using multiple views: a comparative perspective on recent developments

Published:01 December 2011Publication History

ABSTRACT

This paper presents a review and comparative study of recent multi-view 2D and 3D approaches for human action recognition. The approaches are reviewed and categorized due to their nature. We report a comparison of the most promising methods using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) and the i3DPost Multi-View Human Action and Interaction Dataset. Additionally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D human action recognition.

References

  1. MuHAVi dataset instructions at http://dipersec.king.ac.uk/MuHAVi-MAS/.Google ScholarGoogle Scholar
  2. M. Ahmad and S.-W. Lee. Hmm-based human action recognition using multiview image sequences. In ICPR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Ankerst, G. Kastenmüller, H.-P. Kriegel, and T. Seidl. 3d shape histograms for similarity search and classification in spatial databases. In SSD, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. PAMI, 24(4):509--522, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Canton-Ferrer, J. Casas, and M. Pardás. Human model and motion based 3d action recognition in multiple view scenarios. In EUSIPCO, 2006.Google ScholarGoogle Scholar
  6. S. Y. Cheng and M. M. Trivedi. Articulated human body pose inference from voxel data using a kinematically constrained gaussian mixture model. In CVPR Workshops, 2007.Google ScholarGoogle Scholar
  7. S. Cherla, K. Kulkarni, A. Kale, and V. Ramasubramanian. Towards fast, view-invariant human action recognition. In CVPR Workshops, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  8. I. Cohen and H. Li. Inference of human postures by classification of 3d human body shape. In AMFG, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Farhadi and M. Tabrizi. Learning to recognize activities from the wrong view point. In ECCV, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Fihl and T. B. Moeslund. Invariant gait continuum based on the duty-factor. SIViP, 3(4):391--402, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. N. Gkalelis, H. Kim, A. Hilton, N. Nikolaidis, and I. Pitas. The i3dpost multi-view and 3d human action/interaction database. In CVMP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Gkalelis, N. Nikolaidis, and I. Pitas. View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation. In ICME, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Gross and J. Shi. The cmu motion of body (mobo) database. In Techical Report, 2001.Google ScholarGoogle Scholar
  14. A. Haq, I. Gondal, and M. Murshed. On dynamic scene geometry for view-invariant action matching. In CVPR, 2011.Google ScholarGoogle Scholar
  15. M. Holte, T. Moeslund, N. Nikolaidis, and I. Pitas. 3d human action recognition for multi-view camera systems. In 3DIMPVT, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Huang and M. Trivedi. 3d shape context based gesture analysis integrated with tracking using omni video array. In CVPR Workshops, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Huang and A. Hilton. Shape-colour histograms for matching 3d video sequences. In 3DIM, 2009.Google ScholarGoogle Scholar
  18. P. Huang, A. Hilton, and J. Starck. Shape similarity for 3d video sequences of people. IJCV, 89:362--381, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B.-W. Hwang, S. Kim, and S.-W. Lee. A fullbody gesture database for automatic gesture recognition. In FG, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Iosifidis, N. Nikolaidis, and I. Pitas. Movement recognition exploiting multi-view information. In MMSP, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. X. Ji and H. Liu. Advances in view-invariant human motion analysis: A review. Trans. Sys. Man Cyber Part C, 40(1):13--24, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. PAMI, 21(5):433--449, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. Junejo, E. Dexter, I. Laptev, and P. Pérez. Cross-view action recognition from temporal self-similarities. In ECCV, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. I. Junejo, E. Dexter, I. Laptev, and P. Pérez. View-independent action recognition from temporal self-similarities. PAMI, 33(1):172--185, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3d shape descriptors. In SGP, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Kilner, J.-Y. Guillemaut, and A. Hilton. 3d action matching with key-pose detection. In ICCV Workshops, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  27. M. Körtgen, M. Novotni, and R. Klein. 3d shape matching with 3d shape contexts. In CESCG, 2003.Google ScholarGoogle Scholar
  28. J. Liu, S. Ali, and M. Shah. Recognizing human actions using multiple features. In CVPR, 2008.Google ScholarGoogle Scholar
  29. J. Liu and M. Shah. Learning human actions via information maximization. In CVPR, 2008.Google ScholarGoogle Scholar
  30. J. Liu, M. Shah, B. Kuipers, and S. Savarese. Cross-view action recognition via view knowledge transfer. In CVPR, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. F. Lv and R. Nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  32. P. Matikainen, P. Pillai, L. Mummert, R. Sukthankar, and M. Hebert. Prop-free pointing detection in dynamic cluttered environments. In FG, 2011.Google ScholarGoogle Scholar
  33. I. Mikic, M. M. Trivedi, E. Hunter, and P. Cosman. Human body model acquisition and tracking using voxel data. IJCV, 53(3):199--223, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Moeslund, A. Hilton, and V. Krüger. A survey of advances in vision-based human motion capture and analysis. CVIU, 104(2--3):90--126, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. ACM Trans. Graph., 21:807--832, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Pehlivan and P. Duygulu. A new pose-based representation for recognizing actions from multiple cameras. CVIU, 115:140--151, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Pierobon, M. Marcon, A. Sarti, and S. Tubaro. 3-d body posture tracking for human action template matching. In ICASSP, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  38. R. Poppe. A survey on vision-based human action recognition. IVC, 28(6):976--990, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. Reddy, J. Liu, and M. Shah. Incremental action recognition using feature-tree. In ICCV, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  40. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. L. Sigal and M. Black. Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In Techniacl Report, 2006.Google ScholarGoogle Scholar
  42. Y. Song, D. Demirdjian, and R. Davis. Multi-signal gesture recognition using temporal smoothing hidden conditional random fields. In FG, 2011.Google ScholarGoogle Scholar
  43. R. Souvenir and J. Babbs. Learning the viewpoint manifold for action recognition. In CVPR, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  44. A. Sundaresan and R. Chellappa. Model driven segmentation of articulating humans in laplacian eigenspace. PAMI, 30(10):1771--1785, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. C. Tran and M. M. Trivedi. Human body modeling and tracking using volumetric representation: Selected recent studies and possibilities for extensions. In ACM workshops, 2008.Google ScholarGoogle Scholar
  46. D. Tran and A. Sorokin. Human activity recognition with metric learning. In ECCV, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. P. Turaga, A. Veeraraghavan, and R. Chellappa. Statistical analysis on stiefel and grassmann manifolds with applications in computer vision. In CVPR, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  48. A. Veeraraghavan, A. Srivastava, A. Roy-Chowdhury, and R. Chellappa. Rate-invariant recognition of humans and their activities. TIP, 18(6):1326--1339, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. S. Vitaladevuni, V. Kellokumpu, and L. Davis. Action recognition using ballistic dynamics. In CVPR, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  50. D. Weinland, M. Özuysal, and P. Fua. Making action recognition robust to occlusions and viewpoint changes. In ECCV, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. CVIU, 104(2):249--257, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. D. Weinland, R. Ronfard, and E. Boyer. Action recognition from arbitrary views using 3d exemplars. In ICCV, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  53. D. Weinland, R. Ronfard, and E. Boyer. A survey of vision-based methods for action representation, segmentation and recognition. INRIA Report, RR-7212:54--111, 2010.Google ScholarGoogle Scholar
  54. P. Yan, S. Khan, and M. Shah. Learning 4d action feature models for arbitrary view action recognition. In CVPR, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Human action recognition using multiple views: a comparative perspective on recent developments

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            J-HGBU '11: Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding
            December 2011
            46 pages
            ISBN:9781450309981
            DOI:10.1145/2072572

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 December 2011

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Upcoming Conference

            MM '24
            MM '24: The 32nd ACM International Conference on Multimedia
            October 28 - November 1, 2024
            Melbourne , VIC , Australia

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader