ABSTRACT
This paper presents a review and comparative study of recent multi-view 2D and 3D approaches for human action recognition. The approaches are reviewed and categorized due to their nature. We report a comparison of the most promising methods using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) and the i3DPost Multi-View Human Action and Interaction Dataset. Additionally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D human action recognition.
- MuHAVi dataset instructions at http://dipersec.king.ac.uk/MuHAVi-MAS/.Google Scholar
- M. Ahmad and S.-W. Lee. Hmm-based human action recognition using multiview image sequences. In ICPR, 2006. Google ScholarDigital Library
- M. Ankerst, G. Kastenmüller, H.-P. Kriegel, and T. Seidl. 3d shape histograms for similarity search and classification in spatial databases. In SSD, 1999. Google ScholarDigital Library
- S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. PAMI, 24(4):509--522, 2002. Google ScholarDigital Library
- C. Canton-Ferrer, J. Casas, and M. Pardás. Human model and motion based 3d action recognition in multiple view scenarios. In EUSIPCO, 2006.Google Scholar
- S. Y. Cheng and M. M. Trivedi. Articulated human body pose inference from voxel data using a kinematically constrained gaussian mixture model. In CVPR Workshops, 2007.Google Scholar
- S. Cherla, K. Kulkarni, A. Kale, and V. Ramasubramanian. Towards fast, view-invariant human action recognition. In CVPR Workshops, 2008.Google ScholarCross Ref
- I. Cohen and H. Li. Inference of human postures by classification of 3d human body shape. In AMFG, 2003. Google ScholarDigital Library
- A. Farhadi and M. Tabrizi. Learning to recognize activities from the wrong view point. In ECCV, 2008. Google ScholarDigital Library
- P. Fihl and T. B. Moeslund. Invariant gait continuum based on the duty-factor. SIViP, 3(4):391--402, 2008.Google ScholarCross Ref
- N. Gkalelis, H. Kim, A. Hilton, N. Nikolaidis, and I. Pitas. The i3dpost multi-view and 3d human action/interaction database. In CVMP, 2009. Google ScholarDigital Library
- N. Gkalelis, N. Nikolaidis, and I. Pitas. View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation. In ICME, 2009. Google ScholarDigital Library
- R. Gross and J. Shi. The cmu motion of body (mobo) database. In Techical Report, 2001.Google Scholar
- A. Haq, I. Gondal, and M. Murshed. On dynamic scene geometry for view-invariant action matching. In CVPR, 2011.Google Scholar
- M. Holte, T. Moeslund, N. Nikolaidis, and I. Pitas. 3d human action recognition for multi-view camera systems. In 3DIMPVT, 2011. Google ScholarDigital Library
- K. Huang and M. Trivedi. 3d shape context based gesture analysis integrated with tracking using omni video array. In CVPR Workshops, 2005. Google ScholarDigital Library
- P. Huang and A. Hilton. Shape-colour histograms for matching 3d video sequences. In 3DIM, 2009.Google Scholar
- P. Huang, A. Hilton, and J. Starck. Shape similarity for 3d video sequences of people. IJCV, 89:362--381, 2010. Google ScholarDigital Library
- B.-W. Hwang, S. Kim, and S.-W. Lee. A fullbody gesture database for automatic gesture recognition. In FG, 2006. Google ScholarDigital Library
- A. Iosifidis, N. Nikolaidis, and I. Pitas. Movement recognition exploiting multi-view information. In MMSP, 2010.Google ScholarCross Ref
- X. Ji and H. Liu. Advances in view-invariant human motion analysis: A review. Trans. Sys. Man Cyber Part C, 40(1):13--24, 2010. Google ScholarDigital Library
- A. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. PAMI, 21(5):433--449, 1999. Google ScholarDigital Library
- I. Junejo, E. Dexter, I. Laptev, and P. Pérez. Cross-view action recognition from temporal self-similarities. In ECCV, 2008. Google ScholarDigital Library
- I. Junejo, E. Dexter, I. Laptev, and P. Pérez. View-independent action recognition from temporal self-similarities. PAMI, 33(1):172--185, 2011. Google ScholarDigital Library
- M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3d shape descriptors. In SGP, 2003. Google ScholarDigital Library
- J. Kilner, J.-Y. Guillemaut, and A. Hilton. 3d action matching with key-pose detection. In ICCV Workshops, 2009.Google ScholarCross Ref
- M. Körtgen, M. Novotni, and R. Klein. 3d shape matching with 3d shape contexts. In CESCG, 2003.Google Scholar
- J. Liu, S. Ali, and M. Shah. Recognizing human actions using multiple features. In CVPR, 2008.Google Scholar
- J. Liu and M. Shah. Learning human actions via information maximization. In CVPR, 2008.Google Scholar
- J. Liu, M. Shah, B. Kuipers, and S. Savarese. Cross-view action recognition via view knowledge transfer. In CVPR, 2011.Google ScholarDigital Library
- F. Lv and R. Nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, 2007.Google ScholarCross Ref
- P. Matikainen, P. Pillai, L. Mummert, R. Sukthankar, and M. Hebert. Prop-free pointing detection in dynamic cluttered environments. In FG, 2011.Google Scholar
- I. Mikic, M. M. Trivedi, E. Hunter, and P. Cosman. Human body model acquisition and tracking using voxel data. IJCV, 53(3):199--223, 2003. Google ScholarDigital Library
- T. Moeslund, A. Hilton, and V. Krüger. A survey of advances in vision-based human motion capture and analysis. CVIU, 104(2--3):90--126, 2006. Google ScholarDigital Library
- R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. ACM Trans. Graph., 21:807--832, 2002. Google ScholarDigital Library
- S. Pehlivan and P. Duygulu. A new pose-based representation for recognizing actions from multiple cameras. CVIU, 115:140--151, 2011. Google ScholarDigital Library
- M. Pierobon, M. Marcon, A. Sarti, and S. Tubaro. 3-d body posture tracking for human action template matching. In ICASSP, 2006.Google ScholarCross Ref
- R. Poppe. A survey on vision-based human action recognition. IVC, 28(6):976--990, 2010. Google ScholarDigital Library
- K. Reddy, J. Liu, and M. Shah. Incremental action recognition using feature-tree. In ICCV, 2009.Google ScholarCross Ref
- J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, 2011. Google ScholarDigital Library
- L. Sigal and M. Black. Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In Techniacl Report, 2006.Google Scholar
- Y. Song, D. Demirdjian, and R. Davis. Multi-signal gesture recognition using temporal smoothing hidden conditional random fields. In FG, 2011.Google Scholar
- R. Souvenir and J. Babbs. Learning the viewpoint manifold for action recognition. In CVPR, 2008.Google ScholarCross Ref
- A. Sundaresan and R. Chellappa. Model driven segmentation of articulating humans in laplacian eigenspace. PAMI, 30(10):1771--1785, 2008. Google ScholarDigital Library
- C. Tran and M. M. Trivedi. Human body modeling and tracking using volumetric representation: Selected recent studies and possibilities for extensions. In ACM workshops, 2008.Google Scholar
- D. Tran and A. Sorokin. Human activity recognition with metric learning. In ECCV, 2008. Google ScholarDigital Library
- P. Turaga, A. Veeraraghavan, and R. Chellappa. Statistical analysis on stiefel and grassmann manifolds with applications in computer vision. In CVPR, 2008.Google ScholarCross Ref
- A. Veeraraghavan, A. Srivastava, A. Roy-Chowdhury, and R. Chellappa. Rate-invariant recognition of humans and their activities. TIP, 18(6):1326--1339, 2009. Google ScholarDigital Library
- S. Vitaladevuni, V. Kellokumpu, and L. Davis. Action recognition using ballistic dynamics. In CVPR, 2008.Google ScholarCross Ref
- D. Weinland, M. Özuysal, and P. Fua. Making action recognition robust to occlusions and viewpoint changes. In ECCV, 2010. Google ScholarDigital Library
- D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. CVIU, 104(2):249--257, 2006. Google ScholarDigital Library
- D. Weinland, R. Ronfard, and E. Boyer. Action recognition from arbitrary views using 3d exemplars. In ICCV, 2007.Google ScholarCross Ref
- D. Weinland, R. Ronfard, and E. Boyer. A survey of vision-based methods for action representation, segmentation and recognition. INRIA Report, RR-7212:54--111, 2010.Google Scholar
- P. Yan, S. Khan, and M. Shah. Learning 4d action feature models for arbitrary view action recognition. In CVPR, 2008.Google Scholar
Index Terms
- Human action recognition using multiple views: a comparative perspective on recent developments
Recommendations
Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition
MM '15: Proceedings of the 23rd ACM international conference on MultimediaHuman action recognition is one of the most active research areas in both computer vision and machine learning communities. Several methods for human action recognition have been proposed in the literature and promising results have been achieved on the ...
Coupled Action Recognition and Pose Estimation from Multiple Views
Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for ...
A survey of video datasets for human action and activity recognition
Highlights Description of datasets for video-based human activity and action recognition. 68 datasets reported: 28 for heterogeneous and 40 for specific human actions. Useful data, such as web for dowloading, published works or ground truth, are ...
Comments