research-article

Human action recognition using multiple views: a comparative perspective on recent developments

Authors:
Michael B. Holte

Aalborg Universtity, Aalborg, Denmark

Aalborg Universtity, Aalborg, Denmark
View Profile

,
Cuong Tran

University of California, San Diego, San Diego, CA, USA

University of California, San Diego, San Diego, CA, USA
View Profile

,
Mohan M. Trivedi

University of California, San Diego, San Diego, CA, USA

University of California, San Diego, San Diego, CA, USA
View Profile

,
Thomas B. Moeslund

Aalborg Universtity, Aalborg, Denmark

Aalborg Universtity, Aalborg, Denmark
View Profile

J-HGBU '11: Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understandingDecember 2011Pages 47–52https://doi.org/10.1145/2072572.2072588

Published:01 December 2011Publication History

J-HGBU '11: Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding

Pages 47–52

ABSTRACT

This paper presents a review and comparative study of recent multi-view 2D and 3D approaches for human action recognition. The approaches are reviewed and categorized due to their nature. We report a comparison of the most promising methods using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) and the i3DPost Multi-View Human Action and Interaction Dataset. Additionally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D human action recognition.

References

MuHAVi dataset instructions at http://dipersec.king.ac.uk/MuHAVi-MAS/.Google Scholar
M. Ahmad and S.-W. Lee. Hmm-based human action recognition using multiview image sequences. In ICPR, 2006. Google ScholarDigital Library
M. Ankerst, G. Kastenmüller, H.-P. Kriegel, and T. Seidl. 3d shape histograms for similarity search and classification in spatial databases. In SSD, 1999. Google ScholarDigital Library
S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. PAMI, 24(4):509--522, 2002. Google ScholarDigital Library
C. Canton-Ferrer, J. Casas, and M. Pardás. Human model and motion based 3d action recognition in multiple view scenarios. In EUSIPCO, 2006.Google Scholar
S. Y. Cheng and M. M. Trivedi. Articulated human body pose inference from voxel data using a kinematically constrained gaussian mixture model. In CVPR Workshops, 2007.Google Scholar
S. Cherla, K. Kulkarni, A. Kale, and V. Ramasubramanian. Towards fast, view-invariant human action recognition. In CVPR Workshops, 2008.Google ScholarCross Ref
I. Cohen and H. Li. Inference of human postures by classification of 3d human body shape. In AMFG, 2003. Google ScholarDigital Library
A. Farhadi and M. Tabrizi. Learning to recognize activities from the wrong view point. In ECCV, 2008. Google ScholarDigital Library
P. Fihl and T. B. Moeslund. Invariant gait continuum based on the duty-factor. SIViP, 3(4):391--402, 2008.Google ScholarCross Ref
N. Gkalelis, H. Kim, A. Hilton, N. Nikolaidis, and I. Pitas. The i3dpost multi-view and 3d human action/interaction database. In CVMP, 2009. Google ScholarDigital Library
N. Gkalelis, N. Nikolaidis, and I. Pitas. View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation. In ICME, 2009. Google ScholarDigital Library
R. Gross and J. Shi. The cmu motion of body (mobo) database. In Techical Report, 2001.Google Scholar
A. Haq, I. Gondal, and M. Murshed. On dynamic scene geometry for view-invariant action matching. In CVPR, 2011.Google Scholar
M. Holte, T. Moeslund, N. Nikolaidis, and I. Pitas. 3d human action recognition for multi-view camera systems. In 3DIMPVT, 2011. Google ScholarDigital Library
K. Huang and M. Trivedi. 3d shape context based gesture analysis integrated with tracking using omni video array. In CVPR Workshops, 2005. Google ScholarDigital Library
P. Huang and A. Hilton. Shape-colour histograms for matching 3d video sequences. In 3DIM, 2009.Google Scholar
P. Huang, A. Hilton, and J. Starck. Shape similarity for 3d video sequences of people. IJCV, 89:362--381, 2010. Google ScholarDigital Library
B.-W. Hwang, S. Kim, and S.-W. Lee. A fullbody gesture database for automatic gesture recognition. In FG, 2006. Google ScholarDigital Library
A. Iosifidis, N. Nikolaidis, and I. Pitas. Movement recognition exploiting multi-view information. In MMSP, 2010.Google ScholarCross Ref
X. Ji and H. Liu. Advances in view-invariant human motion analysis: A review. Trans. Sys. Man Cyber Part C, 40(1):13--24, 2010. Google ScholarDigital Library
A. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. PAMI, 21(5):433--449, 1999. Google ScholarDigital Library
I. Junejo, E. Dexter, I. Laptev, and P. Pérez. Cross-view action recognition from temporal self-similarities. In ECCV, 2008. Google ScholarDigital Library
I. Junejo, E. Dexter, I. Laptev, and P. Pérez. View-independent action recognition from temporal self-similarities. PAMI, 33(1):172--185, 2011. Google ScholarDigital Library
M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3d shape descriptors. In SGP, 2003. Google ScholarDigital Library
J. Kilner, J.-Y. Guillemaut, and A. Hilton. 3d action matching with key-pose detection. In ICCV Workshops, 2009.Google ScholarCross Ref
M. Körtgen, M. Novotni, and R. Klein. 3d shape matching with 3d shape contexts. In CESCG, 2003.Google Scholar
J. Liu, S. Ali, and M. Shah. Recognizing human actions using multiple features. In CVPR, 2008.Google Scholar
J. Liu and M. Shah. Learning human actions via information maximization. In CVPR, 2008.Google Scholar
J. Liu, M. Shah, B. Kuipers, and S. Savarese. Cross-view action recognition via view knowledge transfer. In CVPR, 2011.Google ScholarDigital Library
F. Lv and R. Nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, 2007.Google ScholarCross Ref
P. Matikainen, P. Pillai, L. Mummert, R. Sukthankar, and M. Hebert. Prop-free pointing detection in dynamic cluttered environments. In FG, 2011.Google Scholar
I. Mikic, M. M. Trivedi, E. Hunter, and P. Cosman. Human body model acquisition and tracking using voxel data. IJCV, 53(3):199--223, 2003. Google ScholarDigital Library
T. Moeslund, A. Hilton, and V. Krüger. A survey of advances in vision-based human motion capture and analysis. CVIU, 104(2--3):90--126, 2006. Google ScholarDigital Library
R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. ACM Trans. Graph., 21:807--832, 2002. Google ScholarDigital Library
S. Pehlivan and P. Duygulu. A new pose-based representation for recognizing actions from multiple cameras. CVIU, 115:140--151, 2011. Google ScholarDigital Library
M. Pierobon, M. Marcon, A. Sarti, and S. Tubaro. 3-d body posture tracking for human action template matching. In ICASSP, 2006.Google ScholarCross Ref
R. Poppe. A survey on vision-based human action recognition. IVC, 28(6):976--990, 2010. Google ScholarDigital Library
K. Reddy, J. Liu, and M. Shah. Incremental action recognition using feature-tree. In ICCV, 2009.Google ScholarCross Ref
J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, 2011. Google ScholarDigital Library
L. Sigal and M. Black. Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In Techniacl Report, 2006.Google Scholar
Y. Song, D. Demirdjian, and R. Davis. Multi-signal gesture recognition using temporal smoothing hidden conditional random fields. In FG, 2011.Google Scholar
R. Souvenir and J. Babbs. Learning the viewpoint manifold for action recognition. In CVPR, 2008.Google ScholarCross Ref
A. Sundaresan and R. Chellappa. Model driven segmentation of articulating humans in laplacian eigenspace. PAMI, 30(10):1771--1785, 2008. Google ScholarDigital Library
C. Tran and M. M. Trivedi. Human body modeling and tracking using volumetric representation: Selected recent studies and possibilities for extensions. In ACM workshops, 2008.Google Scholar
D. Tran and A. Sorokin. Human activity recognition with metric learning. In ECCV, 2008. Google ScholarDigital Library
P. Turaga, A. Veeraraghavan, and R. Chellappa. Statistical analysis on stiefel and grassmann manifolds with applications in computer vision. In CVPR, 2008.Google ScholarCross Ref
A. Veeraraghavan, A. Srivastava, A. Roy-Chowdhury, and R. Chellappa. Rate-invariant recognition of humans and their activities. TIP, 18(6):1326--1339, 2009. Google ScholarDigital Library
S. Vitaladevuni, V. Kellokumpu, and L. Davis. Action recognition using ballistic dynamics. In CVPR, 2008.Google ScholarCross Ref
D. Weinland, M. Özuysal, and P. Fua. Making action recognition robust to occlusions and viewpoint changes. In ECCV, 2010. Google ScholarDigital Library
D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. CVIU, 104(2):249--257, 2006. Google ScholarDigital Library
D. Weinland, R. Ronfard, and E. Boyer. Action recognition from arbitrary views using 3d exemplars. In ICCV, 2007.Google ScholarCross Ref
D. Weinland, R. Ronfard, and E. Boyer. A survey of vision-based methods for action representation, segmentation and recognition. INRIA Report, RR-7212:54--111, 2010.Google Scholar
P. Yan, S. Khan, and M. Shah. Learning 4d action feature models for arbitrary view action recognition. In CVPR, 2008.Google Scholar

Index Terms

Human action recognition using multiple views: a comparative perspective on recent developments

Recommendations

Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Human action recognition is one of the most active research areas in both computer vision and machine learning communities. Several methods for human action recognition have been proposed in the literature and promising results have been achieved on the ...
Read More
Coupled Action Recognition and Pose Estimation from Multiple Views

Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for ...
Read More
A survey of video datasets for human action and activity recognition

Highlights Description of datasets for video-based human activity and action recognition. 68 datasets reported: 28 for heterogeneous and 40 for specific human actions. Useful data, such as web for dowloading, published works or ground truth, are ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
J-HGBU '11: Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding
December 2011
46 pages
ISBN:9781450309981
DOI:10.1145/2072572
General Chairs:
Rita Cucchiara
MA3HO'11: Univ. of Modena and Reggio Emilia, Italy
,
Maja Pantic
SSPW'11: Imperial College London, UK / University of Twente, NL
,
Program Chairs:
Mohamed Daoudi
MA3HO'11: TELECOM Lille1/LIFL, France
,
Alberto Del Bimbo
MA3HO'11: Università di Firenze, Italy
,
Alex Pentland
SSPW'11: MIT, USA
,
Alessandro Vinciarelli
SSPW'11: University of Glasgow, UK / IDIAP, CH
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3-dimensional
IXMAS
comparative study
human action recognition
i3DPost
multi-view
survey
view-invariance
Qualifiers
- research-article
Conference
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 37
  Total Citations
  View Citations
- 589
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Human action recognition using multiple views: a comparative perspective on recent developments

J-HGBU '11: Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition

Coupled Action Recognition and Pose Estimation from Multiple Views

A survey of video datasets for human action and activity recognition