Survey on classifying human actions through visual sensors

Del Rose, Michael S.; Wagner, Christian C.

doi:10.1007/s10462-011-9232-z

Survey on classifying human actions through visual sensors

Published: 04 May 2011

Volume 37, pages 301–311, (2012)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Michael S. Del Rose¹ &
Christian C. Wagner²

333 Accesses
11 Citations
Explore all metrics

Abstract

The ability to predict the intentions of people based solely on their visual actions is a skill only performed by humans and animals. This requires segmentation of items in the field of view, tracking of moving objects, identifying the importance of each object, determining the current role of each important object individually and in collaboration with other objects, relating these objects into a predefined scenario, assessing the selected scenario with the information retrieve, and finally adjusting the scenario to better fit the data. This is all accomplished with great accuracy in less than a few seconds. The intelligence of current computer algorithms has not reached this level of complexity with the accuracy and time constraints that humans and animals have, but there are several research efforts that are working towards this by identifying new algorithms for solving parts of this problem. This survey paper lists several of these efforts that rely mainly on understanding the image processing and classification of a limited number of actions. It divides the activities up into several groups and ends with a discussion of future needs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Antonakaki P, Kosmopoulos D, Perantonis SJ (2009) Detecting abnormal human behaviour using multiple cameras. Signal Process J 89: 1723–1738
Article MATH Google Scholar
Babu RV, Anantharaman B, Ramakrishnan KR, Srinivasan SH (2002) Compressed domain action classification using HMM. Pattern Recognit Lett 23(10): 1203–1213. doi:10.1016/S01167.8655(02)00067-3
Article MATH Google Scholar
Batra D, Chen TH, Sukthankar R (2008) Space-time shapelets for action recognition. In: Proceedings from IEEE workshop on motion and video computing, pp 1–6. doi:10.1109/WMVC.2008.4544051
Baum L (1972) An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3: 1–8
Google Scholar
Ben-Arie J, Wang Z, Pandit P, Rajaram S (2002) Human activity recognition using multidimensional indexing. IEEE Trans Pattern Anal Mach Vis (PAMI) 24(8): 1091–1104. doi:10.1109/TPAMI.2002.1023805
Article Google Scholar
Bilmes J (1998) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden markov models. Technical report TR-97-021, University of Berkeley
Blackburn J, Ribeiro E (2007) Human motion recognition using isomap and dynamic time warping. Lect Notes Pattern Recognit 4814:285–298, Springer, Berlin. doi:10.1007/978.3.540.75703.0
Google Scholar
Bouchaffra D, Tan J (2007) Structural hidden markov models based on stochastic context-free grammars. Control Intell Syst 35(3): 211–216
MATH Google Scholar
Brand M, Oliver N, Pentland A (1997) Coupled hidden markov models for complex action recognition. In: Proceedings from computer vision and pattern recognition conference (CVPR), pp 994–999
Bui H, Phung D, Venkatesh S (2004) Hierarchical hidden markov models with general state hierarchy. In: Proceedings of the nineteenth national conference of artificial ntelligence, pp 324–329
Campbell L, Becker D, Azarbayejani A (1996) Invariant features for 3-D Jester recognition. In: Proceedings from IEEE automatic face and gesture recognition (AFGR), pp 157–162
Chakraborty B, Rudovic O, Gonzalez J (2008) View invariant human body detection with extension to human action recognition using component-wise HMM of body parts. Lect Notes Comput Sci 5098: 208–217. doi:10.1007/978-3-540-70517-8_20
Article Google Scholar
Chomat O, Crowley JL (2000) A probabilistic sensor for the perception of activities. In: Proceedings from IEEE international conference on automatic face and gesture recognition, pp 314–319
Colombo C, Comanducci D, Bimbo A (2007) Compact representation and probabilistic classification of human actions in videos. In: Proceedings from IEEE conference on advanced video and signal based surveillance, pp 342–346. doi:10.1109/AVSS.2007.4425334
Cristani M, Bicego M, Murino V (2007) Audio-visual event recognition in surveillance video sequences. IEEE Trans Multimedia 9(2): 257–267
Article Google Scholar
DARPA mind’s eye broad agency announcement (2010) DARPA-BAA-10-53, (http://www.darpa.mil/tcto/docs/DARPA_ME_BAA-10-53_Mod1.pdf)
Del Rose M, Stein J (2006) Survivability on the ART robotic vehicle. In: Proceedings from the seventeenth ground vehicle survivability symposium
Del Rose M, Wagner C, Frederick P (2011) Evidence feed forward hidden markov model: a new type of hidden markov model. Int J Artif Intell Appl 2(1): 1–19.
Google Scholar
Dimitrijevic M, Lepetit V, Fua P (2006) Human body pose detection using Bayesian spatio-temporal templates. Comput Vis Image Underst 104(2): 127–139
Article Google Scholar
Du Y, Chen F, Xu W (2007) Human interaction representation and recognition through motion decomposition. IEEE Signal Process Lett 14(12): 952–955
Article Google Scholar
Fin S, Singer Y, Tishby N (1998) The hierarchical hidden markov model: analysis and application. Mach Learn 32: 41–62
Article Google Scholar
Fisher Iris data set website (http://archive.ics.uci.edu/ml/datasets/Iris)
Galata A, Johnson N, Hogg D (2001) Learning variable length markov models of behaviour. Comput Vis Image Underst 81: 398–413
Article MATH Google Scholar
Gao J, Collins RT, Hauptmann AG, Wactlar HD (2004) Articulated motion modeling for activity analysis. In: Proceedings from international conference on image and video retrieval, pp 1–19
Gao X, Yang Y, Tao D, Li X (2009) Discriminative optical flow tensor for video semantic analysis. Comput Image Underst 113(3): 372–383
Article Google Scholar
Gehrig D, Schulz T (2008) Selecting relevant features for human motion recognition. In: Proceedings from international conference on pattern recognition, pp 1–4. doi:10.1109/ICPR.2008.4761290
Ghayoori A, Hendessi F, Sheikh A (2006) Application of smooth ergodic hidden markov model in text to speech systems. Int J Signal Process 2(3): 151–157
Google Scholar
Gong S, Xiang T (2003) Recognition of roup activity using dynamic probabilistic networks. In: Proceedings from international conference in computer vision, pp 742–749
Han L, Liange W, Wu XX, Jia YD (2008) Human action recognition using discriminative models in the learned hierarchical manifold space. In: Proceedings from IEEE international conference on automatic face and gesture recognition, pp 1–6. doi:10.1109/AFGR.2008.4813416
Hassan R, Nath B (2005) Stock market forecasting sing hidden markov model: a new approach. In: Proceedings of the fifth international conference on intelligent systems design and application
Hassan R, Nath B, Kirley M (2006) A data clustering algorithm based on single hidden markov model. In: Proceedings of the international multi-conference on computer science and information technology, pp 57–66
Herrera A, Beck A, Bell D, Miller P, Wu Q, Yan W (2008) Behaviour analysis and prediction in image sequences using rough sets. In: Proceedings from international machine vision and image processing conference, pp 71–76. doi:10.1109/IMVIP.2008.24
Herzog DL, Kruger V (2009) Recognition and synthesis of human movements by parametric HMMs. Lect Notes Comput Sci 5064:148–168, Springer, Berlin. doi:10.1007/978-3-642-03061-1_8
Google Scholar
Herzog D, Kruger V, Grest D (2008) Parametric hidden markov models for recognition on synthesis of movements. In: Proceedings of the British machine vision conference
Ikizler N, Cinbis RG, Duygulu P (2008) Human action recognition with line and flow histograms. In: Proceedings from international conference on pattern recognition, pp 1–4. doi:10.1109/ICPR.2008.4671434
Ikizler N, Duygulu P (2007) Human action recognition using distribution of oriented rectangular patches. J Human Motion 271–284
Jang WS, Lee WK, Lee IK, Lee J (2008) Enriching a motion database by analogous combination of partial human motion. Visual Comput 24(4):271–280, Springer, Berlin.
Google Scholar
Jenkins OC, Gonzalez G, Loper M (2006) Dynamic motion vocabularies for kinematic tracking and activity recognition. In: Proceedings from computer vision and pattern recognition conference, pp. 147–156. doi:10.1109/CVPRW.2006.67
Kam AH, Ann TK, Lung EH, Yun YW, Wang JX (2004) Automated recognition of highly complex human behavior. In: Proceedings from international conference on pattern recognition, Vol. 4, pp 327–330. doi:10.1109/ICPR.2004.1333769
Kawanaka D, Okatani T, Deguchi K (2006) HHMM based recognition of human activity. Inst Electron, Inf Commun Engineers Trans, Oxford J E89-D(7): 2180–2185
Google Scholar
Kitani KM, Okabe T, Sato Y, Sugimoto A (2007) Recovering the basic structure of human activity from a video-based symbol string. In: Proceedings from IEEE workshop on motion and video computing, pp 1–9. doi:10.1109/WMVC.2007.34
Lee H, Kim JH (1999) An HMM based threshold model approach for gesture recognition. IEEE Trans Pattern Anal Mach Intell (PAMI) 21: 961–973
Article Google Scholar
Li X, Parizeau M, Plamondon R (2000) Training hidden markov models with multiple observations—a combinatorial method. IEEE Trans Pattern Anal Mach Intell (PAMI) 22(4): 177–371
Google Scholar
Liu X, Chua CS (2006) Multi-agent activity recognition using observation decomposed hidden markov models. Image Vis Comput 24: 166–175
Article MATH Google Scholar
Liu JG, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: Proceedings from computer vision and image processing conference, pp 461–468. doi:10.1109/CVPRW.2009.5206848
Masoud O, Papanikolopoulus NP (2003) A method for human action recognition. Image Vis Comput 21(8): 723–729
Article Google Scholar
Mikolajczyk K, Uemura H (2008) Action recognition with motion-appearance vocabulary forest. In: Proceedings from computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2008.4587628
Mokhber A, Achard C, Milgram M (2008) Recognition of human behavior by space-time Silhouette Characterization. Pattern Recognit Lett 29: 81–89
Article Google Scholar
Morellas V, Pavlidis I, Tsaimyartzis P (2003) DETER: detection of events for threat evaluation and recognition. Mach Vis Appl J 15(1): 29–45
Article Google Scholar
Mori T, Segawa Y, Shimosaka M, Sato T (2004) Hierarchical recognition of daily human actions based on continuous hidden markov models. In: Proceedings from IEEE conference on automatic face and gesture recognition, pp 779–784. doi:10.1109/AFGR.2004.1301629
Murphy K (2002) Hidden semi-markov models. Technical report, MIT AI Lab
Natarajan P, Nevatia R (2007) Coupled hidden semi markov models for activity recognition. In: Proceedings of the IEEE workshop on motion and video computing
Ogale A, Karapurkar A, Aloimonos Y (2007) View-invariant modeling and recognition of human actions using grammars. Lect Notes Comput Sci 4358:115–126, Springer, Berlin. doi:10.1007/978-3-540-70932-9_9
Oikonomopoulus A, Pantic M, Patras I (2008) B-spline polynomial descriptors for human activity recognition. Computer vision and pattern recognition conference, pp 1–6. doi:10.1109/CVPR.2008.4563175
Oikonomopoulos A, Patras I, Pantic M (2006) Kernal-based recognition of human actions using spatiotemporal salient points. In: Proceedings from computer vision and pattern recognition conference, pp 151–161. doi:10.1109/CVPRW.2006.114
Oliver N, Horvitz E, Garg A (2002) Layered representations for human activity recognition. In: Proceedings from IEEE international conference on multimodal inferences (ICMI), pp 3–8
Oliver NM, Rosario B, Pentland AP (2000) A bayesian computer vision system for modeling human interaction. IEEE Trans Pattern Anal Mach Intell 22(8): 831–843
Article Google Scholar
Parameswaran V, Chellappa R (2006) View invariance for human action recognition. Int J Comput Vis 66(1): 83–101
Article Google Scholar
Perez O, Piccardi M, Garcia J, Patricio MA, Molina JM (2007) Comparison between genetic algorithms and the Baum-Welch algorithm in learning HMMs for human activity classification. Lect Notes Comput Sci 4448:399–406, Springer, Berlin
Google Scholar
Petrushin V (2007) Hidden markov models: fundamentals and application. EETimes online symposium for electrical engineers (OSEE), Oct 2007
Rabiner L (1989) A tutorial on hidden markov models and selected applications in speech recognition. In: Proceedings of the IEEE, Vol 7, pp 257–286
Rahman M, Nakamura K, Ishikawa S (2002) Recognizing human behavior using universal eigenspace. In: Proceedings from international conference on pattern recognition, pp 295–298. doi:10.1109/ICPR.2002.1044694
Robertson N, Reid ID (2006) A general method for human activity recognition in video. Comput Vis Image Underst 104(2): 232–248
Article Google Scholar
Rodriguez MD, Ahmed J, Shah M (2008) Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings from computer vision and image processing conference, pp 1–8. doi:10.1109/CVPR.2008.4587727
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings from international conference on pattern recognition, Vol. 3, pp 32–36. doi:10.1109/ICPR.2004.1334462
Shah M (2003) Understanding human behavior from motion imagery. Mach Vis Appl 14(4):210–214, Springer, Berlin. doi:10.1007/s00138.0003-0124-3
Shi QF, Wang L, Cheng L, Smola A (2008) Discriminative human action segmentation and recognition using semi-Markov Model. In: Proceedings from computer vision and pattern recognition conference, pp 1–8. doi:10.1109/CVPR.2008.4587557
Siebel NT, Maybank SJ (2004) The ADVISOR visual surveillance system. In: Proceedings from applications of computer vision, pp 103–111
Starner T, Weaver J, Pentland A (1998) Real time American sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell (PAMI) 20: 1371–1375
Article Google Scholar
Stern H, Kartoun U, Shmilovici A (2001) A prototype fuzzy system for surveillance picture understanding. In: Proceedings from visual imaging and image processing conference, pp 624–629
Thurau C, Hlavac V (2007) n-Grams of action primitives for recognizing human behavior. Lect Notes Comput Sci 4673:93–100, Springer, Berlin
Google Scholar
Truyen TT, Phung DQ, Venkatesh S, Bui HH (2006) AdaBoost.MRF: boosted Markov random forests and application to multilevel activity recognition. In: Proceedings from computer vision and pattern recognition conference, pp 1686–1693. doi:10.1109/CVPR.2006.49
Walter M, Psarrou A, Gong S (2001) Data driven gesture model acquisition using minimum description length. In: Proceedings from British machine vision conference, pp 673–683
Wang Y, Huang KQ, Tan TN (2007) Group activity recognition based on ARMA shape sequence modeling. In: Proceedings from international conference on image processing, Vol. 3, pp. 209–212. doi:10.1109/ICIP.2007.4379283
Wang Y, Mori G (2009) Human action recognition by semilatent topic models. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(10): 1762–1774
Article Google Scholar
Weinland D, Ronfard R, Boyer E (2005) Motion history volumes for free viewpoint action recognition. In: Proceedings from IEEE international workshop on modeling people and human interaction
Wilson A, Bobick A (1999) Parametric hidden markov models for gesture recognition. IEEE Trans Pattern Anal Mach Intell (PAMI) 21: 884–899
Article Google Scholar
Xiang T, Gong S (2004) Activity based video content trajectory representation and segmentations. In: Proceedings from British machine vision conference, pp 177–186
Xiang T, Gong S (2006) Incremental visual behaviour modelling. In: Proceedings from European conference on computer vision, pp 65–72
Yamamoto M, Mitom H, Fujiwara F, Sato T (2006) Bayesian classification of task-oriented actions based on stochastic context free grammar. In: Proceedings from IEEE international conference on automatic face and gesture recognition, pp 317–322. doi:10.1109/FGR.2006.28
Yamato J, Ohya J, Ishii K (1992) Recognizing human action in time sequential images using hidden markov models. In: Proceedings from IEEE computer vision and pattern recognition (CVPR), pp 379–385
Yang JY, Wang JS, Chen YP (2008) Using acceleration measurements for activity recognition: an effective learning algorithm for constructing neural classifiers. Pattern Recognit Lett 29: 2213–2220
Article Google Scholar
Yu C, Ballard D (2002) Learning to recognize human action sequences. In: Proceedings from international conference on development and earning, pp 28–33
Zhang D, Gatica-Perez D, Bengio S, McCowan I (2006) Modeling individual and group actions in meetings with layered HMMs. IEEE Trans Multimedia 8(3): 509–520
Article Google Scholar

Download references

Author information

Authors and Affiliations

US Army Tank Automotive Research, Development, and Engineering Center (TARDEC), Warren, MI, USA
Michael S. Del Rose
Oakland University, Rochester Hills, MI, USA
Christian C. Wagner

Authors

Michael S. Del Rose
View author publications
You can also search for this author in PubMed Google Scholar
Christian C. Wagner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael S. Del Rose.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Del Rose, M.S., Wagner, C.C. Survey on classifying human actions through visual sensors. Artif Intell Rev 37, 301–311 (2012). https://doi.org/10.1007/s10462-011-9232-z

Download citation

Published: 04 May 2011
Issue Date: April 2012
DOI: https://doi.org/10.1007/s10462-011-9232-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on classifying human actions through visual sensors

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Human Action Recognition and Prediction: A Survey

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Survey on classifying human actions through visual sensors

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Human Action Recognition and Prediction: A Survey

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation