Abstract
This work proposes the volume integral (VI) as a new descriptor for three-dimensional action recognition. The descriptor transforms the actor’s volumetric information into a two-dimensional representation by projecting the voxel data to a set of planes that maximize the discrimination of actions. Our descriptor significantly reduces the amount of data of the three-dimensional representations yet preserves the most important information. As a consequence, the action recognition process is greatly speeded up while achieving very high success rates. The method proposed is therefore especially appropriate for applications in which limitations of computing power and space are significant aspects to consider, such as real-time applications or mobile devices. Additionally, the descriptor is sensitive to reflected actions, i.e., same actions performed with different limbs can be differentiated. This paper tests the VI using several Dimensionality Reduction techniques (namely PCA, 2D-PCA, LDA) and different Machine Learning approaches (namely Clustering, SVM and HMM) so as to determine the best combination of these for the action recognition task. Experiments conducted on the public IXMAS dataset show that the VI compares favorably with state-of-the-art descriptors both in terms of classification rates and computing times.
Similar content being viewed by others
References
Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28:976–990
Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488
Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23:257–267
Ikizler N, Duygulu P (2009) Histogram of oriented rectangles: a new pose descriptor for human action recognition. Image Vis Comput 27(10):1515–1526
Chakraborty B, Rudovic O, Gonzalez J (2008) View-invariant human-body detection with extension to human action recognition using component-wise HMM of body parts. In: 2008 8th IEEE international conference on automatic face & gesture recognition. IEEE, pp 1–6
Shin H-K, Lee S-W, Lee S-W (2005) Real-time gesture recognition using 3D motion history model. In: Proceedings of ICIC (1), pp 888–898
Roh M-C, Shin H-K, Lee S-W (2010) View-independent human action recognition with volume motion template on single stereo camera. Pattern Recognit Lett 31(7)
Muñoz-Salinas R, Medina-Carnicer R, Madrid-Cuevas FJ, Carmona-Poyato A (2008) Depth silhouettes for gesture recognition. Pattern Recognit Lett 29:319–329
Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104(2):249–257
Yang Y, Hao A, Zhao Q (2008) View-invariant action recognition using interest points. In: International multimedia conference
Cherla S, Kulkarni K, Kale A, Ramasubramanian V (2008) Towards fast, view-invariant human action recognition. In: 2008 IEEE Computer Society conference on computer vision and pattern recognition workshops. IEEE, pp 1–8
Pingkun Y, Khan SM, Shah M (2008) Learning 4D action feature models for arbitrary view action recognition. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–7
Ji X, Liu H (2010) Advances in view-invariant human motion analysis: a review. IEEE Trans Syst Man Cybernet C (Appl Rev) 40(1):13–24
Peng B, Qian G, Rajko S (2009) View-invariant full-body gesture recognition via multilinear analysis of voxel data. ICDSC
Brubaker MA, Fleet DJ, Hertzmann A (2009) Physics-based person tracking using the anthropomorphic walker. Int J Comput Vis 87(1–2):140–155
Corazza S, Mündermann L, Gambaretto E, Ferrigno G, Andriacchi TP (2009) Markerless motion capture through visual hull, articulated ICP and subject specific model generation. Int J Comput Vis 87(1–2):156–169
Li R, Tian T-P, Sclaroff S, Yang M-H (2009) 3D human motion tracking with a coordinated mixture of factor analyzers. Int J Comput Vis 87(1–2):170–190
Haritaoglu I, Harwood D, Davis LS (2000) W4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22:809–830
Haritaoglu I, Cutler R, Harwood D, Davis LS (1999) Backpack: detection of people carrying objects using silhouettes. Comput Vis Image Underst 81:102–107
Cucchiara R, Grana C, Prati A, Vezzani R (2005) Probabilistic posture classification for human-behavior analysis. IEEE Trans Syst Man Cybernet A: Syst Humans 35(1):42–54
Juang C-F, Chang C-M (2007) Human body posture classification by a neural fuzzy network and home care system application. IEEE Trans Syst Man Cybernet A: Syst Humans 37(6):984–994
Souvenir R, Parrigan K (2009) Viewpoint manifolds for action recognition. EURASIP J Image Video Process 2009:1–13
Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Ji X, Liu H (2009) View-invariant human action recognition using exemplar-based hidden Markov models. Lect Notes Comput Sci 5928:78–89
Weinland D, Boyer E, Ronfard R (2007) Action recognition from arbitrary views using 3D exemplars. In: 2007 IEEE 11th international conference on computer vision. IEEE, pp 1–7
Laurentini A (1991) The visual hull: a new tool for contour-based image understanding. In: Proceedings of seventh Scandinavian conference on image processing, pp 993–1002
Díaz-Más L, Muñoz-Salinas R, Madrid-Cuevas FJ, Medina-Carnicer R (2010) Shape from silhouette using dempster-shafer theory. Pattern Recognit 43(6):2119–2131
Landabaso JL, Pardàs M, Ramon Casas J (2008) Shape from inconsistent silhouette. Comput Vis Image Underst 112:210–224
Bishop CM (2007) Pattern recognition and machine learning (information science and statistics), 1st edn, 2006. Springer. corr. 2nd printing edition, October 2007
Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures, 4th edn. Chapman & Hall/CRC
Devore JL, (2008) Probability and statistics for engineering and the sciences, 7th edn. Thomson Brooks/Cole
Intel. OpenCV: Open source Computer Vision library. http://www.intel.com/research/mrl/opencv.
Acknowledgments
This work was developed with the support of the Research Project “TIN2010-18119” financed by Science and Technology Ministry of Spain.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Díaz-Más, L., Muñoz-Salinas, R., Madrid-Cuevas, F.J. et al. Three-dimensional action recognition using volume integrals. Pattern Anal Applic 15, 289–298 (2012). https://doi.org/10.1007/s10044-011-0239-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-011-0239-5