Fast multi-class action recognition by querying inverted index tables

Pei, Lishen; Ye, Mao; Xu, Pei; Li, Tao

doi:10.1007/s11042-014-2207-8

Fast multi-class action recognition by querying inverted index tables

Published: 05 August 2014

Volume 74, pages 10801–10822, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Lishen Pei¹,
Mao Ye¹,
Pei Xu¹ &
…
Tao Li¹

261 Accesses
3 Citations
Explore all metrics

Abstract

A fast inverted index based algorithm is proposed for multi-class action recognition. This approach represents an action as a sequence of action states. Here, the action states are cluster centers of the extracted shape-motion features. At first, we compute the shape-motion features of a tracked actor. Secondly, a state binary tree is built by hierarchically clustering the extracted features. Then the training videos are represented as sequences of action states by searching the state binary tree. Based on the labeled state sequences, we create a state inverted index table and a state transition inverted index table. During testing, after representing a new action video as a state sequence, the state and state transition scores are computed by querying the inverted index tables. With the weight trained by the validation set, we get an action class score vector. The recognized action class label is the index of the maximum component of the score vector. Our key contribution is that we propose a fast multi-class action recognition approach based on two inverted index tables. Experiments on several challenging data sets confirm the performance of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The center of the action interest region is determined as a point. The point is on the vertical central axis of the bounding box that is determined by the automatic pedestrian localization method, and the side length of the rectangle is proportional to the height of the bounding box.
The tree nodes are the cluster centers of the corresponding clusters.

References

Ahmad M, Lee S (2008) Human action recognition using shape and clg-motion flow from multi-view image sequences. Pattern R 41(7):2237–2252
Article MATH Google Scholar
Avriel M (2003) Nonlinear programming: Analysis and methods. Dover Publishing
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. IEEE Int Conf Comput Vision:1395–1402
Bocker A, Derksen S, Schmidt E, Schneider G (2004) Hierarchical k-means clustering. H-K-means Manual
Chen B, Ting JA, Marlin B, de Freitas N (2010) Deep learning of invariant spatio-temporal features from video. IEEE Int’l. Workshop on Neural Information Processing Systems
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577
Article Google Scholar
Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. Int’l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance pp. 65-72
Elgammal A, Shet V, Yacoob Y, Davis LS (2003) Learning dynamics for exemplar-based gesture recognition. IEEE Conf Comput Vision Pattern R:571–578
Felzenszwalb PF, Girshick RB, Mcallester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627– 1645
Article Google Scholar
Ji S, Xu W, Yang M, Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Article Google Scholar
Jiang Z , Lin Z, Davis LS (2012) Recognizing human actions by learning and matching shape-motion prototype trees. IEEE Trans Pattern Analy Mach Intell 34(3):533–547
Article Google Scholar
Junejo O, Dexter E, Perez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern R Mach Intell 33(1):172–185
Article Google Scholar
Ke Y, Sukthankar R, Hebert M (2007) Spatio-temporal shape and flow correlation for action recognition. IEEE Conf Comput Vision Pattern R:1–8
Kim K, Chalidabhongse T, Harwood D, Davis L (2005) Real-time foreground-background segmentation using codebook model. Real-Time Imaging 11(3):167–256
Article Google Scholar
Lin Z, Jiang Z, Davis LS (2009) Recognizing actions by shape-motion prototype trees. IEEE Intl Conf Comput Vision:444–451
Mikolajczyk K, Uemura H (2008) Action recognition with motion-appearance vocabulary forest. IEEE Conf.on Computer Vision and Pattern Recognition
Niebles JC, Wang H, Fei-Fei L (2007) Unsupervised learning of human action categories using spatial-temporal words. IEEE J Comput Vision 79(3):299–318
Article Google Scholar
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. IEEE Conf Comput Vision Pattern R:2161–2168
Pei L, Ye M, Xu P, Zhao X , Guo G One example based action detection in hough space. Multimedia Tools and Applications. doi:10.1007/s11042-013-1478-9
Pei L, Ye M, Xu P, Zhao X, Li T (2013) Multi-class action recognition based on inverted index of action states. IEEE Intl Conf Image Process
Rao C, Yilmaz A, Shah M (2002) View-invariant representation and recognition of actions. Int J Comput Vision 50(2):203–226
Article MATH Google Scholar
Raptis M, Kokkinos I, Soatto S (2012) Discovering discriminative action parts from mid-level video representations. IEEE Conf Comput Vision Pattern R
Reddy K, Liu J, Shah M (2009) Incremental action recognition using feature-tree. IEEE Int Conf Comput Vision
Rodriguez M, Ahmed J, Shah M (2008) Action mach: A spatio-temporal maximum average correlation height filter for action recognition. IEEE Int Conf Comput Vision:3361– 3366
Schindler K, Gool LV (2008) Action snippets: How many frames does human action recognition require IEEE Conf Comput Vision Pattern R:1–8
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local svm approach. IEEE Int Conf Pattern R:32–36
Snyman J A (2005) Practical mathematical optimization: An introduction to basic optimization theory and classical and new gradient-based algorithms. Springer
Thurau C, Hlavac V (2008) Pose primitive based human action recognition in videos or still images. IEEE Conf Computer Vision and Pattern Recognition:1–8
US-AEMY (1987). Visual signals. Field Manual:21–60
Wang B, Ye M, Li X, Zhao F, Ding J (2012) Abnormal crowd behavior detection using high frequency and spatio temporal features. Mach Vision Appl 9(5):905–912
Google Scholar
Wang Y, Sabzmeydani P, Mori G (2007) Semi-latent dirichlet allocation: A hierarchical model for human action recognition. ICCV Workshop Human Motion:240–254
Weinland D, Boyer E (2008) Action recognition using exemplar based embedding. IEEE Conf Computer Vision Pattern R:1–7
Yao A, Gall J, Gool L V (2010) A hough transform-based voting framework for action recognition. IEEE Conf Comput Vision Pattern R
Yeffet L, Wolf L (2009) Local trinary patterns for human action recognition. IEEE Int Conf Comput Vision:1–8
Yu G, Goussies NA, Yuan J, Liu Z (2011) Fast action detection via discriminative random forest voting and top-k subvolume search. IEEE Trans Multimedia 13(3):507–517
Article Google Scholar
Yuan J , Liu Z , Wu Y (2011) Discriminative video pattern search for efficient action detection. IEEE Trans Pattern Anal Mach Intell 33(9):1728–1743
Article Google Scholar
Zhang Z, Tao D (2012) Slow feature analysis for human action recognition. IEEE Trans Pattern Anal Mach Intell 34(3):436–450
Article MathSciNet Google Scholar
Zou WY, Zhu S, Ng AY, Yu K (2012) Deep learning of invariant features via simulated fixations in video. IEEE Conf Neural Inf Process Syst

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (61375038).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Key Laboratory for NeuroInformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 611731, China
Lishen Pei, Mao Ye, Pei Xu & Tao Li

Authors

Lishen Pei
View author publications
You can also search for this author in PubMed Google Scholar
Mao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Pei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mao Ye.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pei, L., Ye, M., Xu, P. et al. Fast multi-class action recognition by querying inverted index tables. Multimed Tools Appl 74, 10801–10822 (2015). https://doi.org/10.1007/s11042-014-2207-8

Download citation

Received: 09 December 2013
Revised: 25 May 2014
Accepted: 20 July 2014
Published: 05 August 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11042-014-2207-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast multi-class action recognition by querying inverted index tables

Abstract

Access this article

Similar content being viewed by others

Structural iMoSIFT for human action recognition

Online Action Recognition from Trajectory Occurrence Binary Patterns (ToBPs)

Human Action Recognition Using Temporal Segmentation and Accordion Representation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast multi-class action recognition by querying inverted index tables

Abstract

Access this article

Similar content being viewed by others

Structural iMoSIFT for human action recognition

Online Action Recognition from Trajectory Occurrence Binary Patterns (ToBPs)

Human Action Recognition Using Temporal Segmentation and Accordion Representation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation