Abstract
In action recognition recently prototype-based classification methods became popular. However, such methods, even showing competitive classification results, are often limited due to too simple and thus insufficient representations and require a long-term analysis. To compensate these problems we propose to use more sophisticated features and an efficient prototype-based representation allowing for a single-frame evaluation. In particular, we apply four feature cues in parallel (two for appearance and two for motion) and apply a hierarchical k-means tree, where the obtained leaf nodes represent the prototypes. In addition, to increase the classification power, we introduce a temporal weighting scheme for the different information cues. Thus, in contrast to existing methods, which typically use global weighting strategies (i.e., the same weights are applied for all data) the weights are estimated separately for a specific point in time. We demonstrate our approach on standard benchmark datasets showing excellent classification results. In particular, we give a detailed study on the applied features, the hierarchical tree representation, and the influence of temporal weighting as well as a competitive comparison to existing state-of-the-art methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proc. ICCV (2005)
Ke, Y., Sukthankar, R., Hebert1, M.: Event detection in crowded videos. In: Proc. ICCV (2007)
Schueldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proc. ICPR (2004)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proc. PETS (2005)
Schindler, K., van Gool, L.: Action snippets: How many frames does human action recognition require? In: Proc. CVPR (2008)
Thurau, C., Hlaváč, V.: Pose primitive based human action recognition in videos or still images. In: Proc. CVPR (2008)
Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: Proc. ICCV (2009)
Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: Proc. CVPR (2008)
Elgammal, A., Shet, V., Yacoob, Y., Davis, L.S.: Learning dynamics for exemplar-based gesture recognition. In: Proc. CVPR, pp. 571–578 (2003)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. ICCV (2003)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR (2006)
Ikizler, N., Cinbis, R., Duygulu, P.: Human action recognition with line and flow histograms. In: Proc. ICPR, pp. 1–4 (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. CVPR (2005)
Ojala, T., Pietikinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 29, 51–59 (1996)
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L1 optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007)
Wang, H., Ullah, M.M., Laptev, A.K.I., Schmid, C.: An HOG-LBP human detector with partial occlusion handling. In: Proc. ICCV (2009)
Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: Proc. ICCV (2009)
Thurau, C., Hlaváč, V.: Pose primitive based human action recognition in videos or still images. In: Proc. CVPR (2008)
Schindler, K., van Gool, L.: Action snippets: How many frames does human action recognition require? In: Proc. CVPR (2008)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. PAMI 29 (2007)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proc. ICCV, pp. 1395–1402 (2005)
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proc. ICCV (2007)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: Proc. CVPR (2008)
Yao, B., Zhu, S.C.: Learning deformable action templates from cluttered videos. In: Proc. ICCV (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mauthner, T., Roth, P.M., Bischof, H. (2011). Temporal Feature Weighting for Prototype-Based Action Recognition. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19309-5_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-19309-5_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19308-8
Online ISBN: 978-3-642-19309-5
eBook Packages: Computer ScienceComputer Science (R0)