Abstract
In this paper we address the problem of human activity modelling and recognition by means of a hierarchical representation of mined dense spatiotemporal features. At each level of the hierarchy, the proposed method selects feature constellations that are increasingly discriminative and characteristic of a specific action category, by taking into account how frequently they occur in that action category versus the rest of the available action categories in the training dataset. Each feature constellation consists of n-tuples of features selected in the previous level of the hierarchy and lying within a small spatiotemporal neighborhood. We use spatiotemporal Local Steering Kernel (LSK) features as a basis for our representation, due to their ability and efficiency in capturing the local structure and dynamics of the underlying activities. The proposed method is able to detect activities in unconstrained videos, by back-projecting the activated features at the locations at which they were activated. We test the proposed method on two publicly available datasets, namely the KTH and YouTube datasets of human bodily actions. The acquired results demonstrate the effectiveness of the proposed method in recognising a wide variety of activities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comp. Vision, and Image Understanding 115, 224–241 (2011)
Laptev, I., Lindeberg, T.: Space-time Interest Points. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 432–439 (2003)
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65– 72 (2005)
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2009)
Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. Comp. Vision, and Image Understanding 110, 346–359 (2008)
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A Biologically Inspired System for Action Recognition. In: Proc. IEEE Int. Conf. Computer Vision, pp. 1–8 (2007)
Schindler, K., Gool, L.V.: Action snippets: How many frames does human action require? In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Analysis and Machine Intelligence (2010)
Gilbert, A., Illingworth, J., Bowden, R.: Fast realistic multi-action recognition using mined dense spatio-temporal features. In: Proc. IEEE Int. Conf. Computer Vision (2009)
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Seo, H., Milanfar, P.: Action recognition from one example. IEEE Trans. Pattern Analysis and Machine Intelligence 33, 867–882 (2011)
Amer, M., Todorovic, S.: Sum-product networks for modeling activities with stochastic structure. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1314–1321 (2012)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Stanford University Technical Report (1993)
Quack, T., Ferrari, V., Leibe, B., Gool, L.V.: Efficient mining of frequent and distinctive feature configurations. In: Proc. IEEE Int. Conf. Computer Vision (2007)
Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Trans. Pattern Analysis and Machine Intelligence 33, 883–897 (2011)
Wang, L., Wang, Y., Jiang, T., Gao, W.: Instantly telling what happens in a video sequence using simple features. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3257–3264 (2011)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: ACM SIGKDD, pp. 43–52 (2004)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol. 3, pp. 32–36 (2004)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos ”in the wild”. In: IEEE Conf. on Computer Vision and Pattern Recognition (2009)
Seo, H., Milanfar, P.: Training-free, generic object detection using locally adaptive regression kernels. IEEE Trans. Pattern Analysis and Machine Intelligence 32, 1688–1704 (2010)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Wang, H., Klaeser, A., Schmid, C., Liu, C.: Action recognition by dense trajectories. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3361–3368 (2011)
Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: Combining multiple features for human action recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 494–507. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oikonomopoulos, A., Pantic, M. (2013). Human Activity Recognition Using Hierarchically-Mined Feature Constellations. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2013. Lecture Notes in Computer Science, vol 8033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41914-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-41914-0_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41913-3
Online ISBN: 978-3-642-41914-0
eBook Packages: Computer ScienceComputer Science (R0)