Abstract
This paper is focused on the automatic recognition of human events in static images. Popular techniques use knowledge of the human pose for inferring the action, and the most recent approaches tend to combine pose information with either knowledge of the scene or of the objects with which the human interacts. Our approach makes a step forward in this direction by combining the human pose with the scene in which the human is placed, together with the spatial relationships between humans and objects. Based on standard, simple descriptors like HOG and SIFT, recognition performance is enhanced when these three types of knowledge are taken into account. Results obtained in the PASCAL 2010 Action Recognition Dataset demonstrate that our technique reaches state-of-the-art results using simple descriptors and classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(10), 1349–1380 (2010)
Ikizler, N., Duygulu, P.I.: Histogram of oriented rectangles: A new pose descriptor for human action recognition. IVC 27(10), 1515–1526 (2009)
Marszałek, M., Laptev, I., Schmid, C.: Actions in Context. In: CVPR, Florida (2009)
Li, L.-J., Fei-Fei, L.: What, where and who? Classifying event by scene and object recognition. In: ICCV, Rio de Janeiro (2007)
Gupta, A., Kembhavi, A., Davis, L.S.: Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 1775–1789 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: CVPR, New York (2006)
Kjellström, H., Romero, J., Martínez, D., Kragić, D.: Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 336–349. Springer, Heidelberg (2008)
Bangpeng, Y., Fei-Fei, l.: Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities. In: CVPR, San Francisco (2010)
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: ICCV, Kyoto (2009)
Pedersoli, M., Gonzàlez, J., Bagdanov, A.D., Villanueva, J.J.: Recursive Coarse-to-Fine Localization for Fast Object Detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 280–293. Springer, Heidelberg (2010)
Dalal, N., Triggs, B., Rhone-Alps, I., Montbonnot, F.: Histograms of oriented gradients for human detection. In: CVPR, San Diego (2005)
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: ACM ICIVR, Amsterdam (2007)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC 2010) Results (2010), http://www.pascal-network.org/challenges/VOC/voc2010/workshop/index.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shapovalova, N., Gong, W., Pedersoli, M., Roca, F.X., Gonzàlez, J. (2011). On Importance of Interactions and Context in Human Action Recognition. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-21257-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21256-7
Online ISBN: 978-3-642-21257-4
eBook Packages: Computer ScienceComputer Science (R0)