Abstract
In this paper, we propose a novel Spatiotemporal Interest Point (MC-STIP) detector based on the coherent motion pattern around each voxel in videos. Our detector defines the local peaks of optical flow as the interest points in the motion coherence volumes of videos. A concatenating histogram of 2D gradients is introduced to describe each interest point as the descriptor. Moreover, we introduce a Topic Matrix Video Representation (T-Mat) for videos. Our representation not only captures the global hidden topics but also preserves the shared discriminative information among the interest point descriptors. We conduct our experiments on three benchmark datasets to recognize human actions using Support Vector Machines with four different kernels. The experiments demonstrate the effectiveness of our new approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: ACM Multimedia, pp. 357–360 (2007)
Niebles, J., Wang, H., Li, F.: Unsupervised learning of human action categories using spatial-temporal words. In: BMVC, vol. III, p. 1249 (2006)
Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV (2003)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)
Oikonomopoulos, A., Patras, I., Pantic, M.: Human action recognition with spa- tiotemporal salient points. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics 36(3), 710–719 (2006)
Kadir, T., Brady, M.: Saliency, scale and image description. IJCV 45, 83–105 (2001)
Willems, G., Tuytelaars, T., Gool, L.: An efficient dense and scale-invariant spatio- temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volu- metric features. In: ICCV, vol. 1, pp. 166–173 (October 2005)
Efros, A.A., Berg, E.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, pp. 726–733 (2003)
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 20, 91–110 (2003)
Klaeser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC, pp. 995–1004 (2008)
Savarese, S., DelPozo, A., Niebles, J., Li, F.: Spatial-temporal correlations for unsupervised action classification. Motion, 1–8 (2008)
Zhang, Z., Hu, Y., Chan, S., Chia, L.: Motion context: A new representation for human action recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 817–829. Springer, Heidelberg (2008)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR, vol. III, pp. 32–36 (2004)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. In: Mach. Learn., Hingham, MA, USA, vol. 42, pp. 177–196. Kluwer Academic Publishers, Dordrecht (2001)
Paragios, N., Chen, Y., Faugeras, O.: Handbook of Mathematical Methods in Computer Vision. Springer, Heidelberg (2006)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, vol. II, pp. 1395–1402 (2005)
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. ICCV 2, 1458–1465 (2005)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, pp. 1–8 (2008)
Rodriguez, M., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR, pp. 1–8 (2008)
Schindler, K., van Gool, L.: Action snippets: How many frames does human action recognition require? In: CVPR, pp. 1–8 (2008)
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: ICCV, pp. 1–8 (2007)
Wong, S., Cipolla, R.: Extracting spatiotemporal interest points using global in- formation. In: ICCV, pp. 1–8 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Z., Huang, J., Li, ZN. (2010). Discovering Motion Patterns for Human Action Recognition. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6298. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15696-0_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-15696-0_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15695-3
Online ISBN: 978-3-642-15696-0
eBook Packages: Computer ScienceComputer Science (R0)