Abstract
In this paper, we propose a novel motion descriptor Seg-SIFT-ACC for human action recognition. The proposed descriptor is based both on the accordion representation of the video and its temporal segmentation into elementary motion segments. The accordion representation aims to put in space adjacency the columns of the video frames having a high temporal correlation. For complex videos containing many different elementary actions, the accordion representation may put in spatial adjacency temporally correlated pixels that belong to different elementary actions. To surmount this problem, we divide the video into elementary motions segments and we apply the accordion representation on each one separately.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV (2003)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse Spatio-Temporal features. In: VS-PETS (2005)
Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)
Laptev, I., Marsza, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, pp. 3265–3271 (2008)
Heng, W., Alexander, K., Cordelia, S., Cheng-Lin, L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision (2013)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Zhou, Q., Wang, G.: Atomic Action Features: A New Feature for Action Recognition. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part I. LNCS, vol. 7583, pp. 291–300. Springer, Heidelberg (2012)
Ahmed, O.B., Mejdoub, M., Amar, C.B.: SIFT Accordion: A space-time descriptor applied to human action recognition. In: ICMVIPPA 2011, Venice, Italy (2011)
Kwon Park, D., Seok Jeon, Y., Sun Won, C.: Efficient use of local edge histogram descriptor. ACM Multimedia Conference-MM, 51–54 (2000)
Sekma, M., Ben Abdelali, A., Mtibaa, A.: Application d’un descripteur MPEG7 de texture pour la segmentation temporelle de la vidéo. Sciences of Electronics of Information and Telecommunications (2012)
Mejdoub, M., Fonteles, L., BenAmar, C., Antonini, M.: Embedded lattices tree: An efficient indexing scheme for content based retrieval on image databases. Journal of Visual Communication and Image Representation 20(2), 145–156 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. of CVPR (2006)
Mejdoub, M., Ben Amar, C.: Classification improvement of local feature vectors over the KNN algorithm. Multimedia Tools Appl. 64(1), 197–218 (2013)
Dammak, M., Mejdoub, M., Zaied, M., Ben Amar, C.: Feature Vector Approximation based on Wavelet Network. ICAART (1), 394–399 (2012)
Mejdoub, M., Fonteles, L., Ben Amar, C., Antonini, M.: Fast indexing method for image retrieval using tree-structured lattices. CBMI, pp. 365–372 (2008)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)
Petersohn, C.: Temporal video segmentation. Berlin Institute of Technology, pp. 1–272 (2010) ISBN 978-3-938860-39-7
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local Spatio-Temporal features for action recognition. In: BMVC (2010)
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant Spatio-Temporal features for action recognition with independent subspace analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), pp. 3361–3368 (2011)
Ullah, M., Parizi, S., Laptev, I.: Improving bag-of-features action recognition with non-local cues. In: Proceedings of the British Machine Vision Conference (BMVC 2010), pp. 1–11 (2010)
Gilbert, A., Illingworth, J., Bowden, R.: Action Recognition using Mined Hierarchical Compound Features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 883–897 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sekma, M., Mejdoub, M., Ben Amar, C. (2013). Human Action Recognition Using Temporal Segmentation and Accordion Representation. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds) Computer Analysis of Images and Patterns. CAIP 2013. Lecture Notes in Computer Science, vol 8048. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40246-3_70
Download citation
DOI: https://doi.org/10.1007/978-3-642-40246-3_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40245-6
Online ISBN: 978-3-642-40246-3
eBook Packages: Computer ScienceComputer Science (R0)