Abstract
Detecting event in multimedia video has become a popular research topic. One of the most important clues to determine an event in video is its motion features. Currently, motion features are often extracted from the whole video using dense sampling strategy. However, this extraction method is computationally prohibitive when it comes to large scale video dataset. Moreover, video length may be very different, which makes it unreliable to compare the feature between videos. In this paper, we propose to use segment-based approach to extract motion feature. Basically, original videos are quantized into fixed-length segments for both training and testing, while still keep evaluation at video-level. Our approach has achieved promising results when applying for dense trajectory motion feature on TRECVID 2010 Multimedia Event Detection (MED) dataset. Combining with global and local features, our event detection system has comparable performance with other state-of-the-art MED systems, while the computational cost is significantly reduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jiang, Y.G., Zeng, X., Ye, G., Bhattacharya, S., Ellis, D., Shah, M., Chang, S.F.: Columbia-ucf trecvid2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In: NIST TRECVID Workshop, Gaithersburg, MD (November 2010)
Hill, M., Hua, G., Natsev, A., Smith, J.R., Xie, L., Huang, B., Merler, M., Ouyang, H., Zhou, M.: Ibm research trecvid-2010 video copy detection and multimedia event detection system. In: NIST TRECVID Workshop, Gaithersburg, MD (November 2010)
Matsuo, T., Nakajima, S.: Nikon multimedia event detection system. In: NIST TRECVID Workshop, Gaithersburg, MD (November 2010)
Natarajan, P., Manohar, V., Wu, S., Tsakalidis, S., Vitaladevuni, S.N., Zhuang, X., Prasad, R., Ye, G., Liu, D.: Bbn viser trecvid 2011 multimedia event detection system. In: NIST TRECVID Workshop, Gaithersburg, MD (December 2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Conference on Computer Vision & Pattern Recognition (June 2008)
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1582–1596 (2010)
Kläser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: British Machine Vision Conference, pp. 995–1004 (September 2008)
Chen, M., Hauptmann, A.: Mosift: Recognizing human actions in surveillance videos. In: Computer Science Department, CMU-CS-09-161 (2009)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action Recognition by Dense Trajectories. In: IEEE Conference on Computer Vision & Pattern Recognition, Colorado Springs, United States, pp. 3169–3176 (June 2011)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, INRIA Rhône-Alpes, ZIRST-655, av. de l’Europe, Montbonnot-38334, vol. 2, pp. 886–893 (June 2005)
Dalal, N., Triggs, B., Schmid, C.: Human Detection Using Oriented Histograms of Flow and Appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 494–501 (2007)
Jiang, Y.G., Yang, J., Ngo, C.W., Hauptmann, A.G.: Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transactions on Multimedia 12, 42–53 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phan, S. et al. (2012). Multimedia Event Detection Using Segment-Based Approach for Motion Feature. In: Lin, W., et al. Advances in Multimedia Information Processing – PCM 2012. PCM 2012. Lecture Notes in Computer Science, vol 7674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34778-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-34778-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34777-1
Online ISBN: 978-3-642-34778-8
eBook Packages: Computer ScienceComputer Science (R0)