Abstract
As the quantity of publicly available multimedia material becomes larger and larger, automatic indexing becomes increasingly important in accessing multimedia databases. In this paper, a novel set of low-level descriptors is presented for the aim of content-based video classification. Concerning temporal features, we use a modified PMES descriptor for the spatial distribution of local motion and a Dominant Direction Histogram we have developed to represent the temporal distribution of camera motion. Concerning color, we present the Weighted Color Histogram we have designed in order to model color distribution. The histogram models the H parameter of the HSV color space, and we combine it with weighted means for the S and V parameters. For the selection of key-frames from which to extract the spatial descriptors we use a modified version of a simple efficient method. We then proceed to evaluate our descriptor set on a database of video shots resulting from the temporal segmentation of the archive of a real-world TV station. Results demonstrate that our approach can achieve high success rates on a wide range of semantic classes.
Similar content being viewed by others
References
Datta, R., Li, J., Wang, J. Z. (2005). Content-based image retrieval: approaches and trends of the new age. Proceedings of the 7th International Workshop on Multimedia Information Retrieval, in conjunction with ACM International Conference on Multimedia, pp. 253–262.
Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1349–1380. doi:10.1109/34.895972.
Koprinska, I., & Carrato, S. (2001). Temporal video segmentation: a survey. Signal Processing: Image Communication, 8, 477–500. doi:10.1016/S0923-5965(00)00011-4.
Nagasaka, A., Tanaka, Y. (1991) Automatic video indexing and full-video search for object appearances. Proceedings of the IFIP TC2/WG 2.6 Second Working Conference on Visual Database Systems II, pp 113–127.
Ardizzone, E., Gatani, L., La Cascia, M., Lo Re, G., Ortolani, M. (2006) Advances in Multimedia Modelling. Springer Berlin, chapter “A P2P Architecture for Multimedia Content Retrieval,” pp. 462–474.
Chen, J. F., Liao, H. Y. M., Lin, C. W. (2005) Knowledge-Based Intelligent Information and Engineering Systems. Springer Berlin/Heidelberg, chapter “Fast Video Retrieval via the Statistics of Motion Within the Regions-of-Interest”.
Fablet, R., Bouthemy, P., & Pérez, P. (2002). Non-parametric motion characterization using causal probabilistic models for video indexing and retrieval. IEEE Transactions on Image Processing, 11, 393–407. doi:10.1109/TIP.2002.999674.
Fablet, R., & Bouthemy, P. (2000). Statistical motion-based object indexing using optic flow field. IEEE International Conference on Pattern Recognition, 4, 287–290.
Piriou, G., Bouthemy, P., & Yao, J. F. (2006). Recognition of dynamic video contents with global probabilistic models of visual motion. IEEE Transactions on Image Processing, 15, 3418–3431.
Shih, H. C., Huang, C. L. (2003). Image analysis and interpretation for semantics categorization in baseball video. IEEE International Conference on Information Technology: Coding and Computing [Computers and Communications], pp 379–383.
Ferman, A. M., Tekalp, A. M., & Mehrotra, R. (1998). Effective content representation for video. IEEE International Conference on Image Processing, 3, 521–525.
Jeannin, S., & Divakaran, A. (2001). MPEG-7 visual motion descriptors. IEEE Transactions on Circuits and Systems for Video Technology, 11, 720–724. doi:10.1109/76.927428.
Chia-Han, L., & Chen, A. L. P. (2001). Processing concept queries with object motions in video databases. IEEE International Conference on Image Processing, 2, 641–644.
Zhen-Hua Zhang, Yong Quan, Wen-Hui Li, Wu Guo (2006). A new content-based image retrieval. Machine Learning and Cybernetics, IEEE International Conference on, pp 4013–4018.
Sural, S., Quian, G., & Pramanik, S. (2002). Segmentation and Histogram Generation Using the HSV Color Space for Image Retrieval. Proceedings. International Conference on Image Processing, 2, 589–592.
Rautiainen, M., & Doermann, D. (2002). Temporal Color Correlograms for Video Retrieval. Proceedings, International Conference on Pattern Recognition, 2, 589–592.
Williams, A., & Yoon, P. (2007). Content-based image retrieval using joint correlograms. Multimedia Tools and Application, 34, 239–248. doi:10.1007/s11042-006-0087-2.
Yu-Fei, Ma, & Hong-Jiang, Zhang (2001). A new perceived motion based shot content representation. IEEE International Conference on Image Processing, 3, 426–429.
Zampoglou, M., Papadimitriou, T., Diamantaras, K. I. (2007). Support Vector Machines Content-Based Video Retrieval Based Solely on Motion Information. Proc. 17th Int. Workshop on Machine Learning for Signal Processing, IEEE, Thessaloniki, Greece, pp 176–180.
Zampoglou, M., Papadimitriou, T., Diamantaras, K. I. (2008). Integrating Motion and Color for Content-Based Video Classification. 2008 IAPR Workshop on Cognitive Information Processing, Santorini, Greece.
Ferman, A. M., Tekalp, A. M., & Mehrotra, R. (2002). Robust Color Histogram Descriptors for Video Segment Retrieval and Identification. IEEE Transactions on Image Processing, 11, 497–508. doi:10.1109/TIP.2002.1006397.
Cristianini, N., Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge University Press.
Zhang, L., Fuzong Lin, Bo Zhang (2001). Support vector machine learning for image retrieval. International Conference on Image Processing, pp 721–724.
Mezaris, V., Kompatsiaris, I., Boulgouris, N. V., & Strintzis, M. G. (2004). Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 14, 606–621. doi:10.1109/TCSVT.2004.826768.
Joachims, T. Schϕlkopf, B., Burges, C., Smola, A. (eds.) (1999). Advances in Kernel Methods - Support Vector Learning. MIT, chapter “Making large-scale SVM learning practical,” pp. 169–184.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zampoglou, M., Papadimitriou, T. & Diamantaras, K.I. From Low-Level Features to Semantic Classes: Spatial and Temporal Descriptors for Video Indexing. J Sign Process Syst 61, 75–83 (2010). https://doi.org/10.1007/s11265-008-0314-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-008-0314-3