Skip to main content
Log in

Framework for Synthesizing Semantic-Level Indices

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Extraction of the syntactic features is a well-defined problem thereby lending them to be exclusively employed in most of the content-based retrieval systems. However, semantic-level indices are more appealing to user as they are closer to the user's personal space. Most of the work done at semantic level is confined to a limited domain as the features developed and employed therein apply satisfactorily only to that particular domain. Scaling up such systems would inevitably result in large numbers of features. Currently, there exists a lacuna in the availability of a framework that can effectively integrate these features and furnish semantic level indices.

The objective of this paper is to highlight some of the issues in the design of such a framework and to report on the status of its development. In our framework, construction of a high-level index is achieved through the synthesis of its large set of elemental features. From the large collection of these features, an image/video class is characterized by selecting automatically only a few principal features. By properly mapping the constrained multi-dimensional feature space constituted by these principal features, with the semantics of the data, it is feasible to construct high level indices. The problem remains, however, to automatically identify the principal or meaningful subset of features. This is done through the medium of Bayesian Network that discerns the data into cliques by training with pre-classified data. The Bayesian Network associates each clique of data points in the multi-dimensional feature space to one of the classes during training that can later be used for evaluating the most probable class to which that partition of feature space belongs. This framework neither requires normalization of different features or the aid of an expert knowledge base. The framework enables a stronger coupling between the feature extraction and meaningful high-level indices and yet the coupling is sufficiently domain independent, as shown by the experiments. The experiments were conducted over real video consisting of seven diverse classes and the results show its superiority over some of the standard classification tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Y.A. Aslandogan and C.T. Yu, “Techniques and systems for image and video retrieval,” IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1, pp. 56–63, 1999.

    Google Scholar 

  2. M.L. Cascia and E. Ardizzone, “JACOB: Just a content-based query system for video databases,” in IEEE Int. Conf. on Acoustics, Speech and Signal Processing, May 1996.

  3. C.C. Chang and C.-J. Lin, “LIBSVM: Introduction and benchmarks,” in Tech. Report, CS Deptt., NTU, Taiwan. http: //www.csie.ntu.edu.tw/ jlin/libsvm/, 2000.

    Google Scholar 

  4. G.F. Cooper, “Probabilistic inference using belief networks is NP-hard,” Technical Report, KSL-87-27, Stanford University.

  5. J. Demsar and F. Solina, “Using machine learning for content-based image retrieving,” International Conference on Pattern Recognition, 1996, Vol. 3.

  6. N.D. Doulamis, A.D. Doulamis, and S.D. Kollias, “A neural network approach to interactive content-based retrieval of video databases,” International Conference on Image Processing, 1999, Vol. 2.

  7. A.M. Ferman and A.M. Tekalp, “Probabilistic analysis and extraction of video content,” in Proc. Of ICIP, 1999, Vol. 2, pp. 91–95.

    Google Scholar 

  8. S. Fischer, R. Lienhart, and W. Effelsberg, “Automatic recognition of film genres,” in ACM Multimedia 95—Electronic Proceedings, San Francisco, California, Nov. 1995.

  9. M. Flickner et al., “Query by image and video Content: The QBIC system,” IEEE Computer, pp. 23–32, Sept. 1995.

  10. V.N. Gudivada and V.V. Raghavan, “Content-based image retrieval systems,” IEEE Computer, Sept. '95.

  11. A. Hampapur, “Designing video data management systems,” in Ph.D. Thesis, The University of Michigan, 1995.

  12. S. Haykin, “Neural network: A comprehensive foundation,” 2nd ed., pp. 178–210, 1999.

  13. M. Henrion, “Towards efficient probabilistic diagnosis in multiply connected belief networks,” Influence Diagrams, Belief Nets and Decision Analysis, pp. 385–410, 1990.

  14. A.K. Jain, A. Vailaya, and X. Wei, “Query by video clip,” Multimedia Systems, pp. 369–384, 1999.

  15. P.M. Kelly, T.M. Cannon, and D.R. Hush, “Query by image example: The CANDID approach,” in Proc. of the SPIE, Storage and Retrieval for Image and Video Databases III, Vol. 2420, pp. 238–248, 1995.

    Google Scholar 

  16. S.L. Lauritzen and D.J. Spiegelhalter, “Local computations with probabilities on graphical structures and their applications to expert systems,” J. Royal Statistical Society, pp. 157–224, 1988.

  17. S. Lendis, “Content-based image retrieval systems project,” http: //www.tc.cornell.edu/Visualization/ Education/cs718/fall1995/landis/.

  18. T.M. Mitchell, “Instance-based learning,” in Machine Learning, McGraw-Hill, pp. 230–248, 1997.

  19. F. Nack and A. Parkes, “The application of video semantics and theme representation in automated video editing,” Multimedia Tools and Applications, pp. 57–83, 1997.

  20. M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang, “Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems,” in Proc. of ICIP, 1998, pp. 536–540.

  21. V.E. Ogle and M. Stonebraker, “CHABOT: Retrieval from a relational database of images,” IEEE Computer, pp. 40–48, September 1995.

  22. K. Otsuji and Y. Tonomura, “Projection-detecting filter for video cut detection,” Multimedia Systems, Vol. 1, pp. 205–210, 1994.

    Google Scholar 

  23. M. Pazzani, “An interative improvement approach for the discretization of numeric attributes in bayesian classifiers,” International Conference on Knowledge Discovery and Data Mining (KDD), pp. 228–233, 1995.

  24. M. Pazzani, C. Merz, K. Ali, and T. Hume, “Reducing misclassification costs,” International Conference on Machine Learning, 1994.

  25. J. Pearl, “Probabilistic Reasoning in Intelligent Systems,” Morgan Kaufmann, 1988.

  26. Y. Peng and J.A. Reggia, “Abductive Inference Models for Diagnostic Problem-Solving,” Springer-Verlag, 1990.

  27. J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.

  28. S. Raudys, “How good are support vector machines?” Neural Networks, Vol. 13, pp. 17–19, 2000.

    Google Scholar 

  29. X.J. Shannon, M.J. Black, S. Minneman, and D. Kimber, “Analysis of gesture and action in technical talks for video indexing,” IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 595–601, 1997.

  30. J.R. Smith and S.F. Chang, “VisualSEEK: A Fully Automated Content-Based Image Query System,” ACM Multimedia, Nov. 1996.

  31. V. Sobchack, “Toward inhabited spce: The semiotic structure of camera movement in the cinema,” Semotica, pp. 317–335, 1982.

  32. M.A. Stricker and M. Swain, “Bounds for the discrimination power of color indexing techiniques,” in Proceedings SPIE Storage and Retreival for Image and Video Databases II, 1994, pp. 15–24.

  33. G. Sudhir, C.M. Lee, and A.K. Jain, “Automatic classification of tennis video for high-level content-based retrieval,” in IEEE Workshop on Content-Based Access of Image and Video Databases, 1998.

  34. B.S. Todd, R. Stamper, and P. Machpherson, “A probabilistic rule-based expert system,” International Journal of Biomedical computing, pp. 129–148, 1993.

  35. N. Vasconcelos and A. Lipman, “Towards semantically meaningful feature spaces for the characterization of video content,” in Proc. of Int. Conf. on Image Processing, 1997.

  36. Z. Yang and C.C.J. Kuo, “Asemantic classification and composite indexing approach to robust image retrieval,” International Conference on Image Processing, Vol. 1, 1999.

  37. D. Yow, B.L. Yeo, M. Yeung, and B. Liu, “Analysis and presentation of soccer highlights from digital video,” in Second Asian Conf. on Computer Vision(ACCV '95), 1995.

  38. R. Zabih, J. Miller, and K. Mai, “Video browsing using edges and motion,” IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 439–446, 1996.

  39. H. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full motion video,” Multimedia Systems, Vol. 1, pp. 10–28, 1993.

    Google Scholar 

  40. H.J. Zhang, Y. Gong, S.W. Smoliar, and S.Y. Tan, “Automatic parsing of news video,” in Proc. of Int. Conf. On Multimedia Computing and Systems, Boston, Massachusetts, USA, May 1994, pp. 45–54.

  41. H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu, “Video parsing retrieval and browsing: An integrated and content based solution,” in Proc. of Multimedia '95, San Francisco, CA, USA, 1995, pp. 15–24.

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mittal, A., Cheong, LF. Framework for Synthesizing Semantic-Level Indices. Multimedia Tools and Applications 20, 135–158 (2003). https://doi.org/10.1023/A:1023627404478

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023627404478

Navigation