Skip to main content
Log in

Semantic representation of multimedia content: Knowledge representation and semantic indexing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we present a framework for unified, personalized access to heterogeneous multimedia content in distributed repositories. Focusing on semantic analysis of multimedia documents, metadata, user queries and user profiles, it contributes to the bridging of the gap between the semantic nature of user queries and raw multimedia documents. The proposed approach utilizes as input visual content analysis results, as well as analyzes and exploits associated textual annotation, in order to extract the underlying semantics, construct a semantic index and classify documents to topics, based on a unified knowledge and semantics representation model. It may then accept user queries, and, carrying out semantic interpretation and expansion, retrieve documents from the index and rank them according to user preferences, similarly to text retrieval. All processes are based on a novel semantic processing methodology, employing fuzzy algebra and principles of taxonomic knowledge representation. The first part of this work presented in this paper deals with data and knowledge models, manipulation of multimedia content annotations and semantic indexing, while the second part will continue on the use of the extracted semantic information for personalized retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Akrivas G, Stamou G, Kollias S (2004) Semantic association of multimedia document descriptions through fuzzy relational algebra and fuzzy reasoning. IEEE Trans Syst Man Cybern Part A, 34:(2), March

    Google Scholar 

  2. Akrivas G, Wallace M, Andreou G, Stamou G, Kollias S (2002) “Context-Sensitive Semantic Query Expansion”, Proceedings of the IEEE international conference on artificial intelligence systems (ICAIS), Divnomorskoe, Russia, September 2002

  3. Altenschmidt C, Biskup J (2002) Explicit representation of constrained schema mappings for mediated data integration. In: Bhalla S (ed) Databases in networked information systems, pp 103–132

  4. Altenschmidt C, Biskup J, Flegel U, Karabulut Y (2003) Secure mediation: requirements, design, and architecture. J Comput Secur 11(3):365–398, March

    Google Scholar 

  5. Amir A et al (2003) IBM research TRECVID-2003 video retrieval system. Proceedings of NIST TRECVID workshop, Gaithersburg, MD, USA, November 2003

  6. Argillander J, Iyengar G, Nock H (2005) Semantic annotation of multimedia using maximum entropy models. Proceedings of IEEE international conference on acoustics, speech, and signal processing, (ICASSP ’05), March 2005

  7. Athanasiadis Th, Avrithis Y (2004) Adding semantics to audiovisual content. Proceedings of the international conference for image and video retrieval (CIVR ’04), Dublin, Ireland, July 2004

  8. Athanasiadis Th, Tzouvaras V, Petridis K, Precioso F, Avrithis Y, Kompatsiaris Y (2005) Using a multimedia ontology infrastructure for semantic annotation of multimedia content. Proceedings of the 5th international workshop on knowledge markup and semantic annotation (SemAnnot ’05). Galway, Ireland, November 2005

  9. Baeza-Yates RA, Ribeiro-Neto BA (1999) Modern information retrieval. ACM Press/Addison-Wesley

  10. Benitez AB, Chang S-F (2003) Extraction, description and application of multimedia using MPEG-7. Proceedings of the 37th Asilomar conference on signals, systems and computers. Pacific Grove, California, USA, November 2003

  11. Benitez AB, Chang S-F (2003) Image classification using multimedia knowledge networks. Proceedings of the IEEE international conference on image processing (ICIP’03). Barcelona, Spain 2003

  12. Benitez AB et al (2000) Object-based multimedia content description schemes and applications for MPEG-7. Image Communication Journal 16:235–269 (invited paper on a special issue on MPEG-7)

    Google Scholar 

  13. Benitez AB, Chang S-F, Smith JR (2001) “IMKA: a multimedia organization system combining perceptual and semantic knowledge”. Proceedings of the 9th ACM multimedia, Ottawa, Canada 2001

  14. Benitez AB, Zhong D, Chang S, Smith J (2001) MPEG-7 MDS content description tools and applications. Proceedings of the international conference on computer analysis of images and patterns (CAIP), Warsaw, Poland

  15. Benitez AB et al (2002) Semantics of multimedia in MPEG-7. Proceedings of the IEEE international conference on image processing, vol. 1, pp 137–140

    Article  Google Scholar 

  16. Benkhalifa M, Bensaid A, Mouradi A (1999) Text categorization using the semi-supervised fuzzy c-means algorithm”. Proceedings of the 18th international conference of the North American Fuzzy Information Processing Society-NAFIPS, pp 561–565

  17. Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 28(5):34–43

    Google Scholar 

  18. Berry MW, Dumais ST, O’Brien GW (1995) Using linear algebra for intelligent information retrieval. SIAM Rev 37(4):177–196

    Article  MathSciNet  Google Scholar 

  19. Bertini M, Cucchiara R, Del Bimbo A, Torniai C (2005) Video annotation with pictorially enriched ontologies. Proceedings of the IEEE international conference on multimedia and expo, Amsterdam, The Netherlands, July 2005

  20. Bertini M, Del Bimbo A, Torniai C (2005) Automatic video annotation using ontologies extended with visual information. Proceedings of the 13th annual ACM international conference on Multimedia, Singapore, November 2005

  21. Biskup J, Freitag J, Karabulut Y, Sprick B (1997) A mediator for multimedia systems. Proceedings of the 3rd international workshop on multimedia information systems, Como, Italy, September 1997

  22. Bloehdorn S et al (2005) Semantic annotation of images and videos for multimedia analysis. Lecture notes in computer science—The semantic web: research and applications, vol. 3532, Springer, pp 592–607

    Google Scholar 

  23. Burgin R (1995) The retrieval effectiveness of five clustering algorithms as a function of indexing exhaustivity. J Am Soc Inf Sci 46(8):562–572

    Article  MathSciNet  Google Scholar 

  24. Burnett I et al (2003) MPEG-21 goals and achievements. IEEE Multimedia 10(4):60–70

    Article  Google Scholar 

  25. Cai L, Hofmann T (2003) Text categorization by boosting automatically extracted concepts. Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto, Canada, July/August 2003, pp 182–189

  26. Cutting D, Karger DR, Pedersen JO, Tukey JW (1992) Scatter/Gather: a cluster-based approach to browsing large document collections. Proceedings of the ACM/SIGIR, pp 318–329

  27. Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci 41(6):391–407

    Article  Google Scholar 

  28. Denoyer L, Gallinari P, Vittaut J-N, Brunesseaux S (2003) Structured multimedia document classification. Proceedings of the ACM DOCENG conference, Grenoble, France

  29. Doerr M, Hunter J, Lagoze C (2003) Towards a core ontology for information integration. J Digit Inf 4(1), April

  30. Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE Multimed 8(4):10–12

    Article  Google Scholar 

  31. Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on management of data, San Diego, California, USA, June 2003, pp 301–312

  32. Fagin R, Lotem A, Naor M (2003) Optimal aggregation algorithms for middleware. J Comput Syst Sci 66:614–656

    Article  MATH  MathSciNet  Google Scholar 

  33. García R, Celma O (2005) Semantic integration and retrieval of multimedia metadata. Proceedings of the 5th international workshop on knowledge markup and semantic annotation (SemAnnot), Galway, Ireland, November 2005

  34. Gruber TR (1993) A translation approach to portable ontology specification. Knowl Acquis 5:199–220

    Article  Google Scholar 

  35. Hauptmann AG (2004) Towards a large scale concept ontology for broadcast video. Proceedings of the 3rd international conference on image and video retrieval (CIVR’04), Dublin, Ireland, July 2004

  36. Hauptmann AG (2005) Lessons for the future from a decade of informedia video analysis research. Lect Notes Comput Sci 3568:1–10

    Article  Google Scholar 

  37. Hauptmann AG, Yan R, Ng TD, Lin W, Jin R, Derthick M, Christel M, Chen M, Baron R (2002) Video classification and retrieval with the informedia digital video library system. Proceedings of the text and retrieval conference (TREC02), Gaithersburg, MD, USA, November 2002

  38. Hauptmann AG et al (2003) Informedia at TRECVID 2003: analyzing and searching broadcast news video. Proceedings of the NIST TRECVID workshop, Gaithersburg, MD, USA, November 2003

  39. Henderson JM, Hollingworth A (1999) High level scene perception. Annu Rev Psychol 50:243–271

    Article  Google Scholar 

  40. Hofmann T (1999) Probabilistic latent semantic indexing. Proceedings of the 22nd ACM-SIGIR international conference on research and development in information retrieval, pp 50–57

  41. Hollink L, Worring M, Schreiber G (2005) Building a visual ontology for video retrieval. Proceedings of the ACM multimedia, Singapore, November 2005

  42. Hoogs A, Rittscher J, Stein G, Schmiederer J (2003) Video content annotation using visual analysis and a large semantic knowledgebase. Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), Madison, Wisconsin, USA, June 2003

  43. Hunter J (1999) A proposal for an MPEG-7 description definition language. MPEG-7 AHG test and evaluation meeting, Lancaster, February 1999

  44. Hunter J (2001) Adding multimedia to the semantic web-building an MPEG-7 ontology. Proceedings of the international semantic web working symposium (SWWS), California, USA, July 30–August 1

  45. Hunter J (2003) Enhancing the semantic interoperability of multimedia through a core ontology. IEEE Trans Circuits Syst Video Technol 13(1):49–58

    Article  Google Scholar 

  46. ISO/IEC FDIS 15938-5, ISO/IEC JTC 1/SC 29 M 4242 (2001) Information technology multimedia content description interface Part 5: multimedia description schemes, pp 442–448, October 2001

  47. Klir G, Bo Yuan (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice Hall, New Jersey

    MATH  Google Scholar 

  48. Landauer T, Foltz P, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25:259–284

    Article  Google Scholar 

  49. MacLeod K (1990) An application specific neural model for document clustering. Proceedings of the 4th annual parallel processing symposium, vol. 1, pp 5–16

  50. Mich O, Brunelli R, Modena CM (1999) A survey on video indexing. J Vis Commun Image Represent 10:78–112

    Article  Google Scholar 

  51. Milanese R (1993) Detecting salient regions in an image: from biology to implementation. PhD Thesis, University of Geneva, Switzerland

  52. Miyamoto S (1990) Fuzzy sets in information retrieval and cluster analysis. Kluwer Academic Publishers, Dordrecht/Boston/London

    MATH  Google Scholar 

  53. MPEG-21 Overview v.5, ISO/IEC JTC1/SC29/WG11/N5231, Shanghai, October 2002, http://www.chiariglione.org/mpeg/standards/mpeg-21/mpeg-21.htm

  54. Mylonas Ph, Avrithis Y (2005) Context modeling for multimedia analysis and use. Proceedings of the 5th international and interdisciplinary conference on modeling and using context (CONTEXT ‘05), Paris, France 2005

  55. Naphade M, Huang T (2001) A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans Multimedia 3(1):141–151

    Article  Google Scholar 

  56. Naphade MR, Kozintsev IV, Huang TS (2002) A factor graph framework for semantic video indexing. IEEE Trans Circuits Syst Video Technol 12(1):40–52, January

    Article  Google Scholar 

  57. NIST TRECVID (2006), http://www-nlpir.nist.gov/projects/trecvid/

  58. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comp Vis 42:145–175

    Article  MATH  Google Scholar 

  59. Osberger W, Maeder AJ (1998) Automatic identification of perceptually important regions in an image. Proceedings of IEEE International Conference on Pattern Recognition

  60. Papadopoulos G, Mylonas Ph, Mezaris V, Avrithis Y, Kompatsiaris I (2006) Knowledge-assisted image analysis based on context and spatial optimization. International Journal on Semantic Web and Information Systems 2(3):17–36

    Google Scholar 

  61. Petridis K et al (2006) Knowledge representation and semantic annotation of multimedia content. IEE Proc Vis Image Signal Process (special issue on knowledge-based digital media processing) 153(3):255–262, June 2006

    Google Scholar 

  62. Rapantzikos K, Avrithis Y, Kollias S (2005) On the use of spatiotemporal visual attention for video classification”. Proceedings of international workshop on very low bitrate video coding (VLBV '05), Sardinia, Italy, September 2005

  63. Sahami et al (1997) Real-time full-text clustering of networked documents. Proceedings of the National Conference on Artificial Intelligence, p 845

  64. Salembier P, Smith JR (2001) MPEG-7 multimedia description schemes. IEEE Trans Circuits Syst Video Technol 11(6):748–759

    Article  Google Scholar 

  65. Schutze et al (1997) Craig projections for efficient document clustering. SIGIR Forum (ACM Special Interest Group on Information Retrieval), pp 74–81

  66. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47

    Article  Google Scholar 

  67. Sikora T (2001) The MPEG-7 Visual standard for content description—an overview. IEEE Trans Circuits Syst Video Technol (special issue on MPEG-7) 11(6):696–702

    Article  MathSciNet  Google Scholar 

  68. Simou N, Saathoff C, Dasiopoulou S, Spyrou E, Voisine N, Tzouvaras V, Kompatsiaris I, Avrithis Y, Staab S (2005) An ontology infrastructure for multimedia reasoning. International workshop VLBV05, Sardinia, Italy, September 2005

    Google Scholar 

  69. Simou N, Tzouvaras V, Avrithis Y, Stamou G, Kollias S (2005) A visual descriptor ontology for multimedia reasoning. Proceedings of the workshop on image analysis for multimedia interactive services (WIAMIS ’05), Montreux, Switzerland, April 2005

  70. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380

    Article  Google Scholar 

  71. Smith JR (2004) Video indexing and retrieval using MPEG-7. In: B Furht, O Marques (eds) The handbook of image and video databases: design and applications. CRC Press

  72. Smith JR (2006) “MARVEL: Multimedia Analysis and Retrieval System”, http://www.research.ibm.com/marvel/details.html (November 6)

  73. Snoek C et al (2005) MediaMill: exploring news video archives based on learned semantics. Proceedings of ACM Multimedia, Singapore, November 2005

  74. Snoek C, Worring M, Geusebroek J-M, Koelma D, Seinstra F, Smeulders A (2006) The semantic pathfinder for generic news video indexing. Proceedings of the 2006 international conference on multimedia and expo (ICME), Toronto, Canada, July 2006

  75. Snoek C, Worring M, Hauptmann A (2006) Learning rich semantics from news video archives by style analysis. ACM Transactions on Multimedia Computing, Communications and Applications, 2(2):91–108

    Google Scholar 

  76. Staab S, Studer R (2004) Handbook on ontologies. International handbooks on information systems. Springer-Verlag, Heidelberg, New York

    Google Scholar 

  77. Stamou G, Kollias S (eds) (2005) Multimedia content and the semantic web: methods, standards and tools. Wiley & Sons Ltd

  78. Theodoridis S, Koutroumbas K (1998) Pattern recognition. Academic Press

  79. Troncy R (2003) Integrating structure and semantics into audio-visual documents. Proceedings of the 2nd international semantic web conference (ISWC'03), LNCS 2870, Florida, USA, October 2003, pp 566–581

  80. Tsechpenakis G, Akrivas G, Andreou G, Stamou G, Kollias S (2002) Knowledge-assisted video analysis and object detection. Proceedings of European symposium on intelligent technologies, hybrid systems and their implementation on smart adaptive systems (Eunite02), Albufeira, Portugal, September 2002

  81. Tsinaraki C, Polydoros P, Christodoulakis S (2004) Integration of OWL ontologies in MPEG-7 and TVAnytime compliant Semantic Indexing. Proceedings of the 16th international conference on advanced information systems engineering (CAiSE 2004), Riga, Latvia, June 2004

  82. Tzitzikas Y, Meghini C, Spyratos N (2004) Towards a generalized interaction scheme for information access. Foundations of information and knowledge systems: third international symposium (FoIKS 2004), Wilheminenburg Castle, Austria, February 17–20, 2004

    Google Scholar 

  83. Voisine N, Dasiopoulou S, Mezaris V, Spyrou E, Athanasiadis Th, Kompatsiaris I, Avrithis Y, Strintzis MG (2005) Knowledge-assisted video analysis using a genetic algorithm. Proceedings of the 6th international workshop on image analysis for multimedia interactive services (WIAMIS 2005), April 2005

  84. Wallace M, Akrivas G, Mylonas Ph, Avrithis Y, Kollias S (2003) Using context and fuzzy relations to interpret multimedia content. Proceedings of the 3rd international workshop on content-based multimedia indexing (CBMI), IRISA, Rennes, France, September 2003

  85. Wallace M, Avrithis Y, Stamou G, Kollias S (2005) Knowledge-based multimedia content indexing and retrieval. In: Stamou G, Kollias S (eds) Multimedia content and semantic web: methods, standards and tools. Wiley

  86. Wallace M, Avrithis Y, Kollias S (2006) Computationally efficient sup-t transitive closure for sparse fuzzy binary relations. Fuzzy Sets Syst 157(3):341–372

    Article  MATH  MathSciNet  Google Scholar 

  87. Willett P (1988) Recent trends in hierarchic document clustering: a critical review. Inf Process Manag 24(5):577–597

    Article  Google Scholar 

  88. W3C, Semantic Web, www.w3.org/2001/sw: (November 6, 2006).

  89. W3C, SWBPD MM Task Force Description, http://www.w3.org/2001/sw/BestPractices/MM/image_annotation.html (November 6, 2006).

  90. W3C, Web Ontology Language-OWL, http://www.w3.org/TR/owl-features/ (November 6, 2006).

  91. W3C, XML Schema, http://www.w3.org/XML/Schema (November 6, 2006).

  92. Zhao R, Grosky WI (2002) Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Trans Multimedia (special issue on multimedia databases) 4(2), June 2002

  93. Zhong D, Chang S-F (1999) An integrated system for content-based video object segmentation and retrieval. IEEE Trans Circuits Syst Video Technol 9(8):1259–1268, December

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phivos Mylonas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mylonas, P., Athanasiadis, T., Wallace, M. et al. Semantic representation of multimedia content: Knowledge representation and semantic indexing. Multimed Tools Appl 39, 293–327 (2008). https://doi.org/10.1007/s11042-007-0161-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-007-0161-4

Keywords

Navigation