Skip to main content

Chapter 8: Multimedia and Multimodal Information Retrieval

  • Chapter
Search Computing

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5950))

Abstract

The Web is progressively becoming a multimedia content delivery platform. This trend poses severe challenges to the information retrieval theories, techniques and tools. This chapter defines the problem of multimedia information retrieval with its challenges and application areas, overviews its major technical issues, proposes a reference architecture unifying the aspects of content processing and querying, exemplifies a next-generation platform for multimedia search, and concludes by showing the close ties between multi-domain search investigated in Search Computing and multimodal/multimedia search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adistambha, K., Döller, M., Tous, R., Gruhne, M., Sano, M., Tsinaraki, C., Christodoulakis, S., Yoon, K., Ritz, C., Burnett, I.: The MPEG-7 Query Format: A New Standard in Progress for Multimedia Query by Content. In: Proceedings of the 7th International IEEE Symposium on Communications and Information Technologies (ISCIT 2007), pp. 479–484 (2007)

    Google Scholar 

  2. Adobe Premiere (2009), http://www.adobe.com/products/premiere/

  3. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 1st edn. Addison Wesley, Reading (1999)

    Google Scholar 

  4. Baldi, A., Murace, R., Dragonetti, E., Manganaro, M., Guerra, O., Bizzi, S., Galli, L.: Definition of an automated Content-Based Image Retrieval (CBIR) system for the comparison of dermoscopic images of pigmented skin lesions. Biomed. Eng. Online (2009)

    Google Scholar 

  5. Barbieri, M., Agnihotri, L., Dimitrova, N.: Internet Multimedia Management Systems IV. In: Proceedings of the SPIE, vol. 5242, pp. 1–13 (2003)

    Google Scholar 

  6. Beitzel, S.M., Jensen, E.C., Grossman, D.A.: Retrieving OCR Text: A Survey of Current Approaches. In: Symposium on Document Image Understanding Technologies, SDUIT (2003)

    Google Scholar 

  7. Blinkx – Video Search Engine (2009), http://www.blinkx.com/

  8. BMat - 2009 (2009), http://www.bmat.com/

  9. Bozzon, A., Brambilla, M., Fraternali, P.: Model-Driven Design of Audiovisual Indexing Processes for Search-Based Applications. In: 7th IEEE International Workshop on Content-Based Multimedia Indexing, pp. 120–125. IEEE Press, New York (2009)

    Google Scholar 

  10. Bozzon, A., Brambilla, M., Fraternali, P., Nucci, F., Debald, S., Moore, E., Neidl, W., Plu, M., Aichroth, P., Pihlajamaa, O., Laurier, C., Zagorac, S., Backfried, G., Weinland, D., Croce, V.: Pharos: an audiovisual search platform. In: Proceedings of the 32nd international ACM SIGIR Conference on Research and Development in information Retrieval, SIGIR 2009, Boston, MA, USA, July 19 - 23, p. 841. ACM, New York (2009)

    Google Scholar 

  11. Buchenwald demonstrator, University of Twente (2009), http://vuurvink.ewi.utwente.nl:8080/Buchenwald/

  12. Caringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Signal Processing Magazine 23(2), 133–141 (2006)

    Article  Google Scholar 

  13. Carrato, K.S.: Temporal video segmentation: a survey. Signal Processing: Image Communication 16, 477–500 (2001)

    Google Scholar 

  14. Cees, G.M.: Concept-Based Video Retrieval. Foundations and Trends in Information Retrieval 4(2), 215–322 (2009)

    Google Scholar 

  15. Cotsaces, C., Nikolaidis, N., Pitas, I.: Video Shot Boundary Detection and Condensed Representation: A Review. IEEE Signal Processing Magazine (2006)

    Google Scholar 

  16. Delve Networks - Online Video Platform and Content Management (2009), http://www.delvenetworks.com/

  17. Devlin, B., Wilkinson, J.: The Material Exchange Format. In: Gilmer, B. (Hrsg.) File Interchange Handbook, pp. 123–176. Elsevier Inc., Focal Press (2004)

    Google Scholar 

  18. Diou, C., Papachristou, C., Panagiotopoulos, P., Stephanopoulos, G., Dimitriou, N., Delopoulos, A., Rode, H., Aly, R., de Vries, A.P., Tsikrika, T.: VITALAS at TRECVID 2008. In: Proceedings of the 6th TREC Video Retrieval Evaluation Workshop, Gaithersburg, USA, November 17-18 (2008)

    Google Scholar 

  19. Dublin Core Metadata Initiative (2009), http://dublincore.org/

  20. Empora Online Shop (2009), http://www.empora.com

  21. Eu, H., Hedge, A.: Survey of continuous speech recognition software usability. Cornell University, Ithaca, NY (1999), http://ergo.human.cornell.edu/AHProjects/Hsin99/Voice%20Recognition%Paper.pdf (retrieved April 5, 2004)

  22. Eyealike platform for facial similarity (2009), http://www.eyealike.com/home

  23. Facesaerch (2009), http://www.facesaerch.com/

  24. Geurts, J., van Ossenbruggen, J., Hardman, L.: Requirements for practical multimedia annotation. In: Workshop on Multimedia and the Semantic Web Heraklion, Crete, pp. 4–11 (2005)

    Google Scholar 

  25. Google Election Video Search (2009), http://googleblog.blogspot.com/2008/07/in-their-own-words-political-videos.html

  26. Google Images (2009), http://images.google.com

  27. Google Picasa (2009), http://picasa.google.com/

  28. Hanbury, A.: A survey of methods for image annotation. Journal of Visual Languages and Computing 19(5), 617–627 (2008)

    Article  Google Scholar 

  29. Henrich, A., Robbert, G.: Combining multimedia retrieval and text retrieval to search structured documents in digital libraries. In: Proc. 1st DELOS Workshop on Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland, vol. 01/W001 (2000)

    Google Scholar 

  30. Henrich, A., Robbert, G.: POQLMM: A Query Language for Structured Multimedia Documents. In: Proceedings of the First International Workshop on Multimedia Data and Document Engineering, Lyon, France, pp. 17–26 (2001)

    Google Scholar 

  31. Japan Electronics and Information Technology Industries Association: Exchangeable image file format for digital still cameras: EXIF. Version 2.2 (2002)

    Google Scholar 

  32. ID3 (2009), http://www.id3.org/

  33. IST SAPIR Large Scale Multimedia Search and P2P (2009), http://sapir.isti.cnr.it/index

  34. International Press Telecommunications Council (2009), http://www.iptc.org/IPTC4XMP/

  35. Le, T.H., Thonnat, M., Boucher, A., Bremond, F.: A Query Language Combining Object Features and Semantic Events for Surveillance Video Retrieval. In: Proceedings of Advances in Multimedia Modeling, 14th MMM Conference, Kyoto, Japan, pp. 307–317 (2008)

    Google Scholar 

  36. Learning Object Metadata (2009), http://ltsc.ieee.org/wg12/

  37. Lew, M., et al.: Content-Based Multimedia Information Retrieval: State of the Art and Challenges. ACM Transactions on Multimedia Computing, Communications, and Applications 2(1) (2006)

    Google Scholar 

  38. Liu, Y., Zhang, D., Lu, G., Ma, W.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)

    Article  MATH  Google Scholar 

  39. LSCOM Lexicon Definitions and Annotations (2009), http://www.ee.columbia.edu/ln/dvmm/lscom/

  40. LTU technologies (2009), http://www.ltutech.com/

  41. Martínez, J.M.: MPEG-7 Overview (version 10), ISO/IEC JTC1/SC29/WG11N6828, Palma de Mallorca (2004)

    Google Scholar 

  42. Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7: Multimedia Content Description Interface, 396 p. Wiley, Chichester (2002)

    Google Scholar 

  43. Maragos, P., Potamianos, A., Gros, P.: Multimodal Processing and Interaction, Audio, Video, Text. Multimedia Systems and Applications, vol. 33. Springer, Heidelberg (2008)

    Book  Google Scholar 

  44. Marsden, A., Mackenzie, A., Lindsay, A.: Tools for Searching, Annotation and Analysis of Speech, Music, Film and Video; A Survey. Literary and Linguistic Computing 22(4), 469–488 (2007)

    Article  Google Scholar 

  45. Media RSS (2009), http://en.wikipedia.org/wiki/Media_RSS

  46. Meyers, O.C.: A Mood-Based Music Classification and. Exploration System, MS Thesis, Massachusetts Institute of. Technology (MIT), USA (2007)

    Google Scholar 

  47. MiDoMi (2009), http://www.midomi.com/

  48. Microsoft Bing (2009), http://www.bing.com/images

  49. MPEG Industry Forum (2009), http://www.m4if.org/

  50. Ngo, C., Chan, C.: Video text detection and segmentation for optical character recognition. Multimedia Systems 10(3), 261–272 (2004)

    Article  Google Scholar 

  51. Petrovska-Delacrétaz, D., El Hannani, A., Chollet, G.: Text-Independent Speaker Verification: State of the Art and Challenges. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) COST 277. LNCS, vol. 4391, pp. 135–169. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  52. Pictron Solutions (2009), http://www.pictron.com/

  53. Pixta, Visual search technologies (2009), http://www.pixsta.com/

  54. Pluggd Podcast Search Engine (2009), http://www.pluggd.tv/

  55. Podcast (2009), http://en.wikipedia.org/wiki/Podcasting

  56. Podscope Podcast Search Engine (2009), http://www.podscope.com/

  57. Potamitis, I., Ganchev, T.: Generalized recognition of sound events: Approaches and applications. Studies in Computational Intelligence, vol. 120, pp. 41–79. Springer, Heidelberg (2008)

    Google Scholar 

  58. Radio Oranje speech search, Univeristy of Twente (2009), http://hmi.ewi.utwente.nl/choral/radiooranje.html

  59. Radke, R.J., Andra, S., Al-Kofahi, O., Roysam, B.: Image change detection algorithms: a systematic survey. IEEE Transactions on Image Processing 14(3), 294 (2005)

    Article  MathSciNet  Google Scholar 

  60. Rasheed, Z., Shah, M.: Scene detection in Hollywood movies and TV shows. In: Proceedings of the IEEE Computer Vision and Pattern Recognition Conference (2003)

    Google Scholar 

  61. Recordare (2009), http://www.recordare.com/xml.html

  62. RSS (2009), http://en.wikipedia.org/wiki/RSS_file_format

  63. Sacco, S.M., Tzitzikas, Y.: Dynamic Taxonomies and Faceted Search, Theory, Practice, and Experience. The Information Retrieval Series, vol. 25, p. 340. Springer, Heidelberg (2009)

    Book  Google Scholar 

  64. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (November1975) (2003)

    Article  MATH  Google Scholar 

  65. SHAZAM (2009), http://www.shazam.com/

  66. SMILA (2009), http://www.eclipse.org/smila/

  67. SMEF (2009), http://www.bbc.co.uk/guidelines/smef/

  68. TILTOMO (2009), http://www.tiltomo.com/

  69. Tineye, Image Search Engine (2009), http://tineye.com/

  70. The 3GP video standard (2009), http://www.3gp.com/

  71. The DAML Ontology Library (2009), http://www.daml.org/ontologies

  72. The Internet Movie Database (2009), http://www.imdb.com

  73. The Theseus programme (2009), http://theseus-programm.de

  74. The Quaero Program (2009), http://www.quaero.org

  75. Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: A survey. IEEE Transactions on Circuits and Systems for Video Technology 18(11), 1473–1488 (2008)

    Article  Google Scholar 

  76. Typke, R., Wiering, F., Veltkamp, R.C.: A survey of music information retrieval systems. In: ISMIR 2005, pp. 153–160 (2005)

    Google Scholar 

  77. VoxaleadTM (2009), http://voxalead.labs.exalead.com

  78. Wang, C.C., Wang, J., Li, J., Sun, J.G., Shi, S.: MuSQL: A Music Structured Query Language. In: Cham, T.-J., Cai, J., Dorai, C., Rajan, D., Chua, T.-S., Chia, L.-T. (eds.) MMM 2007. LNCS, vol. 4352, pp. 216–225. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  79. Wattamwar, S.S., Ghosh, H.: Spatio-temporal query for multimedia databases. In: Proceeding of the 2nd ACM Workshop on Multimedia Semantics (MS 2008), pp. 48–55. ACM, New York (2008)

    Chapter  Google Scholar 

  80. Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Survey 38(4) (2006)

    Google Scholar 

  81. Yu, G., Chen, Y., Shih, K.: A Content-Based Image Retrieval System for Outdoor Ecology Learning: A Firefly Watching System. In: International Conference on Advanced Information Networking and Applications, vol. 2, p. 112 (2004)

    Google Scholar 

  82. Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Comput. Surv. 35(4), 399–458 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bozzon, A., Fraternali, P. (2010). Chapter 8: Multimedia and Multimodal Information Retrieval. In: Ceri, S., Brambilla, M. (eds) Search Computing. Lecture Notes in Computer Science, vol 5950. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12310-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12310-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12309-2

  • Online ISBN: 978-3-642-12310-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics