Skip to main content

A Multimodal Information Collector for Content-Based Image Retrieval System

  • Conference paper
Neural Information Processing (ICONIP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7064))

Included in the following conference series:

  • 2698 Accesses

Abstract

Explicit relevance feedback requires the user to explicitly refine the search queries for content-based image retrieval. This may become laborious or even impossible due to the ever-increasing volume of digital databases. We present a multimodal information collector that can unobtrusively record and asynchronously transmit the user’s implicit relevance feedback on a displayed image to the remote CBIR server for assisting in retrieving relevant images. The modalities of user interaction include eye movements, pointer tracks and clicks, keyboard strokes, and audio including speech. The client-side information collector has been implemented as a browser extension using the JavaScript programming language and has been integrated with an existing CBIR server. We verify its functionality by evaluating the performance of the gaze-enhanced CBIR system in on-line image tagging tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40(2), 1–60 (2008)

    Article  Google Scholar 

  2. Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2), 18–28 (2003)

    Article  Google Scholar 

  3. Zhang, H., Koskela, M., Laaksonen, J.: Report on forms of enriched relevance feedback. Technical Report TKK-ICS-R10, Helsinki University of Technology (2008)

    Google Scholar 

  4. Hardoon, D.R., Shawe-Taylor, J., Ajanki, A., Puolamäki, K., Kaski, S.: Information retrieval by inferring implicit queries from eye movements. In: Eleventh International Conference on Artificial Intelligence and Statistics (2007)

    Google Scholar 

  5. Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124(3), 372–422 (1998)

    Article  Google Scholar 

  6. Klami, A., Saunders, C., de Campos, T., Kaski, S.: Can relevance of images be inferred from eye movements? In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 134–140. ACM (2008)

    Google Scholar 

  7. Hardoon, D., Pasupa, K.: Image ranking with implicit feedback from eye movements. In: Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, pp. 291–298. ACM (2010)

    Google Scholar 

  8. Maglio, P.P., Campbell, C.S.: Attentive agents. Commun. ACM 46(3), 47–51 (2003)

    Article  Google Scholar 

  9. Gruenstein, A., McGraw, I., Badr, I.: The WAMI Toolkit for developing, deploying, and evaluating web-accessible multimodal interfaces. In: Proceedings of Tenth International Conference on Multimodal Interfaces (ICMI 2008), Chania, Greece (October 2008)

    Google Scholar 

  10. Laaksonen, J., Koskela, M., Oja, E.: PicSOM—Self-organizing image retrieval with MPEG-7 content descriptions. IEEE Transactions on Neural Networks, Special Issue on Intelligent Multimedia Processing 13(4), 841–853 (2002)

    Article  MATH  Google Scholar 

  11. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer Series in Information Sciences, vol. 30. Springer, Berlin (2001)

    Book  MATH  Google Scholar 

  12. Viitaniemi, V., Laaksonen, J.: Evaluating the performance in automatic image annotation: example case by adaptive fusion of global image features. Signal Processing: Image Communications 22(6), 557–568 (2007)

    Google Scholar 

  13. Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 971–980. ACM, New York (2007)

    Chapter  Google Scholar 

  14. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007, VOC 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, H., Sjöberg, M., Laaksonen, J., Oja, E. (2011). A Multimodal Information Collector for Content-Based Image Retrieval System. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7064. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24965-5_83

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24965-5_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24964-8

  • Online ISBN: 978-3-642-24965-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics