Skip to main content

Evaluating Web Image Context Extraction

  • Chapter
  • First Online:
  • 2417 Accesses

Abstract

Images on the Web appear with other textual contents—referred to as Web Image Context —providing valuable information to the image semantics. Unfortunately, HTML documents are usually cluttered with multiple different contents to different topics and therefore the right image context needs to be precisely determined in order to deliver high quality descriptions. Several methods that automatically determine and extract the Web image context from Web documents have been applied in different applications over the years. However, in these applications context extraction is only a preprocessing step and therefore the quality of the extraction task has rather been evaluated on its own. To sum up, there is hardly information about which extraction method to choose in order to get best results. Keeping this necessity in mind, an evaluation framework that objectively measures and compares the quality of different Web Image Context Extraction (WICE) algorithms will be the main subject in this book chapter. The main parts of the framework are a large ground truth dataset consisting of diverse Web documents from real Web servers and objective quality measures tailored to fit the special characteristics of the image context extraction task. In order to demonstrate the capabilities of the framework, common extraction methods from the literature are implemented and integrated into the framework. Finally, the evaluation results are summarized and discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gudivada VN, Raghavan VV (1995) Content-based image retrieval systems. Computer 28:18–22

    Google Scholar 

  2. Alcic S, Conrad S (2010) Measuring performance of web image context extraction. In: Proceedings of the 10th international workshop on multimedia data mining, MDMKDD’10, ACM, New York, pp 8:1–8:8

    Google Scholar 

  3. Alcic S, (2011) Web image context extraction: methods and evaluation. PhD thesis, Heinrich-Heine-University of Duesseldorf

    Google Scholar 

  4. Alcic S, Conrad S (2011) Page segmentation by web content clustering. In: International conference on web intelligence, mining and semantics (WIMS11), May 2011

    Google Scholar 

  5. Coelho TAS, Calado PP, Souza LV, Ribeiro-Neto B, Muntz R (2004) Image retrieval using multiple evidence ranking. IEEE Trans Knowl Data Eng 16(4):408–417

    Article  Google Scholar 

  6. Sclaroff S, Taycher L, Cascia ML (1997) Imagerover: a content-based image browser for the world wide web. In: Proceedings of the 1997 workshop on content-based access of image and video libraries (CBAIVL’97), CAIVL’97, IEEE Computer Society, Washington

    Google Scholar 

  7. Vasconcelos N, Lippman A (2000) Bayesian relevance feedback for content-based image retrieval. In: Proceedings of the IEEE workshop on content-based access of image and video libraries (CBAIVL’00), IEEE Computer Society, Washington, p 63

    Google Scholar 

  8. Yong-hong T, Tie-jun H, Wen G (2005) Exploiting multi-context analysis in semantic image classification. J Zhejiang Univ Sci, 1268–1283

    Google Scholar 

  9. Cai D, Yu S, Wen J-R, Ma W-Y (2003) VIPS: a vision-based page segmentation algorithm. Technical report, Microsoft Research (MSR-TR-2003-79)

    Google Scholar 

  10. He X, Cai D, Wen J-R, Ma W-Y, Zhang H-J (2007) Clustering and searching WWW images using link and page layout analysis. ACM Trans Multimed Comput Commun Appl 3(2):10

    Article  Google Scholar 

  11. Fauzi F, Hong J-L, Belkhatir M (2009) Webpage segmentation for extracting images and their surrounding contextual information. In: ACM multimedia, pp 649–652

    Google Scholar 

  12. Alexa (2011) The web information company. http://www.alexa.com

  13. Cai D, He X, Li Z, Ma W-Y, Wen J-R (2004) Hierarchical clustering of WWW image search results using visual, textual and link information. In: Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04, New York, pp 952–959

    Google Scholar 

  14. Sandor A, Tripp A, Giustina F, Peskin GL, Lempinen S, Gold R, Sanders J,Yount S (2011) The homepage of the JTidy java API. http://jtidy.sourceforge.net/

  15. Feng H, Shi R, Chua T-S (2004) A bootstrapping framework for annotating and retrieving WWW images. In: Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04, ACM, New York, pp 960–967

    Google Scholar 

  16. Trifonova G (2010) Implementation of a tool for manual web image to context mapping. Bachelor Thesis, September 2010

    Google Scholar 

  17. Sclaroff S, Cascia ML, Sethi S (1999) Unifying textual and visual cues for content-based image retrieval on the World Wide Web. Comput Vis Image Underst 75(1–2):86–98

    Article  Google Scholar 

  18. Zhigang H, Xiang-Jun W, Qingshan L, Hanqing L (2005) Semantic knowledge extraction and annotation for web images. In: Proceedings of the 13th annual ACM international conference on multimedia, MULTIMEDIA’05, ACM, New York, pp 467–470

    Google Scholar 

  19. Frankel C, Swain MJ, Athitsos V (1996) WebSeer: an image search engine for the world wide web. Technical Report, University of Chicago, Chicago

    Google Scholar 

  20. Shen HT, Ooi BC, Tan K-L (2000) Giving meanings to WWW images. In: Proceedings of the eighth ACM international conference on multimedia, MULTIMEDIA’00, ACM, New York, pp 39–47

    Google Scholar 

  21. Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Data-centric systems and applications. Springer, Berlin

    Google Scholar 

  22. Cai D (2011) Download site of the DEMO of VIPS algorithm. http://www.zjucadcg.cn/dengcai/VIPS/VIPS.html

  23. Ortega-Binderberger M, Mehrotra S, Chakrabarti K, Porkaew K (2000) WebMARS: a multimedia search engine for the world wide web. In: Proceedings of the SPIE electronic imaging 2000: internet imaging, San Jose

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sadet Alcic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Alcic, S., Conrad, S. (2015). Evaluating Web Image Context Extraction. In: Baughman, A., Gao, J., Pan, JY., Petrushin, V. (eds) Multimedia Data Mining and Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-14998-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14998-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14997-4

  • Online ISBN: 978-3-319-14998-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics