Evaluating Web Image Context Extraction

Alcic, Sadet; Conrad, Stefan

doi:10.1007/978-3-319-14998-1_10

Evaluating Web Image Context Extraction

Sadet Alcic⁵ &
Stefan Conrad⁵

Chapter
First Online: 01 January 2015

2417 Accesses

Abstract

Images on the Web appear with other textual contents—referred to as Web Image Context —providing valuable information to the image semantics. Unfortunately, HTML documents are usually cluttered with multiple different contents to different topics and therefore the right image context needs to be precisely determined in order to deliver high quality descriptions. Several methods that automatically determine and extract the Web image context from Web documents have been applied in different applications over the years. However, in these applications context extraction is only a preprocessing step and therefore the quality of the extraction task has rather been evaluated on its own. To sum up, there is hardly information about which extraction method to choose in order to get best results. Keeping this necessity in mind, an evaluation framework that objectively measures and compares the quality of different Web Image Context Extraction (WICE) algorithms will be the main subject in this book chapter. The main parts of the framework are a large ground truth dataset consisting of diverse Web documents from real Web servers and objective quality measures tailored to fit the special characteristics of the image context extraction task. In order to demonstrate the capabilities of the framework, common extraction methods from the literature are implemented and integrated into the framework. Finally, the evaluation results are summarized and discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Gudivada VN, Raghavan VV (1995) Content-based image retrieval systems. Computer 28:18–22
Google Scholar
Alcic S, Conrad S (2010) Measuring performance of web image context extraction. In: Proceedings of the 10th international workshop on multimedia data mining, MDMKDD’10, ACM, New York, pp 8:1–8:8
Google Scholar
Alcic S, (2011) Web image context extraction: methods and evaluation. PhD thesis, Heinrich-Heine-University of Duesseldorf
Google Scholar
Alcic S, Conrad S (2011) Page segmentation by web content clustering. In: International conference on web intelligence, mining and semantics (WIMS11), May 2011
Google Scholar
Coelho TAS, Calado PP, Souza LV, Ribeiro-Neto B, Muntz R (2004) Image retrieval using multiple evidence ranking. IEEE Trans Knowl Data Eng 16(4):408–417
Article Google Scholar
Sclaroff S, Taycher L, Cascia ML (1997) Imagerover: a content-based image browser for the world wide web. In: Proceedings of the 1997 workshop on content-based access of image and video libraries (CBAIVL’97), CAIVL’97, IEEE Computer Society, Washington
Google Scholar
Vasconcelos N, Lippman A (2000) Bayesian relevance feedback for content-based image retrieval. In: Proceedings of the IEEE workshop on content-based access of image and video libraries (CBAIVL’00), IEEE Computer Society, Washington, p 63
Google Scholar
Yong-hong T, Tie-jun H, Wen G (2005) Exploiting multi-context analysis in semantic image classification. J Zhejiang Univ Sci, 1268–1283
Google Scholar
Cai D, Yu S, Wen J-R, Ma W-Y (2003) VIPS: a vision-based page segmentation algorithm. Technical report, Microsoft Research (MSR-TR-2003-79)
Google Scholar
He X, Cai D, Wen J-R, Ma W-Y, Zhang H-J (2007) Clustering and searching WWW images using link and page layout analysis. ACM Trans Multimed Comput Commun Appl 3(2):10
Article Google Scholar
Fauzi F, Hong J-L, Belkhatir M (2009) Webpage segmentation for extracting images and their surrounding contextual information. In: ACM multimedia, pp 649–652
Google Scholar
Alexa (2011) The web information company. http://www.alexa.com
Cai D, He X, Li Z, Ma W-Y, Wen J-R (2004) Hierarchical clustering of WWW image search results using visual, textual and link information. In: Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04, New York, pp 952–959
Google Scholar
Sandor A, Tripp A, Giustina F, Peskin GL, Lempinen S, Gold R, Sanders J,Yount S (2011) The homepage of the JTidy java API. http://jtidy.sourceforge.net/
Feng H, Shi R, Chua T-S (2004) A bootstrapping framework for annotating and retrieving WWW images. In: Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04, ACM, New York, pp 960–967
Google Scholar
Trifonova G (2010) Implementation of a tool for manual web image to context mapping. Bachelor Thesis, September 2010
Google Scholar
Sclaroff S, Cascia ML, Sethi S (1999) Unifying textual and visual cues for content-based image retrieval on the World Wide Web. Comput Vis Image Underst 75(1–2):86–98
Article Google Scholar
Zhigang H, Xiang-Jun W, Qingshan L, Hanqing L (2005) Semantic knowledge extraction and annotation for web images. In: Proceedings of the 13th annual ACM international conference on multimedia, MULTIMEDIA’05, ACM, New York, pp 467–470
Google Scholar
Frankel C, Swain MJ, Athitsos V (1996) WebSeer: an image search engine for the world wide web. Technical Report, University of Chicago, Chicago
Google Scholar
Shen HT, Ooi BC, Tan K-L (2000) Giving meanings to WWW images. In: Proceedings of the eighth ACM international conference on multimedia, MULTIMEDIA’00, ACM, New York, pp 39–47
Google Scholar
Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Data-centric systems and applications. Springer, Berlin
Google Scholar
Cai D (2011) Download site of the DEMO of VIPS algorithm. http://www.zjucadcg.cn/dengcai/VIPS/VIPS.html
Ortega-Binderberger M, Mehrotra S, Chakrabarti K, Porkaew K (2000) WebMARS: a multimedia search engine for the world wide web. In: Proceedings of the SPIE electronic imaging 2000: internet imaging, San Jose
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Databases and Information Systems, Institute for Computer Science, Heinrich-Heine-University of Duesseldorf, Universitaetsstr. 1, 40225, Duesseldorf, Germany
Sadet Alcic & Stefan Conrad

Authors

Sadet Alcic
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Conrad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sadet Alcic .

Editor information

Editors and Affiliations

IBM Corp., Durham, North Carolina, USA
Aaron K. Baughman
Nokia Inc., Sunnyvale, California, USA
Jiang Gao
Google Inc., Mountain View, California, USA
Jia-Yu Pan
4i, Inc., Carlsbad, California, USA
Valery A. Petrushin

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alcic, S., Conrad, S. (2015). Evaluating Web Image Context Extraction. In: Baughman, A., Gao, J., Pan, JY., Petrushin, V. (eds) Multimedia Data Mining and Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-14998-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-14998-1_10
Published: 01 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14997-4
Online ISBN: 978-3-319-14998-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics