Abstract
Mining information for the images that currently exist in huge amounts on the web, has been a main scientific interest during the past years. Several methods have been exploited and web image information is extracted from textual sources such as image file names, anchor texts, existing keywords and, of course, surrounding text. However, the systems that attempt to mine information for images using surrounding text suffer from several problems, such as the inability to correctly assign all relevant text to an image and discard the irrelevant text as well. A novel method for extracting web image information is discussed in the present paper. The proposed system uses visual cues in order to cluster a web page into several regions and assign to each hosted image the text that most possibly refers to it. Three different approaches to the problem of text to image assignment are discussed and evaluated. The evaluation procedure indicates the advantages of using visual cues and two dimensional euclidean measures for extracting information for web images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ortega-Binderberger, M., Mexico, A.: Webmars: A multimedia search engine for the world wide web (1999)
Alexandre, L., Pereira, M., Madeira, S., Cordeiro, J., Dias, G.: Web image indexing: Combining image analysis with text processing. In: Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2004 (2004)
Alcic, S., Conrad, S.: A clustering-based approach to web image context extraction. In: MMEDIA 2011 (2011)
Cai, D., Yu, S., Wen, J.R., Ma, W.Y.: Vips: a vision based page segmentation algorithm. Technical report, Microsoft Research (2003)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tryfou, G., Tsapatsoulis, N. (2012). Using Visual Cues for the Extraction of Web Image Semantic Information. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds) Theory and Practice of Digital Libraries. TPDL 2012. Lecture Notes in Computer Science, vol 7489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33290-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-33290-6_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33289-0
Online ISBN: 978-3-642-33290-6
eBook Packages: Computer ScienceComputer Science (R0)