IEICE Trans - Image Collector II: A System to Gather a Large Number of Images from the Web

Image Collector II: A System to Gather a Large Number of Images from the Web

Keiji YANAI

Publication
IEICE TRANSACTIONS on Information and Systems Vol.E88-D No.10 pp.2432-2436
Publication Date: 2005/10/01
Online ISSN:
DOI: 10.1093/ietisy/e88-d.10.2432
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Image Processing and Video Processing
Keyword:
Web image gathering, Web image search, information retrieval, image database, Web mining,

Full Text: PDF(129.4KB)>>

Summary:
We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.

open access publishing via