Abstract
In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Smeaton, A.F., Quigley, I.: Experiments on using semantic distances between words in image caption retrieval. ACM SIGIR 174–180 (1996)
Shen, H.T., Ooi, B.C., Tan, K.L.: Giving meanings to WWW images. ACM Multimedia 39–48 (2000)
Harmandas, V., Sanderson, M., Dunlop, M.D.: Image retrieval by hypertext links. ACM SIGIR 296–303 (1997)
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of www image search results using visual, textual and link analysis. ACM Multimedia (2004a)
Cai, D., He, X., Ma, W.-Y., Wen, J.-R., Zhang, H.-J.: Organizing www images based on the analysis of page layout and web link structure. In: Proceedings of the Internatinal Conference on Multimedia and Expo (2004b)
Chua, T.S., Low, W.C., Chu, C.X.: Relevance feedback techniques for color-based image retrieval. Multimedia Model 24–31 (1998)
Yu, S., Cai, D., Wen, J.-R., Ma, W.-Y.: Improving pseudo-relevance feedback in web information retrieval using web page segmentation. World Wide Web (2003)
Pass, G., Zabin, R., Miller, J.: Computing images using color coherence vectors. ACM Multimedia 65–73 (1996)
Smith, J.R., Chang, S.F.: Image indexing and retrieval based on color histograms. ACM Multimedia 87–98 (1996)
Cha, G.H., Chung, C.W.: An indexing and retrieval mechanism for complex similarity queries in image databases. J. Vis. Commun. Image Representation 10(3), 268–290 (1999)
Mukherjea, S., Hirata, K., Hara, Y.: Amore: A world wide web image retrieval engine. World Wide Web 2(3), 115–132 (1999)
Smith, J.R., Chang, S.F.: Visually searching the web for content. IEEE Trans. Multimedia 4(3), 12–20 (1997)
Zheng, C., Liu, W., Zhang, F., Li, M., Zhang, H.: Web mining for web image retrieval. J. Am. Soc. Inf. Sci. Technol. 52(10) (2001)
Ooi, B.C., Tan, K.L., Chua, T.S., Hsu, W.: Fast image retrieval using color-spatial information. VLDB J. 7(2), 115–128 (1998)
El-Kwae, E.A., Kabuka, M.R.: Efficient content-based indexing of large image databases. ACM Transactions on Information Systems 18(2), 171–210 (2000)
Cha, G.H., Chung, C.W.: An indexing and retrieval mechanism for complex similarity queries in image databases. J. Vis. Commun. Image Representation 10(3), 268–290 (1999)
Yu, C., Ooi, B.C., Tan, K.L., Jagadish, H.V.: Indexing the distance: an efficient method to KNN processing. VLDB 166–174 (2001)
Jin, H., Ooi, B.C., Shen, H.T., Yu, C., Zhou, A.: An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing. ICDE (2003)
Koudas, N., Ooi, B.C., Shen, H.T., Tung, A.: LDC: Enabling search by partial distance in a hyper-dimensional space. ICDE (2004)
Weber, R., Schek, H., Blott, S.: A quantitative analysis and performance study for similarity search methods in high dimensional spaces. VLDB 194–205 (1998)
Aggarwal, C.C., Wolf, J.L., Yu, P.S., Procopiuc, C., Park, J.S.: Fast algorithms for projected clustering. SIGMOD 61–72 (1999)
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. SIGMOD 73–84 (1998)
Hearst, M.A., Pedersen, J.O.: Reexamining the cluster hypothesis: scatter/gather on retrieval results. ACM SIGIR 76–84 (1996)
Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. SIGKDD (1998)
Palmer, C.R., Faloutsos, C.: Density biased sampling: an improved method for data mining and clustering. SIGMOD 82–92 (2000)
Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. VLDB 323–333 (1998)
Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relation and an indicator of the structure of text. Comput. Linguistics 17(1), 22–48 (1991)
Voorhees, E.M.: Query expansion using lexical-semantic relations. ACM SIGIR (1994)
Shen, H.T., Ooi, B.C., Tan, K.L.: Finding semantically related images in WWW. ACM Multimedia 491–493 (2000)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to wordnet: an on-line lexical database. Int. J. Lexicograp. 3(4), 235–312 (1990)
Wong, S.K.M., Ziarko, W., Raghavan, V.V., Wong, P.C.N.: On modeling of information retrieval concepts in vector spaces. TODS (1987)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shen, H.T., Tan, KL., Zhou, X. et al. ICICLE: A semantic-based retrieval system for WWW images. Multimedia Systems 11, 438–454 (2006). https://doi.org/10.1007/s00530-006-0020-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-006-0020-6