Abstract
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image’s multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition’s center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the “dimensionality curse” existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image’s text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition’s center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A Review of Content-Based Image Retrieval Systems, http://www.jtap.ac.uk/reports/htm/jtap-054.html
Weber, R., Schek, H., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB, pp. 194–205 (1998)
Shen, H.T., Ooi, B.C., Tan, K.L.: Giving meanings to WWW images. In: Proc. of 8th ACM Multimedia Conference, pp. 39–47 (2000)
Yu, C., Ooi, B.C., Tan, K.L., Jagadish, H.V.: Indexing the Distance: An Efficient Method to KNN Processing. In: VLDB, pp. 421–430 (2001)
Mukherjea, S., Hirata, K., Hara, Y.: Amore: A World Wide Web image retrieval engine. The WWW Journal 2(3), 115–132 (1999)
Sclaro, S., Taycher, L., Cascia, M.L.: Imagerover: A content-based image browser for the World Wide Web. In: Proc. IEEE Workshop on Content-Based Access of Image and Video Libraries (1997)
Smith, J.R., Chang, S.-F.: An Image and Video Search Engine for the World-Wide Web. In: Proceedings, IS&T/SPIE Symposium on Electronic Imaging: Science and Technology (EI 1997) - Storage and Retrieval for Image and Video Databases V (1997)
Chen, Z., Wenyin, L., Hu, C., Li, M., Zhang, H.: iFind: A Web Image Search Engine. In: SIGIR (2001)
Alp Aslandogan, Y., Yu, C.T.: Evaluating strategies and systems for content based indexing of person images on the Web. ACM Multimedia, 313–321 (2000)
Ooi, B.C., Tan, K.L., Yu, C., Bressan, S.: Indexing the Edges - A Simple and Yet Efficient Approach to High-Dimensional Indexing. In: PODS, pp. 166–174 (2000)
Chakrabart, K., Mehrotra, S.: The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. In: International Conference on Data Engineering, pp. 322–331 (1999)
Gaede, V., Gunther, O.: Multidimensional Access Methods. ACM Computing Surveys 30(2), 170–231 (1998)
Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H.: The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation. In: VLDB, pp. 516–526 (2000)
Ngu, A.H.H., Sheng, Q.Z., Huynh, D.Q., Lei, R.: Huynh, and Ron Lei: Combining multi-visual features for efficient indexing in a large image database. VLDB Journal 9(4), 279–293 (2001)
Guntzer, U., Balke, W.-T., Kiessling, W.: Optimizing Multi-Feature Queries for Image Databases. In: VLDB, pp. 261–281 (2000)
Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: PODS (2001)
Wang, J.Z., Wiederhold, G., Firschein, O., Wei, S.X.: Content-based image indexing and searching using Daubechies’ wavelets. International Journal of Digital Libraries 1(4), 311–328 (1998)
Aggrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In: Proceedings of the ACM SIGMOD Conference, pp. 94–105 (1998)
Aggrawal, C.C., Wolf, J.L., Yu, P.S., Procopiuc, C., Park, J.S.: Fast Algorithms for Projected Clustering. In: Proceedings of the ACM SIGMOD Conference, pp. 61–72 (1999)
Hinneburg, Keim, D.A.: An Optimal Grid-Clustering: Towards Breaking the Curse of Diminsionality in High Dimensional Clustering. In: VLDB (1999)
Sung, K.K., Poggio, T.: Example-Based Learning for View-Based Human Face Detection. PAMI 20(1), 39–51 (1998)
Shortliffe, E.H.: Computer-based medical consultation: MYCIN. Elsevier, North-Holland, New York
Jin, H., Ooi, B.C., Shen, H.T., Yu, C., Zhou, A.: An Adaptive and Efficient Dimensionality Reduction Algorithm for High-Dimensional Indexing. In: ICDE, pp. 87–98 (2003)
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Analysis. ACM Multimedia (2004)
Yu, S., Cai, D., Wen, J.-R., Ma, W.-Y.: Improving Pseudo-Relevance Feedback in Web Information Retrieval Using Web Page Segmentation. World Wide Web (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shen, H.T., Zhou, X., Cui, B. (2005). Indexing Text and Visual Features for WWW Images. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_85
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)