ABSTRACT
Content-based similarity search is an important task in multimedia information retrieval (IR). Here, metric space access methods (MAMs) can be applied. They are purely based on the use of a metric distance. No assumption is made about the representation of the feature objects. On the one hand, approximate MAMs have been proposed relying on the inverted file---the de facto standard index structure for text retrieval. On the other hand, there are many exact hierarchical and multi-step MAMs.
We present IF4MI (Inverted Files for Metric Indexing), the first exact metric access method (MAM) based on the inverted file concept. IF4MI can outperform existing MAMs such as the M-tree and the PM-tree. In addition, the pruning power of current state-of-the-art techniques---namely the Metric Index---can be brought to inverted files without relying on an additional mechanism which maps feature objects to one-dimensional values for storing them in adequate data structures such as a B+-tree. IF4MI is conceptually appealing since it can make use of extensive knowledge in the field of inverted file-based indexing. As one example, we show how the efficient processing of textual filter queries---an important task in multimedia IR---is inherently supported.
- P. Bolettieri, A. Esuli, F. Falchi, C. Lucchese, R. Perego, T. Piccioli, and F. Rabitti. CoPhIR: a Test Collection for Content-Based Image Retrieval. CoRR, abs/0905.4627v2, http://arxiv.org/abs/0905.4627v2 (last visit: 12.9.2011), 2009.Google Scholar
- P. Ciaccia, M. Patella, and P. Zezula. M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. In Proc. of the 23rd Intl. Conf. on Very Large Data Bases, pages 426--435, Athens, Greece, 1997. Morgan Kaufmann. Google ScholarDigital Library
- C. Gennaro, G. Amato, P. Bolettieri, and P. Savino. An Approach to Content-Based Image Retrieval Based on the Lucene Search Engine Library. In Proc. of the 14th European Conf. on Research and Advanced Technology for Digital Libraries, pages 55--66, Berlin, Heidelberg, 2010. Springer LNCS 6273. Google ScholarDigital Library
- B. S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7: Multimedia Content Description Interface. Wiley & Sons, 2002. Google ScholarDigital Library
- M. L. Micó, J. Oncina, and E. Vidal. A new version of the Nearest-Neighbour Approximating and Eliminating Search Algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recogn. Lett., 15: 9--17, 1994. Google ScholarDigital Library
- W. Myrvold and F. Ruskey. Ranking and unranking permutations in linear time. Inf. Process. Lett., 79(6): 281--284, 2001. Google ScholarDigital Library
- D. Novak, M. Batko, and P. Zezula. Metric index: An efficient and scalable solution for precise and approximate similarity search. Inf. Syst., 36: 721--733, June 2011. Google ScholarDigital Library
- M. L. Paramita, M. Sanderson, and P. Clough. Diversity in Photo Retrieval: Overview of the ImageCLEFPhoto Task 2009. In Proc. of the 10th Intl. Cross-Language Evaluation Forum: multimedia experiments, pages 45--59, Berlin, Heidelberg, 2010. Springer-Verlag. Google ScholarDigital Library
- H. Samet. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2006. Google ScholarDigital Library
- T. Skopal, J. Pokorný, and V. Snásel. Nearest Neighbours Search Using the PM-Tree. In Proc. of 10th Intl. Conf. on Database Systems for Advanced Applications, pages 803--815, Beijing, China, 2005. Springer LNCS 3453. Google ScholarDigital Library
- P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity Search: The Metric Space Approach. Springer New York, Inc., Secaucus, NJ, USA, 2005. Google ScholarDigital Library
Index Terms
- Inverted file-based indexing for efficient multimedia information retrieval in metric spaces
Recommendations
Inverted files versus signature files for text indexing
Two well-known indexing methods are inverted files and signature files. We have undertaken a detailed comparison of these two approaches in the context of text indexing, paying particular attention to query evaluation speed and space requirements. We ...
Efficient Textual Web Retrieval using Wavelet Tree
Searching on the web is one of the most progressive and expanding field nowadays. A large amount of information is available on the World Wide Web, motivating the need of efficient text indexing method that support fast text retrieval. In the past, two ...
Semantic Image Retrieval Using Region Based Inverted File
DICTA '09: Proceedings of the 2009 Digital Image Computing: Techniques and ApplicationsImage data is as common as textual data in this digital world. There is an urgent demand of image management tools as efficient as those text search engines. Decades of research on image retrieval has found there is a significant gap between the ...
Comments