Abstract
Searching information through the Internet often requires users to contact several digital libraries, author a query representing the information of interest and manually gather retrieved results. However, a user may be not aware of the content of each individual library in terms of quantity, quality, information type, provenance and likely relevance, thus making effective retrieval quite difficult.
Searching distributed information in a network of libraries can be simplified by using a centralized server that acts as a gateway between the user and distributed repositories. To efficiently accomplish this task, the centralized server should perform some major operations, such as resource selection, query transformation and data fusion. Resource selection is required to forward the user query only to the repositories that are candidate to contain relevant documents. Query transformation is necessary in order to translate the query into one or more formats such that each library can process the query. Finally, data fusion is used to gather all retrieved documents and conveniently arrange them for presentation to the user.
In this paper, we introduce an original framework for collection fusion in the context of image databases. In fact, the continuous nature of content descriptors used to describe image content, makes impractical the applicability of methods developed for text. The proposed approach splits the score normalization process into a learning phase, taking place off-line, and a normalization phase that rearranges scores of retrieved images at query time, using information collected during the learning. Fusion examples and results on the accuracy of the solution are reported.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berretti, S., Del Bimbo, A., Pala, P.: Using Indexing Structures for Resource Descriptors Extraction from Distributed Image Repositories. In: Proc. IEEE Int. Conf. on Multimedia and Expo, Lousanne, Switzerland, August 2002, vol. 2, pp. 197–200 (2002)
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems 19(2), 97–130 (2001)
Chang, W., Sheikholaslami, G., Zhang, A., Syeda-Mahmood, T.: Efficient Resource Selection in Distributed Visual Information Retrieval. In: Proc. of ACM Multimedia 1997, Seattle (1997)
Fuhr, N.: Optimum Database Selection in Networked IR. In: Proc. of the SIGIR 1996 Workshop on Networked Information Retrieval, Zurich, Switzerland (August 1996)
Gavarno, L., Garcia-Molina, H.: Generalizing Gloss to Vector-Space Databases and Broker Hierarchies. In: Proc. of the 21st Int. Conf. on Very Large Data Bases, pp. 78–89 (1995)
Kwok, K.L., Grunfeld, L., Lewis, D.D.: TREC-3 Ad-hoc, Routing Retrieval and Thresholding Experiment using PIRCS. In: Proc. of TREC-3, pp. 247–255 (1995)
Kirsch, S.T.: Document Retrieval Over Networks wherein Ranking and Relevance Scores are Computed at the Client for Multiple Database Documents. US patent 5659732
Jagadish, H.V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K.C., Suel, T.: Optimal Histograms with Quality Guarantees. In: VLDB 1998, pp. 275–286 (1998)
Si, L., Callan, J.: Using sampled data and regression to merge search engine results. In: Proc. of International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 19–26 (2002)
Vorhees, E.M., Gupta, N.K., Johnson-Laird, B.: Learning Collection Fusion Strategies. In: Proc. ACM-SIGIR 1995, pp. 172–179 (1995)
Vorhees, E.M., Gupta, N.K., Johnson-Laird, B.: The Collection Fusion Problem. In: The Third Text REtrieval Conference (TREC-3)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berretti, S., Del Bimbo, A., Pala, P. (2004). Collection Fusion for Distributed Image Retrieval. In: Callan, J., Crestani, F., Sanderson, M. (eds) Distributed Multimedia Information Retrieval. DIR 2003. Lecture Notes in Computer Science, vol 2924. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24610-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-24610-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20875-4
Online ISBN: 978-3-540-24610-7
eBook Packages: Springer Book Archive