Abstract
The distributed retrieval process is plagued by uncertainty. Sampling, selection, merging and ranking are all based on very limited information compared to centralized retrieval. In this paper, we focus our attention on reducing the uncertainty within the resource selection phase by obtaining a number of estimates, rather than relying upon only one point estimate. We propose three methods for reducing uncertainty which are compared against state-of-the-art baselines across three distributed retrieval testbeds. Our results show that the proposed methods significantly improve baselines, reduce the uncertainty and improve robustness of resource selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arguello, J., Callan, J., Diaz, F.: Classification-based resource selection. In: Proceedings of the ACM CIKM, pp. 1277–1286 (2009)
Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: Proceedings of the ACM SIGIR, pp. 315–322 (2009)
Azzopardi, L., Baillie, M., Crestani, F.: Adaptive query-based sampling for distributed ir. In: Proceedings of the ACM SIGIR, pp. 605–606 (2006)
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: Proceedings of the ACM SIGIR, pp. 21–28 (1995)
Callan, J.: Distributed Information Retrieval. In: Advances in Information Retrieval, ch. 5, pp. 127–150. Kluwer Academic Publishers (2000)
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions of Information Systems 19(2), 97–130 (2001)
Callan, J., Crestani, F., Nottelmann, H., Pala, P., Shou, X.M.: Resource selection and data fusion in multimedia distributed digital libraries. In: Proceedings of the ACM SIGIR, pp. 363–364 (2003)
Caverlee, J., Liu, L., Bae, J.: Distributed query sampling: a quality-conscious approach. In: Proceedings of the ACM SIGIR, pp. 340–347 (2006)
Collins-Thompson, K., Callan, J.: Estimation and use of uncertainty in pseudo-relevance feedback. In: Proceedings of the ACM SIGIR, pp. 303–310 (2007)
Crestani, F., Lalmas, M.: Logic and Uncertainty in Information Retrieval. In: Agosti, M., Crestani, F., Pasi, G. (eds.) ESSIR 2000. LNCS, vol. 1980, pp. 179–206. Springer, Heidelberg (2001)
Hauff, C.: Predicting the effectiveness of queries and retrieval systems. SIGIR Forum 44(1), 88–88 (2010)
Markov, I., Arampatzis, A., Crestani, F.: Improving cori for results merging and score normalization. In: Proceedings of ECIR (2013)
Shokouhi, M.: Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 160–172. Springer, Heidelberg (2007)
Shokouhi, M., Si, L.: Federated search. Foundations and Trends in Information Retrieval 5, 1–102 (2011)
Shokouhi, M., Zobel, J.: Robust result merging using sample-based score estimates. ACM Trans. Inf. Syst. 27(3), 1–29 (2009)
Shokouhi, M., Zobel, J., Tahaghoghi, S.M.M., Scholer, F.: Using query logs to establish vocabularies in distributed information retrieval. Information Processing & Management 43(1), 169–180 (2007)
Si, L., Callan, J.: Using sampled data and regression to merge search engine results. In: Proceedings of the ACM SIGIR, pp. 19–26 (2002)
Si, L., Callan, J.: Relevant document distribution estimation method for resource selection. In: Proceedings of the ACM SIGIR, pp. 298–305 (2003)
Thomas, P., Shokouhi, M.: Sushi: scoring scaled samples for server selection. In: Proceedings of the ACM SIGIR, pp. 419–426 (2009)
Thomas, P., Shokouhi, M.: Evaluating Server Selection for Federated Search. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 607–610. Springer, Heidelberg (2010)
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: Proceeding of the ACM SIGIR, pp. 115–122 (2009)
Xu, J., Croft, W.B.: Cluster-based language models for distributed retrieval. In: Proceedings of the ACM SIGIR, pp. 254–261. ACM (1999)
Zhai, C., Lafferty, J.D.: A risk minimization framework for information retrieval. Information Processing & Management 42(1), 31–55 (2006)
Zhou, Y., Croft, W.B.: Query performance prediction in web search environments. In: Proceedings of the ACM SIGIR, pp. 543–550 (2007)
Zhu, J., Wang, J., Taylor, M., Cox, I.J.: Risk-Aware Information Retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 17–28. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Markov, I., Azzopardi, L., Crestani, F. (2013). Reducing the Uncertainty in Resource Selection. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)