Probability-based fusion of information retrieval result sets

Lillis, D.; Toolan, F.; Mur, A.; Peng, L.; Collier, R.; Dunnion, J.

doi:10.1007/s10462-007-9021-x

Probability-based fusion of information retrieval result sets

Published: 21 August 2007

Volume 25, pages 179–191, (2006)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

D. Lillis¹,
F. Toolan¹,
A. Mur¹,
L. Peng¹,
R. Collier¹ &
…
J. Dunnion¹

93 Accesses
7 Citations
Explore all metrics

Abstract

Information Retrieval (IR) forms the basis of many information management tasks. Information management itself has become an extremely important area as the amount of electronically available information increases dramatically. There are numerous methods of performing the IR task both by utilising different techniques and through using different representations of the information available to us. It has been shown that some algorithms outperform others on certain tasks. Combining the results produced by different algorithms has resulted in superior retrieval performance and this has become an important research area. This paper introduces a probability-based fusion technique probFuse that shows initial promise in addressing this question. It also compares probFuse with the common CombMNZ data fusion technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aslam JA, Montague M (2000) Bayes optimal metasearch: a probabilistic model for combining the results of multiple retrieval systems. In: SIGIR ’00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM Press. New York, NY, USA, pp 379–381
Aslam JA, Montague M (2001) Models for metasearch. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM Press. New York, NY, USA, pp 276–284
Baeza-Yates RA and Ribeiro-Neto B (1999). Modern information retrieval. Addison-Wesley Longman Publishing Co, Inc, Boston, MA, USA
Google Scholar
Bartell BT, Cottrell GW, Belew RK (1994) Automatic combination of multiple ranked retrieval systems. In: SIGIR ’94: proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval. Springer-Verlag, New York, New York Inc., NY, USA, pp 173–181
Beitzel SM., Jensen EC, Chowdhury A, Grossman D, Frieder O and Goharian N (2004). Fusion of effective retrieval strategies in the same information retrieval system. J Am Soc Inf Sci Technol 55(10): 859–868
Article Google Scholar
Callan JP, Lu Z, Croft WB (1995) Searching distributed collections with inference networks. In: SIGIR ’95: proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval. ACM Press. New York, NY, USA, pp 21–28
Das-Gupta P, Katzer J (1983) A study of the overlap among document representations. In: SIGIR ’83: Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval. ACM Press. New York, NY, USA, pp 106–114
Dietterich TG (2000). Ensemble methods in machine learning. Lecture Notes Comput Sci 1857: 1–15
Article Google Scholar
Fox EA, Shaw JA (1994) Combination of multiple searches. In: Proceedings of the 2nd text Retrieval conference (TREC-2), national institute of standards and technology special publication 500-215. pp 243–252
Giacinto G and Roli F (2001). Dynamic classifier selection based on multiple classifier behaviour. Pattern Recogn 34(9): 1879–1881
Article MATH Google Scholar
Harman D (1993) Overview of the first text retrieval conference (TREC-1). In: SIGIR ’93: proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval. ACM Press. New York, NY, USA, pp 36–47
Howe AE and Dreilinger D (1997). SavvySearch: a metasearch engine that learns which search engines to query.. AI Mag 18(2): 19–25
Google Scholar
Larkey LS, Connell ME, Callan J (2000) Collection selection and results merging with topically organized U.S. patents and TREC data. In: CIKM ’00: proceedings of the ninth international conference on Information and knowledge management. ACM Press. New York, NY, USA, pp 282–289
Lee JH (1997). Analyses of multiple evidence combination. SIGIR Forum 31(SI): 267–276
Article Google Scholar
Montague M, Aslam JA (2001) Relevance score normalization for metasearch. In: CIKM ’01: proceedings of the tenth international conference on Information and knowledge management. ACM Press. New York, NY, USA, pp 427–433
Montague M, Aslam JA (2002) Condorcet fusion for improved retrieval. In: CIKM ’02: Proceedings of the eleventh international conference on Information and knowledge management. ACM Press. New York, NY, USA, pp 538–548
Mur A, Peng L, Collier R, Lillis D, Toolan F, Dunnion J (2005) A HOTAIR scalability model. In: Proceedings of the 16th irish conference on artificial intelligence and cognitive science (AICS 2005). University of Ulster. Portstewart, Northern Ireland, pp 359–368
Peng L, Collier R, Mur A, Lillis D, Toolan F, Dunnion J (2005) A self-configuring agent-based document indexing system. In: Proceedings of the 4th international central and eastern european conference on multi-agent systems (CEEMAS 2005). Springer-Verlag GmbH, Budapest, Hungary,
Powell AL, French JC, Callan J, Connell M, Viles CL (2000) The impact of database selection on distributed searching. In: SIGIR ’00: proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval. ACM Press. New York, NY, USA, pp 232–239
Rasolofo Y, Abbaci F, Savoy J (2001) Approaches to collection selection and results merging for distributed information retrieval. In: CIKM ’01: proceedings of the tenth international conference on Information and knowledge management. ACM Press. New York, NY, USA, pp 191–198
Salton G, Fox EA and Wu H (1983). Extended boolean information retrieval. Commun ACM 26(11): 1022–1036
Article MATH MathSciNet Google Scholar
Salton G and Lesk ME (1968). Computer evaluation of indexing and text processing. J ACM 15(1): 8–36
Article MATH Google Scholar
Saracevic T and Kantor P (1988). A study of information seeking and retrieving. III. Searchers, searches and overlap. J Am Soc Inform Sci 39(3): 197–216
Article Google Scholar
Selberg E, Etzioni O (1997) The metacrawler architecture for resource aggregation on the web. IEEE Expert (January–February): 11–14
Si L, Callan J (2002) Using sampled data and regression to merge search engine results. In: SIGIR ’02: proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. ACM Press. New York, NY, USA, pp 19–26
Vogt CC and Cottrell GW (1999). Fusion via a linear combination of scores. Inform Retrieval 1(3): 151–173
Article Google Scholar
Voorhees EM, Gupta NK, Johnson-Laird B (1994) The collection fusion problem. In: Proceedings of the third text retrieval conference (TREC-3). pp 95–104
Voorhees EM, Gupta NK, Johnson-Laird B (1995) Learning collection fusion strategies. In: SIGIR ’95: proceedings of the 18th annual international ACM SIGIR conference on research and development in information retrieval. ACM Press. New York, NY, USA, pp 172–179
Voorhees EM, Tong RM (1997) Multiple search engines in database merging. In: Proceedings of the second ACM international conference on digital libraries. ACM Press, Philadelphia, Pa, New York, pp 93–102
Wu S, Crestani F (2002) Data fusion with estimated weights. In: CIKM ’02: Proceedings of the eleventh international conference on information and knowledge management. ACM Press. New York, NY, USA, pp 648–651
Wu S, Crestani F (2004) Shadow document methods of results merging. In: SAC ’04: proceedings of the 2004 ACM symposium on applied computing. ACM Press. New York, NY, USA, pp 1067–1072

Download references

Author information

Authors and Affiliations

School of Computer Science and Informatics, University College Dublin, Dublin, Ireland
D. Lillis, F. Toolan, A. Mur, L. Peng, R. Collier & J. Dunnion

Authors

D. Lillis
View author publications
You can also search for this author in PubMed Google Scholar
F. Toolan
View author publications
You can also search for this author in PubMed Google Scholar
A. Mur
View author publications
You can also search for this author in PubMed Google Scholar
L. Peng
View author publications
You can also search for this author in PubMed Google Scholar
R. Collier
View author publications
You can also search for this author in PubMed Google Scholar
J. Dunnion
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Lillis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lillis, D., Toolan, F., Mur, A. et al. Probability-based fusion of information retrieval result sets. Artif Intell Rev 25, 179–191 (2006). https://doi.org/10.1007/s10462-007-9021-x

Download citation

Published: 21 August 2007
Issue Date: April 2006
DOI: https://doi.org/10.1007/s10462-007-9021-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability-based fusion of information retrieval result sets

Abstract

Access this article

Similar content being viewed by others

Inexpensive and Effective Data Fusion Methods with Performance Weights

Information Fusion

Data Fusion Methods with Graded Relevance Judgment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Probability-based fusion of information retrieval result sets

Abstract

Access this article

Similar content being viewed by others

Inexpensive and Effective Data Fusion Methods with Performance Weights

Information Fusion

Data Fusion Methods with Graded Relevance Judgment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation