Abstract
This paper is focused on the effect of correlation on data fusion for multiple retrieval results. If some of the retrieval results involved in data fusion correlate more strongly than the others, their common opinion will dominate the voting process in data fusion. This may degrade the effectiveness of data fusion in many cases, especially when very good results appear to be a minority. For solving this problem, we assign each result a weight, which is derived from the correlation coefficient of that result to the other results, then the linear combination method can be used for data fusion. The evaluation of the effectiveness of the proposed method with TREC 5 ( ad hoc track) results is reported. Furthermore, we explore the relationship between results correlation and data fusion by some experiments, and demonstrate that a relationship between them does exists.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, Lousiana, USA, pp. 276–284 (September 2001)
Beitzel, S., Jensen, E., Chowdhury, A., Grossman, D., Frieder, O., Goharian, N.: On fusion of effective retrieval strategies in the same information retrieval system. Journal of the American Society of Information Science and Technology 55(10), 859–868 (2004)
Belkin, N.J., Cool, C., Croft, W.B., Callan, J.P.: The effect of multiple query representations on information retrieval performance. In: Proceedings of ACM SIGIR 1993, Pittsburgh, USA, pp. 339–346 (June-July 1993)
Foltz, P.W., Dumais, S.T.: Personalized infromation delivery: an analysis of information-filtering methods. Communications of the ACM 35(12), 51–60 (1992)
Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.: Combining evidence from multiple searchs. In: The First Text REtrieval Conference (TREC-1), Gaitherburg, MD, USA, pp. 319–328 (March 1993)
Fox, E.A., Shaw, J.: Combination of multiple searchs. In: The Second Text REtrieval Conference (TREC-2), Gaitherburg, MD, USA, pp. 243–252 (August 1994)
Harman, D.K. (ed.) Proceedings of 3rd Text Retrieval Conference (TREC-3), Gaithersburg, Maryland, USA, (April 1995), National Technical Information Service of USA
Lee, J.H.: Analysis of multiple evidence combination. In: Proceedings of the 20th Annual International ACM SIGIR Conference, Philadelphia, Pennsylvania, USA, pp. 267–275 (July 1997)
Manmatha, R., Rath, T., Feng, F.: Modeling score dsitributions for combining the outputs of search engines. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, USA, pp. 267–275 (September 2001)
Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of ACM CIKM Conference, McLean, VA, USA, pp. 538–548 (November 2002)
Ng, K.B., Loewenstern, D., Basu, C., Hirsh, H., Kantor, P.B.: Data fusion of machine-learning methods for the trec5 routing task. In: Voorhees, E.M., Harman, D.K. (eds.) Proceedings of the 5th Text REtrieval Conference, Gaithersburg, Maryland, USA, November 20-22 (1996) National Technical Information Service of USA
Turtle, H., Croft, W.B.: Evaluation of an inference network-based retrieval model. ACM Transaction on Information Systems 9(3), 187–222 (1991)
Voorhees, E.M., Harman, D.K. (eds.) Proceedings of the 5th Text Retrieval Conference, Gaithersburg, Maryland, USA, November 20-22 (1996), National Technical Information Service of USA
Vort, C.C., Cotterell, G.A.: A fusion via a linear combination of scores. Information Retrieval 1(3), 151–173 (1999)
Wu, S., Crestani, F.: Data fusion with estimated weights. In: Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management, McLean, VA, USA, pp. 648–651 (November 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, S., McClean, S. (2005). Data Fusion with Correlation Weights. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-31865-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)