Skip to main content

Data Fusion with Correlation Weights

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Abstract

This paper is focused on the effect of correlation on data fusion for multiple retrieval results. If some of the retrieval results involved in data fusion correlate more strongly than the others, their common opinion will dominate the voting process in data fusion. This may degrade the effectiveness of data fusion in many cases, especially when very good results appear to be a minority. For solving this problem, we assign each result a weight, which is derived from the correlation coefficient of that result to the other results, then the linear combination method can be used for data fusion. The evaluation of the effectiveness of the proposed method with TREC 5 ( ad hoc track) results is reported. Furthermore, we explore the relationship between results correlation and data fusion by some experiments, and demonstrate that a relationship between them does exists.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, Lousiana, USA, pp. 276–284 (September 2001)

    Google Scholar 

  2. Beitzel, S., Jensen, E., Chowdhury, A., Grossman, D., Frieder, O., Goharian, N.: On fusion of effective retrieval strategies in the same information retrieval system. Journal of the American Society of Information Science and Technology 55(10), 859–868 (2004)

    Article  Google Scholar 

  3. Belkin, N.J., Cool, C., Croft, W.B., Callan, J.P.: The effect of multiple query representations on information retrieval performance. In: Proceedings of ACM SIGIR 1993, Pittsburgh, USA, pp. 339–346 (June-July 1993)

    Google Scholar 

  4. Foltz, P.W., Dumais, S.T.: Personalized infromation delivery: an analysis of information-filtering methods. Communications of the ACM 35(12), 51–60 (1992)

    Article  Google Scholar 

  5. Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.: Combining evidence from multiple searchs. In: The First Text REtrieval Conference (TREC-1), Gaitherburg, MD, USA, pp. 319–328 (March 1993)

    Google Scholar 

  6. Fox, E.A., Shaw, J.: Combination of multiple searchs. In: The Second Text REtrieval Conference (TREC-2), Gaitherburg, MD, USA, pp. 243–252 (August 1994)

    Google Scholar 

  7. Harman, D.K. (ed.) Proceedings of 3rd Text Retrieval Conference (TREC-3), Gaithersburg, Maryland, USA, (April 1995), National Technical Information Service of USA

    Google Scholar 

  8. Lee, J.H.: Analysis of multiple evidence combination. In: Proceedings of the 20th Annual International ACM SIGIR Conference, Philadelphia, Pennsylvania, USA, pp. 267–275 (July 1997)

    Google Scholar 

  9. Manmatha, R., Rath, T., Feng, F.: Modeling score dsitributions for combining the outputs of search engines. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, USA, pp. 267–275 (September 2001)

    Google Scholar 

  10. Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of ACM CIKM Conference, McLean, VA, USA, pp. 538–548 (November 2002)

    Google Scholar 

  11. Ng, K.B., Loewenstern, D., Basu, C., Hirsh, H., Kantor, P.B.: Data fusion of machine-learning methods for the trec5 routing task. In: Voorhees, E.M., Harman, D.K. (eds.) Proceedings of the 5th Text REtrieval Conference, Gaithersburg, Maryland, USA, November 20-22 (1996) National Technical Information Service of USA

    Google Scholar 

  12. Turtle, H., Croft, W.B.: Evaluation of an inference network-based retrieval model. ACM Transaction on Information Systems 9(3), 187–222 (1991)

    Article  Google Scholar 

  13. Voorhees, E.M., Harman, D.K. (eds.) Proceedings of the 5th Text Retrieval Conference, Gaithersburg, Maryland, USA, November 20-22 (1996), National Technical Information Service of USA

    Google Scholar 

  14. Vort, C.C., Cotterell, G.A.: A fusion via a linear combination of scores. Information Retrieval 1(3), 151–173 (1999)

    Article  Google Scholar 

  15. Wu, S., Crestani, F.: Data fusion with estimated weights. In: Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management, McLean, VA, USA, pp. 648–651 (November 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, S., McClean, S. (2005). Data Fusion with Correlation Weights. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31865-1_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25295-5

  • Online ISBN: 978-3-540-31865-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics