Skip to main content

Inexpensive and Effective Data Fusion Methods with Performance Weights

  • Conference paper
  • First Online:
Information Integration and Web Intelligence (iiWAS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13635))

Included in the following conference series:

  • 601 Accesses

Abstract

Data fusion has been widely used in information retrieval for various tasks. It has been found that two factors impact fusion performance significantly: performance of all component systems and dissimilarity among them. This leads to the classification of data fusion methods into four categories depending on which factors are considered, and methods in different categories are suitable for different situations. In this piece of work, we consider data fusion methods with performance weighting. Both proposed methods assign weights based on performance measured by P@10 for the retrieval system in question, while MAP values are used for such performance weighting in previous studies. Compared with other baseline methods in the same category, our experiment shows that the proposed methods are slightly more effective than the others. Some analysis is also done to justify the rationale for the proposed weighting scheme. Because much less human judgment effort is required for P@10 than for some system-oriented measures such as MAP, the proposed method has higher applicability than all the other methods involved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    TREC (Text REtrieval Conference) is an annual event on information retrieval evaluation. Its web site is located at https://trec.nist.gov/.

References

  1. Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of ACM SIGIR, pp. 276–284 (2001)

    Google Scholar 

  2. Budíková, P., Batko, M., Zezula, P.: Fusion strategies for large-scale multi-modal image retrieval. Trans. Large Scale Data Knowl. Centered Syst. 33, 146–184 (2017)

    Google Scholar 

  3. Clarke, C.L.A., Rizvi, S., Smucker, M.D., Maistro, M., Zuccon, G.: Overview of the TREC 2020 health misinformation track. In: Proceedings of TREC. National Institute of Standards and Technology (2020)

    Google Scholar 

  4. Cormack, G.V., Clarke, C.L.A., Buettcher, S.: Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In: Proceedings of ACM SIGIR, pp. 758–759 (2009)

    Google Scholar 

  5. Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.: Combining evidence from multiple searches. In: The First Text REtrieval Conference (TREC-1), pp. 319–328 (1993)

    Google Scholar 

  6. Ghosh, K., Parui, S.K., Majumder, P.: Learning combination weights in data fusion using genetic algorithms. Inf. Process. Manage. 51(3), 306–328 (2015)

    Article  Google Scholar 

  7. Kato, S., Shimizu, T., Fujita, S., Sakai, T.: Unsupervised answer retrieval with data fusion for community question answering. In: Wang, F.L., et al. (eds.) AIRS 2019. LNCS, vol. 12004, pp. 10–21. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42835-8_2

    Chapter  Google Scholar 

  8. Lillis, D., Zhang, L., Toolan, F., Collier, R., Leonard, D., Dunnion, J.: Estimating probabilities for effective data fusion. In: Proceeding of ACM SIGIR, pp. 347–354 (2010)

    Google Scholar 

  9. Lillis, D.: On the evaluation of data fusion for information retrieval. In: FIRE 2020: Forum for Information Retrieval Evaluation, pp. 54–57 (2020)

    Google Scholar 

  10. Lillis, D., Toolan, F., Collier, R., Dunnion, J.: Extending probabilistic data fusion using sliding windows. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 358–369. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_33

    Chapter  Google Scholar 

  11. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of machine learning. In: Adaptive Computation and Machine Learning, MIT Press, Cambridge (2012)

    Google Scholar 

  12. Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of ACM CIKM, pp. 538–548 (2002)

    Google Scholar 

  13. Roberts, K., Demner-Fushman, D., Voorhees, E.M., Hersh, W.R., Bedrick, S., Lazar, A.J.: Overview of the TREC 2018 precision medicine track. In Proceedings of TREC. National Institute of Standards and Technology, USA (2018)

    Google Scholar 

  14. Roberts, K., et al.: Overview of the TREC 2017 precision medicine track. In: Proceedings of TREC 2017. National Institute of Standards and Technology (2017)

    Google Scholar 

  15. Sanderson, M., Zobel, J.: Information retrieval system evaluation: effort, sensitivity, and reliability. In: SIGIR 2005: Proceedings of ACM SIGIR, pp. 162–169. ACM (2005)

    Google Scholar 

  16. Sivaram, M., Batri, K., Mohammed, A.S., Porkodi, V., Kousik, N.V.: Data fusion using Tabu crossover genetic algorithm in information retrieval. J. Intell. Fuzzy Syst. 39(4), 5407–5416 (2020)

    Article  Google Scholar 

  17. Voorhees, E.M., Ellis, A. (eds.): Proceedings of TREC 2019. National Institute of Standards and Technology (2019)

    Google Scholar 

  18. Wu, S.: Linear combination of component results in information retrieval. Data Knowl. Eng. 71(1), 114–126 (2012)

    Article  Google Scholar 

  19. Wu, S.: The weighted Condorcet fusion in information retrieval. Inf. Proc. Manage. 49(1), 114–126 (2013)

    Article  Google Scholar 

  20. Wu, S., Bi, Y., Zeng, X., Han, L.: Assigning appropriate weights for the linear combination data fusion method in information retrieval. Inf. Proc. Manage. 45(4), 413–426 (2009)

    Article  Google Scholar 

  21. Wu, S., McClean, S.: Data fusion with correlation weights. In: Proceedings of ECIR, pp. 275–286 (2005)

    Google Scholar 

  22. Xu, C., Huang, C., Wu, S.: Differential evolution-based fusion for results diversification of web search. In: Cui, B., Zhang, N., Xu, J., Lian, X., Liu, D. (eds.) WAIM 2016, Part I. LNCS, vol. 9658, pp. 429–440. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39937-9_33

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengli Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, Q., Huang, Y., Wu, S. (2022). Inexpensive and Effective Data Fusion Methods with Performance Weights. In: Pardede, E., Delir Haghighi, P., Khalil, I., Kotsis, G. (eds) Information Integration and Web Intelligence. iiWAS 2022. Lecture Notes in Computer Science, vol 13635. Springer, Cham. https://doi.org/10.1007/978-3-031-21047-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21047-1_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21046-4

  • Online ISBN: 978-3-031-21047-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics