Skip to main content
Log in

Integration of data fusion and reinforcement learning techniques for the rank-aggregation problem

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Rank-aggregation or combining multiple ranked lists is the heart of meta-search engines in web information retrieval. In this paper, a novel rank-aggregation method is proposed, which utilizes both data fusion operators and reinforcement learning algorithms. Such integration enables us to use the compactness property of data fusion methods as well as the exploration and exploitation capabilities of reinforcement learning techniques. The proposed algorithm is a two-steps process. In the first step, ranked lists of local rankers are combined based on their mean average precisions with a variety of data fusion operators such as optimistic and pessimistic ordered weighted averaging (OWA) operators. This aggregation provides a compact representation of the utilized benchmark dataset. In the second step, a Markov decision process (MDP) model is defined for the aggregated data. This MDP enables us to apply reinforcement learning techniques such as Q-learning and SARSA for learning the best ranking. Experimentations on the LETOR4.0 benchmark dataset demonstrates that the proposed method outperforms baseline rank-aggregation methods such as Borda Count and the family of coset-permutation distance based stage-wise (CPS) rank-aggregation methods on P@n and NDCG@n evaluation criteria. The achieved improvement is especially more noticeable in the higher ranks in the final ranked list, which is usually more attractive to Web users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Akritidis L, Katsaros D, Bozanis P (2011) Effective rank-aggregation for meta-searching. J Syst Softw 84(1):130–143

    Article  Google Scholar 

  2. Aslam JA, Montague M (2001) Models for metasearch. In: 24th annual international ACM SIGIR conference research and development in information retrieval, pp 276–284

  3. Becchetti L, Castillo C, Donato D, Leonardi S, Italia R (2008) Web spam detection: link-based and content-based techniques. In: Final workshop for European integrated project dynamically evolving, large scale information systems, pp 99–113

  4. Beg M (2004) Parallel rank-aggregation for the World Wide Web. Worldw Web 6(1):5–22

    Article  MathSciNet  Google Scholar 

  5. Chen S, Wang F, Song Y, Zhang C (2011) Semi-supervised ranking aggregation. Inf Process Manag 47(3):415–425

    Article  Google Scholar 

  6. Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank-aggregation methods for the Web. In: 10th international conference on World Wide Web, pp 613–622

  7. Erp MV, Schomaker L (2000) Variants of the Borda Count method for combining ranked classifier hypotheses. In: 7th international workshop on frontiers in handwriting recognition, pp 443–452

  8. Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank-aggregation. In: 2003 ACM SIGMOD international conference management of data, pp 301–312

  9. Fang Q, Xiao H, Zhu S (2010) Top-d rank-aggregation in Web meta-search engine. In: Lee D-T, Chen DZ, Ying S (eds) Frontiers in algorithmics. Lecture Notes in Computer Science, vol 6213. Springer, Berlin, pp 35–44

    Google Scholar 

  10. Filev D, Yager RR (1998) On the issue of obtaining OWA operator weights. Fuzzy Set Syst 94:157–169

    Article  MathSciNet  Google Scholar 

  11. Granka LA, Joachims T, Gay G (2004) Eye-tracking analysis of user behavior in WWW search. In: 27th annual international ACM SIGIR conference on research and development in information retrieval, pp 478–479

  12. He Y, Liu J, Hu Y, Wang X (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42(1):21–50

    Article  Google Scholar 

  13. Hemaspaandra E, Hemaspaandra LA, Rothe J (1997) Exact analysis of Dodgson Elections: Lewis Carroll’s 1876 voting system is complete for parallel access to NP. J ACM (JACM) 44(6):214–224

    Article  MathSciNet  MATH  Google Scholar 

  14. Kehoe C, Pitkow J, Sutton K, Aggarwal G, Rogers JD (1999) Results of GVU’s tenth World Wide Web user survey. Graphic, Visualization, & Usability Center. http://www.cc.gatech.edu/gvu/user_surveys/survey-1998-10/tenthreport.html. Accessed 15 Jan 2015

  15. Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1–2):81–89

    Article  MathSciNet  MATH  Google Scholar 

  16. Keyhanipour AH, Moshiri B, Kazemian M, Piroozmand M, Lucas C (2007) Aggregation of web search engines based on users’ preferences in WebFusion. Knowl Based Syst 20(4):321–328

    Article  Google Scholar 

  17. Khodabakhshi M, Aryavash K (2015) Aggregating preference rankings using an optimistic–pessimistic approach. Comput Ind Eng 85:13–16

    Article  Google Scholar 

  18. Kolde R, Laur S, Adler P, Vilo J (2012) Robust rank-aggregation for gene list integration and meta-analysis. Bioinformatics 28(4):573–580

    Article  Google Scholar 

  19. Lam KW, Leung CH (2004) Rank-aggregation for meta-search engines. In: 13th international conference on World Wide Web, pp 384–385

  20. Li H (2011) Learning to rank for information retrieval and natural language processing. Morgan & Claypool Publishers, San Rafael

    Book  Google Scholar 

  21. Liu TY (2011) Learning to rank for information retrieval. Springer, Berlin

    Book  MATH  Google Scholar 

  22. Liu YT, Liu TY, Qin T, Ma ZM, Li H (2007) Supervised rank-aggregation. In: 16th international conference on World Wide Web, pp 481–490

  23. Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  24. Microsoft Research Asia (2010) LETOR dataset. http://research.microsoft.com/en-us/um/beijing/projects/letor//default.aspx. Accessed 15 Jan 2015

  25. Miller M (2012) 53% of organic search clicks go to first link. http://searchenginewatch.com/article/2215868/53-of-Organic-Search-Clicks-Go-to-First-Link-Study. Accessed 15 Jan 2015

  26. O’Hagan M (1988) Aggregating template rule antecedents in real-time expert systems with fuzzy set logic. In: 22nd annual IEEE Asilomar conference on signals, systems and computers, pp 681–689

  27. Qin T, Geng X, Liu TY (2010) A new probabilistic model for rank-aggregation. In: 24th annual conference neural information processing systems, pp 1948–1956

  28. Randa ME, Straccia U (2003) Web metasearch: rank vs. score based rank-aggregation methods. In: 2003 ACM symposium on applied computing, pp 841–846

  29. Saari DG (2000) Mathematical structure of voting paradoxes. Econ Theory 15(1):55–102

    Article  MathSciNet  MATH  Google Scholar 

  30. Sese J, Morishita S (2001) Rank-aggregation method for biological databases. Genome Inform 12:506–507

    Google Scholar 

  31. Slingshot SEO Inc (2011) A tale of two studies establishing google & bing click-through rates. http://www.slingshotseo.com/wp-content/uploads/2011/10/Google-vs-Bing-CTR-Study-2011.pdf. Accessed 15 Jan 2015

  32. Spirin N, Han J (2011) Survey on Web spam detection: principles and algorithms. ACM SIGKDD Explor Newslett 13(2):50–64

    Article  Google Scholar 

  33. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  34. Szepesvari C (2010) Algorithms for reinforcement learning. Morgan & Claypool Publishers, San Rafael

    MATH  Google Scholar 

  35. Vogt CC, Cottrell GW (1999) Fusion via a linear combination of scores. Inf Retr 1(3):151–173

    Article  Google Scholar 

  36. Wang YM, Luo Y, Hua Z (2007) Aggregating preference rankings using OWA operator weights. Inf Sci 177:3356–3363

    Article  MathSciNet  MATH  Google Scholar 

  37. World-Wide-Web-Size (2015) The size of the World Wide Web (the internet). http://www.worldwideWebsize.com. Accessed 15 Jan 2015

  38. Yager RR (1988) On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans Syst Man Cybern 18(1):183–190

    Article  MathSciNet  MATH  Google Scholar 

  39. Yager RR (1993) Families of OWA operators. Fuzzy Set Syst 55:255–271

    Article  MathSciNet  MATH  Google Scholar 

  40. Yan HB, Huynh VN, Nakamori Y, Murai T (2011) On prioritized weighted aggregation in multi-criteria decision making. Expert Syst Appl 38(1):812–823

    Article  Google Scholar 

  41. Zeckman A (2015) Organic search accounts for up to 64 % of website traffic. Search engine watch. http://searchenginewatch.com/article/2355020/Organic-Search-Accounts-for-Up-to-64-of-Website-Traffic-STUDY. Accessed 15 Jan 2015

Download references

Acknowledgments

This work is supported by the University of Tehran (Grant Number 8101004/1/02). The authors thank the Editor-in-Chief and four anonymous reviewers for their helpful comments and suggestions, which were very helpful in improving the paper. We also give special thanks to Ms. Maryam Piroozmand and Dr. Kambiz Badie for their helps and inspiring discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Hosein Keyhanipour.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keyhanipour, A.H., Moshiri, B., Rahgozar, M. et al. Integration of data fusion and reinforcement learning techniques for the rank-aggregation problem. Int. J. Mach. Learn. & Cyber. 7, 1131–1145 (2016). https://doi.org/10.1007/s13042-015-0442-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-015-0442-6

Keywords

Navigation