Abstract
Rank-aggregation or combining multiple ranked lists is the heart of meta-search engines in web information retrieval. In this paper, a novel rank-aggregation method is proposed, which utilizes both data fusion operators and reinforcement learning algorithms. Such integration enables us to use the compactness property of data fusion methods as well as the exploration and exploitation capabilities of reinforcement learning techniques. The proposed algorithm is a two-steps process. In the first step, ranked lists of local rankers are combined based on their mean average precisions with a variety of data fusion operators such as optimistic and pessimistic ordered weighted averaging (OWA) operators. This aggregation provides a compact representation of the utilized benchmark dataset. In the second step, a Markov decision process (MDP) model is defined for the aggregated data. This MDP enables us to apply reinforcement learning techniques such as Q-learning and SARSA for learning the best ranking. Experimentations on the LETOR4.0 benchmark dataset demonstrates that the proposed method outperforms baseline rank-aggregation methods such as Borda Count and the family of coset-permutation distance based stage-wise (CPS) rank-aggregation methods on P@n and NDCG@n evaluation criteria. The achieved improvement is especially more noticeable in the higher ranks in the final ranked list, which is usually more attractive to Web users.
Similar content being viewed by others
References
Akritidis L, Katsaros D, Bozanis P (2011) Effective rank-aggregation for meta-searching. J Syst Softw 84(1):130–143
Aslam JA, Montague M (2001) Models for metasearch. In: 24th annual international ACM SIGIR conference research and development in information retrieval, pp 276–284
Becchetti L, Castillo C, Donato D, Leonardi S, Italia R (2008) Web spam detection: link-based and content-based techniques. In: Final workshop for European integrated project dynamically evolving, large scale information systems, pp 99–113
Beg M (2004) Parallel rank-aggregation for the World Wide Web. Worldw Web 6(1):5–22
Chen S, Wang F, Song Y, Zhang C (2011) Semi-supervised ranking aggregation. Inf Process Manag 47(3):415–425
Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank-aggregation methods for the Web. In: 10th international conference on World Wide Web, pp 613–622
Erp MV, Schomaker L (2000) Variants of the Borda Count method for combining ranked classifier hypotheses. In: 7th international workshop on frontiers in handwriting recognition, pp 443–452
Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank-aggregation. In: 2003 ACM SIGMOD international conference management of data, pp 301–312
Fang Q, Xiao H, Zhu S (2010) Top-d rank-aggregation in Web meta-search engine. In: Lee D-T, Chen DZ, Ying S (eds) Frontiers in algorithmics. Lecture Notes in Computer Science, vol 6213. Springer, Berlin, pp 35–44
Filev D, Yager RR (1998) On the issue of obtaining OWA operator weights. Fuzzy Set Syst 94:157–169
Granka LA, Joachims T, Gay G (2004) Eye-tracking analysis of user behavior in WWW search. In: 27th annual international ACM SIGIR conference on research and development in information retrieval, pp 478–479
He Y, Liu J, Hu Y, Wang X (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42(1):21–50
Hemaspaandra E, Hemaspaandra LA, Rothe J (1997) Exact analysis of Dodgson Elections: Lewis Carroll’s 1876 voting system is complete for parallel access to NP. J ACM (JACM) 44(6):214–224
Kehoe C, Pitkow J, Sutton K, Aggarwal G, Rogers JD (1999) Results of GVU’s tenth World Wide Web user survey. Graphic, Visualization, & Usability Center. http://www.cc.gatech.edu/gvu/user_surveys/survey-1998-10/tenthreport.html. Accessed 15 Jan 2015
Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1–2):81–89
Keyhanipour AH, Moshiri B, Kazemian M, Piroozmand M, Lucas C (2007) Aggregation of web search engines based on users’ preferences in WebFusion. Knowl Based Syst 20(4):321–328
Khodabakhshi M, Aryavash K (2015) Aggregating preference rankings using an optimistic–pessimistic approach. Comput Ind Eng 85:13–16
Kolde R, Laur S, Adler P, Vilo J (2012) Robust rank-aggregation for gene list integration and meta-analysis. Bioinformatics 28(4):573–580
Lam KW, Leung CH (2004) Rank-aggregation for meta-search engines. In: 13th international conference on World Wide Web, pp 384–385
Li H (2011) Learning to rank for information retrieval and natural language processing. Morgan & Claypool Publishers, San Rafael
Liu TY (2011) Learning to rank for information retrieval. Springer, Berlin
Liu YT, Liu TY, Qin T, Ma ZM, Li H (2007) Supervised rank-aggregation. In: 16th international conference on World Wide Web, pp 481–490
Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New York
Microsoft Research Asia (2010) LETOR dataset. http://research.microsoft.com/en-us/um/beijing/projects/letor//default.aspx. Accessed 15 Jan 2015
Miller M (2012) 53% of organic search clicks go to first link. http://searchenginewatch.com/article/2215868/53-of-Organic-Search-Clicks-Go-to-First-Link-Study. Accessed 15 Jan 2015
O’Hagan M (1988) Aggregating template rule antecedents in real-time expert systems with fuzzy set logic. In: 22nd annual IEEE Asilomar conference on signals, systems and computers, pp 681–689
Qin T, Geng X, Liu TY (2010) A new probabilistic model for rank-aggregation. In: 24th annual conference neural information processing systems, pp 1948–1956
Randa ME, Straccia U (2003) Web metasearch: rank vs. score based rank-aggregation methods. In: 2003 ACM symposium on applied computing, pp 841–846
Saari DG (2000) Mathematical structure of voting paradoxes. Econ Theory 15(1):55–102
Sese J, Morishita S (2001) Rank-aggregation method for biological databases. Genome Inform 12:506–507
Slingshot SEO Inc (2011) A tale of two studies establishing google & bing click-through rates. http://www.slingshotseo.com/wp-content/uploads/2011/10/Google-vs-Bing-CTR-Study-2011.pdf. Accessed 15 Jan 2015
Spirin N, Han J (2011) Survey on Web spam detection: principles and algorithms. ACM SIGKDD Explor Newslett 13(2):50–64
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Szepesvari C (2010) Algorithms for reinforcement learning. Morgan & Claypool Publishers, San Rafael
Vogt CC, Cottrell GW (1999) Fusion via a linear combination of scores. Inf Retr 1(3):151–173
Wang YM, Luo Y, Hua Z (2007) Aggregating preference rankings using OWA operator weights. Inf Sci 177:3356–3363
World-Wide-Web-Size (2015) The size of the World Wide Web (the internet). http://www.worldwideWebsize.com. Accessed 15 Jan 2015
Yager RR (1988) On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans Syst Man Cybern 18(1):183–190
Yager RR (1993) Families of OWA operators. Fuzzy Set Syst 55:255–271
Yan HB, Huynh VN, Nakamori Y, Murai T (2011) On prioritized weighted aggregation in multi-criteria decision making. Expert Syst Appl 38(1):812–823
Zeckman A (2015) Organic search accounts for up to 64 % of website traffic. Search engine watch. http://searchenginewatch.com/article/2355020/Organic-Search-Accounts-for-Up-to-64-of-Website-Traffic-STUDY. Accessed 15 Jan 2015
Acknowledgments
This work is supported by the University of Tehran (Grant Number 8101004/1/02). The authors thank the Editor-in-Chief and four anonymous reviewers for their helpful comments and suggestions, which were very helpful in improving the paper. We also give special thanks to Ms. Maryam Piroozmand and Dr. Kambiz Badie for their helps and inspiring discussions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Keyhanipour, A.H., Moshiri, B., Rahgozar, M. et al. Integration of data fusion and reinforcement learning techniques for the rank-aggregation problem. Int. J. Mach. Learn. & Cyber. 7, 1131–1145 (2016). https://doi.org/10.1007/s13042-015-0442-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-015-0442-6