ABSTRACT
Implicit feedback (e.g., user clicks) is an important source of data for modern search engines. While heavily biased [8, 9, 11, 27], it is cheap to collect and particularly useful for user-centric retrieval applications such as search ranking. To develop an unbiased learning-to-rank system with biased feedback, previous studies have focused on constructing probabilistic graphical models (e.g., click models) with user behavior hypothesis to extract and train ranking systems with unbiased relevance signals. Recently, a novel counterfactual learning framework that estimates and adopts examination propensity for unbiased learning to rank has attracted much attention. Despite its popularity, there is no systematic comparison of the unbiased learning-to-rank frameworks based on counterfactual learning and graphical models. In this tutorial, we aim to provide an overview of the fundamental mechanism for unbiased learning to rank. We will describe the theory behind existing frameworks, and give detailed instructions on how to conduct unbiased learning to rank in practice.
- Qingyao Ai, Liu Yang, Jiafeng Guo, and W. Bruce Croft. 2016. Analysis of the paragraph vector model for information retrieval. In Proceedings of the 2rd ACM ICTIR. ACM, 133--142. Google ScholarDigital Library
- Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Large-scale validation and analysis of interleaved search evaluation. ACM Transactions on Information Systems, Vol. 30, 1 (2012), 6. Google ScholarDigital Library
- Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th WWW. ACM, 1--10. Google ScholarDigital Library
- Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In Proceedings of the 40th ACM SIGIR (SIGIR '17). ACM, 65--74. Google ScholarDigital Library
- Anhai Doan, Raghu Ramakrishnan, and Alon Y. Halevy. 2011. Crowdsourcing systems on the world-wide web. Commun. ACM, Vol. 54, 4 (2011), 86--96. Google ScholarDigital Library
- Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st ACM SIGIR. ACM, 331--338. Google ScholarDigital Library
- Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM CIKM. ACM, 55--64. Google ScholarDigital Library
- Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual ACM SIGIR. Acm, 154--161. Google ScholarDigital Library
- Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS), Vol. 25, 2 (2007), 7. Google ScholarDigital Library
- Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the 10th ACM WSDM. ACM, 781--789. Google ScholarDigital Library
- Mark T. Keane and Maeve O'Brien. 2006. Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias. In Proceedings of the Cognitive Science Society, Vol. 28.Google Scholar
- Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI. ACM, 453--456. Google ScholarDigital Library
- Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331. Google ScholarDigital Library
- Cheng Luo, Yukun Zheng, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Training deep ranking model with weak relevance labels. In Australasian Database Conference. Springer, 205--216.Google ScholarCross Ref
- Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to Match Using Local and Distributed Representations of Text for Web Search. In Proceedings of the 26th WWW (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1291--1299. Google ScholarDigital Library
- Karthik Raman and Thorsten Joachims. 2013. Learning socially optimal information systems from egoistic users. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 128--144.Google ScholarCross Ref
- Paul R. Rosenbaum and Donald B. Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, Vol. 70, 1 (1983), 41--55.Google ScholarCross Ref
- Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In Proceedings of the 9th ACM WSDM. ACM, 457--466. Google ScholarDigital Library
- Adith Swaminathan and Thorsten Joachims. 2015. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research, Vol. 16 (2015), 1731--1755. Google ScholarDigital Library
- Adith Swaminathan and Thorsten Joachims. 2015. Counterfactual risk minimization: Learning from logged bandit feedback. In ICML. 814--823. Google ScholarDigital Library
- Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma. 2015. Incorporating non-sequential behavior into click models. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 283--292. Google ScholarDigital Library
- Hongning Wang, ChengXiang Zhai, Anlei Dong, and Yi Chang. 2013. Content-aware click modeling. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1365--1376. Google ScholarDigital Library
- Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th ACM SIGIR. ACM, 115--124. Google ScholarDigital Library
- Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the 11th ACM WSDM (WSDM '18). ACM, New York, NY, USA, 610--618. Google ScholarDigital Library
- Wanhong Xu, Eren Manavoglu, and Erick Cantu-Paz. 2010. Temporal click model for sponsored search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 106--113. Google ScholarDigital Library
- Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th ICML. ACM, 1201--1208. Google ScholarDigital Library
- Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In Proceedings of the 19th WWW. ACM, 1011--1018. Google ScholarDigital Library
Index Terms
- Unbiased Learning to Rank: Theory and Practice
Recommendations
Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
WWW '19: The World Wide Web ConferenceRecently a number of algorithms under the theme of 'unbiased learning-to-rank' have been proposed, which can reduce position bias, the major type of bias in click data, and train a high-performance ranker with click data. Most of the existing algorithms,...
Unbiased Learning to Rank: Theory and Practice
ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information RetrievalImplicit user feedback (such as clicks and dwell time) is an important source of data for modern search engines. While heavily biased~\citejoachims2005accurately,keane2006modeling,joachims2007evaluating,yue2010beyond, it is cheap to collect and ...
Whole Page Unbiased Learning to Rank
WWW '24: Proceedings of the ACM on Web Conference 2024The page presentation biases in the information retrieval system, especially on the click behavior, is a well-known challenge that hinders improving ranking models' performance with implicit user feedback. Unbiased Learning to Rank~(ULTR) algorithms are ...
Comments