ABSTRACT
In feeds recommendation, users are able to constantly browse items generated by never-ending feeds using mobile phones. The implicit feedback from users is an important resource for learning to rank, however, building ranking functions from such observed data is recognized to be biased. The presentation of the items will influence the user's judgements and therefore introduces biases. Most previous works in the unbiased learning to rank literature focus on position bias (i.e., an item ranked higher has more chances of being examined and interacted with). By analyzing user behaviors in product feeds recommendation, in this paper, we identify and introduce context bias, which refers to the probability that a user interacting with an item is biased by its surroundings, to unbiased learning to rank. We propose an Unbiased Learning to Rank with Combinational Propensity (ULTR-CP) framework to remove the inherent biases jointly caused by multiple factors. Under this framework, a context-aware position bias model is instantiated to estimate the unified bias considering both position and context biases. In addition to evaluating propensity score estimation approaches by the ranking metrics, we also discuss the evaluation of the propensities directly by checking their balancing properties. Extensive experiments performed on a real e-commerce data set collected from JD.com verify the effectiveness of context bias and illustrate the superiority of ULTR-CP against the state-of-the-art methods.
- A. Agarwal, X. Wang, C. Li, M. Bendersky, and M. Najork. 2019 a. Addressing Trust Bias for Unbiased Learning-to-Rank. In Proceedings of the 28th international conference on World wide web. 4--14.Google Scholar
- Aman Agarwal, Ivan Zaitsev, and Thorsten Joachims. 2018. Counterfactual Learning-to-Rank for Additive Metrics and Deep Models. arXiv preprint arXiv:1805.00065 (2018).Google Scholar
- Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019 b. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, pp: 474--482.Google ScholarDigital Library
- Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased learning to rank with unbiased propensity estimation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 385--394.Google ScholarDigital Library
- Grigor Aslanyan and Utkarsh Porwal. 2018. Direct Estimation of Position Bias for Unbiased Learning-to-Rank without Intervention. Computing Research Repository, Vol. abs/1812.09338 (2018).Google Scholar
- Ricardo Baeza-Yates. 2016. Data and algorithmic bias in the web. In Proceedings of the 8th ACM Conference on Web Science. ACM, pp: 1--1.Google ScholarDigital Library
- Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning, Vol. 11, 23--581 (2010), 81.Google Scholar
- Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web. ACM, pp: 1--10.Google ScholarDigital Library
- Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click models for web search. Synthesis Lectures on Information Concepts, Retrieval, and Services, Vol. 7, 3 (2015), pp: 1--115.Google ScholarCross Ref
- Andrew Collins, Dominika Tkaczyk, Akiko Aizawa, and Joeran Beel. 2018. Position bias in recommender systems for digital libraries. In Proceedings of the International Conference on Information. Springer, pp: 335--344.Google ScholarCross Ref
- Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An experimental comparison of click position-bias models. In Proceedings of the 2008 international conference on web search and data mining. ACM, 87--94.Google ScholarDigital Library
- Georges Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 331--338.Google ScholarDigital Library
- Leonardo Grilli and Carla Rampichini. 2011. Propensity scores for the estimation of average treatment effects in observational studies. Training Sessions on Causal Inference, Bristol (2011), pp: 28--29.Google Scholar
- Yulong Gu, Zhuoye Ding, Shuaiqiang Wang, Lixin Zou, Yiding Liu, and Dawei Yin. 2020. Deep Multifaceted Transformers for Multi-Objective Ranking in Large-Scale E-Commerce Recommender Systems. Association for Computing Machinery. https://doi.org/10.1145/3340531.3412697Google Scholar
- Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. In Proceedings of the 28th international conference on World wide web. pp:2830--2836.Google ScholarDigital Library
- Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences .Cambridge University Press.Google ScholarDigital Library
- Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. pp: 15--24.Google ScholarDigital Library
- Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems, Vol. 25, 2 (2007), pp: 7.Google ScholarDigital Library
- Thorsten Joachims, Laura A Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In SIGIR, Vol. 5. pp: 154--161.Google ScholarDigital Library
- Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, pp: 781--789.Google ScholarDigital Library
- Joseph DY Kang, Joseph L Schafer, et al. 2007. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical science, Vol. 22, 4 (2007), pp: 523--539.Google Scholar
- Ching-Pei Lee and Chih-Jen Lin. 2014. Large-scale linear ranksvm. Neural computation, Vol. 26, 4 (2014), pp: 781--817.Google Scholar
- Kristina Lerman and Tad Hogg. 2014. Leveraging position bias to improve peer recommendation. PloS one, Vol. 9, 6 (2014), e98914.Google ScholarCross Ref
- Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3, pp: 225--331.Google Scholar
- Claudio Lucchese, Franco Maria Nardini, Rama Kumar Pasumarthi, Sebastian Bruch, Michael Bendersky, Xuanhui Wang, Harrie Oosterhuis, Rolf Jagerman, and Maarten de Rijke. 2019. Learning to Rank in Theory and Practice: From Gradient Boosting to Neural Networks and Unbiased Learning. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. pp: 1419--1420.Google ScholarDigital Library
- Maeve O'Brien and Mark T Keane. 2006. Modeling result--list searching in the World Wide Web: The role of relevance topologies and trust bias. In Proceedings of the 28th annual conference of the cognitive science society, Vol. 28. Citeseer, pp: 1881--1886.Google Scholar
- Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable Unbiased Online Learning to Rank. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. pp: 1293--1302.Google ScholarDigital Library
- Z. Ovaisi, R. Ahsan, Y. Zhang, K. Vasilaky, and E. Zheleva. 2020. Correcting for Selection Bias in Learning-to-rank Systems. In Proceedings of the 29th international conference on World wide web. 1863--1873.Google Scholar
- Xin Rong. 2014. word2vec Parameter Learning Explained. Computer Science (2014).Google Scholar
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, Vol. 70, 1 (1983), pp: 41--55.Google ScholarCross Ref
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM conference on Electronic commerce. ACM, pp: 158--167.Google ScholarDigital Library
- Jeffrey A Smith and Petra E Todd. 2005. Does matching overcome LaLonde's critique of nonexperimental estimators? Journal of econometrics, Vol. 125, 1--2 (2005), pp: 305--353.Google ScholarCross Ref
- V Vapnik. 1998. Statistical Learning Theory. Wiley, Chichester, Vol. GB (1998).Google ScholarDigital Library
- Mengting Wan, Jianmo Ni, Rishabh Misra, and Julian McAuley. 2020. Addressing marketing bias in product recommendations. In Proceedings of the 13th International Conference on Web Search and Data Mining. ACM.Google ScholarDigital Library
- Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, pp: 115--124.Google ScholarDigital Library
- Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, pp: 610--618.Google ScholarDigital Library
- Fen Xia, Tie Yan Liu, Jue Wang, Wensheng Zhang, and Li Hang. 2008. Listwise approach to learning to rank: theory and algorithm. In Proceedings of the 25th international conference on Machine learning .Google ScholarDigital Library
- Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th Annual International Conference on Machine Learning. pp: 1201--1208.Google ScholarDigital Library
- Hua Zheng, Dong Wang, Qi Zhang, Hang Li, and Tinghao Yang. 2010. Do clicks measure recommendation relevancy?: an empirical user study. In Proceedings of the fourth ACM conference on Recommender systems. ACM, pp: 249--252.Google ScholarDigital Library
Index Terms
- Unbiased Learning to Rank in Feeds Recommendation
Recommendations
Learning to rank for hybrid recommendation
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementMost existing recommender systems can be classified into two categories: collaborative filtering and content-based filtering. Hybrid recommender systems combine the advantages of the two for improved recommendation performance. Traditional recommender ...
Unbiased Learning to Rank: Online or Offline?
How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups—the studies on unbiased learning ...
Unbiased Learning to Rank with Unbiased Propensity Estimation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalLearning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework ...
Comments