Abstract
In recommendation systems, it has been an increasing emphasis on recommending potentially novel and interesting items in addition to currently confirmed attractive ones. In this paper, we propose a contextual bandit algorithm for web page recommendation in the dependent click model (DCM), which takes user and web page features into consideration and automatically balances between exploration and exploitation. In addition, unlike many previous contextual bandit algorithms which assume that the click through rate is a linear function of features, we enhance the representability by adopting the generalized linear models, which include both linear and logistic regressions and have exhibited stronger performance in many binary-reward applications. We prove an upper bound of \(\tilde{O}(d\sqrt{n})\) on the regret of the proposed algorithm. Experiments are conducted on both synthetic and real-world data, and the results demonstrate significant advantages of our algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Here and throughout the paper, we use bold letters for random variables.
References
Aggarwal, C.C.: Recommender Systems. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-29659-3
Richardson, M., Dominowska, E., Ragno, R.: Predicting clicks: estimating the click-through rate for new ads. In: Proceedings of the 16th International Conference on World Wide Web, pp. 521–530. ACM (2007)
Rendle, S.: Factorization machines. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 995–1000. IEEE (2010)
Wang, X., Wang, Y., Hsu, D., Wang, Y.: Exploration in interactive personalized music recommendation: a reinforcement learning approach. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11(1), 7 (2014)
Gittins, J., Glazebrook, K., Weber, R.: Multi-armed Bandit Allocation Indices. Wiley, Hoboken (2011)
Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 661–670. ACM (2010)
Villar, S.S., Bowden, J., Wason, J.: Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges. Stat. Sci.: Rev. J. Inst. Math. Stat. 30(2), 199 (2015)
Chuklin, A., Markov, I., de Rijke, M.: Click models for web search. Synth. Lect. Inf. Concepts, Retrieval, Serv. 7(3), 1–115 (2015)
Kveton, B., Wen, Z., Ashkan, A., Szepesvari, C.: Cascading bandits: learning to rank in the cascade model. In: Proceedings of the 32th International Conference on Machine Learning (2015)
Kveton, B., Wen, Z., Ashkan, A., Szepesvari, C.: Combinatorial cascading bandits. In: Advances in Neural Information Processing Systems, pp. 1450–1458 (2015)
Li, S., Wang, B., Zhang, S., Chen, W.: Contextual combinatorial cascading bandits. In: Proceedings of The 33rd International Conference on Machine Learning, pp. 1245–1253 (2016)
Guo, F., Liu, C., Wang, Y.M.: Efficient multiple-click models in web search. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 124–131. ACM (2009)
Abbasi-Yadkori, Y., Pál, D., Szepesvári, C.: Improved algorithms for linear stochastic bandits. In: Advances in Neural Information Processing Systems, pp. 2312–2320 (2011)
Li, L., Chu, W., Langford, J., Moon, T., Wang, X.: An unbiased offline evaluation of contextual bandit algorithms with generalized linear models. In: Proceedings of the Workshop on On-line Trading of Exploration and Exploitation vol. 2, pp. 19–36 (2012)
Katariya, S., Kveton, B., Szepesvári, C., Wen, Z.: DCM bandits: learning to rank with multiple clicks. In: Proceedings of The 33rd International Conference on Machine Learning (2016)
Li, L., Lu, Y., Zhou, D.: Provable optimal algorithms for generalized linear contextual bandits. In: Proceedings of The 34rd International Conference on Machine Learning (2017)
Filippi, S., Cappe, O., Garivier, A., Szepesvári, C.: Parametric bandits: the generalized linear case. In: Advances in Neural Information Processing Systems, pp. 586–594 (2010)
Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31(2), 221–239 (1989)
Yandex: Yandex personalized web search challenge (2013)
Acknowledgment
This work is sponsored by Huawei Innovation Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Liu, W., Li, S., Zhang, S. (2018). Contextual Dependent Click Bandit Algorithm for Web Recommendation. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-94776-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94775-4
Online ISBN: 978-3-319-94776-1
eBook Packages: Computer ScienceComputer Science (R0)