skip to main content
research-article

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce

Published:09 August 2019Publication History
Skip Abstract Section

Abstract

This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-based predictors are provided to an ensemble learner. In addition, we detail the construction of a personalized predictor based on k-Nearest Neighbors (kNN), with temporal decay capabilities and event weighting. We show how to adapt Thompson Sampling to realistic situations when neither action availability nor reward stationarity is guaranteed. Furthermore, we investigate the effects of priming the sampler with pre-set parameters of reward probability distributions by utilizing the product catalog and/or event history, when such information is available. We report our experimental results based on the analysis of three real-world e-commerce datasets.

References

  1. Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, and Raghu Ramakrishnan. 2013. Content recommendation on web portals. Commun. ACM 56, 6 (June 2013), 92--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Eric Andrews. 2015. Recommender Systems for Online Dating. Master’s thesis. University of Helsinki.Google ScholarGoogle Scholar
  3. Jean-Yves Audibert and Sébastien Bubeck. 2010. Regret bounds and minimax policies under partial monitoring. J. Mach. Learn. Res. 11 (Dec. 2010), 2785--2836. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 2--3 (May 2002), 235--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Viktor A. Barger and Lauren Labrecque. 2013. An integrated marketing communications perspective on social media metrics. International Journal of Integrated Marketing Communications (May 2013), 31.Google ScholarGoogle Scholar
  6. David Ben-Shimon, Alexander Tsikinovsky, Michael Friedmann, Bracha Shapira, Lior Rokach, and Johannes Hoerle. 2015. RecSys challenge 2015 and the YOOCHOOSE dataset. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, 357--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lucas Bernardi, Jaap Kamps, Julia Kiseleva, and Melanie J. I. Müller. 2015. The continuous cold start problem in e-commerce recommender systems. CoRR abs/1508.01177 (June 2015).Google ScholarGoogle Scholar
  8. Djallel Bouneffouf and Raphael Féraud. 2016. Multi-armed bandit problem with known trend. Neurocomput. 205, C (Sept. 2016), 16--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphael Feraud, and Robin Allesiardo. 2014. Contextual bandit for active learning: Active Thompson Sampling. In Neural Information Processing: 21st International Conference, ICONIP 2014, Kuching, Malaysia, November 3--6, 2014. Proceedings, Part I, Chu Kiong Loo, Keem Siah Yap, Kok Wai Wong, Andrew Teoh, and Kaizhu Huang (Eds.). Springer International Publishing, 405--412.Google ScholarGoogle Scholar
  10. Björn Brodén, Mikael Hammar, Bengt J. Nilsson, and Dimitris Paraschakis. 2018. Ensemble recommendations via Thompson Sampling: An experimental study within e-commerce. In Proceedings of the 23rd International Conference on Intelligent User Interfaces (IUI’18). ACM, 19--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Giuseppe Burtini, Jason Loeppky, and Ramon Lawrence. 2015. Improving online marketing experiments with drifting multi-armed bandits. In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1 (ICEIS 2015). SCITEPRESS - Science and Technology Publications, Lda, 630--636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stéphane Caron and Smriti Bhagat. 2013. Mixing bandits: A recipe for improved cold-start recommendations in a social network. In Proceedings of the 7th Workshop on Social Network Mining and Analysis (SNAKDD’13). ACM, Article 11, 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa-Johnson, and Thomas S. Huang. 2017. Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). International World Wide Web Conferences Steering Committee, 381--389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of Thompson Sampling. In Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2249--2257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sotirios P. Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent latent variable networks for session-based recommendation. In Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems (DLRS 2017). ACM, 38--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Crícia Z. Felício, Klérisson V. R. Paixão, Celia A. Z. Barcelos, and Philippe Preux. 2017. A multi-armed bandit model selection for cold-start user recommendation. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization (UMAP’17). ACM, 32--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Aurélien Garivier and Olivier Cappé. 2011. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th Annual Conference on Learning Theory. 359--376.Google ScholarGoogle Scholar
  18. Aurélien Garivier and Eric Moulines. 2011. On upper-confidence bound policies for switching bandit problems. In Proceedings of the 22nd International Conference on Algorithmic Learning Theory (ALT’11). Springer-Verlag, 174--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Aditya Gopalan, Shie Mannor, and Yishay Mansour. 2014. Thompson sampling for complex online problems. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML’14). JMLR.org, I--100--I--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Negar Hariri, Bamshad Mobasher, and Robin Burke. 2015. Adapting to user preference changes in interactive recommendation. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 4268--4274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Cédric Hartland, Sylvain Gelly, Nicolas Baskiotis, Olivier Teytaud, and Michéle Sebag. 2006. Multi-armed bandit, dynamic environments and meta-bandits. In Online Trading of Exploration and Exploitation, NIPS 2006 Workshop.Google ScholarGoogle Scholar
  22. Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016. Parallel recurrent neural network architectures for feature-rich session-based recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys’16). ACM, 241--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Balázs Hidasi and Domonkos Tikk. 2013. Context-aware item-to-item recommendation within the factorization framework. In Proceedings of the 3rd Workshop on Context-awareness in Retrieval and Recommendation (CaRR’13). ACM, 19--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Chu-Cheng Hsieh, James Neufeld, Tracy King, and Junghoo Cho. 2015. Efficient approximate thompson sampling for search query recommendation. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC’15). ACM, 740--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Dietmar Jannach and Malte Ludewig. 2017. When recurrent neural networks meet the neighborhood for session-based recommendation. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 306--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Dietmar Jannach, Malte Ludewig, and Lukas Lerche. 2017. Session-based item recommendation in e-commerce: On short-term intents, reminders, trends and discounts. User Modeling and User-Adapted Interaction 27, 3 (01 Dec 2017), 351--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Iman Kamehkhosh, Dietmar Jannach, and Malte Ludewig. 2017. A comparison of frequent pattern techniques and a deep learning method for session-based recommendation. In TempRec Workshop at ACM RecSys.Google ScholarGoogle Scholar
  28. Varun Kanade, Brendan Mcmahan, and Bryan Brent. 2009. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, number 5. 272--279.Google ScholarGoogle Scholar
  29. Emilie Kaufmann, Nathaniel Korda, and Rémi Munos. 2012. Thompson Sampling: An asymptotically optimal finite-time analysis. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory (ALT’12). Springer-Verlag, 199--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jaya Kawale, Hung Bui, Branislav Kveton, Long Tran Thanh, and Sanjay Chawla. 2015. Efficient Thompson Sampling for online matrix-factorization recommendation. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1. MIT Press, 1297--1305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Robert Kleinberg, Alexandru Niculescu-Mizil, and Yogeshwer Sharma. 2010. Regret bounds for sleeping experts and bandits. Mach. Learn. 80, 2--3 (Sept. 2010), 245--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tomáš Kocák, Michal Valko, Rémi Munos, and Shipra Agrawal. 2014. Spectral Thompson Sampling. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI’14). AAAI Press, 1911--1917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrea E. Kohlhase and Michael Kohlhase. 2009. Semantic transparency in user assistance systems. In Proceedings of the 27th ACM International Conference on Design of Communication (SIGDOC’09). ACM, 89--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yehuda Koren and Robert Bell. 2011. Advances in Collaborative Filtering. Springer US, 145--186.Google ScholarGoogle Scholar
  35. Branislav Kveton, Csaba Szepesvári, Zheng Wen, and Azin Ashkan. 2015. Cascading bandits: Learning to rank in the cascade model. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML’15). JMLR.org, 767--776. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Anisio Lacerda. 2017. Multi-objective ranked bandits for recommender systems. Neurocomput. 246, C (July 2017), 12--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. L. Lai and Herbert Robbins. 1985. Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6, 1 (March 1985), 4--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM’17). ACM, 1419--1428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, 661--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Shuai Li, Alexandros Karatzoglou, and Claudio Gentile. 2016. Collaborative filtering bandits. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’16). ACM, 539--548. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shuai Li, Baoxiang Wang, Shengyu Zhang, and Wei Chen. 2016. Contextual combinatorial cascading bandits. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML’16). JMLR.org, 1245--1253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Andreas Lommatzsch and Sahin Albayrak. 2015. Real-time recommendations for user-item streams. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC’15). ACM, 1039--1046. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, and Sébastien Gerchinovitz. 2015. A multiple-play bandit algorithm applied to recommender systems. In Proceedings of the 28th International Florida Artificial Intelligence Research Society Conference, FLAIRS. 67--72.Google ScholarGoogle Scholar
  45. Jérémie Mary, Romaric Gaudel, and Philippe Preux. 2014. Bandits warm-up cold recommender systems. CoRR abs/1407.2806 (June 2014).Google ScholarGoogle Scholar
  46. Saeed Masoudnia and Reza Ebrahimpour. 2014. Mixture of experts: A literature survey. The Artificial Intelligence Review 42, 2 (2014), 275--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Joseph Mellor and Jonathan Shapiro. 2013. Thompson Sampling in switching environments with Bayesian online change point detection. In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics. 442--450.Google ScholarGoogle Scholar
  48. Nagarajan Natarajan, Donghyuk Shin, and Inderjit S. Dhillon. 2013. Which app will you use next?: Collaborative filtering with interactional context. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys’13). ACM, 201--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Minh-Quan Nguyen. 2014. Multi-Armed Bandit Problem and Its Applications in Intelligent Tutoring Systems. Master’s thesis. École Polytechnique.Google ScholarGoogle Scholar
  50. Dimitris Paraschakis, Bengt J. Nilsson, and John Holländer. 2015. Comparative evaluation of top-N recommenders in e-commerce: An industrial perspective. In Proceedings of the 14th International Conference on Machine Learning and Applications (ICMLA). IEEE, 1024-1031.Google ScholarGoogle ScholarCross RefCross Ref
  51. Massimo Quadrana. 2017. Algorithms for Sequence-Aware Recommender Systems. Ph.D. Dissertation. Politecnico di Milano.Google ScholarGoogle Scholar
  52. Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Comput. Surv. 51, 4, Article 66 (July 2018), 36 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Massimo Quadrana, Alexandros Karatzoglou, Balázs Hidasi, and Paolo Cremonesi. 2017. Personalizing session-based recommendations with hierarchical recurrent neural networks. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 130--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). ACM, 784--791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Aleksandrs Slivkins. 2014. Contextual bandits with similarity information. J. Mach. Learn. Res. 15, 1 (Jan. 2014), 2533--2568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. 2013. Ranked bandits in metric spaces: Learning diverse rankings over large document collections. J. Mach. Learn. Res. 14, 1 (Feb. 2013), 399--436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Matthew Streeter and Daniel Golovin. 2008. An online algorithm for maximizing submodular functions. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS’08). Curran Associates, Inc., 1577--1584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Liang Tang, Yexi Jiang, Lei Li, and Tao Li. 2014. Ensemble contextual bandits for personalized recommendation. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys’14). ACM, 73--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. William R. Thompson. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 3--4 (1933), 285--294.Google ScholarGoogle ScholarCross RefCross Ref
  60. Trinh Xuan Tuan and Tu Minh Phuong. 2017. 3D convolutional networks for session-based recommendation with content features. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 138--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Bartlomiej Twardowski. 2016. Modelling contextual information in session-aware recommender systems with neural networks. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys’16). ACM, 273--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Taishi Uchiya, Atsuyoshi Nakamura, and Mineichi Kudo. 2010. Algorithms for adversarial bandit problems with multiple plays. In Proceedings of the 21st International Conference on Algorithmic Learning Theory (ALT’10). Springer-Verlag, 375--389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. João Vinagre, Alípio Mário Jorge, and João Gama. 2014. Evaluation of recommender systems in streaming environments. In Proceedings of the ACM RecSys Workshop on Recommender Systems Evaluation: Dimensions and Design (REDD’14). ACM.Google ScholarGoogle Scholar
  64. Xinxi Wang, Yi Wang, David Hsu, and Ye Wang. 2014. Exploration in interactive personalized music recommendation: A reinforcement learning approach. ACM Trans. Multimedia Comput. Commun. Appl. 11, 1 (2014), 7:1--7:22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Chen Wu, Ming Yan, and Luo Si. 2017. Session-aware information embedding for e-commerce product recommendation. CoRR abs/1707.05955 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Xiaoxue Zhao, Weinan Zhang, and Jun Wang. 2013. Interactive collaborative filtering. In Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management (CIKM’13). ACM, 1411--1420. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM Transactions on Interactive Intelligent Systems
                  ACM Transactions on Interactive Intelligent Systems  Volume 10, Issue 1
                  Special Issue on IUI 2018
                  March 2020
                  347 pages
                  ISSN:2160-6455
                  EISSN:2160-6463
                  DOI:10.1145/3352585
                  Issue’s Table of Contents

                  Copyright © 2019 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 9 August 2019
                  • Accepted: 1 September 2018
                  • Revised: 1 August 2018
                  • Received: 1 May 2018
                  Published in tiis Volume 10, Issue 1

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article
                  • Research
                  • Refereed

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader

                HTML Format

                View this article in HTML Format .

                View HTML Format