research-article

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce

Authors:
Björn Brodén

Apptus Technologies, Lund, Sweden

Apptus Technologies, Lund, Sweden
View Profile

,
Mikael Hammar

Apptus Technologies, Lund, Sweden

Apptus Technologies, Lund, Sweden
View Profile

,
Bengt J. Nilsson

Malmö University, Sweden

Malmö University, Sweden
View Profile

,
Dimitris Paraschakis

Malmö University, Sweden

Malmö University, Sweden
View Profile

ACM Transactions on Interactive Intelligent Systems Volume 10 Issue 1Article No.: 4pp 1–32https://doi.org/10.1145/3237187

Published:09 August 2019Publication History

ACM Transactions on Interactive Intelligent Systems

Abstract

This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-based predictors are provided to an ensemble learner. In addition, we detail the construction of a personalized predictor based on k-Nearest Neighbors (kNN), with temporal decay capabilities and event weighting. We show how to adapt Thompson Sampling to realistic situations when neither action availability nor reward stationarity is guaranteed. Furthermore, we investigate the effects of priming the sampler with pre-set parameters of reward probability distributions by utilizing the product catalog and/or event history, when such information is available. We report our experimental results based on the analysis of three real-world e-commerce datasets.

References

Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, and Raghu Ramakrishnan. 2013. Content recommendation on web portals. Commun. ACM 56, 6 (June 2013), 92--101. Google ScholarDigital Library
Eric Andrews. 2015. Recommender Systems for Online Dating. Master’s thesis. University of Helsinki.Google Scholar
Jean-Yves Audibert and Sébastien Bubeck. 2010. Regret bounds and minimax policies under partial monitoring. J. Mach. Learn. Res. 11 (Dec. 2010), 2785--2836. Google ScholarDigital Library
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 2--3 (May 2002), 235--256. Google ScholarDigital Library
Viktor A. Barger and Lauren Labrecque. 2013. An integrated marketing communications perspective on social media metrics. International Journal of Integrated Marketing Communications (May 2013), 31.Google Scholar
David Ben-Shimon, Alexander Tsikinovsky, Michael Friedmann, Bracha Shapira, Lior Rokach, and Johannes Hoerle. 2015. RecSys challenge 2015 and the YOOCHOOSE dataset. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, 357--358. Google ScholarDigital Library
Lucas Bernardi, Jaap Kamps, Julia Kiseleva, and Melanie J. I. Müller. 2015. The continuous cold start problem in e-commerce recommender systems. CoRR abs/1508.01177 (June 2015).Google Scholar
Djallel Bouneffouf and Raphael Féraud. 2016. Multi-armed bandit problem with known trend. Neurocomput. 205, C (Sept. 2016), 16--21. Google ScholarDigital Library
Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphael Feraud, and Robin Allesiardo. 2014. Contextual bandit for active learning: Active Thompson Sampling. In Neural Information Processing: 21st International Conference, ICONIP 2014, Kuching, Malaysia, November 3--6, 2014. Proceedings, Part I, Chu Kiong Loo, Keem Siah Yap, Kok Wai Wong, Andrew Teoh, and Kaizhu Huang (Eds.). Springer International Publishing, 405--412.Google Scholar
Björn Brodén, Mikael Hammar, Bengt J. Nilsson, and Dimitris Paraschakis. 2018. Ensemble recommendations via Thompson Sampling: An experimental study within e-commerce. In Proceedings of the 23rd International Conference on Intelligent User Interfaces (IUI’18). ACM, 19--29. Google ScholarDigital Library
Giuseppe Burtini, Jason Loeppky, and Ramon Lawrence. 2015. Improving online marketing experiments with drifting multi-armed bandits. In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1 (ICEIS 2015). SCITEPRESS - Science and Technology Publications, Lda, 630--636. Google ScholarDigital Library
Stéphane Caron and Smriti Bhagat. 2013. Mixing bandits: A recipe for improved cold-start recommendations in a social network. In Proceedings of the 7th Workshop on Social Network Mining and Analysis (SNAKDD’13). ACM, Article 11, 9 pages. Google ScholarDigital Library
Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa-Johnson, and Thomas S. Huang. 2017. Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). International World Wide Web Conferences Steering Committee, 381--389. Google ScholarDigital Library
Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of Thompson Sampling. In Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2249--2257. Google ScholarDigital Library
Sotirios P. Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent latent variable networks for session-based recommendation. In Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems (DLRS 2017). ACM, 38--45. Google ScholarDigital Library
Crícia Z. Felício, Klérisson V. R. Paixão, Celia A. Z. Barcelos, and Philippe Preux. 2017. A multi-armed bandit model selection for cold-start user recommendation. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization (UMAP’17). ACM, 32--40. Google ScholarDigital Library
Aurélien Garivier and Olivier Cappé. 2011. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th Annual Conference on Learning Theory. 359--376.Google Scholar
Aurélien Garivier and Eric Moulines. 2011. On upper-confidence bound policies for switching bandit problems. In Proceedings of the 22nd International Conference on Algorithmic Learning Theory (ALT’11). Springer-Verlag, 174--188. Google ScholarDigital Library
Aditya Gopalan, Shie Mannor, and Yishay Mansour. 2014. Thompson sampling for complex online problems. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML’14). JMLR.org, I--100--I--108. Google ScholarDigital Library
Negar Hariri, Bamshad Mobasher, and Robin Burke. 2015. Adapting to user preference changes in interactive recommendation. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 4268--4274. Google ScholarDigital Library
Cédric Hartland, Sylvain Gelly, Nicolas Baskiotis, Olivier Teytaud, and Michéle Sebag. 2006. Multi-armed bandit, dynamic environments and meta-bandits. In Online Trading of Exploration and Exploitation, NIPS 2006 Workshop.Google Scholar
Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016. Parallel recurrent neural network architectures for feature-rich session-based recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys’16). ACM, 241--248. Google ScholarDigital Library
Balázs Hidasi and Domonkos Tikk. 2013. Context-aware item-to-item recommendation within the factorization framework. In Proceedings of the 3rd Workshop on Context-awareness in Retrieval and Recommendation (CaRR’13). ACM, 19--25. Google ScholarDigital Library
Chu-Cheng Hsieh, James Neufeld, Tracy King, and Junghoo Cho. 2015. Efficient approximate thompson sampling for search query recommendation. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC’15). ACM, 740--746. Google ScholarDigital Library
Dietmar Jannach and Malte Ludewig. 2017. When recurrent neural networks meet the neighborhood for session-based recommendation. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 306--310. Google ScholarDigital Library
Dietmar Jannach, Malte Ludewig, and Lukas Lerche. 2017. Session-based item recommendation in e-commerce: On short-term intents, reminders, trends and discounts. User Modeling and User-Adapted Interaction 27, 3 (01 Dec 2017), 351--392. Google ScholarDigital Library
Iman Kamehkhosh, Dietmar Jannach, and Malte Ludewig. 2017. A comparison of frequent pattern techniques and a deep learning method for session-based recommendation. In TempRec Workshop at ACM RecSys.Google Scholar
Varun Kanade, Brendan Mcmahan, and Bryan Brent. 2009. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, number 5. 272--279.Google Scholar
Emilie Kaufmann, Nathaniel Korda, and Rémi Munos. 2012. Thompson Sampling: An asymptotically optimal finite-time analysis. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory (ALT’12). Springer-Verlag, 199--213. Google ScholarDigital Library
Jaya Kawale, Hung Bui, Branislav Kveton, Long Tran Thanh, and Sanjay Chawla. 2015. Efficient Thompson Sampling for online matrix-factorization recommendation. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1. MIT Press, 1297--1305. Google ScholarDigital Library
Robert Kleinberg, Alexandru Niculescu-Mizil, and Yogeshwer Sharma. 2010. Regret bounds for sleeping experts and bandits. Mach. Learn. 80, 2--3 (Sept. 2010), 245--272. Google ScholarDigital Library
Tomáš Kocák, Michal Valko, Rémi Munos, and Shipra Agrawal. 2014. Spectral Thompson Sampling. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI’14). AAAI Press, 1911--1917. Google ScholarDigital Library
Andrea E. Kohlhase and Michael Kohlhase. 2009. Semantic transparency in user assistance systems. In Proceedings of the 27th ACM International Conference on Design of Communication (SIGDOC’09). ACM, 89--96. Google ScholarDigital Library
Yehuda Koren and Robert Bell. 2011. Advances in Collaborative Filtering. Springer US, 145--186.Google Scholar
Branislav Kveton, Csaba Szepesvári, Zheng Wen, and Azin Ashkan. 2015. Cascading bandits: Learning to rank in the cascade model. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML’15). JMLR.org, 767--776. Google ScholarDigital Library
Anisio Lacerda. 2017. Multi-objective ranked bandits for recommender systems. Neurocomput. 246, C (July 2017), 12--24. Google ScholarDigital Library
T. L. Lai and Herbert Robbins. 1985. Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6, 1 (March 1985), 4--22. Google ScholarDigital Library
Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM’17). ACM, 1419--1428. Google ScholarDigital Library
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, 661--670. Google ScholarDigital Library
Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, 297--306. Google ScholarDigital Library
Shuai Li, Alexandros Karatzoglou, and Claudio Gentile. 2016. Collaborative filtering bandits. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’16). ACM, 539--548. Google ScholarDigital Library
Shuai Li, Baoxiang Wang, Shengyu Zhang, and Wei Chen. 2016. Contextual combinatorial cascading bandits. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML’16). JMLR.org, 1245--1253. Google ScholarDigital Library
Andreas Lommatzsch and Sahin Albayrak. 2015. Real-time recommendations for user-item streams. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC’15). ACM, 1039--1046. Google ScholarDigital Library
Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, and Sébastien Gerchinovitz. 2015. A multiple-play bandit algorithm applied to recommender systems. In Proceedings of the 28th International Florida Artificial Intelligence Research Society Conference, FLAIRS. 67--72.Google Scholar
Jérémie Mary, Romaric Gaudel, and Philippe Preux. 2014. Bandits warm-up cold recommender systems. CoRR abs/1407.2806 (June 2014).Google Scholar
Saeed Masoudnia and Reza Ebrahimpour. 2014. Mixture of experts: A literature survey. The Artificial Intelligence Review 42, 2 (2014), 275--293. Google ScholarDigital Library
Joseph Mellor and Jonathan Shapiro. 2013. Thompson Sampling in switching environments with Bayesian online change point detection. In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics. 442--450.Google Scholar
Nagarajan Natarajan, Donghyuk Shin, and Inderjit S. Dhillon. 2013. Which app will you use next?: Collaborative filtering with interactional context. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys’13). ACM, 201--208. Google ScholarDigital Library
Minh-Quan Nguyen. 2014. Multi-Armed Bandit Problem and Its Applications in Intelligent Tutoring Systems. Master’s thesis. École Polytechnique.Google Scholar
Dimitris Paraschakis, Bengt J. Nilsson, and John Holländer. 2015. Comparative evaluation of top-N recommenders in e-commerce: An industrial perspective. In Proceedings of the 14th International Conference on Machine Learning and Applications (ICMLA). IEEE, 1024-1031.Google ScholarCross Ref
Massimo Quadrana. 2017. Algorithms for Sequence-Aware Recommender Systems. Ph.D. Dissertation. Politecnico di Milano.Google Scholar
Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Comput. Surv. 51, 4, Article 66 (July 2018), 36 pages. Google ScholarDigital Library
Massimo Quadrana, Alexandros Karatzoglou, Balázs Hidasi, and Paolo Cremonesi. 2017. Personalizing session-based recommendations with hierarchical recurrent neural networks. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 130--137. Google ScholarDigital Library
Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). ACM, 784--791. Google ScholarDigital Library
Aleksandrs Slivkins. 2014. Contextual bandits with similarity information. J. Mach. Learn. Res. 15, 1 (Jan. 2014), 2533--2568. Google ScholarDigital Library
Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. 2013. Ranked bandits in metric spaces: Learning diverse rankings over large document collections. J. Mach. Learn. Res. 14, 1 (Feb. 2013), 399--436. Google ScholarDigital Library
Matthew Streeter and Daniel Golovin. 2008. An online algorithm for maximizing submodular functions. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS’08). Curran Associates, Inc., 1577--1584. Google ScholarDigital Library
Liang Tang, Yexi Jiang, Lei Li, and Tao Li. 2014. Ensemble contextual bandits for personalized recommendation. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys’14). ACM, 73--80. Google ScholarDigital Library
William R. Thompson. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 3--4 (1933), 285--294.Google ScholarCross Ref
Trinh Xuan Tuan and Tu Minh Phuong. 2017. 3D convolutional networks for session-based recommendation with content features. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys’17). ACM, 138--146. Google ScholarDigital Library
Bartlomiej Twardowski. 2016. Modelling contextual information in session-aware recommender systems with neural networks. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys’16). ACM, 273--276. Google ScholarDigital Library
Taishi Uchiya, Atsuyoshi Nakamura, and Mineichi Kudo. 2010. Algorithms for adversarial bandit problems with multiple plays. In Proceedings of the 21st International Conference on Algorithmic Learning Theory (ALT’10). Springer-Verlag, 375--389. Google ScholarDigital Library
João Vinagre, Alípio Mário Jorge, and João Gama. 2014. Evaluation of recommender systems in streaming environments. In Proceedings of the ACM RecSys Workshop on Recommender Systems Evaluation: Dimensions and Design (REDD’14). ACM.Google Scholar
Xinxi Wang, Yi Wang, David Hsu, and Ye Wang. 2014. Exploration in interactive personalized music recommendation: A reinforcement learning approach. ACM Trans. Multimedia Comput. Commun. Appl. 11, 1 (2014), 7:1--7:22. Google ScholarDigital Library
Chen Wu, Ming Yan, and Luo Si. 2017. Session-aware information embedding for e-commerce product recommendation. CoRR abs/1707.05955 (2017). Google ScholarDigital Library
Xiaoxue Zhao, Weinan Zhang, and Jun Wang. 2013. Interactive collaborative filtering. In Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management (CIKM’13). ACM, 1411--1420. Google ScholarDigital Library

Index Terms

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce
1. Computing methodologies
  1. Machine learning
2. Information systems

Recommendations

Ensemble Recommendations via Thompson Sampling: an Experimental Study within e-Commerce
IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces

This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-...
Read More
A User Trust-Based Collaborative Filtering Recommendation Algorithm
Information and Communications Security
Abstract
Due to the open nature of collaborative recommender systems, they can not effectively prevent malicious users from injecting fake profile data into the ratings database, which can significantly bias the system’s output. With this problem in mind, ...
Read More
A simple multi-armed nearest-neighbor bandit for interactive recommendation
RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems

The cyclic nature of the recommendation task is being increasingly taken into account in recommender systems research. In this line, framing interactive recommendation as a genuine reinforcement learning problem, multi-armed bandit approaches have been ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Interactive Intelligent Systems Volume 10, Issue 1
Special Issue on IUI 2018
March 2020
347 pages
ISSN:2160-6455
EISSN:2160-6463
DOI:10.1145/3352585
Editor:
Michelle X. Zhou
Juji, Inc., USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 August 2019
- Accepted: 1 September 2018
- Revised: 1 August 2018
- Received: 1 May 2018
Published in tiis Volume 10, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
E-commerce recommender systems
Thompson Sampling
multi-arm bandit ensembles
reinforcement learning
session-based recommendations
streaming recommendations
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 353
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce

ACM Transactions on Interactive Intelligent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Ensemble Recommendations via Thompson Sampling: an Experimental Study within e-Commerce

A User Trust-Based Collaborative Filtering Recommendation Algorithm

A simple multi-armed nearest-neighbor bandit for interactive recommendation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components: An Experimental Study within E-Commerce

ACM Transactions on Interactive Intelligent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Ensemble Recommendations via Thompson Sampling: an Experimental Study within e-Commerce

A User Trust-Based Collaborative Filtering Recommendation Algorithm

A simple multi-armed nearest-neighbor bandit for interactive recommendation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media