research-article

Contextual Bandits in a Collaborative Environment

Authors:

Hongning WangAuthors Info & Claims

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 529 - 538

https://doi.org/10.1145/2911451.2911528

Published: 07 July 2016 Publication History

Abstract

Contextual bandit algorithms provide principled online learning solutions to find optimal trade-offs between exploration and exploitation with companion side-information. They have been extensively used in many important practical scenarios, such as display advertising and content recommendation. A common practice estimates the unknown bandit parameters pertaining to each user independently. This unfortunately ignores dependency among users and thus leads to suboptimal solutions, especially for the applications that have strong social components.

In this paper, we develop a collaborative contextual bandit algorithm, in which the adjacency graph among users is leveraged to share context and payoffs among neighboring users while online updating. We rigorously prove an improved upper regret bound of the proposed collaborative bandit algorithm comparing to conventional independent bandit algorithms. Extensive experiments on both synthetic and three large-scale real-world datasets verified the improvement of our proposed algorithm against several state-of-the-art contextual bandit algorithms.

References

[1]

Y. Abbasi-yadkori, D. Pál, and C. Szepesvári. Improved algorithms for linear stochastic bandits. In NIPS, pages 2312--2320. 2011.

Digital Library

[2]

K. Amin, M. Kearns, and U. Syed. Graphical models for bandit problems. Proceedings of UAI 2011, 2011.

[3]

P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397--422, 2002.

Digital Library

[4]

P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Mach. Learn., 47(2-3):235--256, May 2002.

Digital Library

[5]

P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on, pages 322--331, 1995.

[6]

D. Bouneffouf, A. Bouzeghoub, and A. L. Gançarski. A contextual-bandit algorithm for mobile context-aware recommender system. In Neural Information Processing, pages 324--331. 2012.

Digital Library

[7]

J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. Technical Report MSR-TR-98-12, Microsoft Research, May 1998.

[8]

S. Buccapatnam, A. Eryilmaz, and N. B. Shroff. Multi-armed bandits in the presence of side observations in social networks. In Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pages 7309--7314. IEEE, 2013.

[9]

S. Buccapatnam, A. Eryilmaz, and N. B. Shroff. Stochastic bandits with side observations on networks. SIGMETRICS Perform. Eval. Rev., 42(1):289--300, June 2014.

Digital Library

[10]

N. Cesa-Bianchi, C. Gentile, and G. Zappella. A gang of bandits. In Pro. NIPS, 2013.

Digital Library

[11]

O. Chapelle and L. Li. An empirical evaluation of thompson sampling. In Advances in Neural Information Processing Systems, pages 2249--2257, 2011.

Digital Library

[12]

W. Chu, L. Li, L. Reyzin, and R. E. Schapire. Contextual bandits with linear payoff functions. In International Conference on Artificial Intelligence and Statistics, pages 208--214, 2011.

[13]

R. B. Cialdini and M. R. Trost. Social influence: Social norms, conformity and compliance. 1998.

[14]

S. Filippi, O. Cappe, A. Garivier, and C. Szepesvári. Parametric bandits: The generalized linear case. In NIPS, pages 586--594, 2010.

Digital Library

[15]

C. Gentile, S. Li, and G. Zappella. Online clustering of bandits. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 757--765, 2014.

Digital Library

[16]

J. C. Gittins. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society. Series B (Methodological), pages 148--177, 1979.

[17]

S. Kar, H. Poor, and S. Cui. Bandit problems in networks: Asymptotically efficient distributed allocation rules. In Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on, pages 1771--1778, Dec 2011.

[18]

J. Kawale, H. H. Bui, B. Kveton, L. Tran-Thanh, and S. Chawla. Efficient thompson sampling for online matrix-factorization recommendation. In Advances in Neural Information Processing Systems, pages 1297--1305, 2015.

Digital Library

[19]

T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics, 6(1):4--22, 1985.

Digital Library

[20]

J. Langford and T. Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In NIPS, pages 817--824. 2008.

Digital Library

[21]

L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of 19th WWW, pages 661--670. ACM, 2010.

Digital Library

[22]

L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of 4th WSDM, pages 297--306. ACM, 2011.

Digital Library

[23]

W. Li, X. Wang, R. Zhang, Y. Cui, J. Mao, and R. Jin. Exploitation and exploration in a performance based contextual advertising system. In Proceedings of 16th SIGKDD, pages 27--36. ACM, 2010.

Digital Library

[24]

F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of 25th ICML, pages 784--791. ACM, 2008.

Digital Library

[25]

B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In Proceedings of 10th WWW, pages 285--295. ACM, 2001.

Digital Library

[26]

A. I. Schein, A. Popescul, L. H. Ungar, and D. M. Pennock. Methods and metrics for cold-start recommendations. In Proceedings of 25th SIGIR, pages 253--260. ACM, 2002.

Digital Library

[27]

A. Slivkins. Contextual bandits with similarity information. The Journal of Machine Learning Research, 15(1):2533--2568, 2014.

Digital Library

[28]

Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of 26th ICML, pages 1201--1208. ACM, 2009.

Digital Library

[29]

X. Zhao, W. Zhang, and J. Wang. Interactive collaborative filtering. In Proceedings of the 22nd CIKM, pages 1411--1420. ACM, 2013.

Digital Library

Cited By

Wang HCui LWang ELiu J(2025)A Differentially Private Approach for Budgeted Combinatorial Multi-Armed BanditsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.340183622:1(424-439)Online publication date: Jan-2025
https://doi.org/10.1109/TDSC.2024.3401836
Mohammadi ASoleimani AParvin S(2024)Dynamic Strategy Optimizer (DSO): Application In Enhancing New User Engagement in Hybrid Recommender System2024 IEEE 15th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)10.1109/UEMCON62879.2024.10754751(750-756)Online publication date: 17-Oct-2024
https://doi.org/10.1109/UEMCON62879.2024.10754751
Dai XWang ZXie JYu TLui J(2024)Online Learning and Detecting Corrupted Users for Conversational Recommendation SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.344825036:12(8939-8953)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3448250
Show More Cited By

Index Terms

Contextual Bandits in a Collaborative Environment

Recommendations

Collaborative Filtering Bandits
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and ...
Learning Hidden Features for Contextual Bandits
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Contextual bandit algorithms provide principled online learning solutions to find optimal trade-offs between exploration and exploitation with companion side-information. Most contextual bandit algorithms simply assume the learner would have access to ...
Ballooning Multi-Armed Bandits
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

We introduce ballooning multi-armed bandits (BL-MAB), a novel extension to the classical stochastic MAB model. In the BL-MAB model, the set of available arms grows (or balloons) over time. The regret in a BL-MAB setting is computed with respect to the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

July 2016

1296 pages

ISBN:9781450340694

DOI:10.1145/2911451

General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

SIGIR '16

Sponsor:

SIGIR

SIGIR '16: The 39th International ACM SIGIR conference on research and development in Information Retrieval

July 17 - 21, 2016

Pisa, Italy

Acceptance Rates

SIGIR '16 Paper Acceptance Rate 62 of 341 submissions, 18%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

68
Total Citations
View Citations
2,025
Total Downloads

Downloads (Last 12 months)134
Downloads (Last 6 weeks)5

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang HCui LWang ELiu J(2025)A Differentially Private Approach for Budgeted Combinatorial Multi-Armed BanditsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.340183622:1(424-439)Online publication date: Jan-2025
https://doi.org/10.1109/TDSC.2024.3401836
Mohammadi ASoleimani AParvin S(2024)Dynamic Strategy Optimizer (DSO): Application In Enhancing New User Engagement in Hybrid Recommender System2024 IEEE 15th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)10.1109/UEMCON62879.2024.10754751(750-756)Online publication date: 17-Oct-2024
https://doi.org/10.1109/UEMCON62879.2024.10754751
Dai XWang ZXie JYu TLui J(2024)Online Learning and Detecting Corrupted Users for Conversational Recommendation SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.344825036:12(8939-8953)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3448250
Dai XWang ZXie JLiu XLui J(2024)Conversational Recommendation With Online Learning and Clustering on Misspecified UsersIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342344236:12(7825-7838)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3423442
Mateos PBellogín A(2024)A systematic literature review of recent advances on context-aware recommender systemsArtificial Intelligence Review10.1007/s10462-024-10939-458:1Online publication date: 16-Nov-2024
https://doi.org/10.1007/s10462-024-10939-4
Wang ZXie JYu TLi SLui JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Online corrupted user detection and regret minimizationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667567(33262-33287)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667567
Xu RMin YWang TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Noise-adaptive thompson sampling for linear contextual banditsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667148(23630-23657)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667148
Wang CYe ZFeng ZBadanidiyuru AXu HOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Follow-ups also matterProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666684(12774-12796)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666684
Wang ZXie JLiu XLi SLui JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Online clustering of bandits with misspecified user modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666288(3785-3818)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666288
Ma DWang YMa JJin Q(2023)SGNR: A Social Graph Neural Network Based Interactive Recommendation Scheme for E-CommerceTsinghua Science and Technology10.26599/TST.2022.901005028:4(786-798)Online publication date: Aug-2023
https://doi.org/10.26599/TST.2022.9010050
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten