skip to main content
10.1145/2623330.2623672acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Networked bandits with disjoint linear payoffs

Published:24 August 2014Publication History

ABSTRACT

In this paper, we study `networked bandits', a new bandit problem where a set of interrelated arms varies over time and, given the contextual information that selects one arm, invokes other correlated arms. This problem remains under-investigated, in spite of its applicability to many practical problems. For instance, in social networks, an arm can obtain payoffs from both the selected user and its relations since they often share the content through the network. We examine whether it is possible to obtain multiple payoffs from several correlated arms based on the relationships. In particular, we formalize the networked bandit problem and propose an algorithm that considers not only the selected arm, but also the relationships between arms. Our algorithm is `optimism in face of uncertainty' style, in that it decides an arm depending on integrated confidence sets constructed from historical data. We analyze the performance in simulation experiments and on two real-world offline datasets. The experimental results demonstrate our algorithm's effectiveness in the networked bandit setting.

Skip Supplemental Material Section

Supplemental Material

p1106-sidebyside.mp4

mp4

153.8 MB

References

  1. Y. Abbasi-Yadkori, C. Szepesvári, and D. Tax. Improved algorithms for linear stochastic bandits. In NIPS, pages 2312--2320, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social networks. In KDD, pages 7--15. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J.-Y. Audibert and S. Bubeck. Regret bounds and minimax policies under partial monitoring. JMLR, 9999:2785--2836, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J.-Y. Audibert, R. Munos, and C. Szepesvári. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci., 410(19):1876--1902, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Auer. Using confidence bounds for exploitation-exploration trade-offs. JMLR, 3:397--422, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2--3):235--256, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J COMPUT, 32(1):48--77, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Z. Bnaya, R. Puzis, R. Stern, and A. Felner. Social network search as a volatile multi-armed bandit problem. HUMAN, 2(2):pp--84, 2013.Google ScholarGoogle Scholar
  9. S. Bubeck and N. Cesa-Bianchi. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5(1):1--122, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. Buccapatnam, A. Eryilmaz, and N. B. Shroff. Multi-armed bandits in the presence of side observations in social networks. OSU, Tech. Rep, 2013.Google ScholarGoogle Scholar
  11. N. Cesa-Bianchi, C. Gentile, and G. Zappella. A gang of bandits. In NIPS, 2013.Google ScholarGoogle Scholar
  12. W. Chu, L. Li, L. Reyzin, and R. E. Schapire. Contextual bandits with linear payoff functions. In AISTAS, pages 208--214, 2011.Google ScholarGoogle Scholar
  13. D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri. Feedback effects between similarity and social influence in online communities. In KDD, pages 160--168. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V. Dani, T. P. Hayes, and S. M. Kakade. Stochastic linear optimization under bandit feedback. In COLT, pages 355--366, 2008.Google ScholarGoogle Scholar
  15. T. Iwata, A. Shah, and Z. Ghahramani. Discovering latent influence in online social activities via shared cascade poisson processes. In KDD, pages 266--274. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. ADV APPL PROBAB, 6(1):4--22, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In WWW, pages 661--670. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. A. Myers, C. Zhu, and J. Leskovec. Information diffusion and external influence in networks. In KDD, pages 33--41. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. H. Peña, T. L. Lai, and Q.-M. Shao. Self-normalized processes: Limit theory and Statistical Applications. Springer, 2008.Google ScholarGoogle Scholar
  20. H. Robbins. Some aspects of the sequential design of experiments. In Herbert Robbins Selected Papers, pages 169--177. Springer, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  21. P. Rusmevichientong and J. N. Tsitsiklis. Linearly parameterized bandits. MATH OPER RES, 35(2):395--411, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD, pages 807--816. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Networked bandits with disjoint linear payoffs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
          August 2014
          2028 pages
          ISBN:9781450329569
          DOI:10.1145/2623330

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 August 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader