skip to main content
research-article

Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

Published: 04 September 2014 Publication History

Abstract

Current music recommender systems typically act in a greedy manner by recommending songs with the highest user ratings. Greedy recommendation, however, is suboptimal over the long term: it does not actively gather information on user preferences and fails to recommend novel songs that are potentially interesting. A successful recommender system must balance the needs to explore user preferences and to exploit this information for recommendation. This article presents a new approach to music recommendation by formulating this exploration-exploitation trade-off as a reinforcement learning task. To learn user preferences, it uses a Bayesian model that accounts for both audio content and the novelty of recommendations. A piecewise-linear approximation to the model and a variational inference algorithm help to speed up Bayesian inference. One additional benefit of our approach is a single unified model for both music recommendation and playlist generation. We demonstrate the strong potential of the proposed approach with simulation results and a user study.

Supplementary Material

a7-wang-apndx.pdf (wang.zip)
Supplemental movie, appendix, image and software files for, Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

References

[1]
S. Agrawal and N. Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Proceedings of the 25th Annual Conference on Learning Theory (COLT'12).
[2]
N. Aizenberg, Y. Koren, and O. Somekh. 2012. Build your own music recommender by modeling internet radio streams. In Proceedings of the 21st International Conference on World Wide Web (WWW'12). ACM Press, New York, 1--10.
[3]
P. Auer. 2003. Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397--422.
[4]
M. Braunhofer, M. Kaminskas, and F. Ricci. 2013. Location-aware music recommendation. Int. J. Multimedia Inf. Retr. 2, 1, 31--44.
[5]
R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 5, 1190--1208.
[6]
P. Cano, M. Koppenberger, and N. Wack. 2005. Content-based music audio recommendation. In Proceedings of the 13th Annual ACM International Conference on Multimedia (MM'05). ACM Press, New York, 211--212.
[7]
P. Castells, S. Vargas, and J. Wang. 2011. Novelty and diversity metrics for recommender systems: Choice, discovery and relevance. In Proceedings of the International Workshop on Diversity in Document Retrieval (DDR'11) at the 33rd European Conference on Information Retrieval (ECIR'11). 29--36.
[8]
H. C. Chen and A. L. P. Chen. 2001. A music recommendation system based on music data grouping and user interests. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM'01). ACM Press, New York, 231--238.
[9]
S. Chen, J. L. Moore, D. Turnbull, and T. Joachims. 2012. Playlist prediction via metric embedding. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). 714--722.
[10]
X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM'13). ACM Press, New York, 193--202.
[11]
Z. Cheng and J. Shen. 2014. Just-for-me: An adaptive personalization system for location-aware social music recommendation. In Proceedings of International Conference on Multimedia Retrieval (ICMR'14). 185.
[12]
C. Y. Chi, R. T. H. Tsai, J. Y. Lai, and J. Y. Jen Hsu. 2010. A reinforcement learning approach to emotion-based automatic playlist generation. In Proceedings of the International Conference on Technologies and Applications of Artificial Intelligence. IEEE Computer Society, 60--65.
[13]
H. Ebbinghaus. 1913. Memory: A Contribution to Experimental Psychology. Educational reprints. Teachers College, Columbia University.
[14]
D. Eck, P. Lamere, T. Bertin-Mahieux, and S. Green. 2007. Automatic generation of social tags for music recommendation. In Proceedings of the Neural Information Processing Systems Conference (NIPS'07). Vol. 20.
[15]
N. Friedman and D. Koller. 2009. Probabilistic Graphical Models: Principles and Techniques 1st Ed. The MIT Press.
[16]
N. Golovin and E. Rahm. 2004. Reinforcement learning architecture for web recommendations. In Proceedings of the International Conference on Information Technology: Coding and Computing. Vol. 1, IEEE, 398--402.
[17]
A. Gunawardana and G. Shani. 2009. A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res 10, 2935--2962.
[18]
N. Hariri, B. Mobasher, and R. Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6th ACM Conference on Recommender Systems (RecSys'12). ACM Press, New York, 131--138.
[19]
T. Hastie, R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Springer.
[20]
Y. Hu and M. Ogihara. 2011. Nextone player: A music recommendation system based on user behavior. In Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR'11).
[21]
T. Joachims, D. Freitag, and T. Mitchell. 1997. WebWatcher: A tour guide for the world wide web. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI'97). 770--777.
[22]
M. Kaminskas, F. Ricci, and M. Schedl. 2013. Location-aware music recommendation using auto-tagging and hybrid matching. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys'13). ACM Press, New York, 17--24.
[23]
R. Karimi, C. Freudenthaler, A. Nanopoulos, and L. Schmidt-Thieme. 2011. Towards optimal active learning for matrix factorization in recommender systems. In Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI'11). 1069--1076.
[24]
E. Kaufmann, O. Cappé, and A. Garivier. 2012. On bayesian upper confidence bounds for bandit problems. J. Mach. Learn. Res. Proc. Track 22, 592--600.
[25]
P. Knees and M. Schedl. 2013. A survey of music similarity and recommendation from music context data. ACM Trans. Multimedia Comput. Comm. Appl. 10, 1.
[26]
Y. Koren, R. Bell, and C. Volinsky. 2009. Matrix factorization techniques for recommender systems. Comput. 42, 8, 30--37.
[27]
N. Lathia, S. Hailes, L. Capra, and X. Amatriain. 2010. Temporal diversity in recommender systems. In Proceedings of the 33rd International ACM SIGIR Conference (SIGIR'10). ACM Press, New York, 210--217.
[28]
L. Li, W. Chu, J. Langford, and R. E. Schapire. 2012. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 661--670.
[29]
L. Li, W. Chu, J. Langford, and X. Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM'11). ACM Press, New York, 297--306.
[30]
E. Liebman and P. Stone. 2014. Dj-mc: A reinforcement-learning agent for music playlist recommendation. http://arxiv.org/pdf/1401.1880.pdf
[31]
H. Liu, J. Hu, and M. Rauterberg. 2009. Music playlist recommendation based on user heartbeat and music preference. In Proceedings of the International Conference on Computer Technology and Development. Vol. 1, 545--549.
[32]
L. Liu, H. Xu, J. Xing, S. Liu, X. Zhou, and S. Yan. 2013. Wow! You are so beautiful today! In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). 3--12.
[33]
B. Logan. 2002. Content-based playlist generation: Exploratory experiments. In Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR'02). 295--296.
[34]
D. P. Mackinnon, M. S. Fritz, J. Williams, and C. M. Lockwood. 2007. Distribution of the product confidence limits for the indirect effect: Program prodclin. Behav. Res. Methods 39, 3, 384--389.
[35]
B. C. May, N. Korda, A. Lee, and D. S. Leslie. 2012. Optimistic bayesian sampling in contextual-bandit problems. J. Mach. Learn. Res. 13, 1, 2069--2106.
[36]
M. E. J. Newman. 2005. Power laws, pareto distributions and zipf's law. Contemp. Phys. 46, 5, 323--351.
[37]
A. V. D. Oord, S. Dieleman, and B. Schrauwen. 2013. Deep content-based music recommendation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'13). 2643--2651.
[38]
R. Salakhutdinov and A. Mnih. 2008. Bayesian probabilistic matrix factorization using markov chain monte carlo. In Proceedings of the 25th International Conference on Machine Learning (ICML'08). ACM Press, New York, 880--887.
[39]
M. Schedl and D. Schnitzer. 2014. Location-aware music artist recommendation. In Proceedings of the 20th International Conference on MultiMedia Modeling (MMM'14).
[40]
G. Shani, D. Heckerman, and R. I. Brafman. 2005. An mdp-based recommender system. J. Mach. Learn. Res 6, 1265--1295.
[41]
J. Shen, X. S. Hua, and E. Sargin. 2013. Towards next generation multimedia recommendation systems. In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). ACM Press, New York, 1109--1110.
[42]
J. Silva and L. Carin. 2012. Active learning for online bayesian matrix factorization. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). ACM Press, New York, 325--333.
[43]
Y. Song, S. Dixon, and M. Pearce. 2012. A survey of music recommendation systems and future perspectives. In Proceedings of the 9th International Symposium on Computer Music Modelling and Retrieval (CMMR'12).
[44]
A. Srivihok and P. Sukonmanee. 2005. E-commerce intelligent agent: Personalization travel support agent using q learning. In Proceedings of the 7th International Conference on Electronic Commerce (ICEC'05). ACM Press, New York, 287--292.
[45]
R. S. Sutton and A. G. Barto. 1998. Reinforcement Learning: An Introduction. Bradford.
[46]
C. Szepesvári. 2010. Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning Series. Vol. 4, Morgan and Claypool, San Rafael, CA.
[47]
N. Taghipour and A. Kardan. 2008. A hybrid web recommender system based on q-learning. In Proceedings of the ACM Symposium on Applied Computing (SAC'08). 1164--1168.
[48]
X. Wang, D. Rosenblum, and Y. Wang. 2012. Context-aware mobile music recommendation for daily activities. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM Press, New York, 99--108.
[49]
K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno. 2006. Hybrid collaborative and content-based music recommendation using probabilistic model with latent user preferences. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR'06). 296--301.
[50]
B.-T. Zhang and Y.-W. Seo. 2001. Personalized web-document filtering using reinforcement learning. Appl. Artif. Intell. 15, 665--685.
[51]
Y. C. Zhang, Diarmuid, D. Quercia, and T. Jambor. 2012. Auralist: Introducing serendipity into music recommendation. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM'12). ACM Press, New York, 13--22.
[52]
E. Zheleva, J. Guiver, E. M. Rodrigues, and N. M. Frayling. 2010. Statistical models of music-listening sessions in social media. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 1019--1028.

Cited By

View all
  • (2025)FareIQ: Intelligent Fare Optimization for Cab Drivers Using Reinforcement LearningInnovations in Electrical and Electronics Engineering10.1007/978-981-97-9112-5_34(573-588)Online publication date: 31-Jan-2025
  • (2025)Examining Policy Entropy of Reinforcement Learning Agents for Personalization TasksPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_33(493-504)Online publication date: 8-Feb-2025
  • (2024)Simplifying Digital Streaming: An Innovative Cross-Platform Streaming Solution2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726004(1-6)Online publication date: 24-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1
August 2014
151 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2665935
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2014
Accepted: 01 May 2014
Revised: 01 May 2014
Received: 01 November 2013
Published in TOMM Volume 11, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Recommender systems
  2. application
  3. machine learning
  4. model
  5. music

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)114
  • Downloads (Last 6 weeks)20
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)FareIQ: Intelligent Fare Optimization for Cab Drivers Using Reinforcement LearningInnovations in Electrical and Electronics Engineering10.1007/978-981-97-9112-5_34(573-588)Online publication date: 31-Jan-2025
  • (2025)Examining Policy Entropy of Reinforcement Learning Agents for Personalization TasksPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_33(493-504)Online publication date: 8-Feb-2025
  • (2024)Simplifying Digital Streaming: An Innovative Cross-Platform Streaming Solution2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726004(1-6)Online publication date: 24-Jun-2024
  • (2024)Reinforcement learning for addressing the cold-user problem in recommender systemsKnowledge-Based Systems10.1016/j.knosys.2024.111752(111752)Online publication date: Apr-2024
  • (2023)Audio-Based Sequential Music Recommendation2023 31st European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO58844.2023.10290094(421-425)Online publication date: 4-Sep-2023
  • (2023)Efficient Exploration and Exploitation for Sequential Music RecommendationACM Transactions on Recommender Systems10.1145/3625827Online publication date: 27-Sep-2023
  • (2023)Generative Adversarial Reward Learning for Generalized Behavior Tendency InferenceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.318692035:10(9878-9889)Online publication date: 1-Oct-2023
  • (2023)Smart Song Recommendation System using Machine Learning2023 9th International Conference on Signal Processing and Communication (ICSC)10.1109/ICSC60394.2023.10440988(609-614)Online publication date: 21-Dec-2023
  • (2023)Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00197(1551-1558)Online publication date: 4-Dec-2023
  • (2023)Understanding users music listening habits for time and activity sensitive customized playlists2023 IEEE 20th Consumer Communications & Networking Conference (CCNC)10.1109/CCNC51644.2023.10060462(485-488)Online publication date: 8-Jan-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media