research-article

Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

Authors:

Ye WangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 11, Issue 1

Article No.: 7, Pages 1 - 22

https://doi.org/10.1145/2623372

Published: 04 September 2014 Publication History

Abstract

Current music recommender systems typically act in a greedy manner by recommending songs with the highest user ratings. Greedy recommendation, however, is suboptimal over the long term: it does not actively gather information on user preferences and fails to recommend novel songs that are potentially interesting. A successful recommender system must balance the needs to explore user preferences and to exploit this information for recommendation. This article presents a new approach to music recommendation by formulating this exploration-exploitation trade-off as a reinforcement learning task. To learn user preferences, it uses a Bayesian model that accounts for both audio content and the novelty of recommendations. A piecewise-linear approximation to the model and a variational inference algorithm help to speed up Bayesian inference. One additional benefit of our approach is a single unified model for both music recommendation and playlist generation. We demonstrate the strong potential of the proposed approach with simulation results and a user study.

Supplementary Material

a7-wang-apndx.pdf (wang.zip)

Supplemental movie, appendix, image and software files for, Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

Download
52.34 KB

References

[1]

S. Agrawal and N. Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Proceedings of the 25^th Annual Conference on Learning Theory (COLT'12).

[2]

N. Aizenberg, Y. Koren, and O. Somekh. 2012. Build your own music recommender by modeling internet radio streams. In Proceedings of the 21^st International Conference on World Wide Web (WWW'12). ACM Press, New York, 1--10.

Digital Library

[3]

P. Auer. 2003. Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397--422.

Digital Library

[4]

M. Braunhofer, M. Kaminskas, and F. Ricci. 2013. Location-aware music recommendation. Int. J. Multimedia Inf. Retr. 2, 1, 31--44.

[5]

R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 5, 1190--1208.

Digital Library

[6]

P. Cano, M. Koppenberger, and N. Wack. 2005. Content-based music audio recommendation. In Proceedings of the 13^th Annual ACM International Conference on Multimedia (MM'05). ACM Press, New York, 211--212.

Digital Library

[7]

P. Castells, S. Vargas, and J. Wang. 2011. Novelty and diversity metrics for recommender systems: Choice, discovery and relevance. In Proceedings of the International Workshop on Diversity in Document Retrieval (DDR'11) at the 33^rd European Conference on Information Retrieval (ECIR'11). 29--36.

[8]

H. C. Chen and A. L. P. Chen. 2001. A music recommendation system based on music data grouping and user interests. In Proceedings of the 10^th International Conference on Information and Knowledge Management (CIKM'01). ACM Press, New York, 231--238.

Digital Library

[9]

S. Chen, J. L. Moore, D. Turnbull, and T. Joachims. 2012. Playlist prediction via metric embedding. In Proceedings of the 18^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). 714--722.

Digital Library

[10]

X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the 6^th ACM International Conference on Web Search and Data Mining (WSDM'13). ACM Press, New York, 193--202.

Digital Library

[11]

Z. Cheng and J. Shen. 2014. Just-for-me: An adaptive personalization system for location-aware social music recommendation. In Proceedings of International Conference on Multimedia Retrieval (ICMR'14). 185.

Digital Library

[12]

C. Y. Chi, R. T. H. Tsai, J. Y. Lai, and J. Y. Jen Hsu. 2010. A reinforcement learning approach to emotion-based automatic playlist generation. In Proceedings of the International Conference on Technologies and Applications of Artificial Intelligence. IEEE Computer Society, 60--65.

Digital Library

[13]

H. Ebbinghaus. 1913. Memory: A Contribution to Experimental Psychology. Educational reprints. Teachers College, Columbia University.

[14]

D. Eck, P. Lamere, T. Bertin-Mahieux, and S. Green. 2007. Automatic generation of social tags for music recommendation. In Proceedings of the Neural Information Processing Systems Conference (NIPS'07). Vol. 20.

[15]

N. Friedman and D. Koller. 2009. Probabilistic Graphical Models: Principles and Techniques 1^st Ed. The MIT Press.

Digital Library

[16]

N. Golovin and E. Rahm. 2004. Reinforcement learning architecture for web recommendations. In Proceedings of the International Conference on Information Technology: Coding and Computing. Vol. 1, IEEE, 398--402.

Digital Library

[17]

A. Gunawardana and G. Shani. 2009. A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res 10, 2935--2962.

Digital Library

[18]

N. Hariri, B. Mobasher, and R. Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6^th ACM Conference on Recommender Systems (RecSys'12). ACM Press, New York, 131--138.

Digital Library

[19]

T. Hastie, R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Springer.

[20]

Y. Hu and M. Ogihara. 2011. Nextone player: A music recommendation system based on user behavior. In Proceedings of the 12^th International Society for Music Information Retrieval Conference (ISMIR'11).

[21]

T. Joachims, D. Freitag, and T. Mitchell. 1997. WebWatcher: A tour guide for the world wide web. In Proceedings of the 15^th International Joint Conference on Artificial Intelligence (IJCAI'97). 770--777.

[22]

M. Kaminskas, F. Ricci, and M. Schedl. 2013. Location-aware music recommendation using auto-tagging and hybrid matching. In Proceedings of the 7^th ACM Conference on Recommender Systems (RecSys'13). ACM Press, New York, 17--24.

Digital Library

[23]

R. Karimi, C. Freudenthaler, A. Nanopoulos, and L. Schmidt-Thieme. 2011. Towards optimal active learning for matrix factorization in recommender systems. In Proceedings of the 23^rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI'11). 1069--1076.

Digital Library

[24]

E. Kaufmann, O. Cappé, and A. Garivier. 2012. On bayesian upper confidence bounds for bandit problems. J. Mach. Learn. Res. Proc. Track 22, 592--600.

[25]

P. Knees and M. Schedl. 2013. A survey of music similarity and recommendation from music context data. ACM Trans. Multimedia Comput. Comm. Appl. 10, 1.

Digital Library

[26]

Y. Koren, R. Bell, and C. Volinsky. 2009. Matrix factorization techniques for recommender systems. Comput. 42, 8, 30--37.

Digital Library

[27]

N. Lathia, S. Hailes, L. Capra, and X. Amatriain. 2010. Temporal diversity in recommender systems. In Proceedings of the 33^rd International ACM SIGIR Conference (SIGIR'10). ACM Press, New York, 210--217.

Digital Library

[28]

L. Li, W. Chu, J. Langford, and R. E. Schapire. 2012. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19^th International Conference on World Wide Web (WWW'10). ACM Press, New York, 661--670.

Digital Library

[29]

L. Li, W. Chu, J. Langford, and X. Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the 4^th ACM International Conference on Web Search and Data Mining (WSDM'11). ACM Press, New York, 297--306.

Digital Library

[30]

E. Liebman and P. Stone. 2014. Dj-mc: A reinforcement-learning agent for music playlist recommendation. http://arxiv.org/pdf/1401.1880.pdf

[31]

H. Liu, J. Hu, and M. Rauterberg. 2009. Music playlist recommendation based on user heartbeat and music preference. In Proceedings of the International Conference on Computer Technology and Development. Vol. 1, 545--549.

Digital Library

[32]

L. Liu, H. Xu, J. Xing, S. Liu, X. Zhou, and S. Yan. 2013. Wow! You are so beautiful today! In Proceedings of the 21^st ACM International Conference on Multimedia (MM'13). 3--12.

Digital Library

[33]

B. Logan. 2002. Content-based playlist generation: Exploratory experiments. In Proceedings of the 3^rd International Conference on Music Information Retrieval (ISMIR'02). 295--296.

[34]

D. P. Mackinnon, M. S. Fritz, J. Williams, and C. M. Lockwood. 2007. Distribution of the product confidence limits for the indirect effect: Program prodclin. Behav. Res. Methods 39, 3, 384--389.

[35]

B. C. May, N. Korda, A. Lee, and D. S. Leslie. 2012. Optimistic bayesian sampling in contextual-bandit problems. J. Mach. Learn. Res. 13, 1, 2069--2106.

Digital Library

[36]

M. E. J. Newman. 2005. Power laws, pareto distributions and zipf's law. Contemp. Phys. 46, 5, 323--351.

[37]

A. V. D. Oord, S. Dieleman, and B. Schrauwen. 2013. Deep content-based music recommendation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'13). 2643--2651.

[38]

R. Salakhutdinov and A. Mnih. 2008. Bayesian probabilistic matrix factorization using markov chain monte carlo. In Proceedings of the 25^th International Conference on Machine Learning (ICML'08). ACM Press, New York, 880--887.

Digital Library

[39]

M. Schedl and D. Schnitzer. 2014. Location-aware music artist recommendation. In Proceedings of the 20^th International Conference on MultiMedia Modeling (MMM'14).

[40]

G. Shani, D. Heckerman, and R. I. Brafman. 2005. An mdp-based recommender system. J. Mach. Learn. Res 6, 1265--1295.

Digital Library

[41]

J. Shen, X. S. Hua, and E. Sargin. 2013. Towards next generation multimedia recommendation systems. In Proceedings of the 21^st ACM International Conference on Multimedia (MM'13). ACM Press, New York, 1109--1110.

Digital Library

[42]

J. Silva and L. Carin. 2012. Active learning for online bayesian matrix factorization. In Proceedings of the 18^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). ACM Press, New York, 325--333.

Digital Library

[43]

Y. Song, S. Dixon, and M. Pearce. 2012. A survey of music recommendation systems and future perspectives. In Proceedings of the 9^th International Symposium on Computer Music Modelling and Retrieval (CMMR'12).

[44]

A. Srivihok and P. Sukonmanee. 2005. E-commerce intelligent agent: Personalization travel support agent using q learning. In Proceedings of the 7^th International Conference on Electronic Commerce (ICEC'05). ACM Press, New York, 287--292.

Digital Library

[45]

R. S. Sutton and A. G. Barto. 1998. Reinforcement Learning: An Introduction. Bradford.

Digital Library

[46]

C. Szepesvári. 2010. Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning Series. Vol. 4, Morgan and Claypool, San Rafael, CA.

Digital Library

[47]

N. Taghipour and A. Kardan. 2008. A hybrid web recommender system based on q-learning. In Proceedings of the ACM Symposium on Applied Computing (SAC'08). 1164--1168.

Digital Library

[48]

X. Wang, D. Rosenblum, and Y. Wang. 2012. Context-aware mobile music recommendation for daily activities. In Proceedings of the 20^th ACM International Conference on Multimedia (MM'12). ACM Press, New York, 99--108.

Digital Library

[49]

K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno. 2006. Hybrid collaborative and content-based music recommendation using probabilistic model with latent user preferences. In Proceedings of the 7^th International Conference on Music Information Retrieval (ISMIR'06). 296--301.

[50]

B.-T. Zhang and Y.-W. Seo. 2001. Personalized web-document filtering using reinforcement learning. Appl. Artif. Intell. 15, 665--685.

[51]

Y. C. Zhang, Diarmuid, D. Quercia, and T. Jambor. 2012. Auralist: Introducing serendipity into music recommendation. In Proceedings of the 5^th ACM International Conference on Web Search and Data Mining (WSDM'12). ACM Press, New York, 13--22.

Digital Library

[52]

E. Zheleva, J. Guiver, E. M. Rodrigues, and N. M. Frayling. 2010. Statistical models of music-listening sessions in social media. In Proceedings of the 19^th International Conference on World Wide Web (WWW'10). ACM Press, New York, 1019--1028.

Digital Library

Cited By

Chaudhary KSharma S(2025)FareIQ: Intelligent Fare Optimization for Cab Drivers Using Reinforcement LearningInnovations in Electrical and Electronics Engineering10.1007/978-981-97-9112-5_34(573-588)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-97-9112-5_34
Dereventsov AStarnes AWebster C(2025)Examining Policy Entropy of Reinforcement Learning Agents for Personalization TasksPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_33(493-504)Online publication date: 8-Feb-2025
https://doi.org/10.1007/978-981-97-8702-9_33
Muduli DSahoo SMishra JPradhan DSharma S(2024)Simplifying Digital Streaming: An Innovative Cross-Platform Streaming Solution2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726004(1-6)Online publication date: 24-Jun-2024
https://doi.org/10.1109/ICCCNT61001.2024.10726004
Show More Cited By

Index Terms

Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing
2. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Music retrieval

Recommendations

Serendipitous Personalized Ranking for Top-N Recommendation
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Serendipitous recommendation has benefitted both e-retailers and users. It tends to suggest items which are both unexpected and useful to users. These items are not only profitable to the retailers but also surprisingly suitable to consumers' tastes. ...
Effects of Personalized and Aggregate Top-N Recommendation Lists on User Preference Ratings

Prior research has shown a robust effect of personalized product recommendations on user preference judgments for items. Specifically, the display of system-predicted preference ratings as item recommendations has been shown in multiple studies to bias ...
A novel method for personalized music recommendation

With the development of digital music technologies, it is an interesting and useful issue to recommend the 'favored music' from large amounts of digital music. Some Web-based music stores can recommend popular music which has been rated by many people. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 11, Issue 1

August 2014

151 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2665935

Editor:
Ralf Steinmetz
Technische Universität Darmstadt, Germany

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2014

Accepted: 01 May 2014

Revised: 01 May 2014

Received: 01 November 2013

Published in TOMM Volume 11, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

65
Total Citations
View Citations
2,010
Total Downloads

Downloads (Last 12 months)114
Downloads (Last 6 weeks)20

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chaudhary KSharma S(2025)FareIQ: Intelligent Fare Optimization for Cab Drivers Using Reinforcement LearningInnovations in Electrical and Electronics Engineering10.1007/978-981-97-9112-5_34(573-588)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-97-9112-5_34
Dereventsov AStarnes AWebster C(2025)Examining Policy Entropy of Reinforcement Learning Agents for Personalization TasksPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_33(493-504)Online publication date: 8-Feb-2025
https://doi.org/10.1007/978-981-97-8702-9_33
Muduli DSahoo SMishra JPradhan DSharma S(2024)Simplifying Digital Streaming: An Innovative Cross-Platform Streaming Solution2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726004(1-6)Online publication date: 24-Jun-2024
https://doi.org/10.1109/ICCCNT61001.2024.10726004
Giannikis SFrasincar FBoekestijn D(2024)Reinforcement learning for addressing the cold-user problem in recommender systemsKnowledge-Based Systems10.1016/j.knosys.2024.111752(111752)Online publication date: Apr-2024
https://doi.org/10.1016/j.knosys.2024.111752
Borges RQueiroz M(2023)Audio-Based Sequential Music Recommendation2023 31st European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO58844.2023.10290094(421-425)Online publication date: 4-Sep-2023
https://doi.org/10.23919/EUSIPCO58844.2023.10290094
Pereira BChaves PSantos R(2023)Efficient Exploration and Exploitation for Sequential Music RecommendationACM Transactions on Recommender Systems10.1145/3625827Online publication date: 27-Sep-2023
https://doi.org/10.1145/3625827
Chen XYao LWang XSun ASheng Q(2023)Generative Adversarial Reward Learning for Generalized Behavior Tendency InferenceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.318692035:10(9878-9889)Online publication date: 1-Oct-2023
https://doi.org/10.1109/TKDE.2022.3186920
Singh ASharma SSingh BJha SMishra H(2023)Smart Song Recommendation System using Machine Learning2023 9th International Conference on Signal Processing and Communication (ICSC)10.1109/ICSC60394.2023.10440988(609-614)Online publication date: 21-Dec-2023
https://doi.org/10.1109/ICSC60394.2023.10440988
Starnes ADereventsov AWebster C(2023)Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00197(1551-1558)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00197
Furini MMontangero M(2023)Understanding users music listening habits for time and activity sensitive customized playlists2023 IEEE 20th Consumer Communications & Networking Conference (CCNC)10.1109/CCNC51644.2023.10060462(485-488)Online publication date: 8-Jan-2023
https://doi.org/10.1109/CCNC51644.2023.10060462
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents