Online recommender system for radio station hosting based on information fusion and adaptive tag-aware profiling
Introduction
Music recommendation is an important and challenging direction in the field of recommender systems. It is a hard problem to extract relevant information for similarity search from large music collections, and sources of rich semantic metadata are often needed (Celma, 2010). Many recent works in this area have appeared at the International Society for Music Information Retrieval Conference (ISMIR) (Müller & Wiering, 2015) and the Recommender Systems conference (RecSys) (Werthner, Zanker, Golbeck, & Semeraro, 2015).
Recently, the focus of computer science research in music information studies has shifted from pure music information retrieval and exploration (Gleich, Zhukov, Rasmussen, Lang, 2005, Hilliges, Holzer, Klüber, Butz, 2006) to music recommendations (Brandenburg, Dittmar, Gruhne, Abeßer, Lukashevich, Dunker, et al., 2009, Celma, 2010). It is not a new direction (Avesani, Massa, Nori, & Susi, 2002); however, it is now inspired by new capabilities of large online services that can provide not only millions of tracks for listening but also thousands of radio stations to choose from on a single web site. Moreover, social tagging is an important factor that paves the way to new recommender algorithms based on tag similarity (Nanopoulos, Rafailidis, Symeonidis, Manolopoulos, 2010, Symeonidis, Ruxanda, Nanopoulos, Manolopoulos, 2008b, Yang, Bogdanov, Herrera, Sordo).
Despite many high quality works on different aspects of music recommendation, there are only a few studies devoted to online radio station recommender systems (Aizenberg, Koren, Somekh, 2012, Grant, Ekanayake, Turnbull, 2013). Several radio-like online broadcasting services, including last.fm, Yahoo!LaunchCast, and Pandora, are known for their recommender systems and work on a commercial basis (however, the latter two do not operate in Russia). There is their Russian counterpart, Yandex.Radio1; however, as the aforementioned services, it only provides access to radio stations with playlists, automatically composed according to a catalogue of selection criteria. Recommendation of real online radio stations is different. It presents difficulties not only due to the musical content but also because a recommender needs to find relevant dynamically changing objects while usually relying on implicit feedback only.
Thus, in this work we consider the music recommendation problem from a slightly different angle. We consider the Russian online radio hosting service FMhost, in particular, its new hybrid recommender system. First, we recommend radio stations, i.e., sequences or sets of compositions, which are manually composed by real DJs and dynamically change, rather than individual tracks as most other music recommenders do. Second, as we demonstrate below, the FMhost service does not have enough data for reasonable SVD-based recommendations; still, recommendations have to be provided. To overcome these problems, we propose a novel recommender algorithm that combines three data sources: radio station visits, listening events for music with specific tags, and frequency of tags applied to radio stations and their content. We show experimental results on the FMhost dataset and propose a fusion of two different algorithms that can be tuned for specific quality metrics, e.g., precision, recall, and NDCG.
The paper is organized as follows. In Section 2, we briefly survey related work in music recommendation. Section 3 outlines the online radio service FMhost. In Section 4, we propose a novel recommender model, two basic recommender algorithms, a third algorithm that combines them, and describe the recommender system architecture. Section 5 provides examples, and Section 6 discusses the basic principles and problems in details. Quality of service (QoS) measurements, a comparison with an SVD-based approach, and certain insights into FMhost user behaviour are discussed in Section 7. In Section 8, we provide a theoretical justification of the chosen aggregation of rankings. Section 9 concludes the paper.
Section snippets
Related work
Music recommendation becomes especially important because modern systems that provide music to their users aim to take into account infrequently requested musical compositions and/or collections of compositions such as radio stations from the long tail of the distribution. Most music recommender systems work under the general principles of collaborative filtering (Koren & Bell, 2011). For instance, last.fm mines user tastes both explicitly, from likes with which the users mark compositions, and
A concise online broadcasting dictionary
We begin by briefly introducing some basic domain terminology. A chart is a track rating of a particular radio station; for example, a rock chart shows a certain number (e.g., 10) of most popular rock tracks, ranked from the most popular (rank 1) to the least popular (rank 10) according to a survey. A live performance (or just live for short) is a performance with one or several DJs (disk jockeys) assigned to it. They perform using their own PCs, and the audio stream is being redirected from
Input data and general structure
Our model is based on three data matrices.8 The first matrix tracks the number of times user u visits radio stations with a certain tag t. Each radio station r broadcasts audio tracks with a certain set of tags Tr. The sets of all users, radio stations, and tags are denoted by U, R, and T respectively. The second matrix contains how many tracks with a tag t a radio station r has
How it works: explaining by example
In fact, we deal with a dynamic quadri-partite weighted graph where U is a set of users, T is a set of tags, R is a set of radio stations, M is a set of musical tracks, EUT⊆U × T (similarly for the rest edges), and w is a weight function that assigns to an edge e ot the graph its weight.
Note that this graph can be built from a collection of tuples where each tuple means that user u listened to music track m with tag t played by
Cold start
The cold start problem is a fundamental challenge in recommender systems domain and it touches both new users and items (Lika, Kolomvatsos, & Hadjiefthymiades, 2014).
If a new radio station starts broadcasting, CBRS component cannot help. However, since it plays some tracks with thematic tags, IBRS is able to match user profiles with the repertoire of this new station via adaptive tag-aware representation. For the CBRS component to cope with the problem of a new radio station, one can also use
Setting and quality metrics
To evaluate the quality of the developed system, we used a variant of the cross-validation technique proposed in Ignatov, Poelmans, Dedene, and Viaene (2012). We represent the dataset as an object-attribute table (binary relation) T⊆U × I, where uTi iff user u ∈ U used (purchased, watched, listened etc.) item i ∈ I. To evaluate the quality of recommendations in terms of precision and recall, we split the initial user set U into training and test subsets Utrain and Utest, where the test set is
Theoretical discussion
Linear combination of rankings seems to be a simple solution for their aggregation. However, so far this effective and efficient way of aggregation has been considered only as a well-chosen heuristic without proper theoretical discussion of its properties (Celma, 2010, Deng, Wang, Li, Xu, 2015, Domingues, Gouyon, Jorge, Leal, Vinagre, Lemos, et al., 2013, Kim, Kim, 2014, Wu, Chang, Liu, 2014). We try to bridge the gap by order-theoretic treatment.
First of all we assume that every ranking of
Conclusion and further work
In this work, we have described the underlying models, algorithms, and system architecture of the new improved FMHost service and tested it on the available real dataset. We hope that the developed algorithms will help a user to find relevant radio stations for listening. During future optimization and tuning of the algorithm, special attention should be paid to scalability issues and user-centric quality assessment.
By using bimodal cross-validation, we have built a hybrid algorithm FRS tuned
Acknowledgments
We would like to thank Rustam Tagiew and Mykola Pechenizkiy for their comments and Vasily Zaharchuk (FMhost co-founder), Rimma Ahmetsafina, Natalia Konstantinova and Andrey Konstantinov for their very important work at the previous stages of this project. Our research was done at the National Research University Higher School of Economics in 2014–2015, at the Laboratory of Intelligent Systems and Structural Analysis (Moscow) and Laboratory for Internet Studies (St. Petersburg). Dmitry Ignatov’s
References (75)
- et al.
Music playlist generation by assimilating gmms into soms
Pattern Recognition Letters
(2010) - et al.
Recommender systems survey
Knowledge-Based Systems
(2013) - et al.
Semantic audio content-based music recommendation and visualization based on user preference examples
Information Processing and Management
(2013) - et al.
Exploring user emotion in microblogs for music recommendation
Expert Systems With Applications
(2015) - et al.
How should I explain? A comparison of different explanation types for recommender systems
International Journal of Human-Computer Studies
(2014) - et al.
Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems
Expert Systems with Applications
(2014) - et al.
Audioradar: A metaphorical visualization for the navigation of large music collections
- et al.
Triadic formal concept analysis and triclustering: Searching for optimal patterns
Machine Learning
(2015) - et al.
A framework for tag-aware recommender systems
Expert Systems with Applications
(2014) - et al.
Generating music playlists with hierarchical clustering and q-learning
Advances in information retrieval – 37th European conference on IR research, ECIR 2015, Vienna, Austria, March 29–April 2, 2015. Proceedings
(2015)