Learning to Generate Tips from Song Reviews
Introduction
Online music services, such as QQ Music, Pandora, Spotify, Apple Music, and Netease Music, etc., currently supply ever-growing catalogs with dozens of millions of songs. The services also provide entries for users to exchange their opinions about songs. The importance of song reviews has been proven for various aspects, such as music genre classification (Oramas, Anke, Lawlor, Serra, & Saggion, 2016) and music recommendation (Tata & Eugenio, 2010). Users can capture deeper content of a song, meet resonance, or discover interesting points from the reviews, beyond its title, attributes and description. Prior research (Tata & Eugenio, 2010) also shows that reviews facilitate users’ decision-making processes in choosing which songs to listen to. However, reading a number of long-length reviews is a time-consuming process and the review presentation space is limited for music platforms, especially on mobile devices (Novgorodov, Elad, Guy, & Radinsky, 2019). One practical strategy is to provide tips - concise, empathetic, and self-contained descriptions about the songs. Fig. 1 illustrates the tips for some of songs. Specifically, tips are review fragments or sentences, which provide insights about the melody, rhythm, context or affection of the songs. Considering the diverse perceptions of users, i.e., “there are a thousand Hamlets for a thousand audiences”, there generally exists several tips for one song. To our best knowledge, no previous studies have explored the task of tip generation in music domain.
Although tip generation has been studied in other domains such as product (Hirsch, Novgorodov, Guy, & Nus, 2021) and travel (Guy et al., 2017, Zhu et al., 2018), tips of songs are different in the following two aspects: (1) they are more diverse in patterns since the characteristics, context, or affection of the song may be perceived by users variously; (2) they are expected to be attractive and better with figures of speech such as metaphor, exaggeration, and rhyme, etc., based on the fact that no songs are essentially good or bad. Thus, the experience of tip generation in other domains is not applicable for music domain. Besides, the large number of reviews of songs, especially for popular songs, makes the process of finding useful information more challenging. Moreover, previous studies (Gamzu, Gonen, Kutiel, Levy, & Agichtein, 2021) generally extract short sentences from reviews by splitting based on full stops, but the song reviews are often wrongly punctuated, e.g., the original review shown in Table 1, resulting in long sliced sentences. Another splitting strategy is based on a slicing window, which may produce semantically-incomplete sentences, e.g., “”.
To address the above challenges, in this work, we first annotate a high-quality Chinese dataset containing tips of songs, named MTips. To enhance the generalizability of the dataset, we involve reviews of songs from five different genres, including absolute music, rock and roll, the film & television hit, Chinese pop, and Europe & America & Japan & Korea (EAJK) pop. For ensuring the quality of the labeled data, we collaborated with one popular music platform,1 and invited five annotators who are responsible for the corresponding business. To correct the wrongly-used punctuation in user reviews, we propose a BERT-based punctuation prediction model. With the corrected punctuation, we can directly extract short sentences from reviews according to full stops. During manually labeling the short sentences, we summarize eight characteristics of tips categorized into two aspects, including the content relevance to the songs and stylistic pattern. In total, we have labeled 8003 Chinese tips/non-tips, involving 3062 tips from the top 100 reviews of 128 songs.
Based on the labeled dataset MTips, we then propose a learning-to-generate framework, named GenTMS, for automatically generating tips from song reviews. GenTMS is built upon user reviews, and includes two major modules:
(1) Sentence relevance ranking module, which aims at scoring the representativeness of sentences for one song according to two aspects: the content relevance to the song and stylistic similarity to the annotated tips across songs. Besides the textual relevance, the content-based ranking module also involves the approval numbers of the reviews for the song, considering that the attribute reflects the degree of empathy delivered by the reviews. The stylistic-based ranking focuses on scoring whether the sentences share similar stylistic patterns as the annotated tips across songs.
(2) Diversity-weighted re-ranking module, which aims at increasing the diversity of the top-ranked tips for one song. Topic modeling and the combination of the two ranking scores are adapted in the module.
Experiments show that GenTMS can accurately generate the top-10 tips with precision score at 85.56%. To simulate the practical usage of our framework, we also conduct experiment with previously-unseen 9 songs, achieving top-10 precision at 78.89% on average.
The main contributions of this paper are as follows:
- 1.
To the best of our knowledge, we are the first to introduce and study the tip generation task in music domain.
- 2.
We summarize the characteristics of tips in music domain and release the first annotated Chinese dataset named MTips for tip generation for facilitating future research. We also provide detailed analysis of our dataset.
- 3.
We present a learning-to-generate framework named GenTMS for automatically producing tips from song reviews. Extensive evaluation shows the effectiveness of our proposed framework.
Paper structure. The remainder of the paper is organized as follows. Section 2 illustrates the related work. Section 3 introduces the annotation process and data analysis. Our proposed framework is presented in Section 4. We describe the experiment setup in Section 5 and elaborate on the benchmark results in Section 6. We conclude and mention future work in Section 7.
Section snippets
Song review mining
Mining song reviews has long been studied. Hu, Downie, West, and Ehmann (2005) demonstrate a system to predict star ratings by classifying reviews according to the genre of the song. In fact, in addition to star ratings, we can also assign text messages to songs, which is more acceptable to users. Pinter, Paul, Smith, and Brubaker (2020) create a dataset of expert reviews and offer several possible avenues for research. Different from their study in expert reviews, most of the work still focus
Data annotation and analysis
In this section, we describe the workflow about the annotation process of Chinese tips for song reviews. We obtained user reviews of songs from our business partner, one of the most popular Chinese music platforms. We first present the definition of the tips of songs, and then introduce the data preprocessing and annotation step. We finally summarize the characteristics of tips, and conduct detailed analysis on the annotated dataset.
Models
In this section, we elaborate on the learning-to-generate framework GenTMS for automatic tip generation. Fig. 5 illustrates the workflow of GenTMS, including two main components: sentence relevance ranking, and diversity-weighted re-ranking. Given the candidate review sentences for a song, sentence relevance ranking module is divided into two parts, content-based ranking and stylistic-based ranking. The content-based ranking component aims at learning to measure the content representativeness
Dataset
The MTips dataset is described in Section 3. We split the dataset into training, validation and test sets. We first randomly select 9 songs and 50 sentences from each of them to form the test sets. This is to test the model’s ability to recognize new tips under new reviews of the seen songs. Then we randomly split the remaining part into training and validation sets in the ratio of 8:2.
Besides, to stimulate the practical use of tip extraction, we evaluate our framework on previously-unseen
Experiment evaluation
In the following section, we present evaluation methods for GenTMS. We firstly analyze performance of the whole framework. Then we report the detailed analysis of each module.
Conclusion and future work
In this paper, we are the first to propose a learning-to-generate framework named GenTMS for generating tips from song reviews and publish the first Chinese tip dataset named MTips in the music domain. We perform analysis of the dataset and make a synthesis of the effectiveness of each module of GenTMS. Benchmark results on top-k precision evaluation and practical evaluation show that the proposed framework is of high quality. In the future, we will explore the use of tips for personalized
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research was supported by National Key R and D Program of China (No. 2022YFB3103900), National Natural Science Foundation of China under project (No. 62002084, 62276075, 62272132), Shenzhen Basic Research, China (General Project No. JCYJ20220531095214031), Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, China (2022B1212010005), and the Major Key Project of PCL, China (Grant No. PCL2022A03, PCL2021A02, PCL2021A09).
References (36)
- et al.
Diversifying customer review rankings
Neural Networks
(2015) - et al.
Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network
Neural Networks
(2020) - et al.
Modeling coherence by ordering paragraphs using pointer networks
Neural Networks
(2020) - et al.
Unsupervised tip-mining from customer reviews
Decision Support System
(2018) - et al.
Fast greedy map inference for determinantal point process to improve recommendation diversity
- et al.
Improving the similarity measure of determinantal point processes for extractive multi-document summarization
- et al.
BERT: pre-training of deep bidirectional transformers for language understanding
- et al.
Identifying helpful sentences in product reviews
- et al.
Extracting and ranking travel tips from user-generated reviews
- et al.
Generating tips from product reviews