Learning to Generate Tips from Song Reviews

doi:10.1016/j.neunet.2023.01.049

Neural Networks

Volume 161, April 2023, Pages 746-756

https://doi.org/10.1016/j.neunet.2023.01.049 Get rights and content

Abstract

Reviews of songs play an important role in online music service platforms. Prior research shows that users can make quicker and more informed decisions when presented with meaningful song reviews. However, reviews of songs are generally long in length and most of them are non-informative for users. It is difficult for users to efficiently grasp meaningful messages for making decisions. To solve this problem, one practical strategy is to provide tips, i.e., short, concise, empathetic, and self-contained descriptions about songs. Tips are produced from song reviews and should express non-trivial insights about the songs. To the best of our knowledge, no prior studies have explored the tip generation task in music domain. In this paper, we create a dataset named MTips for the task and propose a learning-to-generate framework named GenTMS for automatically generating tips from song reviews. The dataset involves 8,003 Chinese tips/non-tips from 128 songs which are distributed in five different song genres. Experimental results show that GenTMS achieves top-10 precision at 85.56%, outperforming the baseline models by at least 3.34%. Besides, to simulate the practical usage of our proposed framework, we also experiment with previously-unseen songs, during which GenTMS also achieves the best performance with top-10 precision at 78.89% on average. The results demonstrate the effectiveness of the proposed framework in tip generation of the music domain.

Introduction

Online music services, such as QQ Music, Pandora, Spotify, Apple Music, and Netease Music, etc., currently supply ever-growing catalogs with dozens of millions of songs. The services also provide entries for users to exchange their opinions about songs. The importance of song reviews has been proven for various aspects, such as music genre classification (Oramas, Anke, Lawlor, Serra, & Saggion, 2016) and music recommendation (Tata & Eugenio, 2010). Users can capture deeper content of a song, meet resonance, or discover interesting points from the reviews, beyond its title, attributes and description. Prior research (Tata & Eugenio, 2010) also shows that reviews facilitate users’ decision-making processes in choosing which songs to listen to. However, reading a number of long-length reviews is a time-consuming process and the review presentation space is limited for music platforms, especially on mobile devices (Novgorodov, Elad, Guy, & Radinsky, 2019). One practical strategy is to provide tips - concise, empathetic, and self-contained descriptions about the songs. Fig. 1 illustrates the tips for some of songs. Specifically, tips are review fragments or sentences, which provide insights about the melody, rhythm, context or affection of the songs. Considering the diverse perceptions of users, i.e., “there are a thousand Hamlets for a thousand audiences”, there generally exists several tips for one song. To our best knowledge, no previous studies have explored the task of tip generation in music domain.

Although tip generation has been studied in other domains such as product (Hirsch, Novgorodov, Guy, & Nus, 2021) and travel (Guy et al., 2017, Zhu et al., 2018), tips of songs are different in the following two aspects: (1) they are more diverse in patterns since the characteristics, context, or affection of the song may be perceived by users variously; (2) they are expected to be attractive and better with figures of speech such as metaphor, exaggeration, and rhyme, etc., based on the fact that no songs are essentially good or bad. Thus, the experience of tip generation in other domains is not applicable for music domain. Besides, the large number of reviews of songs, especially for popular songs, makes the process of finding useful information more challenging. Moreover, previous studies (Gamzu, Gonen, Kutiel, Levy, & Agichtein, 2021) generally extract short sentences from reviews by splitting based on full stops, but the song reviews are often wrongly punctuated, e.g., the original review shown in Table 1, resulting in long sliced sentences. Another splitting strategy is based on a slicing window, which may produce semantically-incomplete sentences, e.g., “

”.

To address the above challenges, in this work, we first annotate a high-quality Chinese dataset containing tips of songs, named MTips. To enhance the generalizability of the dataset, we involve reviews of songs from five different genres, including absolute music, rock and roll, the film & television hit, Chinese pop, and Europe & America & Japan & Korea (EAJK) pop. For ensuring the quality of the labeled data, we collaborated with one popular music platform,¹ and invited five annotators who are responsible for the corresponding business. To correct the wrongly-used punctuation in user reviews, we propose a BERT-based punctuation prediction model. With the corrected punctuation, we can directly extract short sentences from reviews according to full stops. During manually labeling the short sentences, we summarize eight characteristics of tips categorized into two aspects, including the content relevance to the songs and stylistic pattern. In total, we have labeled 8003 Chinese tips/non-tips, involving 3062 tips from the top 100 reviews of 128 songs.

Based on the labeled dataset MTips, we then propose a learning-to-generate framework, named GenTMS, for automatically generating tips from song reviews. GenTMS is built upon user reviews, and includes two major modules:

(1) Sentence relevance ranking module, which aims at scoring the representativeness of sentences for one song according to two aspects: the content relevance to the song and stylistic similarity to the annotated tips across songs. Besides the textual relevance, the content-based ranking module also involves the approval numbers of the reviews for the song, considering that the attribute reflects the degree of empathy delivered by the reviews. The stylistic-based ranking focuses on scoring whether the sentences share similar stylistic patterns as the annotated tips across songs.

(2) Diversity-weighted re-ranking module, which aims at increasing the diversity of the top-ranked tips for one song. Topic modeling and the combination of the two ranking scores are adapted in the module.

Experiments show that GenTMS can accurately generate the top-10 tips with precision score at 85.56%. To simulate the practical usage of our framework, we also conduct experiment with previously-unseen 9 songs, achieving top-10 precision at 78.89% on average.

The main contributions of this paper are as follows:

1.
To the best of our knowledge, we are the first to introduce and study the tip generation task in music domain.
2.
We summarize the characteristics of tips in music domain and release the first annotated Chinese dataset named MTips for tip generation for facilitating future research. We also provide detailed analysis of our dataset.
3.
We present a learning-to-generate framework named GenTMS for automatically producing tips from song reviews. Extensive evaluation shows the effectiveness of our proposed framework.

Paper structure. The remainder of the paper is organized as follows. Section 2 illustrates the related work. Section 3 introduces the annotation process and data analysis. Our proposed framework is presented in Section 4. We describe the experiment setup in Section 5 and elaborate on the benchmark results in Section 6. We conclude and mention future work in Section 7.

Section snippets

Song review mining

Mining song reviews has long been studied. Hu, Downie, West, and Ehmann (2005) demonstrate a system to predict star ratings by classifying reviews according to the genre of the song. In fact, in addition to star ratings, we can also assign text messages to songs, which is more acceptable to users. Pinter, Paul, Smith, and Brubaker (2020) create a dataset of expert reviews and offer several possible avenues for research. Different from their study in expert reviews, most of the work still focus

Data annotation and analysis

In this section, we describe the workflow about the annotation process of Chinese tips for song reviews. We obtained user reviews of songs from our business partner, one of the most popular Chinese music platforms. We first present the definition of the tips of songs, and then introduce the data preprocessing and annotation step. We finally summarize the characteristics of tips, and conduct detailed analysis on the annotated dataset.

Models

In this section, we elaborate on the learning-to-generate framework GenTMS for automatic tip generation. Fig. 5 illustrates the workflow of GenTMS, including two main components: sentence relevance ranking, and diversity-weighted re-ranking. Given the candidate review sentences for a song, sentence relevance ranking module is divided into two parts, content-based ranking and stylistic-based ranking. The content-based ranking component aims at learning to measure the content representativeness

Dataset

The MTips dataset is described in Section 3. We split the dataset into training, validation and test sets. We first randomly select 9 songs and 50 sentences from each of them to form the test sets. This is to test the model’s ability to recognize new tips under new reviews of the seen songs. Then we randomly split the remaining part into training and validation sets in the ratio of 8:2.

Besides, to stimulate the practical use of tip extraction, we evaluate our framework on previously-unseen

Experiment evaluation

In the following section, we present evaluation methods for GenTMS. We firstly analyze performance of the whole framework. Then we report the detailed analysis of each module.

Conclusion and future work

In this paper, we are the first to propose a learning-to-generate framework named GenTMS for generating tips from song reviews and publish the first Chinese tip dataset named MTips in the music domain. We perform analysis of the dataset and make a synthesis of the effectiveness of each module of GenTMS. Benchmark results on top-k precision evaluation and practical evaluation show that the proposed framework is of high quality. In the future, we will explore the use of tips for personalized

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was supported by National Key R and D Program of China (No. 2022YFB3103900), National Natural Science Foundation of China under project (No. 62002084, 62276075, 62272132), Shenzhen Basic Research, China (General Project No. JCYJ20220531095214031), Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, China (2022B1212010005), and the Major Key Project of PCL, China (Grant No. PCL2022A03, PCL2021A02, PCL2021A09).

References (36)

KrestelR. et al.
Diversifying customer review rankings
Neural Networks
(2015)
MoirangthemD.S. et al.
Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network
Neural Networks
(2020)
PandeyD. et al.
Modeling coherence by ordering paragraphs using pointer networks
Neural Networks
(2020)
ZhuD. et al.
Unsupervised tip-mining from customer reviews
Decision Support System
(2018)
ChenL. et al.
Fast greedy map inference for determinantal point process to improve recommendation diversity
ChoS. et al.
Improving the similarity measure of determinantal point processes for extractive multi-document summarization
DevlinJ. et al.
BERT: pre-training of deep bidirectional transformers for language understanding
GamzuI. et al.
Identifying helpful sentences in product reviews
GuyI. et al.
Extracting and ranking travel tips from user-generated reviews
HirschS. et al.
Generating tips from product reviews

HofmannT.

Probabilistic latent semantic analysis

(2013)

HuX. et al.

Mining music reviews: Promising preliminary results

JoachimsT.

Text categorization with support vector machines: Learning with many relevant features

JoulinA. et al.

Bag of tricks for efficient text classification

(2016)

KuleszaA. et al.

Determinantal point processes for machine learning

Foundations and Trends in Machine Learning

(2012)

LiJ. et al.

A diversity-promoting objective function for neural conversation models

LiP. et al.

Persona-aware tips generation?

LiP. et al.

Neural rating regression with abstractive tips generation for recommendation

Cited by (0)

View full text

Learning to Generate Tips from Song Reviews

Abstract

Introduction

Section snippets

Song review mining

Data annotation and analysis

Models

Dataset

Experiment evaluation

Conclusion and future work

Declaration of Competing Interest

Acknowledgments

Neural Networks

Neural Networks

Neural Networks

Decision Support System

Fast greedy map inference for determinantal point process to improve recommendation diversity

Improving the similarity measure of determinantal point processes for extractive multi-document summarization

BERT: pre-training of deep bidirectional transformers for language understanding

Identifying helpful sentences in product reviews

Extracting and ranking travel tips from user-generated reviews

Generating tips from product reviews

Probabilistic latent semantic analysis

Mining music reviews: Promising preliminary results

Text categorization with support vector machines: Learning with many relevant features

Bag of tricks for efficient text classification

Determinantal point processes for machine learning

Foundations and Trends in Machine Learning

A diversity-promoting objective function for neural conversation models

Persona-aware tips generation?

Neural rating regression with abstractive tips generation for recommendation