Elsevier

Information Sciences

Volumes 451–452, July 2018, Pages 295-309
Information Sciences

Leveraging sentiment analysis at the aspects level to predict ratings of reviews

https://doi.org/10.1016/j.ins.2018.04.009Get rights and content

Abstract

Online reviews are an important asset for users who are deciding to buy a product, see a movie, or go to a restaurant and for managers who are making business decisions. The reviews from e-commerce websites are usually attached to ratings, which facilitates learning from the reviews by users. However, many reviews that spread across forums or social media are written in plain text, which is not rated, and these reviews are called non-rated reviews in this paper. From the perspective of sentiment analysis at the aspects level, this study develops a predictive framework for calculating ratings for non-rated reviews. The idea behind the framework began with an observation: the sentiment of an aspect is determined by its context; the rating of the review depends on the sentiment of the aspects and the number of positive and negative aspects in the review. Viewing term pairs that co-occur with aspects as their context, we conceived of a variant of a Conditional Random Field model, called SentiCRF, for generating term pairs and calculating their sentiment scores from a training set. Then, we developed a cumulative logit model that uses aspects and their sentiments in a review to predict the ratings of the review. In addition, we met the challenge of class imbalance when calculating the sentiment scores of term pairs. We also conceived of a heuristic re-sampling algorithm to tackle class imbalance. Experiments were conducted on the Yelp dataset, and their results demonstrate that the predictive framework is feasible and effective at predicting the ratings of reviews.

Introduction

Online reviews are an important asset for users who are deciding to buy a product, see a movie, or go to a restaurant and for managers who are making business decisions. When we talk about the reviews in the context of e-commerce, they usually refer to the text that is posted under the products, services, or businesses shown on the e-commerce website. Always, they are attached by star ratings that vary from 1-star to 5-star, which could facilitate learning about the reviews by the visitors. Fig. 1 exhibits an example of such a review on the Amazon website, which involves the Samsung Galaxy S6. A 3-star rating is assigned to the item.

However, other types of reviews are also widely spread across forums or social media. They do not show significant differences compared with those on the e-commerce website, except for the lack of ratings. This paper calls them non-rated reviews. For example, Fig. 2 illustrates a tweet on Twitter that addresses the iPhone 7.

Many business intelligence (BI) applications consider text messages that are scattered across the Internet to be important data sources. If these applications can provide ratings for the non-rated reviews, it will help a substantial amount in making business decisions. For example, in a system, we seek to obtain reviews from the Internet that are associated with a certain business and then calculate the star ratings for them. Furthermore, we aggregate the rated reviews to determine a dynamic performance that enables managers to conveniently gain insight into the business. Predicting the ratings of the non-rated reviews could be a valuable tool for building business intelligence applications.

The ratings prediction is a subfield of sentiment classification [3], [4], [9], [24], [26]. This subfield has attracted much interest over the past decade since it emerged in 2005 [8]. For instance, TASS is an experimental evaluation workshop for sentiment analysis focused on the Spanish language since 2012. Participants are expected to submit experiments for the 6-label evaluation (strong positive, positive, neutral, negative, strong negative and a no sentiment tag) on a public corpus. The six labels are equivalent to the rating of stars in the e-commerce.

However, most previous research regards predicting ratings of reviews as a task of the document-level multilabel classification [6], [7], [8]. Motivated by Observation 1, we believe the performance of prediction of ratings of reviews can be significantly improved by leveraging sentiment analysis at the aspects-level. Examining reviews on the e-commerce website, we obtain Observation 1.

Observation 1: Every review involves at least one aspect; almost all of the aspects in a 5-star rating review are positive sentiment; the aspects of a 1-star review tend to be negative sentiments; and a 3-star review comprises a nearly equal number of positive and negative aspects.1

Fig. 3 shows a 3-star rating review that was derived from the Yelp website, where the aspects highlighted in a red color have positive sentiments and those highlighted in a yellow color have negative sentiments. Observing the review, we find that the number of positive aspects is almost equal to the number of negative aspects.

This study intends to build a predictive framework based on sentiment analysis at the aspects-level to provide star ratings for non-rated reviews. Observation 1 builds the cornerstone for this study. We also seek to provide the evidence for the reliability of the observation via experiments (see Section 4.3 for details). Motivated by Observation 1, the task of predicting ratings can be formally decomposed into three steps: extracting the aspects, obtaining their sentiments, and then, predicting the rating based on the aspects. To calculate the sentiment scores of the aspects, we must derive the context of the aspects, i.e., terms that co-occur with the aspects in the same sentence. The sentiment of the aspects depends on their context. Traditional lexicon-based sentiment analysis employs a sentiment score of words provided by the lexicon to determine the sentiment scores of phrases or sentences, even documents. Prior work [4], [5], [25] has proven that the sentiments of words could change depending on their context. Hence, in this study, we use the term pair 〈w1, w2〉 as a basic element, where both terms w1 and w2 are considered the context of their counterpart. We use a list of term pairs that co-occur with an aspect as the context of the aspect. Calculating the sentiment scores of the term pairs and then aggregating them, we obtain the sentiment score of the aspects. Furthermore, a cumulative logit model is developed that uses the sentiment scores of the aspects as features for predicting the ratings of the reviews. The main contributions of this study include the following:

  • (1)

    Develop a variant of the Conditional Random Field model, called SentiCRF, to build term pairs and calculate their sentiment scores.

  • (2)

    Conceive of a heuristic re-sampling algorithm to address the class imbalance that is encountered when we train the SentiCRF model.

  • (3)

    Build a framework to predict the ratings of non-rated reviews.2

The remainder of this paper is organized as follows. Section 2 reviews the related research on predicting the star ratings of the reviews. Section 3 introduces a predictive framework. Section 4 presents the experimental results. We also discuss interesting findings in Section 5. Section 6 offers the concluding remarks.

Section snippets

Related work

The task of predicting the star ratings of reviews originates from the sentiment classification of reviews, i.e., classifying reviews as recommended (thumbs up) or not recommended (thumbs down) [3]. Research work for review classification can be divided into machine learning methods, lexicon-based methods and hybrid methods. Some studies see review classification as a problem of text classification, which employs traditional machine learning methods, e.g., SVM (Support Vector Machine) [1] or

A predictive framework

To better present this study, we first provide three definitions. The Yelp website3 uses the term business to refer to the target of the reviews. We also employ this term to present our work.

Definition 1 (Business). This study gives the items, products or services presented in the e-commerce website, which are the targets of reviews, a general name, business.

Liu [17] uses the term entity to indicate the products and uses aspects to refer to the features or attributes of the

Experiments

In this section, we evaluate both the feasibility and the performance of the predictive framework developed in this study using the Yelp dataset.4 The experiments are conducted on a PC with an INTEL i7 processor and 32 GB RAM.

SentiCRF model

In Section 4, we employ the Yelp dataset to train the SentiCRF model. We list the top 20 positive sentiment and negative sentiment term pairs in Table 8. We rank the term pairs by their sentiment score, sc, which is calculated using the formula sc=log(λpos+1)log(λneg+1). If sc is larger than 0, then the term pair exhibits a positive sentiment; otherwise, it indicates a negative sentiment.

From observing Table 8, we find that ‘downside’ surprisingly shows at the top of the list of positive

Conclusions

Predicting the ratings of the reviews is a task that is both interesting and challenging. This task could be a basic and even key component in a business intelligence application. In this paper, we propose a predictive framework to meet the challenges. We develop a SentiCRF model to build a collection of term pairs and to calculate their sentiments from a training set. To predict the star rating of a non-rated review, we extract the aspects of the review and their contexts, i.e., term pairs

Acknowledgments

This work was supported by National Natural Science Foundation of China (no. 71571145), Humanities and Social Science Foundation of Ministry of Education of China (no. 14YJAZH063) and the Fundamental Research Funds for the Central Universities (nos. JBK120505 and JBK150503).

References (28)

  • T. Wilson et al.

    Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis

    Comput. Ling.

    (2009)
  • A. Weichselbraun et al.

    Extracting and grounding context-aware sentiment lexicons

    IEEE Intell. Syst.

    (2013)
  • Qu, L., Ifrim, G., Weikum, G., The bag-of-opinions method for review rating prediction from sparse text patterns,...
  • D. Tang et al.

    User modeling with neural network for review rating prediction

  • Cited by (52)

    • Incorporating explicit syntactic dependency for aspect level sentiment classification

      2021, Neurocomputing
      Citation Excerpt :

      Aspect level sentiment classification aims to extract sentiment expressed towards specific aspects from a sentence. It has important implications for various tasks, e.g., commodity recommendation [1,2], political stance analysis [3,4]. It can provide more fine-grained sentiment compared to sentence-level sentiment classification [5].

    View all citing articles on Scopus
    View full text