Elsevier

Information Sciences

Volumes 367–368, 1 November 2016, Pages 689-699
Information Sciences

Exploiting flexible-constrained K-means clustering with word embedding for aspect-phrase grouping

https://doi.org/10.1016/j.ins.2016.07.002Get rights and content

Abstract

Aspect-phrase grouping is an important task for aspect finding in sentiment analysis. Most existing methods for this task are based on a window-context model, which assumes that the same aspect has similar co-occurrence contexts. This model does not always work well in practice. In this paper, we develop a novel weighted context representation model based on semantic relevance, which exploits word embedding method to represent aspect-phrase. And we encode the lexical knowledge as constraints with a degree of belief, and further propose a flexible-constrained K-means algorithm to cluster aspect-phrases. Empirical evaluation shows that the proposed method outperforms existing state-of-the-art methods.

Introduction

Aspect-level sentiment analysis is an important task in sentiment analysis and attracts more and more attentions [5], [7], [16], [17], [23], [31]. For this task, it is a necessary step to identify and group aspect-phrases from the corpus. However, the challenge is that people can use different words/phrases referring to the same aspect in reviews.

In order to understand it more clearly, we introduce two concepts, aspect and aspect-phrase. An aspect is the name of a feature of the product, while an aspect-phrase is a word or phrase that actually appears in a sentence to indicate the aspect. For example, “picture quality” could have some other expressions such as “photo”, “image”, “picture”. All the aspect-phrases in a group indicate the same aspect. So grouping aspect-phrases is an important necessary work for aspect-level sentiment analysis.

In this paper, we assume that all aspect-phrases have been identified by existing extracting methods [7], [9], [11], [12], [13], [18], and we focus on grouping domain synonymous aspect-phrases1.

Existing studies for this problem, which mainly utilize the context information of aspect-phrase, have some limitations. These methods are under the assumption that different aspect-phrases of the same aspect should have similar co-occurrence contexts. As shown in [6], [28], [29], [30], [31], they collect contextual words for each aspect-phrase in a text window [t,t] from all the sentences, in which the aspect-phrase appeared. The limitations are:

  • 1.

    A solid value of t is not fit for various sentences, which have different length. In other words, the smaller t setting can not get enough surround words but the larger t setting can get more noisy words.

  • 2.

    They use term frequency to weight context, which is a simple and powerful feature weighting method for clustering, but sometimes it is helpless for our task.

We can see t=4 is not a proper setting for S2 and S3 in Fig. 1. Such as the two aspect-phrases in S2, they have very similar context as well as the term frequency. To take sentence S2 as an example, when t=4, the context of “picture” is

{picture, clear, bright, sharp, good},

the context of “sound” is

{clear, bright, sharp, sound, good}.

When using (picture, clear, bright, sharp, sound, good) to represent the 6 dimensions in word vector space model, the vector representation of “picture” and “sound” in S2 will be very similar:

⟨1, 1, 1, 1, 0,1⟩, ⟨0, 1, 1, 1, 1, 1⟩.

These limitations of the previous works motivate our research. In our method, the goal of context weighting is to gain a more reasonable feature representation for each aspect-phrase. Our weighting method is essentially a semantic relevance metric approach, which is based on word embedding. Actually, existing methods are based on the assumption that all the words in a review sentence are a mixture of aspect-phrases. We further assume that the contextual words of an aspect-phrase have higher semantic relevance with this aspect-phrase, comparing with other aspect-phrase. And we propose a novel context extracting and representation method. We use all the surrounding words of an aspect-phrase, and weight them by using the semantic relevance between the aspect-phrase and each surrounding word. For example, our method obtains two vectors respectively for “picture” and “sound” in S2 like this:

⟨1, 0.51, 0.53, 0.57, 0.43, 0.44⟩, ⟨0.43, 0.40, 0.34, 0.35, 1, 0.54⟩,

where the weight of each dimension is the semantic relevance between the aspect-phrase and each contextual word. And the semantic relevance is calculated based on word vector, which is learned by word embedding model.

On the other hand, the prior constraint knowledge was able to improve the grouping process. But, the existing methods still leave much to be desired. By exploiting the morphological relations and synonyms among aspect-phrases [28], [29], [30], a Must-Link (called ML) constraint, which means the two points of a constraint should group in the same cluster, is created, e.g. “battery life” and “battery power”, “image” and “picture”.

However the knowledge is learned unsupervised, noisy constraint may be included. So, they allowed reassigning all points to other group in the clustering process. In other words, the constraint is used to provide an initialization, and then each constraint can be violated. But we argue that if the constraint is strong enough then it should not be broken. For example, “battery life” and “battery power” should be grouped into the same cluster, but “signal quality” and “picture quality” may be assigned to different groups in phone reviews.

Therefore, we develop FC-Kmeans algorithm, in which a degree of belief is encoded in ML constraint, and FC-Kmeans guarantees to satisfy these constraints whose degree of belief are bigger than a specified threshold. In other words, in our method, the constraint relationship between points is no longer 0 or 1, but a real value, e.g.

ml1=picture,picturequality,0.85,ml2=signalquality,picturequality,0.55,

where we use a triple ⟨ai, aj, d⟩ to represent a constraint, ai and aj denote the ith and jth aspect-phrase, and d is a real value, which denotes the degree of belief of the constraint. When specifying a threshold 0.7, constraint ml1 must be satisfied and ml2 could be violated. However, for one domain, a constraint is strong, but for another domain, the constraint may be weak. So, the degree of belief is context-sensitive in our method and the threshold is domain dependency.

In summary, this paper proposes a more flexible framework for constrained K-Means algorithm (called FC-Kmeans) with weighted context representation, which is based on semantic relevance. It makes three main contributions:

  • 1.

    We propose a weighted context representation method for aspect-phrase, which exploits the latest neural language model as well as the large scale external common web corpus.

  • 2.

    We develop a flexible-constrained K-Means method, which encodes the strength into constraint. In this method, it ensures the satisfaction of strong constraint, which is flexible controlled by a threshold.

  • 3.

    Through experiments on four datasets, we demonstrate the effectiveness of our model on aspect-phrase grouping.

Section snippets

Related work

In the past decade, document level and sentence level sentiment analysis have studied by many researchers. Although both of them are useful in many cases, it still leaves a lot to be desired. Therefore, Hu and Liu [7] proposed to extract aspects of entities from opinionated text and summarize people’s opinions on aspects and entities. The work in [7] is the first research work in aspect level, its aim is to obtain more fine-grained sentiment analysis. In fact, the entity is usually the reviewed

Overview of our method

A sketchy description of our method for aspect-phrase grouping is illustrated in Fig. 2. The original context of each aspect-phrase is formed by aggregating the surrounding words of every sentence that contains the aspect-phrase. For obtaining a weighted context, we learn a word vector list based on a large scale common web corpus, which is easily crawled from the world wide web. Then, we calculate semantic relevance between aspect-phrase and its context word based on the obtained word vectors

Experiments

In this section, we evaluate our method and compare it with several baselines, and perform parameter tuning.

Conclusion

This paper study the problem of product aspect-phrase grouping for aspect-level sentiment analysis. For this grouping task, we exploit the latest neural language model to learn word vectors, and then represent each aspect-phrase with a novel weighted context by using the semantic relevance, which is measured by cosine distance of word vector. Meanwhile, we exploit lexical knowledge to learn a constraint collection and estimate its degree of belief, then solve the clustering problem in a

Acknowledgment

This work is supported by the National Natural Science Foundation of China (no. 61173062, 61373108, 61133012), the major program of the National Social Science Foundation of China (no. 11&ZD189), the key project of Natural Science Foundation of Hubei Province, China (no. 2012FFA088), the Educational Commission of Henan Province, China (no. 17A520050), and the High Performance Computing Center of Computer School, Wuhan University.

References (31)

  • I. Titov et al.

    Modeling online reviews with multi-grain topic models

    Proceedings of WWW

    (2008)
  • S. Basu et al.

    Semi-supervised clustering by seeding

    ICML

    (2002)
  • ChenZ. et al.

    Exploiting domain knowledge in aspect extraction.

    EMNLP

    (2013)
  • ChoiY. et al.

    Hierarchical sequential learning for extracting opinions and their attributes

    Proceedings of the ACL 2010 Conference Short Papers

    (2010)
  • R. Collobert et al.

    A unified architecture for natural language processing: Deep neural networks with multitask learning

    ICML, New York, NY, USA

    (2008)
  • FangL. et al.

    Exploring weakly supervised latent sentiment explanations for aspect-level review analysis

    Proceedings of the 22nd ACM international conference on Conference on information and knowledge management, CIKM

    (2013)
  • H. Guo et al.

    Product feature categorization with multilevel latent semantic association

    CIKM

    (2009)
  • M. Hu et al.

    Mining and summarizing customer reviews, KDD

    (2004)
  • N. Jakob et al.

    Extracting opinion targets in a single-and cross-domain setting with conditional random fields

    Proceedings of the 2010 conference on empirical methods in natural language processing

    (2010)
  • W. Jin et al.

    Opinionminer: A novel machine learning system for web opinion mining and extraction

    KDD ’09

    (2009)
  • Y. Jo et al.

    Aspect and sentiment unification model for online review analysis

    Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11

    (2011)
  • KimS.-M. et al.

    Extracting opinions, opinion holders, and topics expressed in online news media text

    Proceedings of ACL Workshop on Sentiment and Subjectivity in Text

    (2006)
  • N. Kobayashi et al.

    Extracting aspect-evaluation and aspect-of relations in opinion mining

    (2007)
  • L.-W. Ku et al.

    Opinion extraction, summarization and tracking in news and blog corpora.

    Proceedings of AAAI-CAAW, Proc. AAAI-CAAW, 100107

    (2006)
  • LiF. et al.

    Structure-aware review mining and summarization

    Proceedings of the 23rd International Conference on Computational Linguistics

    (2010)
  • Cited by (18)

    View all citing articles on Scopus
    View full text