ILWAANet: An Interactive Lexicon-Aware Word-Aspect Attention Network for aspect-level sentiment classification on social networking

https://doi.org/10.1016/j.eswa.2019.113065Get rights and content

Highlights

  • The effcient multiple attention mechanisms can extract salient features.

  • Lexicon resources as in-domain knowledge are incorporated into the neural network.

  • Both Phase-level and Context-level of an aspect are captured effectively.

Abstract

An Interactive Lexicon-Aware Word-Aspect Attention Network (ILWAAN) is proposed for aspect-level sentiment classification which deals with identifying the sentiment polarity of a specific aspect in its context and have potential application on social networking. In this model, effective multiple attention mechanisms (intra-attention and interactive-attention mechanisms) integrated with sentiment lexicon information are developed to form an aspect-specific representation at two levels: Phrase-level and Aggregation-level information. Specifically, an aspect and its context are fused with the sentiment lexicon information and learn their relationship representations by lexicon-aware attention operations. This allows the model to tries to incorporate the aspect information into the deep neural networks and learn to attend the correct sentiment context words conditioned on the informative aspect words. To evaluate the performance, we evaluate our model in three benchmark data: Twitter, Laptop, and Restaurant. The experimental results indicate that our models improve the performance for aspect-level sentiment classification.

Introduction

Aspect-based sentiment analysis (ABSA) aims to identify the sentiment polarity of an aspect term in its context. For example, the sentence “The Iphone screen is good, but the battery life is short.”. In this sentence, there are two aspects having opposite polarities, “Iphone screen” is positive, whereas, “battery life” is negative. Recent years, the ABSA task has grown to be one of the most active research areas in natural language processing (NLP) and become important to business and society. For instance, a company would like to know the quality of the “screen” of a phone or the people situation after an earthquake (an event). This work is involved in the problem of modeling the relationship of a specific aspect term and its context. In order to tackle this, traditional machine learning approaches and lexicon-based approaches were utilized in the first time by Go, Bhayani, and Huang (2009) (Liu, 2010) (Saif, He, & Alani, 2012) and (Kiritchenko, Zhu, Cherry, & Mohammad, 2014a). While most of the traditional machine learning is supervised machine learning (e.g., Support Vector Machine, Maximum Entropy, Naive Bayes) which requires significant laborious feature engineering, lexicon-based approaches were applied as additional features for the traditional machine learning models. Specifically, lexicon-based approaches mainly use knowledge-based or lexicon-based methods, which utilize public available lexicon resources (e.g., WordNet, SentiWordNet) and classify the sentiment of texts based on the overall sentiment polarity of lexicons (Taboada, Brooke, Tofiloski, Voll, & Stede, 2011). However, the drawbacks of these methods are not capable of modeling the semantic relationship between an aspect and its context sufficiently. Additionally, another problem with these methods is difficult to adapt well to different domains or different languages. As such, the task of ABSA introduces a challenging problem of incorporating aspect information into learning models for making predictions.

Recently, end-to-end neural networks (Dong, Wei, Tan, Tang, Zhou, Xu, 2014, Ma, Li, Zhang, Wang, 2017, Tang, Qin, Feng, Liu, 2016a, Wang, Huang, Zhu, Zhao, 2016) have garnered considerable attention and have promising performance on ABSA task without any laborious feature engineering. Such models can incorporate aspect information into neural architectures by learning to attend the different parts of a context sentence towards a given aspect term. For example, ATAE-LSTM (Wang et al., 2016) is an attention-based LSTM model which tries to fuse aspect information by adopting a naive concatenation of an aspect and its context words to extract important parts towards the given aspect. The key idea of an attention mechanism is to extract only the most relevant information that is useful for prediction. Notably, most dominant state-of-the-art models utilize the attention layer to focus on learning the relative importance of context words by simply concatenating the context words and aspect information. Consequently, this causes an extra burden for the attention layer of modeling sequential information dominated by the aspect information and incurs additional parameter costs to LSTM layers towards to hardly model the relationship between the aspect and its context words (Hochreiter & Schmidhuber, 1997). Second, such models just make use of the contexts without consideration of the aspect information while the aspect information should be an important factor for judging the aspect sentiment polarity. In other words, the importance degrees of different words are different for a specific aspect. For example, the aspect term “Iphone screen”, “screen” is the main aspect in its context that we are going to predict the sentiment polarity. As such, “screen” plays a more important role than “Iphone”. Finally, the attention-based models mainly utilized pre-trained word embeddings (e.g., Glove) which captures the semantics of words. This leads the attention mechanism via Dot product in extracting context words based on the semantics of the words that ignore the sentiment of the words.

In this work, we propose a novel model that aims to tackle the weaknesses of the above challenges by considering each sentiment context word conditioned on the crucial words of a specific aspect. Specifically, we develop an end-to-end deep neural model constructing multiple attention mechanisms (intra-attention and interactive-attention mechanisms) assisted by sentiment lexicon information. The purpose of the lexicon information is to enforce the model to pay more attention to the sentiment of words instead of only the semantics of the words. Additionally, the inter-dependence between an aspect and its sentiment context words can be captured. Our model called Interactive Lexicon-Aware Word-Aspect Attention Network (ILWAAN) treats an aspect, and its context separately and cleverly divides the responsibilities of layers to model the relationship between the aspect and its context. More specifically, aspect embeddings and its context word embeddings augmented sentiment lexicon information are firstly encoded via LSTM Encoders. Subsequently, an intra-attention mechanism and an average pooling are applied to obtain the information of the aspect at two levels: informative phrase-level and aggregation-level information. To the best of our knowledge, the average pooling summarizes the information of the aspect, while the intra-attention mechanism learns to weight the words and the sub-phrases within the aspect based on how important they are, and then, allowing interactive-attention mechanisms learn to attend the relative importance of the fused context words. Our model achieves not only state-of-the-art performance on benchmark datasets but also a significant improvement over many other neural architectures.

Our contributions

The principal contributions of this paper are as follows:

  • A novel model is developed to try to learn the associative word-aspect relationship via the multiple attention mechanisms.

  • Lexicon information is proposed to highlight the important information of an aspect and its context via lexicon-augmented word embeddings. These embeddings enforce the model to pay more attention to the sentiment context words in a sentence via multiple attention mechanisms.

  • We conduct a comprehensive and in-depth analysis of the inner workings of our proposed model.

In the remaining sections, related works are introduced in Section 2. The architecture of our proposed model is illustrated in Section 3. Finally, we describe the experiments, and analysis in the Section 4 and finish by drawing essential conclusions.

Section snippets

Related works

Today, the dominant state-of-the-art models are neural networks which are incredibly fashionable for NLP task, and ABSA task is no exception. To incorporate aspect information, there are several neural architectures which are based on LSTM to model each sentence towards given aspect term (Dong, Wei, Tan, Tang, Zhou, Xu, 2014, Liu, Zhang, 2017, Ma, Li, Zhang, Wang, 2017, Tang, Qin, Feng, Liu, 2016a, Wang, Huang, Zhu, Zhao, 2016). These models utilize the power of LSTM layers and attention layers

Proposed deep neural networks

In this section, an ILWAAN model is introduced to learn to attend correct sentiment context words given an aspect term. Subsequently, the variants of the ILWAAN model is developed to conduct the partial evaluation of our model to assess the significance of our model for aspect-level sentiment analysis. To the best understanding, we illustrate the architecture of the ILWAAN model layer-by-layer with the functionality of each component.

Experiments

This section shows early studies used to compare to our models and presents an evaluation metric, datasets and the configuration of our models utilized for the comparison.

Conclusions

This work proposes a novel method for aspect-level sentiment classification by incorporating LSTM, Intra-attention, Interactive-attention and Sentiment lexicons in which the model learns to attend word-aspect association at two levels: Phrase-level and Context-level information. Our models indicate that lexicon-based approaches is still alive and contribute effectively to deep learning models. Experiments on SemEval2014 and Twitter show that our model can learn effective features and provide

CRediT authorship contribution statement

Huy-Thanh Nguyen: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing. Le-Minh Nguyen: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing -

Declaration of Competing Interest

Both authors are agreeing that there is no conflict of interest in the paper. The paper is have no conflict of interest.

Acknowlgedgments

The authors would like to thank anonymous reviewers for their feedback and comments. This work is supported by the project “Building a machine translation system to support the translation of documents between Vietnamese and Japanese to help managers and businesses in Hanoi approach to the Japanese market”, No. TC.02-2016-03.

References (20)

  • P. Chen et al.

    Recurrent attention network on memory for aspect sentiment analysis

    Proceedings of the conference on empirical methods in natural language processing

    (2017)
  • L. Dong et al.

    Adaptive recursive neural network for target-dependent twitter sentiment classification

    Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: Short papers)

    (2014)
  • A. Go et al.

    Twitter sentiment classification using distant supervision

    Processing

    (2009)
  • S. Hochreiter et al.

    Long short-term memory

    Neural Computation

    (1997)
  • M. Hu et al.

    Mining and summarizing customer reviews

    Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining - KDD

    (2004)
  • S. Kiritchenko et al.

    NRC-Canada-2014: Detecting aspects and sentiment in customer reviews

    Proceedings of the 8th international workshop on semantic evaluation (SEMEVAL)

    (2014)
  • S. Kiritchenko et al.

    Sentiment analysis of short informal texts

    Journal of Artificial Intelligence Research

    (2014)
  • B. Liu

    Sentiment Analysis and Subjectivity

    Handbook of natural language processing

    (2010)
  • J. Liu et al.

    Attention Modeling for Targeted Sentiment

    Proceedings of the 15th conference of the european chapter of the association for computational linguistics: Volume 2, short papers

    (2017)
  • D. Ma et al.

    Interactive attention networks for aspect-level sentiment classification

    Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI)

    (2017)
There are more references available in the full text version of this article.

Cited by (0)

View full text