Knowledge-guided multi-granularity GCN for ABSA

https://doi.org/10.1016/j.ipm.2022.103223Get rights and content

Highlights

  • This paper proposes a knowledge guided multi-granularity graph convolutional neural network (KMGCN) to solve these problems.

  • Multi-granularity attention mechanism is designed to enhance the interaction between the aspect terms and the opinion words.

  • In order to deal with the problem of long distance dependence, KMGCN uses a graph convolutional network that relies on the semantic map based on fine-tuning pre-training.

  • In particular, KMGCN uses a mask mechanism guided by conceptual knowledge to face with new aspect terms.

Abstract

Aspect-based sentiment analysis aims to determine sentiment polarities toward specific aspect terms within the same sentence or document. Most recent studies adopted attention-based neural network models to implicitly connect aspect terms with context words. However, these studies were limited by insufficient interaction between aspect terms and opinion words, leading to poor performance on robustness test sets. In addition, we have found that robustness test sets create new sentences that interfere with the original information of a sentence, which often makes the text too long and leads to the problem of long-distance dependence. Simultaneously, these new sentences produce more non-target aspect terms, misleading the model because of the lack of relevant knowledge guidance. This study proposes a knowledge guided multi-granularity graph convolutional neural network (KMGCN) to solve these problems. The multi-granularity attention mechanism is designed to enhance the interaction between aspect terms and opinion words. To address the long-distance dependence, KMGCN uses a graph convolutional network that relies on a semantic map based on fine-tuning pre-trained models. In particular, KMGCN uses a mask mechanism guided by conceptual knowledge to encounter more aspect terms (including target and non-target aspect terms). Experiments are conducted on 12 SemEval-2014 variant benchmarking datasets, and the results demonstrated the effectiveness of the proposed framework.

Introduction

Aspect-based sentiment analysis (ABSA) (Pontiki et al., 2016, Schouten and Frasincar, 2015) is a subtask of text sentiment analysis that, differs from the traditional sentiment analysis of documents or sentences. ABSA aims to summarize the sentiment polarities of users toward specific aspect terms in a sentence. For example, in the sentence “Great food but the service was dreadful!”, the sentiment polarities for the aspect terms “food” and “service” are “positive” and “negative” respectively. Because the two aspect terms in this example express completely opposite sentiment polarities, assigning a sentence-level sentiment polarity to them is unreasonable. In this regard, compared with sentence-level or document-level sentiment analysis, ABSA can provide better insight into user reviews.

In addition, using held-out datasets that are often not comprehensive tends to result in trained models that contain the same biases as the training data, which causes ABSA to encounter more severe challenges. The text robustness assessment method focuses on slightly modifying the input and results in different predictions. Studies on robustness (Xing et al., 2020) have detected whether a neural network model has learned real semantic information from training text. The robustness test can be divided into three categories. As shown in Fig. 1, we provide two robustness test examples from the SemEval-2014 benchmarking dataset. In Fig. 1, “Original text” represents the test sample from the original restaurant review dataset, and the three boxes below represent the conversion of the three types of data. “RevTgt text” represents the sentiment of reversing the aspect term. In C1, the sentiment polarity of the service is positive and after RevTgt, the sentiment polarity of the service becomes negative. “RevNon” is the reverse of the sentiment of the non-target aspect terms that are originally the same sentiment as the target. “AddDiff” is the addition of aspect terms with the opposite sentiment from the target aspect. These robustness test datasets are derived by Gui et al. (2021).

We identify three problems that remain unsolved in current state-of-the-art methods for improving the robustness of ABSA. One problem is that these methods do not actually learn the interactive information of aspect terms and context words in the semantic space, they perform extremely poorly on the test set.  Poria, Cambria, and Gelbukh (2016) used convolutional neural networks (CNN) to capture the connection between aspect terms and context words. Based on these studies. Tang, Qin, and Liu (2016) adopt multiple layers of long short-term memory (LSTM) with memory cells to capture the hidden state of context words. Xue and Li (2018) proposed an easily parallelized model based on CNNs and gating mechanisms, that selectively captures the connection between aspect terms and context words. In addition, recurrent neural network (RNN) augmented by attention mechanism (He et al., 2018, Ma et al., 2017) have been widely used to capture the association information between context words and aspect terms. However, these methods only fused aspect information and context information, and did not fully use their link information in the semantic space. The lack of this semantic interaction significantly limits the performance of these methods on the robustness test dataset. This problem exists not only in ABSA, but also in other natural language processing tasks such as question answer task and summary task. This is an urgent problem in natural language processing. Therefore, we designed a multi-granularity attention mechanism to enhance the interaction between different words. The multi-granularity attention mechanism enhances the interaction between different words by combining coarse and fine grained attention, particularly on the robustness test dataset, because we artificially add some noise elements to the robustness test dataset. The multi-granularity attention mechanism dynamically simulates the relationship between aspect terms (target and non-target aspect terms) and opinion words by increasing the interactive information between different words. This helps the model pay more attention to the target aspect terms and corresponding opinion words, ignoring the sentiment information of secondary non-target aspect terms, to maintain the stability of the subject sentiment orientation of the semantic space, thereby improving the robustness of the model.

Another problem is that as sentences increase in length and the distance between words increases, the model’s attention to some important words weakens. This is a common problem in the field of natural language processing. Traditional neural networks rely on serialized processing methods; therefore, long-distance dependence is a common defect. Graph neural networks (GNNs) eventually broke the deadlock, as they treats each word as an independent node in the graph, and then capture their hidden states through relationships between different words. GNNs (Wu et al., 2020, Zhou et al., 2020) use each context word as a node to obtain the global information of the context, have made breakthroughs in addressing the problem. Zhao, Hou, and Wu (2020) introduced position encoding for graph convolutional networks. Wang, Shen, Yang, Quan, and Wang (2020) proposed a relational graph attention network to encode a data structure for sentiment prediction. GNN has achieved good results with short and medium length text, but performs poorly when processing long sequences of text. Therefore, we propose using a combination of a fine-tuned pre-training model and GCN (Wang et al., 2019) to solve this problem. We first obtain domain applicable pre-trained language model by pre-training on domain-specific datasets. Then we break the traditional sequence structure and rely only on the semantic information of the input sequence to build a semantic graph. Finally we use a GCN to obtain the hidden state of the text. This method completely relies on the semantic information of the text to construct a new graph data structure method, which abandons the limitation of the sequence structure of the text, and provides an idea for solving long-distance dependence.

Finally the existing ABSA model lacks guidance from external knowledge. A knowledge graph can effectively introduce external knowledge into a neural network model. The introduction of external knowledge can guide the model to better understand the semantic text information. Therefore, incorporating commonsense knowledge into deep learning models has become a popular research topic in the field of natural language processing (NLP), such as in question answering (Dong, Wei, Zhou, & Xu, 2015) and machine reading comprehension (Mihaylov & Frank, 2018). However, few studies have been conducted on improving the robustness of the model using external knowledge. We propose a approach for solving this problem. Many disturbing aspect terms and opinion words were artificially added to the robustness test datasets. This may lead to a confusing problem of opinion words misleading the model, thereby interfering with the model’s judgment of the sentiment polarity of target aspect terms. This method can also provide a higher decision-making status for target aspect terms. For example, the sentence “great burgers, grilled cheeses and french fries, but worst sushi and service is slow”. have seven aspect terms in the 15 words; however, we only require the model to recognize the sentiment polarity of the target aspect terms. The opinion word “slow” of the non-target aspect term “service” that does not belong to the food category obviously has a very low correlation with the aspect terms of the food category. Therefore, it has less interference with the affective polarity of the target aspect term. However, if the model ignores the non-target aspect term “sushi”, and “french fries” is near the word “worst” in both semantic space and sequence structure, the model’s judgment of the sentiment polarity of “french fries” may be inaccurate. We import conceptual knowledge as a hypernym of “french fries”, which can be combined with the food-related words “burgers, grilled cheeses, sushi”. At the same time, the non-target aspect terms and their corresponding opinion words are given lower priority, thereby providing a basis for forming a semantic space dominated by the target aspect terms concept.

In this study, we first fine-tuned a pre-trained model on the restaurant and laptop datasets to obtain pre-trained features with higher domain-related features from these datasets. Then we built domain-related semantic maps based on the fine-tuned pre-training model, and used the semantic maps as the input of the GCN. For aspect terms, we first introduced external concept knowledge, and then used a multi-granularity attention mechanism to enhance the interaction between aspect terms and context words. The main contributions of this paper are summarized as follows:

We propose a multi-granularity attention mechanism, that enhances the interaction between aspect terms and context words, and helps the model focus on the core target aspect terms and their opinion words.

To solve the problem of long-distance dependence, we propose a GCN based on a fine-tuned pre-training model.

To avoid disturbing the model by ignoring artificially added non-target aspect terms and their opinion words, we introduce external conceptual knowledge into the model.

We conducted experiments using 12 robustness test datasets. The results demonstrate that our network outperforms state-of-the-art methods on these datasets.

Section snippets

Aspect-based sentiment analysis

Compared with traditional sentiment analysis task, ABSA is a more fine-grained sentiment analysis task. ABSA can be used to conduct a more standardized and rationalized analysis of users’ online comments. In this section, we will review studies related to ABSA.

Deep neural networks can generate dense word vectors without manual features. Hence, they have attracted increasing attention. Tang et al. (2016) proposed multiple layers in an LSTM, and each layer is an attention model that adds

Proposed methodology

The aspect-based sentiment analysis task aims to predict the sentiment polarity of aspect terms. Suppose given a context sequence Tc=w1c,w2c,,wnc and a aspect terms sequence Tt=w1t,w2t,,wmt, where Tt is a sub-sequence of Tc. Tc and Tt are respectively composed of n context words and m aspect term words. The goal of this task is to predict the sentiment polarity of all terms contained in Tt through the elements in Tc. For this task we propose a knowledge-guided multi-granularity graph

Dataset and experimental settings

To demonstrate the effectiveness of KMGCN, we conduct experiments on 12 variant benchmarking datasets (Gui et al., 2021, Xing et al., 2020) from SemEval 2014 Task 4. The dataset is described in Table 1. Specially, “Res” and “Lap” respectively represent the dataset of restaurant reviews and laptop reviews, the dataset with “Res” are variant datasets of the restaurant dataset. For example, “Res_RevTgt” is reverse the sentiment of the target aspect terms. “RevNon” is reverse the non-target aspect

Conclusion and future work

In this study, we propose a knowledge-guided multi-granularity graph convolutional neural (KMGCN) network for aspect-based sentiment analysis, which integrates knowledge via multi-granularity graph convolutional neural simultaneously. In particular, KMGCN outperform all the state-of-the-art approaches across all the 12 datasets and can capture the semantic information more accurately. However, we think that our model still has some limitations. First of all, when the entities in the text are

CRediT authorship contribution statement

Zhenfang Zhu: Methodology, Software, Data curation. Dianyuan Zhang: Conceptualization, Writing – original draft. Lin Li: Visualization. Kefeng Li: Supervision. Jiangtao Qi: Software. Wenling Wang: Validation. Guangyuan Zhang: Investigation. Peiyu Liu: Writing – review & editing.

Acknowledgments

This work was supported in part by National Social Science Foundation(19BYY076); of China (ZR2021MF064, ZR2021QG041); Key R & D project of Shandong Province 2019JZZY010129, Shandong Natural Science Foundation, Shandong Social Science Planning Foundation (19BJCJ51).

References (48)

  • Fan, F., Feng, Y., & Zhao, D. (2018). Multi-grained attention network for aspect-level sentiment classification. In...
  • Ferragina, P., & Scaiella, U. (2010). Tagme: On-the-fly annotation of short text fragments (by Wikipedia entities). In...
  • FerraginaP. et al.

    Fast and accurate annotation of short texts with Wikipedia pages

    IEEE Software

    (2011)
  • Gao, T., Yao, X., & Chen, D. (2021). SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the...
  • Gu, S., Zhang, L., Hou, Y., & Song, Y. (2018). A position-aware bidirectional attention network for aspect-level...
  • GuiT. et al.

    TextFlint: Unified multilingual robustness evaluation toolkit for natural language processing

    (2021)
  • He, R., Lee, W. S., Ng, H. T., & Dahlmeier, D. (2018). Effective attention modeling for aspect-level sentiment...
  • Huang, B., & Carley, K. M. (2018). Parameterized Convolutional Neural Networks for Aspect Level Sentiment...
  • Li, X., Bing, L., Lam, W., & Shi, B. (2018). Transformation Networks for Target-Oriented Sentiment Classification. In...
  • Li, Z., Zou, Y., Zhang, C., Zhang, Q., & Wei, Z. (2021). Learning Implicit Sentiment in Aspect-based Sentiment Analysis...
  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph...
  • LvS. et al.

    Graph-based reasoning over heterogeneous external knowledge for commonsense question answering

  • Ma, D., Li, S., Zhang, X., & Wang, H. (2017). Interactive attention networks for aspect-level sentiment classification....
  • MaF. et al.

    Exploiting position bias for robust aspect sentiment classification

    (2021)
  • Cited by (15)

    • Breaking down linguistic complexities: A structured approach to aspect-based sentiment analysis

      2023, Journal of King Saud University - Computer and Information Sciences
    View all citing articles on Scopus
    View full text